Prev: linux-next: build failure after merge of the scsi-post-merge final tree
Next: [PATCH] scripts/kernel-doc: fix empty function description section
From: H. Peter Anvin on 8 Mar 2010 16:20 On 03/08/2010 12:19 PM, Martin K. Petersen wrote: >>>>>> "hpa" == H Peter Anvin <hpa(a)zytor.com> writes: > > hpa> On the flipside, though, there really is very little net benefit to > hpa> 4K as opposed to 512 byte logical sectors: the additional protocol > hpa> overhead is relatively minimal, and as long as writes are aligned > hpa> full blocks, there shouldn't be any additional overhead on either > hpa> the OS or the drive side. On the plus side, you get full > hpa> compatibility with the existing software stack. The equation > hpa> really seems rather simple. > > 4KB sectors are not a win for anybody except the drive vendors. > Obviously. However, larger physical storage unit sizes -- 4K for spinning media, but frequently much larger for flash, for example -- is already in wide use, and having a huge mishmash of logical block sizes isn't going to work very well. > There is a push in the industry right now to keep the 512-byte logical > blocks forever. The first step would be to report misaligned accesses > or accesses that are not a multiple of the physical block size. Second > step would be to eventually reject any write that's not a properly > aligned multiple of the physical block size. I personally suspect that that is the way it is going to go, rather than trying to change the software ecosystem to a different logical block size. It has been tried in the past and failed, with the sole exception of CD-ROMs, pretty much. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 8 Mar 2010 21:30 Hello, On 03/09/2010 05:12 AM, H. Peter Anvin wrote: > Please correct the following bit in C-3: > > "A different partition format - GPT[6] - should be used beyond 2^32 > sectors, which could harm compatibility with older BIOSs or other > operating systems which don't recognize the new format." > > BIOS does not care about the partition table format. There might be > issues with > 2^32 sectors for BIOSes (e.g. truncating sector counts), > but that would be unrelated. Updated to, This might also be beneficial for operating systems which don't suffer from this limitation. A different partition format - GPT[6] - should be used beyond 2^32 sectors, which could harm compatibility with other operating systems which don't recognize the new format. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 8 Mar 2010 21:40 Hello, On 03/09/2010 04:58 AM, Karel Zak wrote: >> Tejun> Reportedly, commonly used partitioners aren't ready to handle >> Tejun> drives larger than 2 TiB in any configuration and alignment isn't > > The limit is specific for DOS partition table (with 512-byte log. > sectors), but for example GPT uses 64-bit LBA. I believe that our > partitioning tools don't introduce any other restriction. Hmmm... the 'reportedly' was from Daniel Taylor or maybe I just misinterpreted the conversation. Daniel, can you please fill in? >> Tejun> done properly for drives with 4 KiB physical sectors. 4 KiB >> Tejun> logical sector support is broken in both the kernel >> >> Huh, what? My homedir is on a 4KiB LBS/PBS drive and has been for ~2 >> years. By default, they aren't aligned properly, are they? >> Tejun> (need more details and probably a whole section on partitioner >> Tejun> behaviors) >> >> I'm Cc:'ing Karel Zak and Jim Meyering who have been doing all the >> alignment work for fdisk and parted respectively. Karel, Jim: The full >> writeup is here: >> >> http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues >> >> It'd be great if you guys could share what you have been doing to the >> tooling. > > small summary: > > - libblkid provides unified API to topology information, it supports: > - ioctls (kernel >= 2.6.32) > - sysfs (kernel >= 2.6.31) > - stripe chunk size and stripe width for DM, MD. LVM and evms on > old kernels > - libparted and fdisk are linked against libblkid > > - fdisk supports 4KiB logical sector size (util-linux-ng >= 2.15 > - fdisk supports 4KiB physical sector size (util-linux-ng >= 2.17) > - fdisk uses 1MiB alignment (or more if optimal I/O size is bigger) > and alignment_offset for all partitions in non-DOS mode > (util-linux-ng >= 2.17.1) That's great. Daniel, maybe you were testing older versions? Or maybe those failures were manifested from libata mishandling 4KiB r/w requets. > - parted supports 4KiB physical sector size > - parted uses 1MiB alignment for disks with unknown topology, disks > with topology information are aligned to optimal (or minimum) I/O > size (parted >= 2.1) This will result in incorrect alignment for drives which lie about the physical sector size to work around BIOS/drivers issues (C-1). It would probably be best to align to at least 1MiB. > - EFI GPT code in the kernel has been updated to works properly with > 4KiB sectors (kernel >= 2.6.33) libata is broken for logical 4KiB ATA devices tho. I'll fix it up. > - mkfs.{ext,xfs,gfs2,ocfs2} have been update to work properly with > topology information, mkfs.{ext,xfs} are linked against libblkid > for compatibility with old kernel (for stripe chunk size / width) > > - Fedora-13/RHEL6 installer uses libparted with 4KiB support > > - alignment_offset & 4KiB support is planned for LUKS (cryptsetup) > >> Tejun> Unfortunately, the transition to 4 KiB sector size, physical only >> Tejun> or logical too, is looking fairly ugly. Hopefully, a reasonable >> Tejun> solution can be reached in not too distant future but even with >> Tejun> all the software side updated, it looks like it's gonna cause >> Tejun> significant amount of confusion and frustration. >> >> With regards to XP compatibility I don't think we should go too much out >> of our way to accommodate it. XP has been disowned by its master and I >> think virtualization will take care of the rest. Yeah, good point. I'm just a bit worried that it might generate a lot of frustrated bug reports. Well, maybe we should just advise users to install windows first and then install Linux. >> FWIW, recent fdisk has a command line flag that will enable/disable DOS >> compatible layout. > > yes, util-linux-ng 2.17.1, fdisk -c > > Note that non-DOS mode will be default in the next major > util-linux-ng release. I'll try to merge these information into the ata-4k doc. Thank you very much. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 8 Mar 2010 21:50 Hello, On 03/09/2010 12:18 AM, Martin K. Petersen wrote: >>>>>> "Tejun" == Tejun Heo <tj(a)kernel.org> writes: > Tejun> Please note that hdparm is misreporting the alignment offset. It > Tejun> should be reporting 512 instead of 256 for offset-by-one drives. > > Already fixed. Your hdparm must be old. Yeah, I know Mark fixed it but couldn't find where the tree was. SF only had old releases, so... (other stuff replied further down the thread) Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jeff Garzik on 8 Mar 2010 21:50
On 03/08/2010 09:34 PM, Tejun Heo wrote: > libata is broken for logical 4KiB ATA devices tho. I'll fix it up. Does libata-dev.git#sectsize miss any details? Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |