Prev: linux-next: build failure after merge of the scsi-post-merge final tree
Next: [PATCH] scripts/kernel-doc: fix empty function description section
From: Tejun Heo on 8 Mar 2010 21:50 Hello, again. On 03/09/2010 11:34 AM, Tejun Heo wrote: >> - parted uses 1MiB alignment for disks with unknown topology, disks >> with topology information are aligned to optimal (or minimum) I/O >> size (parted >= 2.1) > > This will result in incorrect alignment for drives which lie about the > physical sector size to work around BIOS/drivers issues (C-1). It > would probably be best to align to at least 1MiB. I misread it. C-1 would be disks w/o alignment information which will be aligned to optimal_io_size which again would be 0 and thus 1MiB alignment. So, this should work, right? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 8 Mar 2010 22:00 Hello, On 03/09/2010 11:42 AM, Jeff Garzik wrote: > On 03/08/2010 09:34 PM, Tejun Heo wrote: >> libata is broken for logical 4KiB ATA devices tho. I'll fix it up. > > Does libata-dev.git#sectsize miss any details? I haven't looked at it yet. I'll review it soon but the thing is without actual hardware it would be a bit difficult to tell. It's not only the drivers. I have this mighty unhappy feeling that some controllers (especially some of the SATA ones with internal state machine to emulate SFF) would be sniffing the commands and making the wrong assumption if 4KiB logical sector size is used, so we'll need to test various controllers. Some PATA-SATA bridge chips will definitely be having problems too. Then there are the USB and other bridges too but well those aren't libata's problem at least. :-) Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 8 Mar 2010 22:20 >>>>> "Tejun" == Tejun Heo <tj(a)kernel.org> writes: >>> Huh, what? My homedir is on a 4KiB LBS/PBS drive and has been for >>> ~2 years. Tejun> By default, they aren't aligned properly, are they? Single partition. I did the alignment manually. Tejun> libata is broken for logical 4KiB ATA devices tho. I'll fix it Tejun> up. Matthew implemented support for this a while back... Tejun> I'm just a bit worried that it might generate a lot of frustrated Tejun> bug reports. Well, maybe we should just advise users to install Tejun> windows first and then install Linux. Unfortunately there is no simple solution given that we can't go back in time and fix legacy DOS/XP behavior. The 1-alignment jumper (that some drives have) fixes things for the first partition but will mess up our alignment for subsequent ones unless the firmware actually reports the shift. So no matter what we do the user will have to have a bare minimum of knowledge about 512-byte LBS/4 KB PBS drives. That sucks. But even Windows users are presented with extra documentation and alignment utilities during the transition. Having a 1 MB alignment by default and hoping that devices that lie will be 0-aligned is the best we can do, I think. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 8 Mar 2010 22:30 >>>>> "Tejun" == Tejun Heo <tj(a)kernel.org> writes: >> http://people.redhat.com/msnitzer/docs/io-limits.txt Tejun> Ah... this is great. I'll link the doc and shamelessly steal Tejun> parts of it if that's okay with you. There's also this one: http://oss.oracle.com/~mkp/docs/linux-advanced-storage.pdf It is more aimed at storage vendors than end users, though. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Daniel Taylor on 8 Mar 2010 22:50
-----Original Message----- From: Tejun Heo [mailto:tj(a)kernel.org] Sent: Monday, March 08, 2010 6:34 PM To: Karel Zak Cc: Martin K. Petersen; linux-ide(a)vger.kernel.org; lkml; Daniel Taylor; Jeff Garzik; Mark Lord; tytso(a)mit.edu; H. Peter Anvin; hirofumi(a)mail.parknet.co.jp; Andrew Morton; Alan Cox; irtiger(a)gmail.com; Matthew Wilcox; aschnell(a)suse.de; knikanth(a)suse.de; jdelvare(a)suse.de; Jim Meyering Subject: Re: ATA 4 KiB sector issues. Hello, On 03/09/2010 04:58 AM, Karel Zak wrote: >> Tejun> Reportedly, commonly used partitioners aren't ready to handle >> Tejun> drives larger than 2 TiB in any configuration and alignment >> Tejun> isn't > > The limit is specific for DOS partition table (with 512-byte log. > sectors), but for example GPT uses 64-bit LBA. I believe that our > partitioning tools don't introduce any other restriction. Hmmm... the 'reportedly' was from Daniel Taylor or maybe I just misinterpreted the conversation. Daniel, can you please fill in? DLT> The problem that I see is that the installers and upper level applications do not make good choices for partition layout. DLT> "parted", itself, seems to work OK in the latest version. One of the things I've heard since I started this process is that DLT> there are some libraries associated with the process of partitioning/formatting. Perhaps the upper layers and those DLT> libraries aren't synced up? >> Tejun> done properly for drives with 4 KiB physical sectors. 4 KiB >> Tejun> logical sector support is broken in both the kernel >> >> Huh, what? My homedir is on a 4KiB LBS/PBS drive and has been for ~2 >> years. By default, they aren't aligned properly, are they? >> Tejun> (need more details and probably a whole section on partitioner >> Tejun> behaviors) >> >> I'm Cc:'ing Karel Zak and Jim Meyering who have been doing all the >> alignment work for fdisk and parted respectively. Karel, Jim: The >> full writeup is here: >> >> http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues >> >> It'd be great if you guys could share what you have been doing to the >> tooling. > > small summary: > > - libblkid provides unified API to topology information, it supports: > - ioctls (kernel >= 2.6.32) > - sysfs (kernel >= 2.6.31) > - stripe chunk size and stripe width for DM, MD. LVM and evms on > old kernels > - libparted and fdisk are linked against libblkid > > - fdisk supports 4KiB logical sector size (util-linux-ng >= 2.15 > - fdisk supports 4KiB physical sector size (util-linux-ng >= 2.17) > - fdisk uses 1MiB alignment (or more if optimal I/O size is bigger) > and alignment_offset for all partitions in non-DOS mode > (util-linux-ng >= 2.17.1) That's great. Daniel, maybe you were testing older versions? Or maybe those failures were manifested from libata mishandling 4KiB r/w requets. DLT> As I said, above, it could be libraries. I was not aware that so much of the implementation was embedded there. > - parted supports 4KiB physical sector size > - parted uses 1MiB alignment for disks with unknown topology, disks > with topology information are aligned to optimal (or minimum) I/O > size (parted >= 2.1) This will result in incorrect alignment for drives which lie about the physical sector size to work around BIOS/drivers issues (C-1). It would probably be best to align to at least 1MiB. DLT> Please. > - EFI GPT code in the kernel has been updated to works properly with > 4KiB sectors (kernel >= 2.6.33) libata is broken for logical 4KiB ATA devices tho. I'll fix it up. > - mkfs.{ext,xfs,gfs2,ocfs2} have been update to work properly with > topology information, mkfs.{ext,xfs} are linked against libblkid > for compatibility with old kernel (for stripe chunk size / width) > > - Fedora-13/RHEL6 installer uses libparted with 4KiB support > > - alignment_offset & 4KiB support is planned for LUKS (cryptsetup) > >> Tejun> Unfortunately, the transition to 4 KiB sector size, physical >> Tejun> only or logical too, is looking fairly ugly. Hopefully, a >> Tejun> reasonable solution can be reached in not too distant future >> Tejun> but even with all the software side updated, it looks like >> Tejun> it's gonna cause significant amount of confusion and frustration. >> >> With regards to XP compatibility I don't think we should go too much >> out of our way to accommodate it. XP has been disowned by its master >> and I think virtualization will take care of the rest. Yeah, good point. I'm just a bit worried that it might generate a lot of frustrated bug reports. Well, maybe we should just advise users to install windows first and then install Linux. DLT> Simple reality is that XP is "forever". Drives >2TiB, which may be USB-attached, used with XP will be MBR-partitioned DLT> and use 4096-byte sectors. We need to be able to read/write those disks on Linux systems. >> FWIW, recent fdisk has a command line flag that will enable/disable >> DOS compatible layout. > > yes, util-linux-ng 2.17.1, fdisk -c > > Note that non-DOS mode will be default in the next major > util-linux-ng release. I'll try to merge these information into the ata-4k doc. Thank you very much. DLT> One last comment: I just tried to partition and format a >2TiB drive on fully updated Ubuntu 9.10 with GParted. DLT> I selected not to cylinder align, use GPT and ext3, and to put 1 MiB preceeding and following. libparted failed DLT> with "unable to satisfy all constraints of the partition". Using "parted", I created the partition, and then DLT> GParted was able to apply the ext3 file system. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |