From: Rahul on 17 Jan 2010 15:46 I find that I could in theory get a performance boost either by using a RAID5 via mdadm or by striping via LVM. Let's assume redundancy is not a concern merely performance boosting. What's the difference in these two approaches and is one better than the other? -- Rahul
From: Aragorn on 18 Jan 2010 06:32 On Sunday 17 January 2010 21:46 in comp.os.linux.misc, somebody identifying as Rahul wrote... > I find that I could in theory get a performance boost either by using > a RAID5 via mdadm or by striping via LVM. Let's assume redundancy is > not a concern merely performance boosting. > > What's the difference in these two approaches and is one better than > the other? A true RAID 5 means that you need at least three disks, in which case the data will, per data segment, be striped over two disks, and the third disk will hold a parity block. Distribution of the parity blocks is staircased, meaning that the parity block will be put on a different disk in the array per data segment, like so... Data segment Disk 1 Disk 2 Disk 3 A A-1 A-2 A-parity B B-1 B-parity B-2 C C-parity C-1 C-2 D D-1 D-2 D-parity E E-1 E-parity E-2 F F-parity F-1 F-2 ... ... ... ... Writing to a RAID 5 is slower than writing to a single disk because with each write, the parity block must be updated, which means calculation of the parity data and writing that parity data to the pertaining disk. Reading from a (non-degraded) RAID 5 however is fast and comparable to RAID 0, also known as "striping", because the parity block need not be read, unless the array is running in degraded mode, i.e. with one of the disks failing and the missing data is recalculated using the parity block. A plain stripeset on the other hand only requires two disks, and simply does what the above does, but without parity blocks. So you'd have a set up like this... Data segment Disk 1 Disk 2 A A-1 A-2 B B-1 B-2 C C-1 C-2 ... ... ... In this case, you don't have any redundancy. Writing to the stripeset is faster than writing to a single disk, and the same applies for reading. It's not a 2:1 performance boost due to the overhead for splitting the data for writes and re-assembling it upon reads, but there is a significant performance improvement, and especially so if you use more than two disks. Now, you can use virtually any kind of software RAID set-up with /mdadm/ - including RAID 0 - and things like LVM can offer you a similar set-up - you don't even need either of them if you want to apply striping to the swap partition because this can be achieved by simply giving two swap partitions on separate disks an equal priority in "/etc/fstab". If striping without redundancy is what you want, then you can go either way, i.e. RAID 0 via /mdadm/ or via the older /dmraid/ or striping implemented at the logical volume management level. The only difference is in the layer of the kernel in which this will be handled, so whether you set it up via /mdadm/ - or even via /dmraid/ - versus setting it up via the logical volume manager, it is still software RAID, and I don't think there would be any significant - i.e. humanly noticeable - difference in performance. There are however a few considerations you should take into account with both of these approaches, i.e. that you should not put the filesystem which holds the kernels and /initrd/ - and preferably not the root filesystem either[1] - on a stripe, because the bootloader recognizes neither software RAID nor logical volume management. It's a chicken-and-egg thing, i.e. the drivers for LVM and software RAID are in the Linux kernel, so you have to be able load the Linux kernel first before you can make use of those drivers. GRUB does not have any drivers for that, and the way LILO works it would also not be able to load a kernel off of a striped filesystem. [1] Having the root filesystem on a software RAID stripeset will work only if you have an initrd which contains *all* the required driver modules, since there is no control over the order of the automatic module loading by the kernel itself. It loads the modules according to the hardware it finds, and if it needs a module off of the root filesystem before the RAID or LVM modules have been loaded, then you're foobarred. -- *Aragorn* (registered GNU/Linux user #223157)
From: David Brown on 18 Jan 2010 16:13 Rahul wrote: > I find that I could in theory get a performance boost either by using a > RAID5 via mdadm or by striping via LVM. Let's assume redundancy is not a > concern merely performance boosting. > > What's the difference in these two approaches and is one better than the > other? > LVM is for logical volume management, mdadm is for administering multiple disk setups (i.e., software raid). LVM /can/ do basic striping, in that if you have two physical volumes allocated to the same volume group, then a logical volume can be striped across the two physical volumes. As another poster has said, you won't notice a performance difference between striping via LVM or mdadm. But you /will/ notice a difference in the administration and commands used - it is more convenient to use mdadm for raid than LVM. My recommendation is that you use mdadm to create a raid from the raw drives or partitions on the drives, and if you want the volume management features of LVM (I find it very useful), put LVM on top of mdadm raid. As for the type of raid to use, that depends on the number of disks you have and the redundancy you want. raid5 is well-known to be slower for writing, especially for smaller writes, and it can be risky for large disks in critical applications (since rebuilding takes so long, and wears the other disks). Mirroring is safer, and mdadmin can happily do a raid10 (roughly a stripe of mirrors) on any number of disks for high speed and mirrored redundancy. Booting from raids is complicated, but not as difficult as suggested by another poster. Modern grub can handle a /boot partition on a raid1 or raid0 mdadmin setup, although it's a little inconvenient to install - you typically have to manually run grub to install the first stage bootloader on each disk's boot sector individually. The last server I configured had three disks. I partitioned each into a small partion (1G) and a big partition (the rest of the disks). The small partitions I joined in an mdadm raid1 (mirror), and use for /boot. The big partitions are in an raid10 mdadm block, used as an LVM physical drive, with logical partitions for various parts of the system and virtual machines. It will happily run and boot with any one of the drives removed.
From: Rahul on 19 Jan 2010 02:37 Aragorn <aragorn(a)chatfactory.invalid> wrote in news:hj1gta$2hp$5 @news.eternal-september.org: Thanks for the great explaination! > Writing to a RAID 5 is slower than writing to a single disk because with > each write, the parity block must be updated, which means calculation > of the parity data and writing that parity data to the pertaining disk. This is where I get confused. Is writing to a RAID5 slower than a single disk irrespective of how many disks I throw at the RAID5? I currently have a 7-disk RAID5. Will writing to this be slower than a single disk? Isn't the parity calculation a fairly fast process especially if one has a hardware based card? And then if the write gets split into 6 parts shouldnt that speed up the process since each disk is writing only 1/6th of the chunk? > > In this case, you don't have any redundancy. Writing to the stripeset > is faster than writing to a single disk, and the same applies for > reading. It's not a 2:1 performance boost due to the overhead for > splitting the data for writes and re-assembling it upon reads, but > there is a significant performance improvement, and especially so if > you use more than two disks. Why doesn;t a similar boost come out of a RAID5 with a large number of disks? Merely because of the parity calculation overhead? > > There are however a few considerations you should take into account with > both of these approaches, i.e. that you should not put the filesystem > which holds the kernels and /initrd/ - and preferably not the root > filesystem either[1] - on a stripe, because the bootloader recognizes Luckily that is not needed. I have a seperate drive to boot from. The RAID is intended only for user /home dirs. -- Rahul
From: Rahul on 19 Jan 2010 02:44
David Brown <david.brown(a)hesbynett.removethisbit.no> wrote in news:BtOdnakm8taiUsnWnZ2dnUVZ8qOdnZ2d(a)lyse.net: Thanks David! > Rahul wrote: > > LVM is for logical volume management, mdadm is for administering > multiple disk setups (i.e., software raid). LVM /can/ do basic > striping, in that if you have two physical volumes allocated to the > same volume group, then a logical volume can be striped across the two > physical volumes. As another poster has said, you won't notice a > performance difference between striping via LVM or mdadm. But you Will putting LVM on top of mdadm slow things down? Or does LVM not have a significant performance penalty? > > My recommendation is that you use mdadm to create a raid from the raw > drives or partitions on the drives, and if you want the volume > management features of LVM (I find it very useful), put LVM on top of > mdadm raid. This is exactly what I was trying to do. BUt LVM asks "stripe" or :no stripe". THat I wasn;t sure about. > As for the type of raid to use, that depends on the number of disks > you have and the redundancy you want. raid5 is well-known to be > slower for writing, especially for smaller writes, and it can be risky > for large disks in critical applications Maybe if I explain my situation you can have some more comments. I have 3 physical "storage boxes" (MD-1000's from Dell). Each takes 15 SAS 15k drives of 300 GB each. i.e. I have a total of 45 drives of 300 GB each. Redundancy is important but not critical. Performance was more imporntant. My original plan was to split each box into two RAID5 arrays of 7 disks each and leave 1 as a hot spare. Thus I get 6 RAID5 arrays in all. They are visible as /dev/sdb /dev/sdc etc. but I want to mount a single /home on it. That's where I introduced LVM. But then LVM again introduces a striping option. Should I be striping or not? That's where I am confuesd about what my best option is. It's hard to balance redundancy, performance and disk capacity. Any other creative options that come to mind? >(since rebuilding takes so > long, and wears the other disks). Mirroring is safer, and mdadmin can > happily do a raid10 (roughly a stripe of mirrors) on any number of > disks for high speed and mirrored redundancy. > > Booting from raids is complicated, but not as difficult as suggested Luckily I don't have to go down that path; I have a seperate drive to boot from. -- Rahul |