From: Stan Hoeppner on
Roger Leigh put forth on 7/12/2010 5:45 PM:

> Have a closer look at lvcreate(8). The last arguments are:
>
> [-Z|--zero y|n] VolumeGroupName [PhysicalVolumePath[:PE[-PE]]...]

Good catch. As I said I've never used it before, so I wasn't exactly sure how
it all fits. Seemed logical that when he went from testing the mdadm device
to the lvm volume and lost almost exactly 10x that a striping issue wrt lvm
may be in play.

> AFAICT the striping options are entirely pointless when layered on
> RAID, and could be responsible for the performance issues if it
> can have a negative impact (such as thrashing the disks if you
> tell it to write multiple stripes to a single disc).

I would have thought so as well, but didn't understand the exact function of
-i at the time. I thought it was more like the xfs "-d sw=" switch.

>From another post it looks like the OP is making some good progress, although
there are still some minor questions unanswered.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C3BB03A.3050509(a)hardwarefreak.com
From: Stan Hoeppner on
Aaron Toponce put forth on 7/12/2010 6:56 PM:

> The argument is not whether Linux software RAID 10 is standard or not,
> but the requirement of the number of disks that Linux software RAID
> supports. In this case, it supports 2+ disks, regardless what its
> "effectiveness" is.

Yes, it is the argument. The argument is ensuring _accurate_ information is
presented here for the benefit of others who will go searching for this
information.

The _accurate_ information is that Linux software md RAID 10 on anything less
than 4 disks, or using the md RAID 10 "F2" layout on any number of disks, is
not standard RAID 10. That is a very important distinction to make, and
that's the reason I'm making it. That's what the current "argument" is about.

I made the statement that you can't run RAID 10 on 3 disks, and I and the
list, were told that the information I presented was "incorrect". It wasn't
incorrect at all. The information presented in rebuttal to it is what was
incorrect. I'm setting the record straight.

Now, you can argue what RAID 10 is from now until you are blue in the face,
and the list is tired of hearing it. But that won't change the industry
definition of RAID 10. It's been well documented for over 15 years and won't
be changing any time soon.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C3BB29B.20201(a)hardwarefreak.com
From: Stan Hoeppner on
Arcady Genkin put forth on 7/12/2010 12:45 PM:
> I just tried to use LVM for striping the RAID1 triplets together
> (instead of MD). Using the following three commands to create the
> logical volume, I get 550 MB/s sequential read speed, which is quite
> faster than before, but is still 10% slower than what plain MD RAID0
> stripe can do with the same disks (612 MB/s).
>
> pvcreate /dev/md{0,5,1,6,2,7,3,8,4,9}
> vgcreate vg0 /dev/md{0,5,1,6,2,7,3,8,4,9}
> lvcreate -i 10 -I 1024 -l 102390 vg0
>
> test4:~# dd of=/dev/null bs=8K count=2500000 if=/dev/vg0/lvol0
> 2500000+0 records in
> 2500000+0 records out
> 20480000000 bytes (20 GB) copied, 37.2381 s, 550 MB/s
>
> I would still like to know why LVM on top of RAID0 performs so poorly
> in our case.

I'm curious as to why you're (apparently) wasting 2/3 of your storage for
redundancy. Have you considered a straight RAID 10 across those 30
disks/LUNs? Performance should be enhanced by about 50% or more over your
current setup (assuming you're not hitting your ethernet b/w limits
currently), and you'd only be losing half your storage to fault tolerance
instead of 2/3rds of it. RAID 10 has the highest fault tolerance of all
standard RAID levels and higher performance than anything but a straight stripe.

I'm guessing lvm wouldn't have any problems atop a straight mdadm RAID 10
across those 30 disks. I'm also guessing the previous lvm problem you had was
probably due to running it atop nested mdadm RAID devices. Straight mdadm
RAID 10 doesn't create or use nested devices.

I'm also curious as to why you're running software RAID at all given the fact
than pretty much every iSCSI target is itself an array controller with built
in hardware RAID. Can you tell us a little bit about your iSCSI target devices?

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C3BCF4A.6090303(a)hardwarefreak.com
From: Arcady Genkin on
On Mon, Jul 12, 2010 at 20:06, Stan Hoeppner <stan(a)hardwarefreak.com> wrote:

> I had the same reaction Mike.  Turns out mdadm actually performs RAID 1E with
> 3 disks when you specify RAID 10.  I'm not sure what, if any, benefit RAID 1E
> yields here--almost nobody uses it.

The people who are surprised to see us do RAID10 over three devices
probably overlooked that we do RAID10 with cardinality of 3, which, in
combination with "--layout=n3" is almost an equivalent of creating a
three-way RAID1 mirror. I'm saying "almost" because it's equivalent
in as much as each of the three disks is an exact copy of the others,
but the difference is in performance.

We found out empirically (and then confirmed by reading a number of
posts on the 'net) that MD does not implement RAID1 in, let's say, the
most desirable way. In particular, it does not make use of the data
redundancy for reading when you have only one process doing the
reading. In other words, if you have a three-way RAID1 mirror, and
only one reader process, MD would read from only one of the disks, so
you don't get performance benefit from using the mirror. If you have
more than one large read, or more than one process reading, then MD
does the right thing and uses the disks in what seems to be a round
robin algorithm (I may be wrong about this).

When we tried using RAID10 with n=3 instead of RAID1, we saw much
better performance. And we verified that all
three disks are bit-to-bit exact copies.

> I just hope the OP gets prompt and concise drive failure information the
> instant one goes down, and has a tested array rebuild procedure in place.
> Rebuilding a failed drive in this kind of setup may get a bit hairy.

Actually, it's the other way around because you get quite a bit of
redundancy from doing the three-way mirroring. You are still
redundant if you loose just one drive, and we are planning to have
about four global hot spares standing by in case a drive fails.
--
Arcady Genkin


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/AANLkTil-nzmsYi8uBQNbLRBHQkVDr4AngpGSvi9Uh_kj(a)mail.gmail.com
From: Aaron Toponce on
On 07/12/2010 06:26 PM, Stan Hoeppner wrote:
> Now, you can argue what RAID 10 is from now until you are blue in the face,
> and the list is tired of hearing it. But that won't change the industry
> definition of RAID 10. It's been well documented for over 15 years and won't
> be changing any time soon.

Your lack of understanding content and subject matter is rather unfortunate.

--
. O . O . O . . O O . . . O .
. . O . O O O . O . O O . . O
O O O . O . . O O O O . O O O