From: Arcady Genkin on
On Mon, Jul 12, 2010 at 22:28, Stan Hoeppner <stan(a)hardwarefreak.com> wrote:
> I'm curious as to why you're (apparently) wasting 2/3 of your storage for
> redundancy.  Have you considered a straight RAID 10 across those 30
> disks/LUNs?

This is a very good question. And the answer is: because Linux's MD
does not implement RAID10 the way we expected (as you have found out
for yourself). We started out thinking exactly that we'd have a
RAID10 stripe with cardinality of 3, instead of the multi-layered MD
design. But for us it's important to have full control over what
physical disks form the triplets (see below for discussion); instead,
MD's so-called RAID10 only guarantees that there will be exactly N
copies of each chunk on N different drives, but makes on promise as to
on *which* drives.

The reason the drive assignment is important to us is that we can
achieve more data redundancy if we form each triplet from an iSCSI
disk that lives on a different iSCSI target (host).

Suppose that you have six iSCSI target hosts h0 through h5, and each
of them has five disks d0 through d4. Then if you form the first
triplet as (h0:d0, h1:d0, h2:d0), and so forth until (h3:d4, h4:d4,
h5:d4), then if any iSCSI host goes down for whatever reason, then all
triplets still stay up and are still redundant, only running on two
copies instead of three.

Linux's RAID10 implementation did not allow us to do this. So we had
to layer by first creating RAID1 (or RAID10 with n=3) triplets, and
striping them in a higher layer.

> I'm also curious as to why you're running software RAID at all given the fact
> than pretty much every iSCSI target is itself an array controller with built
> in hardware RAID.  Can you tell us a little bit about your iSCSI target devices?

Our boss wanted us to only use commodity hardware to build this
solution, so we don't employ any fancy RAID controllers - all drives
are connected to on-board SATA ports. Staying away from the "black
box" implementations as much as possible was also part of the wish
list.

After dealing with all the idiosyncrasies of iSCSI and software RAID
under Linux I am a bit skeptical whether what we are building is going
to actually be better than a black-box fiber-attached RAID solution,
but it surely is cheaper and more expandable.
--
Arcady Genkin


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/AANLkTinWFQgCutLg6Rm0C5pmGTGZwnvFXc-KBZSDQsU7(a)mail.gmail.com
From: Stan Hoeppner on
Arcady Genkin put forth on 7/12/2010 10:49 PM:

<very interesting and thorough background snipped for brevity>

> After dealing with all the idiosyncrasies of iSCSI and software RAID
> under Linux I am a bit skeptical whether what we are building is going
> to actually be better than a black-box fiber-attached RAID solution,
> but it surely is cheaper and more expandable.

I share your skepticism. Cheaper in initial acquisition cost, yes, but maybe
not long term reliability and serviceability. Have you performed manual
catastrophic iSCSI target node failure tests yet, monitored the
node/disk/array reconstruction process to verify it all works as expected
without user interruption? This is always the main concern with homegrown
storage systems of this nature, and where "black box" solutions typically
prove themselves more cost effective (at least in user good will $$) than home
brew solutions.

I myself am a fan of Nexsan storage arrays. They offer some of the least
expensive and most feature rich and performant FC and iSCSI arrays on the
market. Given what you've built, it would appear the SATABeast would fit your
needs. 42 SATA drives in a 4U chassis, dual controllers with 4 x 4Gb FC ports
and 4 x 1GbE iSCSI ports, 600MB/s sustained per controller, 1.2GB/s with both
controllers, up to 4GB read/write battery backed cache per controller, web
management/snmp/email alerts via mngt 10/100 ethernet port, etc, etc. The web
management interface is particularly nice making it almost too easy to
configure and manage arrays and LUN assignments.

http://www.nexsan.com/satabeast.php

One of these will run somewhere between $20-40k depending on disk qty/size/rpm
and whether you want/need both controllers. They also offer an SAS version
with 15krpm drives at higher cost. I've installed a couple of the single
controller SATABeast models and the discontinued SATABlade model. They've
performed flawlessly, no drive failures to date. Last I checked Nexsan still
uses only Hitachi (formerly IBM) UltraStar drives.

Good product/solution all around. If you end up in the market for a "black
box" storage solution after all, I'd recommend you start your search with
Nexsan. I'm not selling here, just a very happy customer.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C3CE8E8.1010001(a)hardwarefreak.com