nested RAID level 0+0 – ultimate performance?

I'll take four of these, please... RAID 0+0 = win?
I briefly considered RAID for storage in my new system, but realized that RAID is basically useless as a backup mechanism. Others have made the basic case for why RAID sucks as backup better than I can; I went ahead and ordered a new Caviar Black with the 6 Gb/sec interface as my main drive, and will re-use my older Hitachi for regular internal backup and large video files, torrents etc. Regular Windows backup tool will be enough; I’ll also add a network disk on teh router for network backup of all the machines, and probably get a service like Carbonite for offsite backup.

While researching RAID, though, I became fascinated by the concepty of nested RAID (I had watched Inception twice on a recent flight :). Nested RAID levels are of course nothing new – RAID 1+0 and RAID 0+1 being the most common, giving you advantages of both mirroring and striping for both redundancy and performance.

But what if you nested RAID 0 twice? In other words, four disks, each pair a RAID 0 array, and then those arrays also in RAID 0?

RAID 0 gives you almost double the performance of a single disk (much as SLI gives you almost double the performance of a single GPU), at double the cost (double the drives). Does nesting RAID 0 scale linearly? Would RAID 0+0 give you almost 4x performance at 4x cost?

Triple SLI doesn’t quite give you triple performance, as there is some overhead in coordinating between the cards, In the case of RAID, the overhead is borne by the RAID controllers, however, and theoretically each controller only has to worry about 2 logical units. So I would expect that nesting level 0 RAID arrays would be less burdened by overhead and would be closer to true linear scaling.

Has anyone ever done this? It’s insanely expensive of course – 4 disks, with 4x more risk of drive failure and absolutely no redundancy at all. Though you could envision a RAID 0+0+1 array where you have 4 disks in RAID 0+0 and then do a simple RAID 1 array at the very top with a much larger drive. An example would be to do RAID 0+0+1 with 4 128 GB SSDs and 1 500 GB hard disk. It would be easy to simply reduce the nesting level for performance comparisons, to see how RAID 0+0+1 fares against RAID 0+1, RAID 1+0, RAID 0, and RAID 1 as the baseline.

I don’t have 4 SSDs and a spare 500 GB disk lying around. Or 5 hard drives of any sort, frankly. But I bet the Tom’s folks have the hardware to spare lying around the bench. I’ve posted a forum topic there to see if I can get their attention.

If someone were to spend money on this, though, clearly the best hardware would be four of these Sandforce-based 128 GB drives from ADATA, which basically has all the tech sites swooning. Couple that with a 500 GB WD Caviar Black for the +1 part of the RAID 0+0+1 array and you’d have serious hardware. Total cost for the drives alone would be about $850 as of this posting date, for 500 GB of storage. But if I’m right about the linear scaling, then this would be ridiculously fast.

4 thoughts on “nested RAID level 0+0 – ultimate performance?”

  1. Actually, RAID levels are not limited to just two disks (as stuff like RAID 5 should show requiring a minimum of 3 disks). Basically, it’s simply RAID 0 (or striping) across all four disks. And I have seen this done (with 8 disks in a RAID 0 array) and seen the performance hit the limits of the communications bus (at the time, was a SCSI-320 system). “Nesting” RAID 0 on RAID 0 will probably do the same thing as RAID 0 across 4 drives with a bit more overhead going on.

    So yes, there are some performance benefits to using a 2+ disk RAID 0, but I think the biggest gains are limited to large sequential reads (so I can see video recording using raw images or a low-compressing codec like mJPEG and on-the-fly re-encoding sites using such a setup); I don’t see much of an advantage to normal use. I find the biggest bottleneck in normal usage is seek times; something that SSDs practically eliminate on their own. I find RAID 1 (mirroring) setups with a smart controller can really help with this on standard (or non-SSD) hard drives by letting one process read from one disk while another process can read data from another part of the filesystem on another disk; one process doesn’t have to wait for the other to finish. RAID 0 can help some, but there’s no guarantee that each process needs data coming from two separate disks because of how the striping works.

  2. Ah, i see what you mean. That makes sense. You can tell I’m a RAID neophyte 🙂 And since I’ve already concluded I don’t need RAID at all (and am not using any SSds in my new build) it’s all a bit theoretical.

    Still, if you have 4-disk RAID 0 setup, the RAID controller has to deal with four logical units. Isn’t there a possibility that RAID 0+0 would reduce that overhead, because you are distributing it across not just one RAID controller, but now three? (one for each RAID 0 setup, and then one more for the top-level RAID 0 setup).

    The issue is how linear the scaling would be, and the reason scaling wouldn’t be linear would be overhead. Adding more controllers would keep overhead flat (at expense of power consumption, there’s always a price).

    It looks like controllers are smart enough now that even sequential reads on SSDs gain from a RAID 0 setup – but we could drop SSds and go with a traditional hard disk array, and I think you’re right that the gains would be more apparent. I wonder if you could even invert the cost justification – 4 500GB hard drives cost $200, same cost as a 128 GB SSD. If using RAID 0+0 works linearly you could see the drive array approach the SSD in performance for same cost.

    I note with fascination that some of the more expensive SSDs sold right now are actually RAID 0 internally, invisible to the user. This model, for example. If you RAID 0’d these, you’d have effective RAID 0+0, not 4-disk RAID 0…

  3. Well, with 4 drives across a single controller, mapping an “address” to the right point on the right drive is a single calculation. With a Raid 0+0 setup you’re specifying, it would be two calculations, one for each level. In reality, I doubt it would be a noticeable difference.

    Lately, it looks like the recent trends have been to move the striping, mirroring, and parity checks above the controller into the filesystem driver level. For example, Linux’s BtrFS (still practically alpha/beta level right now) can be told to automatically stripe data across two or more disks for performance gains. This isn’t just moving RAID up to the software level, but the file system itself is aware that it’s on multiple disks and can optimize how it stores data to take advantage of it.

    That said, this is still in the realm of server systems; for normal users (and even most power users), the biggest gains I find are still in the seek-time level; with traditional hard drives, you still have physical heads that take time to move to the right spot, then have to wait until the cylinder comes around to the right spot before it can read/write data. That said, a smart controller can mitigate this by optimizing the ordering commands so that the HD heads have to do less movements (instead of going from point 0 to 30, read, then point 20, read; got to point 20, read into memory, go to point 30, read, then send the saved data from point 20 back, etc).

    Again, RAID-0 can mitigate this some when there’s multiple read/write commands coming in by the possibility that they might be spread out to different HDs, but you still have that seek-time issue coming in. Especially if the filesystem doesn’t hold the entire info of where files and directory data is located in memory; to open a file whose location info is not in cache yet, the file system has to request the information of where that file is stored, then it can get the data of that file itself. So you have at least 2 read requests just for opening a file. It gets worse if that file is fragmented; the drive head will be going back and forth to get the file for you. RAID-0 doesn’t eliminate this.

    Because of all this, on standard hard drives, you won’t get a linear performance improvement out of RAID 0 for each additional drive unless the following conditions are met:
    1) You are only doing sequential reads/writes of a single file;
    2) That file you’re reading from/the space you’re writing to is completely defragmented.

    With SSDs, though, because of how fast their seek times are (traditional drives are measured in milliseconds, SSDs are measured in the *nanoseconds*), the above two conditions aren’t as big an issue and putting them in a RAID 0 setup will be fairly linear, I’d suspect (Though short on actual info and full of silly hype, this Samsung video on running 24 of one of their SSDs in a RAID 0 setup getting 2 TB/S transfer rates: http://www.youtube.com/watch?v=96dWOEa4Djs ).

    So, yeah, 4 500 GB tradition HDs in a RAID 0 may approach or even surpass the throughput of the SSD in certain scenarios, but won’t still be able to touch the access time of the SSD as the hard drives still have to wait on physical limitations of drive heads and spinning cylinders.

  4. Well, with 4 drives across a single controller, mapping an “address” to the right point on the right drive is a single calculation. With a Raid 0+0 setup you’re specifying, it would be two calculations, one for each level.

    I see what you mean. the calculations are not exactly in parallel, there is a delay as the instruction travels down the hierarchy. I wonder if anyone has RAID-0 striped a pair of those Colussus SSDs I linked? if there’s no performance gain from RAID 0 of such drives, then that would definitely put the nail in RAID 0+0.

    Still, it would be an interesting test if you had 4 SSDs to try them out in RAID 0 x4 and RAID 0+0 for comparison, wouldn’t it?

    Clearly, RAID for hard disks should be strictly for redundancy, whereas RAID 0 only makes sense in an SSD context. And neither makes much sense in a home or office context unless you are really trying to eke out maximum speed.

    I am considering buying that ADATA $200 drive for my boot/applications. I have spent enough right now though – will wait six months or until the next refresh/lower price point regime. Right now my single Caviar Black connectted with 6 GB/s SATA should be more than adequate – I am expecting Windows Experience scores in the mid-7 territory. My old rig for the kids was 4.5 on Vista, which isn’t bad considering Vista maxed out at 5.9. But the new rig should get close (it will definitely max out WE on Vista, but the upper limit is higher in W7).

Comments are closed.