[zfs-discuss] 24 drives - what would you do?
scruffters at gmail.com
Mon May 18 11:34:30 EDT 2015
On Sun, May 17, 2015 at 8:37 PM, Kash Pande <kashpande at gmail.com> wrote:
> On 17/05/15 03:12 PM, scruffters via zfs-discuss wrote:
>> I wonder if anybody would be kind enough to critique my plan for
>> building a large pool of storage at home?
>> I'm looking at obtaining a 24 drive enclosure and filling it with 4TB
>> HGST NAS drives.
> Just out of curiosity - 2.5" drives offer 2TB of storage in less space.
> Have you considered using small form-factor disks with a higher density?
I hadn't, but there don't appear to be many 2.5" 2TB drives out there?
The options at 2TB seem to be rather consumer oriented - WD
Green/Blue, HGST Travel Star and Samsung Spinpoint..
Anything else you recommend?
>> On the ZFS side of things I'm planning to create a single ZFS pool,
>> comprising of 4x RAID Z2 VDEVs (each over 6 drives).
> Personally, I would do the mirrored approach despite the 50% storage
> efficiency losses because rebuilding things from exact copies is always
> going to be easier on the system than rebuilding from parity.
>> My focus is as a large block of storage for storing quite a large
>> dataset of stuff that I've gathered over the years. The reason I'm
>> thinking of that design is because I can't really afford to have a 50%
>> utilisation of a striped mirror.
> Don't look at it as 50% vs 66%, it's more or less equal once the
> negative consequences of parity are taken into account.. plus, raidz*
> has an issue with inefficient space usage on ashift=12 array - those 4TB
> are advanced format? In one case, 3TB out of a 30TB pool was simply lost
> to storage inefficiency.
> That said, 1M block size for the filesystem (large_blocks) would help
> with this problem.
The HGST NAS is an advanced format drive (I think).... From the spec:
Sector size: 4KB/sector (512B/sector emulation)
I'm not familiar with the large_block option. Will look it up.
> Because of its problems, raidz is not suitable for environments where
> performance is of importance, though some people hear that and think,
> "Oh I'm not looking for bleeding edge performance" they're really not
> aware that simply scrubbing your pool with raidz2 or raidz3 will just be
> absolutely horrible.
> Raidz* resilvering has the penalty of parity calculation - and disks are
> under greater load during rebuild as well.
> In a n-way (2 disks per vdev or more) mirror setup you'll have each
> disk under less load, and less disks will be active. Despite this, the
> resilver is generally much faster and your system is back on its feet
> sooner, allowing you to backup data and feel safe once more.
> Send/recv with raidz* pools will similarly suffer, especially when the
> pool is fragmented and you're sending incremental streams, the recv side
> can take a while to parse each stream as it does a lot of random IO
> that's bad for raidz*.
> I would only use raidz1,2,3 on environments where time to recovery is
> not a problem. I'd generally use mirrors for my backup servers because I
> dig that level of redundancy, but if I've got several backup boxes and
> I'm needing a large archive to store long-term backups, then I might use
> multiple 6 disk raidz2.
The dataset is mostly large video files from various projects over the
years, so mostly I expect the workload will be linear/sequential
I did a test on a 4 drive ZFS pool with no parity. It seemed to have
the IOPS required for my application, which I understand 4 x Z2 would
basically equate to?
Maybe a mirror would be more suitable, but for this kind of bulk
storage wouldn't the resilver be more reasonable because the dataset
is less 'bitty' - for want of a better word?!
Also, I found that streaming seemed to perform better with a 4 drive
z1 than a 4 drive mirror.
>> The rack I'm looking at has an
>> expander type backplane (single 8087), so I'll be using it with an LSI
>> 9211-4i in IT mode.
>> Would anybody advise against the above? It seems like quite a sane
>> approach, but some advice would be gratefully received.
>> Also, while I'm here, it appears that I can quite happily populate
>> only half of the bays, then add the rest later, to increase both
>> capacity and performance? (although it won't auto rebalance) Do I
>> understand the concepts correctly?
> I would advise against this sort of thing with raidz, it doesn't deal
> very nicely with being unbalanced - you are better off creating a new
> pool or replacing your disks with larger ones one-at-a-time.
OK thanks for the heads up on that.
In which case it would be better to populate the whole thing at once,
or create a second pool later.
More information about the zfs-discuss