[zfs-discuss] ZFS on RAID-5 array
gordan.bobic at gmail.com
Thu Apr 12 13:30:46 EDT 2018
On Thu, Apr 12, 2018 at 5:51 PM, Richard Elling <
richard.elling at richardelling.com> wrote:
> On Apr 11, 2018, at 2:39 AM, Gordan Bobic via zfs-discuss <
> zfs-discuss at list.zfsonlinux.org> wrote:
> Short answer: Don't.
> Actually, there are quite a few large enterprises who have been running
> mission-critical systems
> like this for many years. It does work and can work very well.
> That said, if the real question is "given X disks, should I create a raidz
> or raid-5?", then the short answer
> is "use raidz"
> Long answer:
> By putting ZFS on a single virtual disk exposed by hardware RAID you will
> lose ZFS' ability to auto-heal data, thus forfeiting a substantial fraction
> of it's value.
> An enterprise-class RAID array deals with bit rot, checksums, and
> auto-healing too. They can make
> a very nice complement to ZFS.
Right up to the point when the disk starts feeding you garbage instead of
errors (yes, it happens). At that point the controller won't be able to
tell if that happened without reading the whole stripe, which is exactly
what ZFS does anyway, and any theoretical performance advantage disappears.
> Because the underlying RAID geometry is entirely opaque, it is virtually
> guaranteed that you will end up with a lot of misaligned access which will
> potentially halve sustained read throughput.
> Sustained write throughput can be as low as half of what ZFS can achieve
> because of RMW required by parity RAID on writes.
> In some _very_ specific uses, you may be able to get better small random
> read performance if you have ZFS on top of hardware RAID5.
> If you can reduce the exposed RAID block size to as low as 8KB, you could
> create zfs with ashift=13 to ensure the alignment at the bottom end is
> unaffected, and set recordsize to 8KB to ensure fs blocks won't straddle
> disk boundaries, and operations you are doing are mostly <= 8KB so the 8KB
> recordsize is in fact appropriate, then having ZFS on hardware RAID5 would
> potentially yield better random read performance than RAIDZ1 of same shape.
> Sustained write throughput can still be as bad as half due to RMW on parity
> Disagree, modern RAID arrays are quite adept at managing small writes
> efficiently. Attempting
> to try and match the internal representation is largely a waste of time,
> just like it is a waste of time
> to try and match raidz sizes to recordsize/volblocksize.
It's a waste of time for bursty workloads. Once your sustained throughput
saturates the write caches, the performance will tank.
> Systems engineering is a discipline that deals with trade-offs and
> compromises. ZFS vs RAID-5 is
> not immune to these forces.
Can you illustrate a specific case where ZFS on RAID5 will yield better
performance than RAIDZ of same geometry, that is NOT dependent on ensuring
good alignment as per what I described?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the zfs-discuss