[zfs-discuss] L2ARC and SLOG on HW RAID with writeback cache

Gionatan Danti g.danti at assyoma.it
Fri Apr 27 17:23:17 EDT 2018


Il 27-04-2018 21:29 Gordan Bobic via zfs-discuss ha scritto:
> If you have write-caching enabled, why not just skip SLOG all
> together? If all your disks are hooked up to the same controller, all
> of your writes will be rendered asynchronous anyway.

Because leaving ZIL on the main pool tends to fragment it, lowering 
performance:
http://thomas.gouverneur.name/2011/06/20110609zfs-fragmentation-issue-examining-the-zil/
https://github.com/zfsonlinux/zfs/issues/3582

> Why? I have been running ZFS roon on CentOS for quite some years now.

I would like to keep things as simple as possible and easily 
supportable; root on ZFS is an additional step with added complexity.

> Why do you think it's easier to replace a failed disk in hardware RAID
> than using zfs replace?

Failed disk on a good hardware RAID:
- red/amber led noticied;
- pull out the bad disk;
- insert the new disk;
- done, without any CLI command.

With OS on a MDRAID array and SLOG/L2ARC on dedicated partitions, I 
failed disk means:
- offline the failed disk/partition in ZFS;
- offline the failed disk in MDRAID (mdadm);
- replace the failed disk;
- online the replaced disk in MDRAID;
- online the replaced disk in ZFS.

To be clear, this is not a problem for me; however, this can be tedious 
for customers.

> Are you using a different controller for your bulk data? If not, what
> do you think SLOG will get you that the writeback cache on the
> controller won't get you without it?

Much reduced main pool fragmentation.

> Disable writeback cache, let ZFS take care of the lot, and never look
> back.
> Using hardware RAID might seem appealing but you are going to regret
> it the first moment you get bitten by bit rot that the hardware RAID
> didn't protect you from.

Sure, but the entire point of the discussion is that for L2ARC and SLOG, 
an undetected and dangerous bit root should be *really* unlikely, 
because:
- a checksum error in L2ARC will cause a trasparent reload from the main 
pool;
- as the ZIL is normally never read *and* data stay in it for a very 
short time, a checksum error in ZIL is very unlikely.

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8


More information about the zfs-discuss mailing list