[zfs-discuss] L2ARC and SLOG on HW RAID with writeback cache

Richard Yao ryao at gentoo.org
Sun Apr 29 17:44:12 EDT 2018

> On Apr 29, 2018, at 4:37 PM, Gionatan Danti <g.danti at assyoma.it> wrote:
> Il 29-04-2018 21:48 Richard Yao via zfs-discuss ha scritto:
>> There is nothing in the hardware to protect against this. A
>> misdirected write (likely caused by vibration) could be detected if a
>> read is done afterward, but that has two problems. The first is that
>> nobody does it because it hurts performance. The second is that there
>> is no telling where the write went without stopping the world and
>> scrutinizing everything (for several hours) and trying to make sense
>> of how to fix it, which nobody does. It is in no way practical.
> T10 DIF/DIX *does* protect against misredicted writes: https://www.usenix.org/legacy/event/lsf07/tech/petersen.pdf
> However, it requires expensive SAS controllers, backplanes and disks.

Despite claims to the contrary by Oracle, I do not see how T10 protects against misdirected writes:


Every misdirected write affects at least two locations. There is the target location, which will have stale data. Then there is the clobbered location, which might have be readable with the wrong data or might be completely unreadable due to misalignment. Proper handling of misdirected writes requires handling both locations.

My read of T10 is that it will write a DIF tag containing a hash of the LBA. Then when the data is read, that hash is read and verified. If it is wrong, then a read error occurs. If it is correct, then the data is passed onward. A sector with stale data from a misdirected write will pass this check. This handles the clobbered data case.

A feature of hard drives designed for old mainframes was that they would read after writing to verify that data at the correct location had been updated. This handles the stale data case.

You need to handle both cases to protect against misdirected writes, but I read nothing about T10 that claims it requires a read after write to handle the stale data case. I also find nothing that talks about the loss in performance that implementing T10 would occur if it required that drives do a read after every write.

> That said, I reiterate: my question about HW RAID was limited to SATA-based L2ARC/SLOG.
> Any thoughts on that?

As always, I suggest giving the devices directly to ZFS. Having other things on the same device can limit the benefit of L2ARC and SLOG. If you want to share the disks for the main pool with a different filesystem for /, you can use partitions, but they should be aligned and you will need to set noop on the disk manually to avoid unnecessary overhead.

Another option is to put / on a zvol, have an initramfs import the pool and then mount / from the zvol. That is not very common, but I am told that Ubuntu/Debian users do it.

> Thanks.
> -- 
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti at assyoma.it - info at assyoma.it
> GPG public key ID: FF5F32A8
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20180429/bd425abd/attachment.html>

More information about the zfs-discuss mailing list