[zfs-discuss] cannot import 'home': I/O error Destroy and re-create the pool from a backup source

Chris Siebenmann cks at cs.toronto.edu
Fri Apr 27 15:36:11 EDT 2018


> I still fail to understand the meaning of CKSUM errors on radiz0 vdev
> and pool levels. Can someone explain this to me? Before I understand
> these numbers I discard them as non credible (noise).

 In general, any errors above the levels of individual disks mean ZFS
has found unrepairable damage. The question is how this damage can
happen given that only two of your disks reported specific errors and
you're using raidz2 (which theoretically should withstand errors on any
two disks).

 I don't have any answers because I don't know enough and I haven't
inspected the ZFS source code to find out. However, I have a theory. On
a raidz vdev/pool, there are two sorts of errors that can happen when a
logical block is read back and its checksum fails to verify, ones where
you can recreate a block that passes the checksum by dropping some
of the disk blocks and reconstructing them with the available parity
information, and ones where you can't, where no available combination
of data and parity disk blocks produces a logical block with the right
checksum.

 In the first case, it makes sense for ZFS to mark the disks whose
disk data blocks got dropped out and then reconstructed from parity as
'have checksum errors'. These blocks clearly have corrupted contents
and that corruption was detected only by checksums (they didn't report
read errors). In the second case, it's probably not possible to assign
fault to any particular specific set of three or more disks; ZFS simply
doesn't know which disks (if any) have good data and which disks have
bad data. It's even possible that all the disks have faithfully returned
the data that was written to them but that data got corrupted before it
was written out (in which case blaming the disks is wrong in general).

 So there's a reason why you might see vdev and pool level errors
on a raidzN vdev without errors on enough individual disks to match
it.

 As for the difference in error counts between the pool and the
single vdev:

>         NAME                            STATE     READ WRITE CKSUM
>         home                            ONLINE       0     0 1.80K
>           raidz2-0                      ONLINE       0     0 9.03K

... I wonder if this is due to pool-level metadata redundancy. ZFS
normally stores two copies of all metadata, so the raidz2-0 CKSUM error
counts may include individual copies of the metadata, while the smaller
pool-level count includes only metadata where both copies are bad.

(Since data normally has no extra copies, both the pool and raidz2-0
CKSUM counts will include all of the bad data blocks.)

	- cks


More information about the zfs-discuss mailing list