[zfs-devel] self-healing IOs stats in zpool status

zfs-devel at kopka.net zfs-devel at kopka.net
Wed Aug 2 09:10:16 EDT 2017



Am 02.08.2017 um 03:01 schrieb Tony Hutter via zfs-devel:
>
> > To be clear here, I think that what you're seeing is specifically
> > that readahead IO does not cause these updates. I don't believe that
> > regular reads are issued as speculative ZIOs, only readahead ones.
> > Since you're reading your test files with sequential IO, I believe that
> > most of the reads will be initially issued as speculative readaheads
> > and then the real user-level read()s will be satisfied from the cache.
>
> Thanks for the clarification on that - yep, it's the speculative
> ("read-ahead") IOs that are not getting recorded, not the regular
> ones.  I can also confirm that self-healed writes also don't increment
> the error counters, but do generate events.  That can lead to some
> funny results if you happen to have zed running:
>
>   pool: mypool
>  state: DEGRADED
> status: One or more devices are faulted in response to persistent errors.
>     Sufficient replicas exist for the pool to continue functioning in a
>     degraded state.
> action: Replace the faulted device, or use 'zpool clear' to mark the
> device
>     repaired.
>   scan: none requested
> config:
>
>     NAME        STATE     READ WRITE CKSUM
>     mypool      DEGRADED     0     0     0
>       mirror-0  DEGRADED     0     0     0
>         md100   FAULTED      0     0     0  too many errors
>         md101   ONLINE       0     0     0
In case the ejection happened from healed IO it IMHO should be treated
as a bug, since a (somewhat) good drive was ejected.
Would it be possible to queue a read for the rewritten block in the
future (so the drive will have to fetch it from the platter, instead of
serving it from cache) to make sure the repair stuck - should that fail
treat it as more severe form of error?

> > If there is a good reason not to count these in the current
> > read and checksum stats, I would definitely support adding
> > additional stats counters for them. I would even go so far
> > as having 'zpool status' automatically add them to the real
> > read/checksum counts unless you give it a flag to turn this
> > off.
I second this, it shouldn't be hidden.

> I agree, especially with the part about adding the self-healed IOs
> into the regular totals *by default*, since I think that's what people
> expect.  I'm already working on a patch. Here's some sample output
> showing 40 self-healed read errors, plus 5 regular read errors ('-s' =
> "show self-healed"):
I would expect, given the existance of 'non-fatal' errors to only report
fatal ones in the first figures, but that would only make sense if
failed repairs could be recorded in them.

>
> $ zpool status -s
>
>   pool: mypool
>  state: ONLINE
>   scan: none requested
> config:
>                                                Non-Fatal
>     NAME        STATE     READ WRITE CKSUM  READ WRITE CKSUM
>     mypool      ONLINE       0     0     0     0     0 0
>       mirror-0  ONLINE       0     0     0     0     0 0
>         md100   ONLINE      45     0     0    40     0 0
>         md101   ONLINE       0     0     0     0     0 0
Could it make sense to discount the repaired blocks from the first
counter? Also possibly a more descriptive label?
>                           Fatal errors      Repaired data
>     NAME        STATE     READ WRITE CKSUM  READ WRITE CKSUM
>         md100   ONLINE       5     0     0    40     0     0 

Thanks for diving into this.

Gregor



More information about the zfs-devel mailing list