[zfs-discuss] Permanent errors in older snapshots

Gordan Bobic gordan.bobic at gmail.com
Sat Dec 24 07:01:48 EST 2016


Why does this even remotely matter? A snapshot will eventually stop being
relevant and it will get rotated out, and the errors will then disappear.

Restoring a file in a snapshot from a backup seems like a ridiculous idea.

On 24 Dec 2016 01:39, "Gregor Kopka (@zfs-discuss) via zfs-discuss" <
zfs-discuss at list.zfsonlinux.org> wrote:

> A function to rewrite defective, non-reconstructable (file) data could be
> helpful:
>
> ZFS knows the file in question, it knows the mode how the data was
> written, it knows offset and lendth of the bad data and the checksum that
> would need to have.
> So it should be possible to feed in that file (from a backup) for ZFS to
> rewrite the defective part inplace (as the data there is currently
> known-bad), healing the defect.
>
> Gregor
>
> Am 23.12.2016 um 19:38 schrieb Håkan Johansson via zfs-discuss:
>
> I have been thinking about the below (original mail) problem for a while.
>
> The issue in short:
>
> if a scrub finds a file which has permanent errors (i.e. no good copies
> can be found or reconstructed), there usually is one way to recover/clear
> the error without having to recreate the entire filesystem: delete the file
> in question.  If the error is in a file which is part of a snapshot, the
> snapshot cannot be fixed as it cannot be changed.  Thus the only way of
> clearing the error from the pool is to destroy the entire snapshot.
>
> Destroying an entire snapshot is sometimes a rather heavy-handed solution.
>
> Would it make sense to introduce an list of known (and acknowledged)
> damaged blocks, that scrub would ignore and not report during checking?
> Acknowledgement from the user would be by issuing some zpool/zfs command to
> add the blocks to the list.  Normal reads would still generate I/O errors.
> The blocks could be stored as offset+expected checksum pairs, and would
> thus not allow any other brokenness to pass.
>
> Best regards,
> Håkan
>
>
> On Sat, Dec 10, 2016, at 11:37 PM, devsk via zfs-discuss wrote:
>
> Hi,
>
> I think this might have been discussed here before because I am very sure
> I myself ran into this issue several years ago.
>
> One fine day after the update to v0.6.5.8-r0-gentoo, a scrub finds a file
> which has permanent error (I guess it means that it can't correct the blocks
> in error) on a file in an old snapshot. The file is unreadable (Input/Output
> error) in all snapshots taken after that but they were taken in the years
> since.
>
> # zpool status -v
>   pool: backup
>
>
>
>  state: DEGRADED
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://zfsonlinux.org/msg/ZFS-8000-8A
>   scan: scrub repaired 0 in 9h16m with 1 errors on Sat Dec 10 23:19:10 2016
> config:
>
>         NAME                                            STATE     READ
> WRITE CKSUM
>         backup                                          DEGRADED
> 0     0     1
>           raidz2-0                                      DEGRADED
> 0     0     2
>             ata-WDC_WD4001FAEX-00MJRA0_WD-WCCxxxxxxxxx  ONLINE
> 0     0     0
>             ata-ST4000VN000-1H4168_Z302C80M             ONLINE
> 0     0     0
>             ata-WDC_WD4001FAEX-00MJRA0_WD-WCCxxxxxxxxx  ONLINE
> 0     0     0
>             /mnt/serviio/4tbFile                        OFFLINE
> 0     0     0
>
> errors: Permanent errors have been detected in the following files:
>
>         backup/zfs-backup at move_backup_to_4tb_external_sep25_2014:/
> Installs/clonezilla/live/filesystem.squashfs
>
>
> I tried to read the file in all subsequent snapshots (using .zfs/snapshot
> folder) since Sept 2014 and its unreadable in all of them. I can copy the
> correct  file over and take snapshots and they are all fine. I can keep
> deleting snapshots and the scrub keep pointing to the next snapshot.
>
> Neither of the 3 disks in there show any pending, uncorrectable or
> reallocated sectors. Overall health is fine with the disks and scrub has
> never failed on these for last any number of months I can remember (and
> from zpool history).
>
> Any ideas? I remember I had to restore from backup (of backup in this
> case) last time I ran into this. Is there any other way? Its a pain to
> start over.
>
> Also, I want to add a 4TB disk to replace 4tbFile but I am wondering if
> the resilver will even succeed in this state. I am afraid it will fail at
> this snapshot and it will be a waste of time.
> Thanks
> -devsk
> *_______________________________________________*
> zfs-discuss mailing list
> zfs-discuss at list.zfsonlinux.org
> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
>
>
>
>
> _______________________________________________
> zfs-discuss mailing listzfs-discuss at list.zfsonlinux.orghttp://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at list.zfsonlinux.org
> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20161224/7620520b/attachment.html>


More information about the zfs-discuss mailing list