[zfs-discuss] cannot import 'home': I/O error Destroy and re-create the pool from a backup source

Daniel Armengod daniel at armengod.cat
Sun Apr 22 09:56:35 EDT 2018


The perf_event_max_sample is innocuous :)

I'm sorry, but I'm at the limit of my (very scant) knowledge. I hope 
someone else will be able to give you further advice. I think they'd 
find it helpful if you pasted the output of zdb -l for each disk, so you 
can try and do that in advance.

Best of luck,


On 22/04/18 14:56, Anton Gubar'kov via zfs-discuss wrote:
> Thanks a lot for advice. The rig is not bootable, so I use rescuecd 
> live system to inspect/repair. There is no auto-import  in it.
> I did a dd of=/dev/null for all my pool's devices (all 6 raidz2, ZIL 
> and cache) overnight. all reads were successful, no messages in dmesg 
> whatsoever related to storage/scsi/zfs. Night dd run produced only :
>
> [11271.659763] perf: interrupt took too long (2513 > 2500), lowering 
> kernel.perf_event_max_sample_rate to 79000
> [15930.168897] perf: interrupt took too long (3153 > 3141), lowering 
> kernel.perf_event_max_sample_rate to 63000
>
> I don't believe this signifies any problems with my disks/controllers.
>
> All 6 devices and ZIL show identical txg from zdb -l
>
> Replacing all 6 drives is an overkill for my budget at the moment and 
> total profanation of the idea I built this pool for - reliability, 
> fault tolerance and backup.
>
> Is there a way to get more information about the import error? Some 
> command line switches/environmental variables to set? When I run zpool 
> import, dmesg and/or system log has no messages. The diag message 
> produced by zpool import command is not very helpful.
>
> The ZFS-8000-4J message that is referenced in zpool import output 
> relates to missing/failed devices, but there are no 
> OFFLINE/REMOVED/UNAVAIL devices in the pool config. They are all 
> present in /dev/disk/by-id and /dev and successful read is confirmed.
>
> Thanks for your support.
> KR
> Anton.
>
>
>
> On Sat, Apr 21, 2018 at 10:24 PM Daniel Armengod via zfs-discuss 
> <zfs-discuss at list.zfsonlinux.org 
> <mailto:zfs-discuss at list.zfsonlinux.org>> wrote:
>
>     *Wait for someone more knowledgeable to provide advice*
>
>     I was in a similar situation just yesterday. 4-drive RAIDz1, 1
>     drive completely dead, the other one had acted up.
>
>     What I did (again: wait for someone else to provide input on this)
>     was to:
>
>     * Disable ZFS automatic import on boot (in my case, systemctl
>     disable zfs-import.target). Actually, boot the system as stripped
>     of non-essential processes and services as you can.
>
>     * Check dmesg for error messages. Disks acting up will leave a
>     very recognizable pattern there.
>
>     * Make a full non-destructive read-only badblocks pass on each
>     device. This will tell you if they can withstand reads. If any
>     disks are not-yet-dead-but-dying the stress will leave logs in the
>     kernel ring buffer; check dmesg regularly. Identify how many flaky
>     drives you have. Pray you don't break the redundancy threshold.
>
>     * Check the ZFS data structures with zdb. With the pool unmounted,
>     for each member drive, run zdb -l /dev/<path_to_drive>. Make sure
>     to provide the partition/slice number, even if you gave ZFS the
>     whole disks to build the pool. Of particular note is the txg
>     number: it should be the same in all devices. I believe it should
>     be the same in *at least* n-2 devices for a raidz2.
>
>     * Go back and read the zdb manual, it's quite interesting.
>
>     In my case, 2 of the remaining RAIDZ1 disks - the healthy ones -
>     showed a txg number higher than the faulty one.
>
>     zpool import tank -XF did the trick for me. I was lucky and able
>     to recover all the data (if anything was lost I've yet to notice,
>     it mostly contains anime :P) I thought I'd lose.
>
>     After successful recovery, re-import it with -o readonly=on and
>     zfs send'd all the datasets you care about somewhere safe and
>     reliable. Then you can do disk reshuffling until you can trust
>     your pool again.
>
>     Best of luck,
>
>
>     On 2018-04-21 19:36, Anton Gubar'kov via zfs-discuss wrote:
>>     Hi,
>>
>>     My recent backup server freeze ended up with non-importable pool.
>>     Since it's a backup server - I have no further backups - so
>>     following the diag message is not a way for me. I would like to
>>     recover this pool as it contains some valuable data I cannot
>>     reproduce ever (video archive).
>>
>>     So my status today is:
>>
>>     root at sysresccd /root % zpool import -N -f  home
>>     cannot import 'home': I/O error
>>             Destroy and re-create the pool from
>>             a backup source.
>>
>>     root at sysresccd /root % zpool import -N
>>        pool: home
>>          id: 4810743847386909334
>>       state: ONLINE
>>      status: One or more devices contains corrupted data.
>>      action: The pool can be imported using its name or numeric
>>     identifier.
>>        see: http://zfsonlinux.org/msg/ZFS-8000-4J
>>      config:
>>
>>             home                   ONLINE
>>               raidz2-0                   ONLINE
>>     wwn-0x5000c500a41a0a00      ONLINE
>>     wwn-0x5000c500a41ae340      ONLINE
>>     wwn-0x5000c500a41b4c57      ONLINE
>>     wwn-0x5000c500a41b7572      ONLINE
>>     wwn-0x5000c500a41ba99c      ONLINE
>>     wwn-0x5000c500a41babe8      ONLINE
>>             cache
>>               sdj3
>>             logs
>>     wwn-0x30000d1700d9d40f-part2  ONLINE
>>
>>     I tried import -F and import -FX too - no luck :-[
>>     I have reviewed all similar cases that google search returned me.
>>     I'm really confused as I've built the 6-drive raidz2 just for the
>>     resiliense and now I face availability issues.
>>
>>     Can someone experienced provide an advice on recovery?
>>
>>     My ZFS versions:
>>     v0.7.7-r0-gentoo
>>
>>     This is also my root pool, so I can't boot my normal rig and
>>     booting the recovery environment using
>>     https://wiki.gentoo.org/wiki/User:Fearedbliss systemrescuecd
>>     based live system.
>>
>>
>>     Thanks in advance.
>>     Anton.
>>
>>
>>
>>
>>
>>
>>
>>     _______________________________________________
>>     zfs-discuss mailing list
>>     zfs-discuss at list.zfsonlinux.org
>>     <mailto:zfs-discuss at list.zfsonlinux.org>
>>     http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
>
>     _______________________________________________
>     zfs-discuss mailing list
>     zfs-discuss at list.zfsonlinux.org
>     <mailto:zfs-discuss at list.zfsonlinux.org>
>     http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at list.zfsonlinux.org
> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20180422/28bfe130/attachment.html>


More information about the zfs-discuss mailing list