[zfs-discuss] Re: Importing a semi-faulted pool

Gordan Bobic gordan.bobic at gmail.com
Thu Aug 15 06:47:39 EDT 2013


OK, so I got the pool to mount, but there are unrecoverable errors. :(

# zpool status -v
  pool: ssd
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0h48m with 0 errors on Sun Aug 11 06:53:36 2013
config:

    NAME                                                   STATE     READ
WRITE CKSUM
    ssd                                                    ONLINE
0     0   258
      mirror-0                                             ONLINE
0     0   261
        1976656507382779759                                UNAVAIL
0     0     0  was
/dev/disk/for-zfs/ata-Integral_SATAII_and_USB_SSD_DC0110060D6250008-part4
        wwn-0x5000c50021f59168-part4                       ONLINE
0     0   261
      mirror-1                                             ONLINE
0     0     0
        ata-Integral_SATAII_and_USB_SSD_DC091012109940021  ONLINE
0     0     0
        ata-Integral_SATAII_and_USB_SSD_DC28101210CBE0012  ONLINE
0     0    46
    cache
      zram12                                               ONLINE
0     0     0
      zram13                                               ONLINE
0     0     0
      zram14                                               ONLINE
0     0     0
      zram15                                               ONLINE
0     0     0
      zram16                                               ONLINE
0     0     0
      zram17                                               ONLINE
0     0     0
      zram18                                               ONLINE
0     0     0
      zram19                                               ONLINE
0     0     0
      zram20                                               ONLINE
0     0     0
      zram21                                               ONLINE
0     0     0
      zram22                                               ONLINE
0     0     0
      zram23                                               ONLINE
0     0     0

errors: Permanent errors have been detected in the following files:

        ssd/ariia at 20130815:<0x1>
        ssd/ariia:<0x1>
        ssd/edi:<0x1>

I cannot zfs send the snapshots of any of these zvols, nor can I dd them
out, I just get I/O errors. I can verify that the underlying media is
indeed readable. Is there a way I can get at least a partially corrupted dd
image off? There are a few files in there that I could do with retrieving
since they would have changed since the most recent backup. Is there a way
to tell zfs to ignore errors and just feed back blocks of 0s for the data
it can't scrub out?

Gordan




On Thu, Aug 15, 2013 at 9:43 AM, Gordan Bobic <gordan.bobic at gmail.com>wrote:

> On Thu, Aug 15, 2013 at 8:57 AM, Gordan Bobic <gordan.bobic at gmail.com>wrote:
>
>> I have a situation where I've had a pool on 4 questionable-brand
>> (Integral) SSDs (stripe of mirrors). The SMART on these contains nothing
>> but pure lies - all wear-out indicators show no wear at all and the overall
>> status is "PASSED", but one of the disks appears to have run out of write
>> endurance and has locked itself into read-only mode, and even that is
>> somewhat dodgy - I don't know how much of what it's feeding me on reads is
>> duff data, but reading it with dd using bs > 8KB, results in errors, but
>> 8KB and 4KB reads seem to manage those reads without erroring.
>>
>> Unfortunately, another disk dropped out, too, and now I'm in a situation
>> where zpool import shows the pool is there with that read-only disk and 2
>> other disks online, and the 4th disk faulted.
>>
>> I tried zfs import -f $poolname and that caused 5 kernel zfs threads to
>> to up to 100% CPU overnight and it didn't get anywhere - it was still
>> sitting there doing nothing, just burning through CPU. There is no
>> explosive memory usage, nor any disk I/O that I could see during this
>> process.
>>
>> I'm going to dd the contents of the read-only disk to a new disk, but I
>> don't expect different results.
>>
>> Any suggestions on how to import a pool in case of SSD-like disk failures
>> like this? The pool is quite small, it has maybe 100GB of data on it (a
>> couple of VM images), but it is deduped.
>
>
> To add a little more info, with the read-only disk now removed from the
> pool:
>
> # zpool import
>    pool: ssd
>      id: 11866534114522102781
>   state: FAULTED
>  status: One or more devices contains corrupted data.
>  action: The pool cannot be imported due to damaged devices or data.
>     The pool may be active on another system, but can be imported using
>     the '-f' flag.
>    see: http://zfsonlinux.org/msg/ZFS-8000-5E
>  config:
>
>     ssd                                                          FAULTED
> corrupted data
>       mirror-0                                                   DEGRADED
>         ata-Integral_SATAII_and_USB_SSD_DC0110060D6250008-part4  UNAVAIL
> corrupted data
>         ata-Integral_SATAII_and_USB_SSD_DC0110060D6250009-part4  UNAVAIL
>       mirror-1                                                   ONLINE
>         ata-Integral_SATAII_and_USB_SSD_DC091012109940021        ONLINE
>         ata-Integral_SATAII_and_USB_SSD_DC28101210CBE0012        ONLINE
>
>
> I'm sure I can add the read-only disk back and that disk will then show up
> as ONLINE, but then there is the situation of 5 kernel threads startup up,
> each eating 100% of CPU without doing any disk I/O (according to iostat)
> and never terminating. L2ARC is also not getting filled (I have 12 ZRAM
> devices defined for this, and they don't get touched.
>
> This looks like a bug (especially since whatever those threads are doing
> consumes neither RAM nor disk I/O), but unfortunately, in this particular
> instance I didn't take my own advice of creating the pool explicitly with
> version 26 so I cannot try getting it up and running with zfs-fuse.
>
> Gordan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130815/68714dc0/attachment.html>


More information about the zfs-discuss mailing list