[zfs-discuss] How can one import a zpool with missing cache device?
Chris Siebenmann
cks at cs.toronto.edu
Fri Nov 13 13:29:11 EST 2015
> The superuser question was active for a while and valiant efforts
> were made, but now it has stalled. Here's the latest version of the
> question, with all the collected info so far:
[...]
> $ sudo zdb -e pool5 says:
>
> Configuration for import:
> vdev_children: 2
[...]
> vdev_tree:
> type: 'root'
> id: 0
> guid: 14850262647910895720
> children[0]:
> type: 'raidz'
> id: 0
[...]
> children[1]:
> type: 'missing'
> id: 1
> guid: 0
I think that your pool is almost certainly toast and the missing cache
device is mostly a red herring.
Here is the thing: cache devices are *not* listed in the vdev tree.
Your zdb output here says that your pool is supposed to have two vdevs
('vdev_children:2') and you can find one (the raidz vdev). However, the
second child vdev, id 1 and type 'missing', is not there; the 'missing'
type is a special vdev type that ZFS fills in when the pool data says
there should be a child vdev but it can't be found.
Based on the pool information reported in
http://superuser.com/q/993695/515694
this may have ultimately happened because at one point you had the
same device (then 'ata-ST31500341AS_6VS073SA') in use both as the
sole device/vdev in the 'zones' pool and the cache device of 'pool5'.
This may have caused ZFS to get badly confused and update the pool5
configuration to think that it had two real vdevs, not a single vdev and
a cache device.
It's possible that an expert ZFS programmer could recover from this
with a significant amount of manual patching of the ZFS/ZoL source
code. At a conceptual level, what they probably need to do is somehow
modify the visible pool configuration for pool5 so that it only has
vdev_children:1. I don't believe there are any existing and well
documented tools that can do this. However, if pool5 ever wrote real
data to the missing device/vdev (as opposed to just using it as a
cache), the pool will likely be toast (because you have actual missing
data at that point).
(An alternate approach would be to systematically disable every safety
measure in user level code and the kernel that keeps one from importing
this pool due to the inconsistency. Some Internet searches suggest that
people have done this sort of thing with zfs-fuse in the past.)
- cks
More information about the zfs-discuss
mailing list