[zfs-discuss] zfs-discuss Digest, Vol 7, Issue 59
automaticgiant at gmail.com
Thu Jan 7 12:04:56 EST 2016
Hahaha. Try zpconfig on your pool with a cache disk! (laughing because
I just tried that.) Now, if only I could get it working on a pool that
I can't open.
On 7 January 2016 at 11:35, Chris Siebenmann <cks at cs.toronto.edu> wrote:
>> > Here is the thing: cache devices are *not* listed in the vdev tree.
>> Why is that? I want sauce for that claim, due, in part, to output I'll
>> provide shortly.
> This is what I see when I do a zdb dump of a pool with a L2ARC device
> (on ZFS on Linux):
> # zdb maindata
> Cached configuration:
> vdev_children: 1
> type: 'root'
> id: 0
> guid: 2318438208971947125
> type: 'mirror'
> id: 0
> guid: 8223432651739159300
> type: 'disk'
> id: 0
> guid: 3188774935843120330
> path: '/dev/disk/by-id/ata-ST31000340AS_5QJ10BRT-part7'
> type: 'disk'
> id: 1
> guid: 12137634640557355462
> path: '/dev/disk/by-id/ata-ST31000333AS_6TE0D4T6-part7'
> There's no L2ARC device listed, just the main mirrored pair of disks
> (eg the root has 'vdev_children: 1' for the single mirror vdev).
> 'zpool status' reports the L2ARC:
> I assume that L2ARC devices are not recorded in the vdev tree exactly
> so that they can be missing without causing the pool to fail to import.
> My understanding is that the on-disk ZFS vdev configuration tree is
> not explicitly stored anywhere accessible and instead is basically
> reconstructed on the fly and verified based on the checksum matching.
> This unfortunately leaves one up the creek for detailed information if
> devices are missing; unlike a conventional RAID system, there is no
> accessible source of what the missing stuff is supposed to be. Instead
> all ZFS can say is 'there are some bits missing and the checksum doesn't
> verify, but I have no idea what exactly I'm supposed to be looking for'.
> I believe ZFS doesn't even know the GUID(s) of the missing components,
> which is why the GUID of the 'missing' child is reported as 'guid: 0'
> in your original email. It might be possible to reconstruct a single
> missing GUID given an understanding of how the pool GUID sum is formed
> and all of the other GUIDs. I think the GUID sum is literally just the
> (wrapped-around) sum of all the GUIDs; if I'm doing the math right, this
> would make your single missing GUID value be 14707329497279011883.
>> > Your zdb output here says that your pool is supposed to have two
>> > vdevs ('vdev_children:2') and you can find one (the raidz vdev).
>> > However, the second child vdev, id 1 and type 'missing', is not
>> > there; the 'missing' type is a special vdev type that ZFS fills in
>> > when the pool data says there should be a child vdev but it can't be
>> > found.
>> Idk why zdb says that or which os it was from, but zpool import
>> doesn't give quite the same information. I guess I'd have to look
>> through the nvlists to really understand what it was referring to, but
>> zfs-on-linux 0.6.5-pve6~jessi (proxmox) says:
> 'zpool import' is unfortunately uninformative here; it is explicitly
> coded to stop reporting details after the first problem it finds, which
> here is 'The pool was last accessed by another system'. An Illumos bug
> has been filed to fix this at some point:
> It also turns out that there's an Illumos bug for a potential underlying
> cause of your situation:
> zdb does not report the 'pool was last accessed by another system'
> issue, so it dumps the underlying on-disk pool information as best
> it can reconstruct it.
>> I haven't quite gotten around to tracing the import through user and
>> kernelspace to find the exact error path, but I did replace the guid
>> of a new cache disk with this one in all 4 copies of the vdev label.
>> with the same results. I think I'm going to try and grok the nvlists
>> on that cache disk to see if a vdev tree is present that I need to
>> modify and/or grok the original pool5 vdev tree.
> The 'missing' child is a vdev type of VDEV_TYPE_MISSING. The whole
> configuration reconstruction process seems to be done primarily in
> lib/libzfs/libzfs_import.c's get_configs(); see especially the bit
> that says:
> * Look for any missing top-level vdevs. If this is
> * the case, create a faked up 'missing' vdev as a
> * placeholder. We cannot simply compress the child
> * array, because the kernel performs certain checks to
> * make sure the vdev IDs match their location in the
> * configuration.
> There is also an interesting and potentially relevant 'XXX' comment
> in module/zfs/spa.c's spa_config_valid():
> * XXX - once we have 'readonly' pool
> * support we should be able to handle
> * missing data devices by transitioning
> * the pool to readonly.
> I believe that in theory it would be possible to hack
> spa_config_valid() et al to force a pool with a missing vdev
> to be considered valid and thus to import. I believe that the
> kernel ZFS code will immediately fail IO to such a vdev, possibly
> causing pool explosions if it actually needs data from there; see
> (You might want to do this using zfs-fuse, or at least on a
> sacrificial machine in case this has side effects on other pools.
> Obviously such hackery is only an emergency measure to get the data
> off the pool.)
> - cks
More information about the zfs-discuss