[zfs-discuss] Help with debugging pool import problem

Chris Siebenmann cks at cs.toronto.edu
Mon Apr 23 14:00:39 EDT 2018


> I am looking for some help in debugging a problem with pool import and
> any suggestions are highly appreciated.
> 
> First off, this is what I see when I try to import:
[...]
> An important point here is that the device "sdb" is a block device
> backed by my own driver. [...]
>
> I completely understand that the real problem could be in my code but
> I am looking for suggestions on how to debug pool import issues. If I
> can get more specific error (something beyond "corrupted data"), that
> would help me in narrowing down the root cause.

 At the moment there are no better diagnostics that are readily
available from the user level code, as you've noticed. This is
very annoying in a number of situations, but the good news is
that improvements are coming.

 First, I believe that this error ultimately emerges from the kernel,
but you should check the user-level code involved in 'zpool import'
to be sure. Modifying the user-level code to spit out additional
information (or simply running it under gdb with appropriate breakpoints
set) will likely be a lot easier than the alternatives.

 As far as the kernel code goes, a change to split the previously
monolithic kernel code for importing a ZFS pool into multiple functions
was recently added to the development version of ZFS on Linux (this
is the commit for 'OpenZFS 7638 - Refactor spa_load_impl into several
functions'). If you can build the development ZFS on Linux and then
instrument which of these new functions are called and succeed (versus
fail), you can probably gather much more information about the specific
failing operation. Possible avenues for instrumentation are various
passive kernel function tracing methods like ftrace[*], or simply hand
modifying the code to stick printk()s in.

 This split of spa_load_impl is the prequel for a second change that
will add better error reporting to 'zpool import' itself, but that
change is not yet ported to ZFS on Linux. However, there is an open
pull request with an initial integration of this code:
	https://github.com/zfsonlinux/zfs/pull/7459

A sufficiently daring person could build a version of ZFS on Linux
with this pull request integrated and then give it a try. Obviously
you want to do this on a scratch system, but if this is a test system
for a storage driver, it may qualify.

	- cks
[*: see eg https://jvns.ca/blog/2017/03/19/getting-started-with-ftrace/
]


More information about the zfs-discuss mailing list