[zfs-discuss] Help with debugging pool import problem

Raghuram Devarakonda draghuram at gmail.com
Mon Apr 23 17:15:51 EDT 2018


Thanks Chris. I will build the PR and see whether it helps.

Thanks,
Raghu

On Mon, Apr 23, 2018 at 2:00 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
>> I am looking for some help in debugging a problem with pool import and
>> any suggestions are highly appreciated.
>>
>> First off, this is what I see when I try to import:
> [...]
>> An important point here is that the device "sdb" is a block device
>> backed by my own driver. [...]
>>
>> I completely understand that the real problem could be in my code but
>> I am looking for suggestions on how to debug pool import issues. If I
>> can get more specific error (something beyond "corrupted data"), that
>> would help me in narrowing down the root cause.
>
>  At the moment there are no better diagnostics that are readily
> available from the user level code, as you've noticed. This is
> very annoying in a number of situations, but the good news is
> that improvements are coming.
>
>  First, I believe that this error ultimately emerges from the kernel,
> but you should check the user-level code involved in 'zpool import'
> to be sure. Modifying the user-level code to spit out additional
> information (or simply running it under gdb with appropriate breakpoints
> set) will likely be a lot easier than the alternatives.
>
>  As far as the kernel code goes, a change to split the previously
> monolithic kernel code for importing a ZFS pool into multiple functions
> was recently added to the development version of ZFS on Linux (this
> is the commit for 'OpenZFS 7638 - Refactor spa_load_impl into several
> functions'). If you can build the development ZFS on Linux and then
> instrument which of these new functions are called and succeed (versus
> fail), you can probably gather much more information about the specific
> failing operation. Possible avenues for instrumentation are various
> passive kernel function tracing methods like ftrace[*], or simply hand
> modifying the code to stick printk()s in.
>
>  This split of spa_load_impl is the prequel for a second change that
> will add better error reporting to 'zpool import' itself, but that
> change is not yet ported to ZFS on Linux. However, there is an open
> pull request with an initial integration of this code:
>         https://github.com/zfsonlinux/zfs/pull/7459
>
> A sufficiently daring person could build a version of ZFS on Linux
> with this pull request integrated and then give it a try. Obviously
> you want to do this on a scratch system, but if this is a test system
> for a storage driver, it may qualify.
>
>         - cks
> [*: see eg https://jvns.ca/blog/2017/03/19/getting-started-with-ftrace/
> ]


More information about the zfs-discuss mailing list