[zfs-discuss] Unlistable files

Andreas Dilger adilger at dilger.ca
Thu Apr 5 17:58:32 EDT 2018


On Apr 5, 2018, at 2:34 PM, Vladimir Brik <vladimir.brik at icecube.wisc.edu> wrote:
> 
> Hello.
> 
> I have run into a strange issue where files don't show up in directory
> listing but can be accessed by path directly. I wonder if somebody knows
> what might have caused this.
> 
> # find dst/a/foo
> dst/a/foo
> (as expected)
> 
> # find dst/a/ -name foo
> (no output)
> 
> # ls -l dst/a/foo
> -rw-r--r-- 1 xxx xxx 5991051 Feb 22 13:35 dst/a/foo
> (as expected)
> 
> # ls -l dst/a/ | grep foo
> (no output)
> 
> # cp dst/a/foo bar
> (this works; bar is created and can be listed)

There are a few potential issues that might cause this.  One is if
getdents() returns from the kernel with d_ino == 0, then "ls" and
other directory walking tools will skip the entry as "deleted" for
historical reasons.

It might also be that "ls" and ZFS directory iteration do not play well
together, skipping some entries in the directory (e.g. hash collisions,
or if telldir() and seekdir() do not work properly).  If your problem
only happens on large directories then this is a possibility.

Run your "ls -l dst/a/" under strace and/or ltrace to see if these
entries are being returned from the kernel, but not printed by "ls",
or if they are not being returned by the kernel at all.  Something like:

    strace -f -e trace=open,getdents,lstat -v -y ls -l dst/a/

The exact system calls for getdents() and lstat() may depend on your
kernel and userspace libraries.  Note that this will suppress all of
the other systemcalls, but makes the output more readable.

Another possibility is a bug in the ZFS ZAP processing code, which does
not iterate over the entries properly, and doesn't return the names to
userspace via getdents() at all.  You'd could use "zdb" to dump the
directory to confirm the entry is there (it pretty much *HAS* to be, if
the "dst/a/foo" lookup works).  At that point, running with tracepoints,
or adding printk() debug messages and rebuilding the zfs.ko module would
help debug where the problem is.

Cheers, Andreas

> The problem occurs when I run something like "cp -r src dst", where src
> is a directory with 12 sub-directories with 6999 files each, about 84K
> files total, 2.9TB. After copy finishes, dst is missing several thousand
> files according to find. (Similar thing happened when I tarred src and
> then unpacked it in a different location; according to tar --list the
> tarball contained all files.)
> 
> The cp command reported "No space left on device" for a couple of files.
> The filesystem has about 80TB free (zpool is about 50% full). The files
> for which "No space left on device" error was generated just weren't
> created, it seems, but other missing files are accessible by their full
> path but did not show up in directory listings (as shown above).
> 
> ls is reporting some sub-directories of dst have 7000 hard links instead
> of 7001 that the sub-directories in src have. All missing files seem to
> be from such sub-directories.
> 
> After rebooting the server, the missing were no longer accessible by
> full path.
> 
> It seems the problem is reproducible. Missing files are not always the same.
> 
> I am running ZFS 0.7.7, Scientific Linux release 6.8. No ZFS snapshots.
> 
> If anybody can shed light on this, I would really appreciate it :)
> 
> 
> Vlad
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at list.zfsonlinux.org
> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss


Cheers, Andreas





-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 873 bytes
Desc: Message signed with OpenPGP
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20180405/63dd8bfe/attachment.sig>


More information about the zfs-discuss mailing list