[zfs-discuss] Unlistable files

Vladimir Brik vladimir.brik at icecube.wisc.edu
Fri Apr 6 12:47:02 EDT 2018


A new symptom I noticed is that I am not longer able to access the
un-listable files by path directly if I run "echo 3 >
/proc/sys/vm/drop_caches".

I ran the strace command and I don't see getdents returning 0. The names
of the missing files do not appear in the output of strace at all, so it
looks like the kernel does not return them.

I am not sure if the problem only happens with big directories. The
machine this is happening on is a file server, and there seem to be more
"file not found" errors than usual in the logs, but I can't tell if that
is caused by the same issue, or if the clients were simply trying to
open files that were never uploaded in the first place.

>  You'd could use "zdb" to dump the
> directory to confirm the entry is there
How do I do this?


Thanks,

Vlad


On 04/05/2018 04:58 PM, Andreas Dilger wrote:
> On Apr 5, 2018, at 2:34 PM, Vladimir Brik <vladimir.brik at icecube.wisc.edu> wrote:
>>
>> Hello.
>>
>> I have run into a strange issue where files don't show up in directory
>> listing but can be accessed by path directly. I wonder if somebody knows
>> what might have caused this.
>>
>> # find dst/a/foo
>> dst/a/foo
>> (as expected)
>>
>> # find dst/a/ -name foo
>> (no output)
>>
>> # ls -l dst/a/foo
>> -rw-r--r-- 1 xxx xxx 5991051 Feb 22 13:35 dst/a/foo
>> (as expected)
>>
>> # ls -l dst/a/ | grep foo
>> (no output)
>>
>> # cp dst/a/foo bar
>> (this works; bar is created and can be listed)
> 
> There are a few potential issues that might cause this.  One is if
> getdents() returns from the kernel with d_ino == 0, then "ls" and
> other directory walking tools will skip the entry as "deleted" for
> historical reasons.
> 
> It might also be that "ls" and ZFS directory iteration do not play well
> together, skipping some entries in the directory (e.g. hash collisions,
> or if telldir() and seekdir() do not work properly).  If your problem
> only happens on large directories then this is a possibility.
> 
> Run your "ls -l dst/a/" under strace and/or ltrace to see if these
> entries are being returned from the kernel, but not printed by "ls",
> or if they are not being returned by the kernel at all.  Something like:
> 
>     strace -f -e trace=open,getdents,lstat -v -y ls -l dst/a/
> 
> The exact system calls for getdents() and lstat() may depend on your
> kernel and userspace libraries.  Note that this will suppress all of
> the other systemcalls, but makes the output more readable.
> 
> Another possibility is a bug in the ZFS ZAP processing code, which does
> not iterate over the entries properly, and doesn't return the names to
> userspace via getdents() at all.  You'd could use "zdb" to dump the
> directory to confirm the entry is there (it pretty much *HAS* to be, if
> the "dst/a/foo" lookup works).  At that point, running with tracepoints,
> or adding printk() debug messages and rebuilding the zfs.ko module would
> help debug where the problem is.
> 
> Cheers, Andreas
> 
>> The problem occurs when I run something like "cp -r src dst", where src
>> is a directory with 12 sub-directories with 6999 files each, about 84K
>> files total, 2.9TB. After copy finishes, dst is missing several thousand
>> files according to find. (Similar thing happened when I tarred src and
>> then unpacked it in a different location; according to tar --list the
>> tarball contained all files.)
>>
>> The cp command reported "No space left on device" for a couple of files.
>> The filesystem has about 80TB free (zpool is about 50% full). The files
>> for which "No space left on device" error was generated just weren't
>> created, it seems, but other missing files are accessible by their full
>> path but did not show up in directory listings (as shown above).
>>
>> ls is reporting some sub-directories of dst have 7000 hard links instead
>> of 7001 that the sub-directories in src have. All missing files seem to
>> be from such sub-directories.
>>
>> After rebooting the server, the missing were no longer accessible by
>> full path.
>>
>> It seems the problem is reproducible. Missing files are not always the same.
>>
>> I am running ZFS 0.7.7, Scientific Linux release 6.8. No ZFS snapshots.
>>
>> If anybody can shed light on this, I would really appreciate it :)
>>
>>
>> Vlad
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at list.zfsonlinux.org
>> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
> 


More information about the zfs-discuss mailing list