Hung processes during copy operations

Ulrich Petri u.petri at gmail.com
Mon May 23 07:33:04 EDT 2011


Hi Brian,

On May 16, 11:26 pm, Brian Behlendorf <behlendo... at llnl.gov> wrote:
> Have you tried the latest source from the zfs master branch?  I recently
> committed a fix which might address this issue.  It's hard to be certain
> from the stacks included in your email, but it's certainly possible.
> Commit 21ade34 fixed issues #232 which was very similar to the problem
> your describing.

I updated spl to 372c2572336 and zfs to d9bfe0f57 and at first it
seemed that this solved the problems. There were still some long
periods of rsync being in state D but it came out of it eventually.

Unfortunately this weekend the system hung again.

This time I had a script running that logged the memory usage every 10
seconds (I thought maybe that was a clue to what is going on) and the
memory usage definitely looks unusual to me.

You can see the graph here [1] (takes a while to load).

There is nothing running on this machine besides ZFS, rsync copying
data over and the usual system deamons present on a freshly installed
ubuntu system.

When the system was in the hung state everything that tries to
interact with the zfs volumes (e.g zpool / zfs commands and updatedb)
also hung.

Here is the complete dmesg output [2]. The hung reporting eventually
stopps because it reached the maximum hung reporing number.

Also look at the output of "ps fax" [3] shortly before I rebooted the
machine. Notice that the txg_sync process is also in state D

I just saw another thread on the list [4] that seems to describe the
same (or at least very similar) problem. In both cases many small
files seem to be involved.

Bye
Ulrich

[1] http://ulo.pe/misc/zfs_rsync_mem.html
[2] http://cl.ly/3F2M0v3h3t3S422E0H3S
[3] http://cl.ly/2J2d3W1G0k003y092t1S
[4] http://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/browse_thread/thread/489202a36d78c3b1#



More information about the zfs-discuss mailing list