zpool scrub speed

devsk devsku at gmail.com
Sat May 7 20:16:29 EDT 2011


Jean,

I am not using MD or DM raid variants. I am using deadline IO
scheduler and I have configured both CGROUPS and AUTOGROUP (recent
2.6.38.5 feature which groups processes based on the session id,
created using setsid call). PREEMPT_NONE is also set in the kernel
config.

So, I have all the right ingredients for ZFS on Linux to function
properly. I have a feeling this is something deep down in memory mgmt
(some sort of deadlock), but that's pure speculation because when this
happened, I could not get any useful debug information out of the
system: it was a sudden hard lock.

-devsk


On May 7, 3:59 pm, Jean-Michel Bruenn <jean.bru... at ip-minds.de> wrote:
> Hello (again),
>
> if devsk got the same issue as i, this is easily reproducable. Launch
> X11, start your ZFS stuff, mount it, and now run bonnie++ onto it with
> some highers values.
>
> i.e. bonnie++ -d /mountedzfsdir -n 512
>
> As soon as it's at "reading with getc" the box starts to have high wait
> shown in top while swap is not in use, ram is not in use, cpu shows 99%
> idle. Used Hardware: quadcore 3,1 ghz, 4gb ram, 8gb swap, 4x500gb sata
> with 32mb cache each.
>
> Tried with raid5, raid10, zfs - everytime same result.
>
> Tried with different hardware: dualcore 2ghz, 3gb ram, same amount of
> discs, same result.
>
> As soon as it's at writing (deleting directories and such things) i
> can't move the mouse properly everything is hanging, and at some points
> i have to wait like a minute that i can start to continue doing
> anything (and again.. no swap in use, no ram in use, it's all free, not
> even cached).
>
> So... this should be easily reproducable. Switching from PReempt to
> non-preempt (which is now the default) helps a bit. switching from cfq
> to deadline helps. according to a kernel dev this is a known bug and
> the proper solution is using cgroups (as i've explained in the other
> mail). This is only reproducable on HIGH I/O - Thats why bonnie or
> "scrub" is good for reproducing it.
>
> However, this was the case for "my" problems - switching to a box
> without x11 solved it, zfs is totally stable, nothing is hanging. Would
> be cool if you can verify, that you got the "same" issue (because then,
> its not a zfs-issue - it's a kernel bug)
>
> Jean
>
> On Sat, 7 May 2011 15:23:33 -0700 (PDT) devsk
>
>
>
>
>
>
>
>
>
> <dev... at gmail.com> wrote:
> > Its not reproducible so far (as is typical with intermittent issues)
> > and I have run scrub before multiple times.
>
> > I reduced the ARC to 1.5GB but not to 128M. Previously, very low arc
> > WAS one of the reasons for lock ups. So, I have never tried that
> > again. And Brian had put in fixes for actually respecting the ARC
> > limit in rc4, so I kept it at 2GB because I have 12GB in this box.
>
> > -devsk
>
> > On May 7, 3:15 pm, "Fajar A. Nugraha" <l... at fajar.net> wrote:
> > > On Sun, May 8, 2011 at 5:09 AM, devsk <dev... at gmail.com> wrote:
> > > > So, it looks like scrub speed is also
> > > > determined by how much memory is fragmented i.e. on a box which is
> > > > freshly booted, things are generally faster and as the time goes on
> > > > and zfs allocates and arc_reclaim eventually reclaims memory
> > > > repeatedly, things begin to slow down.
>
> > > Is the lockup reproducable? Can your repeat the process, but this time
> > > setting zfs_arc_max and zfs_arc_min to the same value (something like
> > > 128M or whatever sane value depends on your memory size) thru module
> > > parameter?
>
> > > --
> > > Fajar
>
> --
> Jean-Michel Bruenn <jean.bru... at ip-minds.de>



More information about the zfs-discuss mailing list