[zfs-discuss] kswapd[0-5] at 100% CPU, Load average spiked, NFS clients timing out on "ls"

Tren Blackburn iam at theendoftime.net
Fri Aug 2 14:44:37 EDT 2013


Here's a shot in the dark. Do you have transparent hugepage support
enabled? I've had some issues with that causing memory thrashing in the
past and usually just outright disable it.

Try adding "transparent_hugepage=never" to your kernel boot line and see if
that helps at all.


On Fri, Aug 2, 2013 at 9:43 AM, Sander Klein <roedie at roedie.nl> wrote:

> AFAIK if you don't set zfs_arc_max it will never grow beyond 96GB of
> memory. Not counting memory usage due to fragmentation. Or am I wrong about
> that?
>
> Reducing it with 512GB of memory seems not right to me
>
> Are there other services running on this machine?
>
> Do you have a zil?
>
> Do you have any ssd's for caching?
>
> Greets,
>
> Sander
>
> On 2 aug. 2013, at 18:19, Gordan Bobic <gordan.bobic at gmail.com> wrote:
>
> Have you tried reducing zfs_arc_max (module option) and increasing
> vm.min_free_kybtes (sysctl)?
>
>
> On Fri, Aug 2, 2013 at 5:09 PM, <cshamis at gmail.com> wrote:
>
>> All:
>>
>> So about every 48-96 hours or so, my system enters a "bogged-down" state.
>>  The load average shoots up (normally around 5.00, shoots to 140+ or
>> higher), logins take forever, NFS clients slow down, and everything just
>> grinds to a halt.  Processes start getting "D" status in top and then you
>> know you're in trouble...
>>
>> Reboot fixes it; but I can't reboot this system regularly since it's
>> 24x7.
>>
>> I'm about to put echo 3 > /proc/sys/drop_caches in a cron as a band-aide
>> because that's the only thing that works at getting the load average down.
>>  Note, the load average comes down, but the kswapd processes still stay
>> running at 100% cpu, and sometimes processes stay stuck in "D" status in
>> top.  So it's not perfect...
>>
>> My machine is a beast.  It's an HP585 with 64 processor cores and 512 GB
>> RAM, no swap.  Running RHEL 6.3 and ZFS master from github dated from
>> 6/24/2013.
>>
>> Our system does heavy file IO.  Maybe 2000 or so 6GB files per day, WORM.
>>  Files over 30 days are deleted.  So the ZFS is on a 50TB high churn pool.
>> dedupe=off (never been enabled ever), compression=lz4.
>>
>> Are there any tuning parameters I can try?  Can I turn off kswapd?  add
>> RAM?  take away RAM?  This problem is killing me.
>>
>> v/r
>> -C.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130802/ff56b77f/attachment.html>


More information about the zfs-discuss mailing list