[zfs-discuss] Re: ZFS memory usage and long reboots on OpenIndiana

richard.elling at richardelling.com richard.elling at richardelling.com
Wed Jan 8 17:10:42 EST 2014


Hi Liam,


On Wednesday, January 8, 2014 12:29:51 PM UTC-8, Liam wrote:
>
>
> The application is GlusterFS - there is no NFS sharing.  The rest of your 
> answers are inline...
>
> On Wed, Jan 8, 2014 at 10:19 AM, <richard... at richardelling.com<javascript:>
> > wrote:
>
>> On Tuesday, January 7, 2014 3:35:48 PM UTC-8, Liam wrote:
>>
>>> We're having some problems with ZFS on OpenIndiana 151a8.
>>>
>>> Our first problem is after we have a heavy couple days of writing 
>>> (steady 50MB/sec write) - we eat up all the memory on the server until the 
>>> point that the server starts swapping until a death spiral happens.
>>>
>>> I graph arc statistics and I see the following happen:
>>>
>>> arc_data_size decreases
>>> arc_other_size increases
>>> meta_size exceeds the meta_limit
>>>
>>
>> The meta_limit is not a limit on the metadata size. IMHO, it is misnamed, 
>> but
>> it also isn't likely to be your problem.
>>
>
> ok
>  
>
>>  
>>
>>>
>>> At some point the free available memory starts to decrease until the arc 
>>> starts eating away at swap.
>>>
>>
>> No. ARC never uses swap. If the ARC gets too big, then other apps can 
>> swap, but the ARC 
>> will also shrink. You can try reducing zfs_arc_max to allow more free RAM 
>> for your app.
>>
>
> In our case. the arc_meta_used value increases past the arc_meta_limit and 
> than never goes back down.  Without a reboot the arc_meta_used increases 
> until the point that the system starts to swap (could be the system or arc 
> but either way the box crashes).  Our arc_meta_limit is set to 1/2 of the 
> total memory on the system.  (See values below)
>

You can safely ignore arc_meta_limit tunables, it does not limit the ARC's 
metadata size.
The big knob to turn is zfs_arc_max, reduce that to a level that leaves 
enough RAM for
your app, plus some margin.
 

>
>  
>
>>  
>>
>>>
>>> Now a reboot fixes this, but a reboot takes *hours* to run.  After the 
>>> arc_cache fills a reboot takes 5-6 hours to remount the zfs datastore. 
>>>  During this time all the disks are flashing and the server is waiting.
>>>
>>
>> There is currently no known bug to cover this case. It is possible if you 
>> had destroyed a 
>> dataset and the zpool version does not have the async destroy feature 
>> enabled. But if
>> async destroy is enabled, then something else is going on.
>>
>
> We don't use snaphots and haven't destroyed any datasets.  The ZFS set is 
> extremely simple.
>
> data = main zpool
> data/store = all our data goes in here
>
> We're not using dedup, compression, or anything out of the ordinary.  All 
> the rest of the zfs settings are default.
>

ok, that makes the long reboot harder to explain. OTOH, if it is a transient
condition, you might not see it again. If you do see it again, ping me 
directly.
 -- richard
 

>  
>
>>  
>>
>>>  
>>> I have no idea what its doing during a reboot as SSH hasn't started yet. 
>>>  It looks like it is doing a fsck...just thrashing the disk.
>>>
>>> Anybody have any idea what could possibly be going on?
>>>
>>
>> Have you changed any other tunables?
>>
>
> I have no changed anything else.  Everything else is default.
>
> thanks!
> liam
>
>  
>
>>  -- richard
>>  
>>
>>>
>>> System:
>>>
>>> OpenIndiana 151a8
>>> Dell R720
>>> 64g ram
>>> LSI 9207-8e SAS controller
>>> Dell MD1220 JBOD w/ 4TB SAS
>>> Gluster 3.3.2
>>>
>>> set zfs:zfs_arc_max=51539607552
>>> set zfs:zfs_arc_meta_limit=34359738368
>>> set zfs:zfs_prefetch_disable=1
>>>
>>> thanks!
>>> liam
>>>
>>>  To unsubscribe from this group and stop receiving emails from it, send 
>> an email to zfs-discuss... at zfsonlinux.org <javascript:>.
>>
>
>

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe at zfsonlinux.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20140108/fd0684be/attachment.html>


More information about the zfs-discuss mailing list