[zfs-discuss] arc size vs memory usage

Brian Behlendorf behlendorf1 at llnl.gov
Thu Feb 2 18:22:09 EST 2012

On Wed, 2012-01-25 at 23:45 -0800, Niels de Carpentier wrote:
> > Maybe the definitive solution is to replace the ARC's
> > slab-based/slub-based memory allocation mechanism, with a different
> > non-slab non-slub memory allocation and organization backing.
> Probably at least some tighter integration with the kernel memory
> management. However, it should be possible to make some smaller changes
> that prevent most issues.

Yes, that's exactly the long term plan.  The spl's slab was always
written as an interim solution to handle the memory management demands
of the existing zfs code.

Zfs relies heavily on the ability to do large fast large allocations of
contiguous memory.  In many environments this isn't an issue, but by
design the Linux kernel does not handle this case well.  Rather than
rewrite large chunks of the zfs code during the initial port, I wrote
the slab which provides respectable performance for large virtual memory

Going forward I plan to investigate reworking zfs to rely on Linux
friendly scatter-gather lists of pages rather than large chunks of
virtual memory.  Some of the expected benefits are:

* Reduced need for the spl vmem based slab.  Eventually, we may be
  able to rely entirely on the existing Linux memory allocators.

* Scatter-gather based arc buffers enable the possibility of mapping
  the ARC directly in to the Linux page cache.  Most notably that
  will remove the need to double cache for mmap()'ed I/O and zfs
  caching will be correctly accounted for in the usual Linux places.

* Improved 32-bit Linux support due to reduced usage of the virtual
  memory address space.

* Improved performance.

* Reduced memory usage due to less overhead and fragmentation.

Since this is going to be a substantial change it's not going to happen
until after we tag a stable release.  So in the meanwhile, if it's
possible to tweak the existing slab implementation to improve things I'm
all for it.

Niels, please feel free to submit and patches you come up with as a pull
requests on Github.  I'm happy to review them!


> There is a 15 second timeout before the slab start returning memory to the
> system, which probably needs to be lowered. It would be best if the
> timeout was ignored in case of memory pressure. This might already be
> implemented (but not working properly), haven't looked that closely yet.
> The SLAB inefficiency might be caused by a bug, but would help things as
> well.
> I'm looking into both issues, and if both are solved that should help
> things a lot.
> Niels

More information about the zfs-discuss mailing list