[zfs-discuss] zvol with dedupe space accounting

Alex 'CAVE' Cernat cave at cernat.ro
Fri Feb 26 05:29:13 EST 2016


On 25/2/2016 11:07 PM, Edward Ned Harvey (zfsonlinux) wrote:
> That being said, in my experience, zfs deduplication performs terribly, consumed *great* *gobs* of memory, and fails ungracefully when the system gets low on memory. So there's literally no situation that I would recommend anyone to use it. It's great theoretically, but never got developed enough to be good in practice, because of the Sun implosion and loss of funding for zfs development and close-sourcing of zfs.
>
> Compression, on the other hand, works so well, there's *almost* no situation where I would advise having it disabled.
indeed, zfs deduplication is wonderful, is sublime, but if the
deduplication hash table doesn't fit in the memory you will wish you've
never heard about it :-P
initially I looked as holy grail (in theory), but forgot about it after
extensive tests

IIRC the 'standard' memory consumption is about 5 GB of memory per 1 TB
of data; but, because the deduplication table can fill only 25% of the
memory, you need 20 GB of memory per 1 TB of data
(http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe);
'huge' is not enough said, it's catastrophic :-P

IMHO, the deduplication table should be stored somehow separately in
memory (so the limit of 1/4 of ARC should not count). Also to be stored
separately on a SSD (maybe with backup on 'normal' drives).
But it's easy to speak theoretically, without deep knowledge of zfs
structure, so let's not throw the stone!

Microsuxx in windows 12 has an interesting approach of deduplication.
AFAIK it's made 'offline' (i.e. 'cron' based). Let's not compare those
implementations, both have good points and issues, so we don't compare
apples and oranges.

I don't see any reason not to implement compression (I've read an
article about installing mysql even on an gzip compressed volume, it's
kinda shocking first, but if you read it from start to end you can
understand the logic).
But when in doubt, use lz4 compression, it's very lightweight, and it
fall back rapidly if it 'detects' small compression rate. So the
overhead is really negligible.

Alex

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20160226/292885b7/attachment-0001.html>


More information about the zfs-discuss mailing list