[zfs-discuss] Help with estimated memory need
Niels de Carpentier
niels at decarpentier.com
Sun Mar 11 13:16:46 EDT 2012
>> Solaris requires 2.6GB memory per TB data in the best case (only
>> blocks.) You probably want to double that because of inefficiencies
>> in the
>> zfs on linux implementation.
> So this might be the answer I was looking for ?
Be aware that this is for a blocksize of 128KB. I think the default
blocksize for a zvol is 8KB, which means 16 times the amount of blocks,
and 16 times the storage requirements!
>> Normally metadata (this includes the dedup tables) is limited to 25%
>> the arc! So you'll need a lot of memory (or a fast ssd), and you'll
>> to tune the arc so most can be used for metadata.
> Should I understand that I would need 2.6GB * 2 * 4 = 20 GB RAM only
> for ZFS arc for processing properly a 1 TB Zpool with dedup ? That would
> be HUGE RAM requirements !
You can change the arc settings, and allow all the arc to be usable for
metadata or use the arc only for metadata. 8KB blocks is killing though,
and results in insane memory requirements. With an 8KB block size you
would need 2.6GB * 2 * 16 = 83.2 GB RAM!
Not practical for a home server, although you could use an ssd for l2arc
which would be much faster than normal disks, but much slower than memory.
Every l2arc entry also needs some data in the arc though, so you might
need more memory even if you use an ssd for l2arc.
This shows why people generally don't use dedup except in very limited
situations on small partitions. It's simply not worth it in most cases.
> I've been actually using it daily for more than 6 months. Dreadfully
> slow, getting slower every single day, but it works and has shown
> perfectly stable...
Ok, cool. There have been reports of hangs under load when using dedup,
but maybe your home server is not stressed enough to trigger the issue.
> Without deduplication, what would be your RAM advice for correct
> performance ? - but I would need to at least double my online storage...
Without dedup your current 4GB should be fine for a home server.
Maybe there is backup software that can do the deduplication without
relying on the filesystem. I know there are commercial solutions that can
do this, so there might be open source solutions that can do this as well.
Doing this on the application level seems much better, as you can match on
filename, instead of comparing with all other blocks on the filesystem.
You could also have a look at lessfs, which is optimized for
deduplication. I don't have any experience with it, so I cannot say if
that will work for you.
More information about the zfs-discuss