[zfs-discuss] Re: Bottlenecking on CPU with SSDs, Tuning, Scheduling

Richard Yao ryao at gentoo.org
Tue Sep 10 17:40:28 EDT 2013


Deduplication performance is slow because each in-core miss on a write
requires at least one random seek. If you are deduplicating 4KB blocks
and you only have 4000 IOPS write performance, you are limited to 16MB/s
write throughput. Note that these are maximum figures that do not take
into account the transaction commit.

With that said, it is possible to accelerate this process using
something called a bloom filter. A bloom filter is a probablistic data
structure that can be used to answer questions about the contents of a
set. When asked if the set contains a given value, it will either say no
or maybe. Implementing this on the hashes would permit us to skip
lookups on some misses, which would improve performance.

Sand Force does not have to worry about the flash growing, so they can
use far smaller hashes for imported memory efficiency. To put things
into perspective, they can manage at least an order of magnitude less
data with an order of magnitude smaller hash size. Both factors
significantly increase the number of hashes that they can store in
memory at any given time in comparison to what ZFS can store with the
same amount of memory. Also, implementing a bloom filter in their
firmware would be a competitive advantage, so they likely already have
one implemented.

On 09/10/2013 03:38 PM, Gordan Bobic wrote:
> On 09/10/2013 08:22 PM, Niels de Carpentier wrote:
>>> I'm seeing the same in an effort to get a pair of Fusion-io ioDrive2
>>> cards
>>> to work as mirrored pool devices. I'm not even getting 30% of the
>>> sequential throughput that I *should* be seeing.
>>
>> Are you using dedup as well?
>>
>> I could be that dedup is causing txg_sync load, as is causing an overal
>> slowdown.
>> It would be interesting to know if the same happens without dedup.
> 
> If it is dedupe that is causing the insane txg_sync load and abismal
> performance (and that is a big if), considering that the recent
> Sandforce flash controllers dedupe the blocks anyway, and do so in
> realtime without any slowdown with a lot less CPU than a 3GHx Xeon, I'd
> like to hope that there are some vast improvements in ZFS planned WRT
> this particular feature.
> 
> Gordan
> 
> To unsubscribe from this group and stop receiving emails from it, send
> an email to zfs-discuss+unsubscribe at zfsonlinux.org.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130910/b4d0c50b/attachment.sig>


More information about the zfs-discuss mailing list