[zfs-discuss] Feature Request: Trivial/Inline/In Place/Static/Over-write Deduplication. Tentative offer to implement

Ryan How ryan at zbit.net.au
Mon Oct 14 03:27:34 EDT 2013


Could this not be done using rsync inplace mode?

I wrote the worlds simplest fuse filesystem to allow reading a block 
device like it is a file so it will work with rsync. There are rsync 
patches around to work with block devices, but they are a bit hit and 
miss from what I've seen.

https://github.com/kartweel/block2file

Ryan

On 14/10/2013 3:12 PM, Schlacta, Christ wrote:
> I have a feature request for a highly useful feature for myself for 
> backup purposes.  I want to be able to turn on a simple deduplication 
> mode that will only work on in-place data.  Namely, when backing up 
> raw disk images (there exists no good utility to accomplish this) I 
> want to run dd if=/dev/sdx of=/dev/zvol/backups/disk-name and simply 
> take a backup of the whole 8 or 16GiB disk, but only actually commit 
> to zfs disk the changed blocks.  Any blocks that haven't changed 
> should simply be matched against their on-disk checksums, and 
> optionally existing data, and the writes silently discarded, returning 
> success.
>
> The semantics are simple.  If the block is the same, don't write it 
> back out again.  End of discussion.  Should be completely optional. 
>  This mechanism should NOT have the side effects of enabling 
> Deduplication.  Namely, no alternate hashing should be enforced.  No 
> DDT should be created.  Nothing on the pool should change except a 
> feature flag.  Disabling the feature should result in a pool that 
> otherwise looks exactly like only changed blocks were written, and 
> future writes to the data set should behave as if the behavior was 
> never enabled, namely, identical blocks should be over-written again.
>
> Enabling this feature should be an inheritable dataset flag, 
> deduplication, or dedup, as it exists now.  The following options 
> should be added:  trivial, trivial-verify.
>
> Additionally, this seems like a simple enough project for a beginner 
> to cut his or her teeth on.  If the rest of you agree, I'd like to get 
> some pointers, then look into implementing this feature my self in my 
> spare time this quarter.  If you think it's too complicated (I've 
> never even looked at ZFS itself's code, nor touched linux kernel 
> code), I'd like to request that someone well suited to the task take 
> it on.
>
> Thank you in advance
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to zfs-discuss+unsubscribe at zfsonlinux.org.


To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe at zfsonlinux.org.



More information about the zfs-discuss mailing list