[zfs-discuss] Feature Request: Trivial/Inline/In Place/Static/Over-write Deduplication. Tentative offer to implement

Schlacta, Christ aarcane at aarcane.org
Mon Oct 14 03:12:47 EDT 2013


I have a feature request for a highly useful feature for myself for backup
purposes.  I want to be able to turn on a simple deduplication mode that
will only work on in-place data.  Namely, when backing up raw disk images
(there exists no good utility to accomplish this) I want to run dd
if=/dev/sdx of=/dev/zvol/backups/disk-name and simply take a backup of the
whole 8 or 16GiB disk, but only actually commit to zfs disk the changed
blocks.  Any blocks that haven't changed should simply be matched against
their on-disk checksums, and optionally existing data, and the writes
silently discarded, returning success.

The semantics are simple.  If the block is the same, don't write it back
out again.  End of discussion.  Should be completely optional.  This
mechanism should NOT have the side effects of enabling Deduplication.
 Namely, no alternate hashing should be enforced.  No DDT should be
created.  Nothing on the pool should change except a feature flag.
 Disabling the feature should result in a pool that otherwise looks exactly
like only changed blocks were written, and future writes to the data set
should behave as if the behavior was never enabled, namely, identical
blocks should be over-written again.

Enabling this feature should be an inheritable dataset flag, deduplication,
or dedup, as it exists now.  The following options should be added:
 trivial, trivial-verify.

Additionally, this seems like a simple enough project for a beginner to cut
his or her teeth on.  If the rest of you agree, I'd like to get some
pointers, then look into implementing this feature my self in my spare time
this quarter.  If you think it's too complicated (I've never even looked at
ZFS itself's code, nor touched linux kernel code), I'd like to request that
someone well suited to the task take it on.

Thank you in advance

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe at zfsonlinux.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20131014/92ec2fb1/attachment.html>


More information about the zfs-discuss mailing list