[zfs-discuss] Summary: space overhead caused by record-/volblock-size
hartnegg at uni-freiburg.de
Mon Jul 10 10:38:19 EDT 2017
This is a try to summarize what I leaned from this discussion.
Thanks to everybody who sent comments.
Foreword: this effect is only relevant with large physical disk sectors
and small logical blocks. This typically happens on zvols, not on files.
Because the default recordsize (for files) is 128k, but the default
volblocksize is only 8k. And the effect is much larger on new disks with
large physical sectors (large ashift).
Turns out there are two reasons for the kind of space overhead which I
had asked for: additional party sectors, and padding sectors.
additional parity sectors
Chunks of data are split into blocks with size recordsize (or volblocksize).
If datasize is not a multiple of recordsize, the last block will be
smaller. But parity sectors are added to each block, regardless how
small it is. This can cause a much higher ratio of parity to data
sectors than a classic RAID would have.
If volblocksize equals sectorsize, parity sectors are added to each data
The number of sectors allocated to store the data and parity sectors of
one chunk of data must be a multiple of p+1 (p = number of parity disks).
If it is not, padding sectors are added.
"So that when it is freed it does not leave a free segment which is too
small to be used (i.e. too small to fit even a single sector of data
plus p parity sectors)"
For both possible sources of space overhead, the effect gets small when
recordsize or volblocksize is large.
The actual amount of space overhead is difficult to predict if you have
compression enabled, and your data is compessible. Then the compression
often overcompensates the space overhead.
Notes for those who put NTFS into a zvol:
Several people recommended NTFS compression instead of ZFS compression.
NTFS compression is not supported with cluster sizes > 4KB.
Each change of a single sector requires to read and rewrite the whole
volblock (if it is not already cached).
While researching this topic, I found two more space overheads:
Unused space behind last metaslab
"Vdevs are divided into 200 or less metasabs for the purpose of space
management. Metaslab size is always a number equal to 2^N, where N is
defined by the metaslab_shift parameter."
If the available space is not a multiple of 2^N, the remaining space is
The value metaslab_shift can be found with
zdb -C $pool | grep metaslab_shift
Slop space reservation: 1/32th of the zpool capacity.
"to ensure that some critical ZFS operations can complete even in
situations with very low free space remaining"
Capacity calculator http://wintelguy.com/zfs-calc.pl
More information about the zfs-discuss