[zfs-discuss] Issue with data

Richard Laager rlaager at wiktel.com
Wed Sep 25 22:06:08 EDT 2013


Set your ashift based on the physical parameters of the drive. If you
have 512B sectors, you *may* want to set ashift to 12 (4kB sectors)
anyway, out of concerns for replacement drive sector sizes.

Then just be aware that you're going to waste space if your workload
ends up with ZFS writing blocks of less than (number of data disks) *
2^ashift. You asked about a 6-disk raidz2. That has 4 data disks.
Assuming ashift=12 (4kB disk sectors), that's 4 * 4kB = 16kB. So
anything less than 16kB blocks will waste space.

Whether you want to accept that trade-off is dependent on what you're
trying to do and your budget. For example, if you're trying to do all
4kB zvols, then your raidz2 effectively becomes a triple-mirror (1x 4kB
data + 2x parity), so you might as well do triple-mirroring instead. It
would give you higher read IOPS (as I don't think ZFS special-cases d=1
raidz stripes to satisfy reads from parity blocks). If you're doing 8kB
zvols, then it's (2x 4kB data + 2x parity), so you're losing 25% of your
expected capacity; but if your goal is to survive *any two* drive
failures, then the only alternative is triple-mirroring, which would
give you 50% of your expected capacity with the same number and size of
disks. So that's the same trade-off you already expected, it's just that
the breakpoint is different.

If you're doing file-based access, keep in mind that recordsize is an
upper-bound, not a lower-bound. So there's no point in tuning recordsize
for this.

-- 
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130925/b8234df7/attachment.sig>


More information about the zfs-discuss mailing list