[zfs-discuss] MySQL Performance Blog about ZoL

Chris Siebenmann cks at cs.toronto.edu
Thu May 30 16:45:57 EDT 2013


| >> I saw that the other day (I'm a MySQL admin among other things). Half
| >> the performance of XFS was a little disappointing, but it actually makes
| >> perfect sense and is expected. XFS on RAID10 will be spreading the reads
| >> between all the disks. ZFS will do every read from both mirrors in order
| >> to verify the integrity of the data, hence half the I/O capacity and
| >> thus half the throughput.
| >
| > Gordan, that doesn't sound right to me.  I thought ZFS spread the read
| > load over all elements of a mirror vdev, hence random reads are 2X (or
| > whatever) faster than writes.
| 
| I am not 100% sure, but it stands to reason that if the whole stripe is 
| checked on RAIDZ, checking both mirrors is probably also happening.

 The situations on mirroring and RAIDZ are different.

 In RAIDZ only a part of each stripe is stored on each disk and the
block checksum is calculated over the *whole* stripe. This means that
you must read the whole stripe in order to be able to verify the
checksum. The only way that ZFS could return a single-disk read from a
RAIDZ stripe was if it was willing to skip verifying the checksum, and
it's not; ZFS guarantees that if it gives you data, that data passed its
block checksum.

(Other RAID-N systems read only single disks because they do not have
checksums and do not do full parity verification on reads; if the disk
drive didn't give them an error, they assume the data is good and just
give it to you.)

 In mirroring each mirror stores a full copy of the block (because this
is what is implied by mirroring). As a result you only need to read one
mirror to get all of the data needed to verify the checksum and thus
return the read to user level.

	- cks



More information about the zfs-discuss mailing list