[zfs-discuss] About zvol performance drop

Daobang Wang wangdbang at 163.com
Wed May 11 21:22:32 EDT 2016


Hello,

    Thank you to reply, I will change the configuraiton to mirror to try it again.

Best Regards,
Daobang Wang.

At 2016-05-11 18:20:34, "Phil Harman" <phil.harman at gmail.com> wrote:
>Of course, a fast log device is always a great idea when you want both performance and to keep your data.
>
>In this case a SLOG/ZIL device will be of zero benefit, because the OP has gone to great lengths to make sync writes a non issue (i.e. sync=disabled and writeback).
>
>But even a log won't save you from eventually needing to write the data back to disk. At some point as the ARC fills up, you'll hit write throttling.
>
>With 8 drives in RAIDZ1 config and assuming the default zvol 8KB volblocksize and default ashift=9...
>
>ZFS will split one 8KB random write into 7 data + 1 parity writes of 3 sectors (1.5KB) per drive (you check this with iostat).
>
>We know they're fast drives (in spinning rust terms), but we haven't been told their size. Let's guess at 600GB, in which case a 500GB zvol would span about 500/7/600 - i.e. about 12% of the drive.
>
>So, we might expect about 500-600 random IOPS per drive, and as we know each zvol 8KB random write will generate 1 write per drive, we might expect between 4-5MB / sec throughput once write throttling kicks in.
>
>Why do we get less than this? Because we haven't accounted for metadata.
>
>In conclusion, 3MB / sec is not unreasonable.
>
>Moral: don't use RAIDZ when you do a lot of small random IO.
>
>You could speed things up (and waste a whole lot if space) by using ashift=12.
>
>Best option is mirroring. But you'll still see only 10-12MB / sec with such small writes once write throttling kicks in.
>
>> On 11 May 2016, at 07:47, Sebastian Oswald via zfs-discuss <zfs-discuss at list.zfsonlinux.org> wrote:
>> 
>> Hello,
>> 
>> Just to make sure: the disks are directly attached to the system by
>> a "dumb" HBA, not via RAID-controller as single-disk-raid0 or any
>> other funky configuration? RAID-Controllers can't cope with how ZFS is
>> handling the disks and may/will "do funny things". (though not funny for
>> performance or your data...)
>> Also the pool was created with 4k blocksize (ashift=12)?
>> 
>> ZFS gets its speed from spreading I/O over all available VDEVs. Major
>> rule of thumb: the more VDEVs the more performance (at reasonable
>> numbers of VDEVs...)
>> RAIDZs are a tradeoff between usable space, redundancy and speed -
>> with priorities descending in this order.
>> For high (random) I/O applications like VM-storage you should definately
>> consider another disk layout. Maximum performance would be 4 x 2
>> mirroring, which also gives the best flexibility (you can 2 disksk at a
>> time), but only 50% usable space.
>> A good tradeoff with usable space is 2xRAIDz1 with 4 disks each. You
>> should benchmark both layouts and decide based on your requirements.
>> 
>> Another important point is the L2ARC and SLOG. For high IOPS add an SSD
>> backed L2 cache and SLOG. This gives by far the biggest performance
>> boost for ZFS, especially when using it as a storage provider for
>> multiple systems with relatively low memory on the storage system. 32GB
>> is relatively low in ZFS-terms - always try to throw as much RAM at ZFS
>> as technically and financially possible. 
>> For SSD cache/SLOG make sure to only use proper server-grade SSDs as
>> consumer SSDs will be hammered to death within a few months (I killed 2
>> cheap 60GB SSDs in a test system within 3 months...). SATA/SAS SSDs are
>> fine, NVMe or PCIe are much better.
>> 
>> The backing spinny-disk layout should be improved anyways, because ZFS
>> throttles writes if the SLOG is growing too big too fast because the
>> VDEVs can't keep up.
>> This behaviour is tuneable but in almost any case the defaults are
>> perfectly fine and shouldn't be touched! Making things worse by tuning
>> these parameters is far easier and common than actually gaining any
>> improvements.
>> 
>> In short:
>> 1. change your disk layout
>> 2. add SSD-backed L2ARC and SLOG
>> 3. ....
>> 4. profit
>> 
>> 
>> Regards,
>> Sebastian Oswald
>> 
>> 
>>> Hi All,
>>> 
>>>    I setup an system(32GB DDR3), 8 SAS disks(15000 RPM ) created a
>>> raidz1(sync disabled), and one 500GB zvol(sync disabled), qle2562 FC
>>> HBA, exported the zvol with SCST 3.1.0(write back, fileio), the
>>> client installed centos 6.5 X86_64, test command was  "fio
>>> -filename=/dev/sdb -direct=1 iodepth=32 -thread -numjob=1
>>> -ioengine=psync -rw=randwrite -bs=8k -size=64G -group_reporting
>>> -name=fio_8k_64GB_RW", run the test command continuely, at the
>>> starting, the speed was about 260MB/s(iostat -xm 1), but after
>>> several times tested, performance dropped to 3MB/s.
>>> 
>>>    Would anybody give me a clue? Where is the root cause? How to
>>> improve it?
>>> 
>>> Thank you very much.
>>> 
>>> Best Regards,
>>> Daobang Wang.
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at list.zfsonlinux.org
>> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20160512/5ea53d77/attachment.html>


More information about the zfs-discuss mailing list