[zfs-discuss] Slow read performance

Richard Elling richard.elling at richardelling.com
Mon Apr 2 10:53:23 EDT 2018



> On Apr 2, 2018, at 12:39 AM, Gionatan Danti <g.danti at assyoma.it> wrote:
> 
> Il 01-04-2018 19:05 Richard Elling via zfs-discuss ha scritto:
>> there is a lot of good info in this thread already, but I'd like to
>> draw your attention to prefetching...
>>> On Mar 29, 2018, at 5:43 AM, Alex Vodeyko via zfs-discuss
>>> <zfs-discuss at list.zfsonlinux.org> wrote:
>> ...
>>> I use "recordsize=1M" because we have big files and sequential I/O.
>> ...
>>> 1) iozone (write = 2.7GB/s, read = 1GB/s)
>> ...
>>> "arcstat" during reads:
>>> # arcstat.py 5 (100% pm and 50+% miss)
>>> time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c
>>> 14:13:17 1.9K 1.0K 55 1 0 1.0K 100 2 42 62G 62G
>>> 14:13:22 2.1K 1.1K 54 0 0 1.1K 100 0 13 63G 62G
>>> 14:13:27 1.8K 980 54 1 0 979 100 2 42 62G 62G
>>> 14:13:32 1.6K 880 55 1 0 879 100 2 40 62G 62G
>> "pmis" is the number of prefetch misses: both data and metadata
>> "pm%" is the prefetch miss ratio to the number of total accesses.
>> First, check that prefetching is enabled (it is by default)
>> zfs_prefetch_disable = 0
>> For a sequentual read operation, we expect the prefetcher to be
>> prefetching,
>> and thus do not expect pm%=100%.
>> The zfetch_array_rd_sz tunable parameter is a limit to the size of the
>> prefetching
>> blocks. Basically, if a block is larger than zfetch_array_rd_sz, then
>> it is not prefetched.
>> However, the default zfetch_array_rd_sz = 1,048,576 thus it should be
>> fine if your
>> volblocksize=1m. Be sure to check its value.
>> A related tunable is zfetch_max_distance, default = 8MiB, maximum
>> number of bytes
>> to prefetch per stream. This might be too small for volblocksize=1m.
>> To help visualize the zfetch activity, I usually do the data
>> collection with Prometheus'
>> node_exporter or influxdb's telegraf. But if you are a CLI fan, then I
>> pushed a Linux
>> version of zfetchstat to
>> https://github.com/richardelling/zfs-linux-tools <https://github.com/richardelling/zfs-linux-tools> [1]
>>> "top" shows only 8 "z_rd_int" processes during reads (and only one
>>> "z_rd_int" running), while there were 32 running z_wr_iss processes
>>> during writes.
>> This could be another clue about prefetching not being enabled or not
>> working as
>> desired. However, in my experience, it is better to observe the
>> detailed back-end I/O
>> distribution and classification with "zpool iostat -q" where
>> prefetches are often, but
>> not always, in the asyncq_read category.
>> -- richard
> 
> This is very interesting, thanks.
> What surprises me is that, based on iostat and zpool iostat, its pool always see 128K-sized I/O and low util% value. Any clue about that?

1MiB / 8 = 128kiB
 -- richard

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20180402/11234700/attachment.html>


More information about the zfs-discuss mailing list