[zfs-discuss] Slow read performance

Gionatan Danti g.danti at assyoma.it
Mon Apr 2 03:39:31 EDT 2018


Il 01-04-2018 19:05 Richard Elling via zfs-discuss ha scritto:
> there is a lot of good info in this thread already, but I'd like to
> draw your attention to prefetching...
> 
>> On Mar 29, 2018, at 5:43 AM, Alex Vodeyko via zfs-discuss
>> <zfs-discuss at list.zfsonlinux.org> wrote:
>  ...
> 
>> I use "recordsize=1M" because we have big files and sequential I/O.
>  ...
> 
>> 1) iozone (write = 2.7GB/s, read = 1GB/s)
>  ...
> 
>> "arcstat" during reads:
>> # arcstat.py 5 (100% pm and 50+% miss)
>> time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c
>> 14:13:17 1.9K 1.0K 55 1 0 1.0K 100 2 42 62G 62G
>> 14:13:22 2.1K 1.1K 54 0 0 1.1K 100 0 13 63G 62G
>> 14:13:27 1.8K 980 54 1 0 979 100 2 42 62G 62G
>> 14:13:32 1.6K 880 55 1 0 879 100 2 40 62G 62G
> 
> "pmis" is the number of prefetch misses: both data and metadata
> "pm%" is the prefetch miss ratio to the number of total accesses.
> 
> First, check that prefetching is enabled (it is by default)
> zfs_prefetch_disable = 0
> 
> For a sequentual read operation, we expect the prefetcher to be
> prefetching,
> and thus do not expect pm%=100%.
> 
> The zfetch_array_rd_sz tunable parameter is a limit to the size of the
> prefetching
> blocks. Basically, if a block is larger than zfetch_array_rd_sz, then
> it is not prefetched.
> However, the default zfetch_array_rd_sz = 1,048,576 thus it should be
> fine if your
> volblocksize=1m. Be sure to check its value.
> 
> A related tunable is zfetch_max_distance, default = 8MiB, maximum
> number of bytes
> to prefetch per stream. This might be too small for volblocksize=1m.
> 
> To help visualize the zfetch activity, I usually do the data
> collection with Prometheus'
> node_exporter or influxdb's telegraf. But if you are a CLI fan, then I
> pushed a Linux
> version of zfetchstat to
> https://github.com/richardelling/zfs-linux-tools [1]
> 
>> "top" shows only 8 "z_rd_int" processes during reads (and only one
>> "z_rd_int" running), while there were 32 running z_wr_iss processes
>> during writes.
> 
> This could be another clue about prefetching not being enabled or not
> working as
> desired. However, in my experience, it is better to observe the
> detailed back-end I/O
> distribution and classification with "zpool iostat -q" where
> prefetches are often, but
> not always, in the asyncq_read category.
>  -- richard

This is very interesting, thanks.
What surprises me is that, based on iostat and zpool iostat, its pool 
always see 128K-sized I/O and low util% value. Any clue about that?

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8


More information about the zfs-discuss mailing list