[zfs-discuss] Slow read performance
richard.elling at richardelling.com
Sun Apr 1 13:05:02 EDT 2018
there is a lot of good info in this thread already, but I'd like to draw your attention to prefetching...
> On Mar 29, 2018, at 5:43 AM, Alex Vodeyko via zfs-discuss <zfs-discuss at list.zfsonlinux.org> wrote:
> I use "recordsize=1M" because we have big files and sequential I/O.
> 1) iozone (write = 2.7GB/s, read = 1GB/s)
> "arcstat" during reads:
> # arcstat.py 5 (100% pm and 50+% miss)
> time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c
> 14:13:17 1.9K 1.0K 55 1 0 1.0K 100 2 42 62G 62G
> 14:13:22 2.1K 1.1K 54 0 0 1.1K 100 0 13 63G 62G
> 14:13:27 1.8K 980 54 1 0 979 100 2 42 62G 62G
> 14:13:32 1.6K 880 55 1 0 879 100 2 40 62G 62G
"pmis" is the number of prefetch misses: both data and metadata
"pm%" is the prefetch miss ratio to the number of total accesses.
First, check that prefetching is enabled (it is by default) zfs_prefetch_disable = 0
For a sequentual read operation, we expect the prefetcher to be prefetching,
and thus do not expect pm%=100%.
The zfetch_array_rd_sz tunable parameter is a limit to the size of the prefetching
blocks. Basically, if a block is larger than zfetch_array_rd_sz, then it is not prefetched.
However, the default zfetch_array_rd_sz = 1,048,576 thus it should be fine if your
volblocksize=1m. Be sure to check its value.
A related tunable is zfetch_max_distance, default = 8MiB, maximum number of bytes
to prefetch per stream. This might be too small for volblocksize=1m.
To help visualize the zfetch activity, I usually do the data collection with Prometheus'
node_exporter or influxdb's telegraf. But if you are a CLI fan, then I pushed a Linux
version of zfetchstat to
> "top" shows only 8 "z_rd_int" processes during reads (and only one
> "z_rd_int" running), while there were 32 running z_wr_iss processes
> during writes.
This could be another clue about prefetching not being enabled or not working as
desired. However, in my experience, it is better to observe the detailed back-end I/O
distribution and classification with "zpool iostat -q" where prefetches are often, but
not always, in the asyncq_read category.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the zfs-discuss