[zfs-discuss] zfs io stats

Richard Elling richard.elling at richardelling.com
Mon Dec 3 00:53:11 EST 2018

> On Dec 2, 2018, at 3:30 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
> I wrote:
>> Is there any way to pull out the average wait and run times for pool
>> IOs? It feels like there probably is, but I can't think clearly enough
>> to spot how to compute it.
> Having thought about this more, I believe that there is.
> Here is the derivation, so people can tell me if I got something wrong
> or overlooked something.
> In general, if we have a utilization time, the average
> queue size, and the number of requests, the average request time is[*]:
> 	util / (nrequests / aveq)
> or, simplified:
> 	(util * aveq) / nrequests
> For ZFS pool kstats, nrequests is (reads+writes). 'util' is rtime or
> wtime, depending on run or wait.  The average queue size is 'rlentime /
> rtime' (and similarly for wait), which means that this simplifies down
> to:
> 	(rlentime) / (reads+writes)
> 	(wlentime) / (reads+writes)
> or the total sum, which is really the interesting number:
> 	(rlentime + wlentime) / (reads+writes)

In general, yes. But the overall is actually not as useful as for identifying
bottlenecks as the detail. There is some open research work needed to 
understand how the ZIO scheduler is tuned or tunable. Also, for disks
there are multiple queues in the data path and congestion is not easily 

> This appears to generate plausible numbers when I graph it in Prometheus
> while putting test loads on a scratch pool.


> 	- cks
> [*: If the disk is in use for a second and over that second we had
>    10 requests that were issued back to back for a queue depth of 1,
>    clearly each request took 1/10th of a second to complete (on average,
>    etc). If we had 20 requests with a queue depth of one, each request
>    must have taken 1/20th of a second. If we had 20 requests with a queue
>    depth of 2, they each must have taken 1/10th of a second to complete.
>    Or at least this sounds convincing to me.
> ]

Except modern disks are not m/m/1 queues, so you can forget all of that 
thinking. For example, an SSD that can do a rate of 750,000 IOPS can also
have a minimum response time of 100usec.
 -- richard

More information about the zfs-discuss mailing list