[zfs-discuss] Sudden write() + fsync() performance drop

Miguel Wang m.wang at salesforce.com
Thu Feb 4 12:12:11 EST 2016


Gregor -

zpool status -v is clean. After one day's "rest", one "bad" server comes
back with its
normal fsync() performance [7MB/s as in my initial email]. I also found
that on some servers the pool has unbalanced disk use as shown below,
however unbalanced
servers can have either good or bad fsync() performance which is directly
related to the
application performance. What fsync() performance depends on is still a
mystery.

I wanted to believe how many disks are working at the same time
is directly related to the application performance which is an easy
explanation, but it
turns out not the case. I do not think the unbalanced disk use is a good
thing, but it looks
like a separate issue.

               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
local0      5.23T   592G  1.73K  2.24K  18.6M  20.8M
  xvdb       660G  83.7G    171      1  1.77M  5.33K
  xvdc       660G  84.0G    198      0  2.09M      0
  xvdd       662G  81.9G    179      0  1.84M      0
  xvde       661G  83.3G    177      0  1.81M      0
  xvdf       661G  83.2G    190      0  2.01M      0
  xvdg       736G  7.60G    499  2.24K  5.35M  20.8M
  xvdh       659G  85.0G    167      1  1.73M  5.33K
  xvdi       661G  83.2G    188      1  1.94M  5.33K
----------  -----  -----  -----  -----  -----  -----


On Thu, Feb 4, 2016 at 3:24 AM, Gregor Kopka <zfs-discuss at kopka.net> wrote:

>
>
> Am 03.02.2016 um 19:03 schrieb Miguel Wang via zfs-discuss:
>
> The good:
>     local0/mysqldata/data       recordsize            128K
>      local
>
>
> The bad:
>     local0/mysqldata/data       recordsize            16K
>       local
>
>
> NB: The "bad" server used to have 128K recordsize that is why it has better
> compression ratio and more free space. The "good" server had 16K recordsize
> on the volume from start when we rebuilt the server.
>
> Now I am confused.
> So you set the good server from 16k to 128k, and the bad the other way
> around?
>
> Apart from that:
> Anything in dmesg, smartctl or zpool status -v indicating that a drive
> might have problems (which could be only on certain areas that are unused
> after restore from backup)?
> Enough memory in the systems, maybe the bad one was swapping when the *
> happened (maybe because of ARC overgrow)?
>
> Gregor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20160204/d4d63a24/attachment.html>


More information about the zfs-discuss mailing list