zpool scrub speed

Gunnar Beutner gunnar at beutner.name
Wed May 18 14:40:59 EDT 2011


I've seen something similar here with several metadata-heavy pools (i.e.
50-100M files) on OpenSolaris/FreeBSD and now Linux. The scrub seems to
take forever (about 1-5MB/sec - not quite as bad as what you're seeing)
for the first couple of percents. During that stage the scrub seems to
be doing mostly random I/Os.

However, after about 5-10% the scrub runs with full speed (around
100MB/sec per device). So maybe it's just that - or maybe it's something
else entirely. :)

Regards,
Gunnar

On 16.05.2011 23:12, Nils Bausch wrote:
> Hi Steve,
> 
> Thanks for the reply. The iostat was taken just right after I typed
> zpool status - which is at the top of my post, showing a mere 852K/s.
> Posting my zpool history would go completely overboard, as I use
> rolling backup snapshots per hour/day/week/month e.g.
> 
> #!/bin/bash
> zfs destroy -r wtf at 7daysago > /dev/null 2>&1
> zfs rename -r wtf at 6daysago @7daysago > /dev/null 2>&1
> zfs rename -r wtf at 5daysago @6daysago > /dev/null 2>&1
> zfs rename -r wtf at 4daysago @5daysago > /dev/null 2>&1
> zfs rename -r wtf at 3daysago @4daysago > /dev/null 2>&1
> zfs rename -r wtf at 2daysago @3daysago > /dev/null 2>&1
> zfs rename -r wtf at yesterday @2daysago > /dev/null 2>&1
> zfs snapshot -r wtf at yesterday
> 
> The problems I face with this scrub speed is, that I won't be able to
> verify my running pool, as it will come to a complete halt, I stopped
> at around 120 KBytes/second today.
> 
> Thanks
> Nils
> 
> On May 16, 4:22 pm, "Steve Costaras" <stev... at chaven.com> wrote:
>> From what point(s) in the scrub proccess were teh iostat's taken?
>>
>> From the output of them it seems that you have very small request sizes (~22K; ~3K; and 13K) in the three samples with sda showing largely different results. Would like to see a 'zpool status' and zpool history wtf.
>>
>> Though this initially looks like a highly fragmented system which is just thrashing your drives like crazy.
>>
>>
>>
>> -----Original Message-----
>> From: Nils Bausch [mailto:nils.bau... at googlemail.com]
>> Sent: Monday, May 16, 2011 09:13 AM
>> To: 'zfs-discuss'
>> Subject: Re: zpool scrub speed
>>
>> Hi,
>>
>> I have started a scrub on my running raidz1 and have very slow speeds.
>> Below my observations
>>
>> FordPrefect ~ # zpool status
>> pool: wtf
>> state: ONLINE
>> scan: scrub in progress since Mon May 16 10:46:54 2011
>> 12.4G scanned out of 3.23T at 852K/s, (scan is slow, no estimated
>> time)
>> 0 repaired, 0.38% done
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> wtf ONLINE 0 0 0
>> raidz1-0 ONLINE 0 0 0
>> ata-SAMSUNG_HD154UI_S1XWJ1KSB33767 ONLINE 0 0 0
>> ata-SAMSUNG_HD154UI_S1XWJ1KSB33765 ONLINE 0 0 0
>> ata-SAMSUNG_HD154UI_S1XWJ1KSC11001 ONLINE 0 0 0
>> ata-SAMSUNG_HD154UI_S1XWJ1MSC01749 ONLINE 0 0 0
>>
>> errors: No known data errors
>>
>> FordPrefect ~ # iostat -x 5
>> Linux 2.6.37-gentoo-r4 (FordPrefect) 05/16/11 _x86_64_ (8 CPU)
>>
>> avg-cpu: %user %nice %system %iowait %steal %idle
>> 0.80 0.24 1.36 0.04 0.00 97.55
>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>> sdb 0.32 2.28 2.59 12.43 101.01 221.45 42.94 0.30 19.80 76.13 8.04 3.41 5.12
>> sdc 0.31 2.12 2.57 11.78 100.73 219.64 44.66 0.42 29.12 104.31 12.72 4.30 6.16
>> sdd 0.30 2.14 2.60 12.33 100.73 219.58 42.92 0.20 13.30 48.05 5.98 2.88 4.30
>> sde 0.32 2.26 2.59 12.47 101.01 221.45 42.82 0.31 20.55 80.37 8.11 3.44 5.18
>> sda 0.05 6.94 0.15 3.95 1.71 43.58 22.14 0.11 26.11 2.46 26.99 1.15 0.47
>>
>> avg-cpu: %user %nice %system %iowait %steal %idle
>> 1.02 9.97 2.37 0.12 0.00 86.51
>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>> sdb 0.00 0.40 40.80 2.20 80.90 3.90 3.94 9.29 205.62 216.61 1.91 22.32 95.98
>> sdc 0.00 0.40 31.60 2.40 71.30 3.90 4.42 9.42 263.42 283.27 2.08 27.99 95.18
>> sdd 0.00 0.40 47.00 2.60 73.70 4.20 3.14 9.01 178.75 184.54 73.92 19.96 98.98
>> sde 0.00 0.40 42.40 2.40 84.80 3.90 3.96 9.33 200.80 212.06 2.00 21.56 96.60
>> sda 0.00 67.20 0.00 4.80 0.00 128.00 53.33 2.77 164.62 0.00 164.62 35.50 17.04
>>
>> avg-cpu: %user %nice %system %iowait %steal %idle
>> 0.10 2.03 1.08 7.98 0.00 88.81
>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>> sdb 0.00 3.00 18.20 38.00 40.00 314.80 12.63 5.25 98.49 286.62 8.39 10.73 60.28
>> sdc 0.00 1.20 26.60 34.80 47.40 299.40 11.30 9.15 153.42 341.22 9.88 15.83 97.22
>> sdd 0.00 2.60 13.60 34.40 36.70 364.80 16.73 1.13 26.65 71.46 8.94 6.60 31.70
>> sde 0.00 2.40 17.20 38.60 28.70 358.30 13.87 4.38 84.29 256.15 7.70 9.44 52.70
>> sda 0.00 45.20 0.00 13.20 0.00 393.60 59.64 6.94 675.68 0.00 675.68 35.41 46.74
>>
>> FordPrefect ~ # zfs list
>> NAME USED AVAIL REFER MOUNTPOINT
>> wtf 2.43T 1.58T 146G /wtf
>> wtf/ejic 115M 1.58T 115M /wtf/ejic
>> wtf/nils 394G 1.58T 56.3G /wtf/nils
>> wtf/nils/backup 137G 1.58T 134G /wtf/nils/backup
>> wtf/nils/tm 179G 1.58T 170G /wtf/nils/tm
>> wtf/shared 1.79T 1.58T 1.75T /wtf/shared
>> wtf/vmuser 116G 1.58T 96.8G /wtf/vmuser
>>
>> FordPrefect ~ # zpool list
>> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
>> wtf 5.44T 3.23T 2.20T 59% 1.03x ONLINE -
>>
>> My setup was made with Open Solaris and I have carried it over to zfs-
>> fuse and now to zfsonlinux, running the latest zpool version 28. Back
>> with zfs-fuse I used dedup and disabled it due to it being a memory
>> hog and slow performance under zfsonlinux. While running the scrub, I
>> am not doing anything resource hungry on the system, so I am a bit
>> stuffed about the slow performance - even zpool scrub returns this
>> message. It started with around 4 MB/s and is constantly decreasing
>> ever since. Dmesg does not give me any kernel debugs, so it does not
>> seem to crash or crashed anything - yet. Any ideas why this is
>> happening?
>>
>> Nils


-- 
Gunnar Beutner
Oberhäuserstrasse 167
91522 Ansbach

E-Mail: gunnar at beutner.name
Mobiltelefon: 0171 95 818 49



More information about the zfs-discuss mailing list