[zfs-discuss] zpool scrub repeatedly detects checksum errors

Achim Gottinger achim at ag-web.biz
Fri Sep 13 06:21:36 EDT 2013


Am 13.09.2013 11:17, schrieb Gregor Kopka:
>
> Am 13.09.2013 04:37, schrieb Achim Gottinger:
>> Hi,
>>
>> Recently i checken my zpool on debian wheezy with scrub and it 
>> detected around 450 checksum error, two of them remained 
>> uncorrectable, resulting in two defet files whom i deleted.
>> The pool consits of an stripe with two 2TB drives plus 10GB cache and 
>> 512MB for logs.
>> Wheezy runs as an VM with 4 cores and 4GB ram ontop of xen cloud 
>> platform. The two 2TB drives are physically raid1's on the host's on 
>> an adaptec 6805E raid controller. Due to the fact that the vm is 
>> limited to an maximum of six virtual disk;s, each with an maximum of 
>> 2TB, i had to use above aproach to get an ~4TB pool inside the vm. 
>> Pool is accessed via nfs3 and samba (4).
>> In the past i had a few kernel lookup's due to arc related memory 
>> issues, i ended up with an max arc size of 1GB, back then, afterwards 
>> it was stable.
>> Back on topic, after the scrubbing i checked both raid1's, running 
>> verify and fix on the adpatec controller and theys where both ok. The 
>> physical drives themselves had no errors /msart/hardware/parity), 
>> just an few aborted commands warning. I noted the aborted command 
>> values, cleaned the zpool incl discs from the warnings and reran an 
>> scrub.
>> Once again it found about 450 checksum errors on each drive, this 
>> time all of them where correctable, but how can that be? Aborted 
>> commands counters on the physical discs involved did not change 
>> during the test.
>> So now i wonder where these checksum errors come from when the 
>> involved discs are intact and all previsous errors could be fixed?
> In case the errors are in zfs metadata then they're correctable (since
> zfs keeps multiple copies of metadata).

Thank you for the reply George!

This is the output with results from the second scrub after i had checked involved raid1's and discs for error.
Disc's and raid's are ok, the system ran flawless since the previous scrub and i had cleaned the errors before i ran scrub again.
I wonder where these checksum erros come from because i had expected they had been fixed during the first scrub and by me removing the affected two fils.
I run zfs/spl verision 0.6.2 btw.

ABC

pool: zpool
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
    see: http://zfsonlinux.org/msg/ZFS-8000-9P
   scan: scrub repaired 1,07M in 5h51m with 0 errors on Fri Sep 13 01:15:56 2013
config:

NAME        STATE     READ WRITE CKSUM
zpool       ONLINE       0     0     0
   xvde      ONLINE       0     0   446
   xvdf      ONLINE       0     0   459
logs
   xvda3     ONLINE       0     0     0
cache
   xvdd1     ONLINE       0     0     0
errors: No known data errors

>
> Having the raid find nothing is quite normal, since it'll happily feed 
> you garbage because it dosn't know anything about the data on-disk.
>
> With the 6 drives limit on the VM it would be better to hand the 4 
> data disks to the VM (since it would fit the limitation, in case the 
> cache and log drives are from one SSD you could hand it completely and 
> create the partitions inside the VM) and let ZFS handle them directly, 
> then you would have had better info which disk failed (where the 
> checksum errors come from) and also the ability of zfs to repair them 
> from the other side of the mirror.
Indeed tha would be better but one drive is occupied by the system one 
by the dvd and i need another free slot for drives i mount from other 
vm's randomly, so i had to go the raid0 route.
>
> Gregor
>
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to zfs-discuss+unsubscribe at zfsonlinux.org.

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe at zfsonlinux.org.



More information about the zfs-discuss mailing list