[zfs-discuss] zfs large pool - many checksum errors, no read or write errors

Badi' Abdul-Wahid abdulwahidc at gmail.com
Mon May 16 16:49:19 EDT 2016


Francois, here is an instance of the errors I get.


May 15 07:15:02 namo kernel: blk_update_request: I/O error, dev sda, sector
1930956344
May 15 07:15:02 namo kernel: ata1: EH complete
May 15 07:16:18 namo kernel: ata1.00: exception Emask 0x10 SAct 0x10000
SErr 0x280100 action 0x6 frozen
May 15 07:16:18 namo kernel: ata1.00: irq_stat 0x08000000, interface fatal
error
May 15 07:16:18 namo kernel: ata1: SError: { UnrecovData 10B8B BadCRC }
May 15 07:16:18 namo kernel: ata1.00: failed command: READ FPDMA QUEUED
May 15 07:16:18 namo kernel: ata1.00: cmd
60/00:80:c8:58:23/01:00:74:00:00/40 tag 16 ncq 131072 in
                                      res
40/00:84:c8:58:23/00:00:74:00:00/40 Emask 0x10 (ATA bus error)
May 15 07:16:18 namo kernel: ata1.00: status: { DRDY }
May 15 07:16:18 namo kernel: ata1: hard resetting link

I see several instances of these until
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
after which the errors subside.

Looking at smartctl for the disk:
ata-WDC_WD3001FFSX-68JNUN0_WD-WMC1F0E426P7 SATA Version is:  SATA 3.0, 6.0
Gb/s (current: 3.0 Gb/s)



On Mon, May 16, 2016 at 12:36 PM, Badi' Abdul-Wahid <abdulwahidc at gmail.com>
wrote:

> Francois Stark <francois at postmasters.co.za> writes:
>
> > ... Thanks for the suggestion - but wouldn't I be getting read or write
> errors?
> >
> > I am only getting ZFS checksum errors - no read or write errors. I also
> don't see any errors in the kernel for the disks.
> >
> > Can you paste the kind of errors you have found with the WD disks?
>
> Sure, I'll send it this evening when I get a chance.
>
> >
> > Thanks
> >
> > ________________________________________
> > From: zfs-discuss [zfs-discuss-bounces at list.zfsonlinux.org] On Behalf
> Of Badi' Abdul-Wahid
> > Sent: 16 May 2016 04:27 PM
> >
> > This looks similar to something I see on my system.
> > I've got a couple of the WD FFSX drives (same as yours but 7200rpm).
> > Do you see any errors in your kernel logs regarding these disks?
> > For me, libata reports bad sectors and I can always reproduce these
> errors when doing a dd test or a scrub.
> > After a few minutes, the link speed gets downgraded from 6 to 3 Gbps and
> the issue subsides.
> > I've been able to "hide" the problem by setting libata.force=3 in the
> kernel parameters during boot.
> > Can you pull a couple out and connect them directly to the motherboard?
> > When I tested this, skipping the backplane board, the issue disappeared
> as well.
> > At this point I'm in the process of replacing my WD Red drives.
> > My guess at this point is that the vibration is beyond what the disks
> can handle, despite the manufacturer claims.
> >
> > FWIW, my pool is a mix of WD, HGST, and Seagate disks in 3 mirrors of
> two disks.
> >
>
> --
> Badi' Abdul-Wahid
>



-- 

Badi' Abdul-Wahid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20160516/cfd24f49/attachment.html>


More information about the zfs-discuss mailing list