[zfs-discuss] Replacing disk in RAIDZ2 pool with autoreplace=on

mabi mabi at protonmail.ch
Mon Nov 12 08:53:02 EST 2018


I am planning to replace a single disk on my RAIDZ2 12-disks ZFS pool. This pool has the autoreplace=on attribute set. This means that I can simply take the disk out and replace it with another disk in the same slot without having to enter a single command on the server and the resilver will happen automatically, correct?

I was also wondering because of the ashift=12 parameter which I use on this pool. Will the new replacement disk immediately inherit this ashift=12? Because based on the zpool man page this seems not to be the case when manually running a "zpool replace".

For the sake of completeness I am replacing this 6 TB SAS disk as it shows some signs of failure after this monthly pool scrub as you can see below.

kernel log:
sd 0:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#2 Sense Key : Medium Error [current] [descriptor]
sd 0:0:0:0: [sda] tag#2 Add. Sense: Unrecovered read error
sd 0:0:0:0: [sda] tag#2 CDB: Read(16) 88 00 00 00 00 01 16 20 3b 38 00 00 00 f8 00 00
blk_update_request: critical medium error, dev sda, sector 4666178576

relevant disk from "zpool status":
  pool: data
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 108K in 21h28m with 0 errors on Sun Nov 11 21:52:40 2018
	NAME                                                  STATE     READ WRITE CKSUM
	data                                                  ONLINE       0     0     0
	  raidz2-0                                            ONLINE       0     0     0
	    sda                                               ONLINE      10     0     0

and finally relevant output from "smartctl -a /dev/sda" after a longtest:

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Failed in segment -->       -   25535        4666178576 [0x3 0x11 0x0]

For those who are curious, this is a nearly 4 years old Seagate enterprise ST6000NM0034 which has a warranty of 5 years.


More information about the zfs-discuss mailing list