[zfs-discuss] Correct/best procedure to replace a dying (but still working) disk ?

Turbo Fredriksson turbo at bayour.com
Sat Aug 10 15:02:07 EDT 2013

On Aug 10, 2013, at 7:46 AM, Joachim Grunwald wrote:

> On 09.08.2013 21:22, Turbo Fredriksson wrote:
>> Since 'replace' don't work (yet - isn't implemented I think), you could/should have just
>> detached a failing disk and then attached the new one in it's place.
>> From the man page:
>>        zpool attach [-f] [-o property=value] pool device new_device
>>        zpool add [-fn] [-o property=value] pool vdev ...
>>        zpool detach pool device
> Still not clear to me what would have been correct command.

Before you messed things up and added your disk as a vdev instead of adding it TO
a vdev, you should have:

	* detached a failing disk
	   zpool detach raid-z sda

	* attached the new disk in it's place (or at least in it's vdev)
	   zpool attach raid-z raidz2-0 sdk

> man zpool attach says "The existing device cannot be part of a raidz configuration." This is why I didn't consider that command.

And sdk WASN'T a part of a raidz config before you added it as one...

> Seconnd line is unclear. pool is raid-z but what would be the correct vdev?
>>>         NAME                          STATE     READ WRITE CKSUM
>>>         raid-z                        ONLINE       0     0     0
>>>           raidz2-0                    ONLINE       0     0     0
>>>             [...]
>>>           sdk                         ONLINE       0     0     0
raid-z is the pool, raidz2-0 the vdev. And so is (now!) sdk. Which is very confusing
(I think even sda, sdb etc is vdevs! :).

The point is, 'vdev' stands for 'Virtual Device'. I'm not sure if the man page is wrong/deliberatly
fuzzy, or ...

>> So 'offline' the disk, 'detach' it and then 'attach' it to the correct vdev MIGHT work...
> First thing is, I might just make sure I have a complete and current backup of the pool
> and then do the tests and if it fails, recreate it.

According to Gregor (next mail in the thread) claims that removing a vdev isn't
possible (never tried it and he's good! so I'll take his word for it). Which means
you'd have to scratch your data and recreate the pool, correctly.

Which kind'a sound reasonable when I think of it. You have increased your pool
with the size of sdk, which is easy. But reducing a pool size is difficult, in ANY
system (MD, LVM etc).

> But in the end it is quite difficult to determine which function according to manual
> is implemented and which is not to figure the correct procedure.

Yes, that's a bother, BUT reading the manpage should have shown you that
'replace' would have been the correct command (as Gregor showed).

	zpool replace [-f] pool old_device [new_device]
           Replaces old_device with new_device. This is equivalent to attaching
	   new_device, waiting for it to resilver, and then detaching old_device.

And (last I heard) it isn't implemented (correctly?) so it should have given you an
error when you tried using it.

Then you should have came here, as your question (or google it a little) and
we could have recommended a workaround (the manpage snippet above is actually
giving you the clues to a workaround).:

	zpool detach pool device
           Detaches device from a mirror.

	zpool attach [-f] [-o property=value] pool device new_device
           Attaches new_device to an existing zpool device.

which is much different from:

	zpool add [-fn] [-o property=value] pool vdev ...
           Adds the specified virtual devices to the given pool.

Oh, and another thing. NEVER, EVER, EVER use 'force' unless you are absolutly
100% super sure that you're absolutly sure you know what you're doing! This is true
with _any_ command!
Imagine you're an idiot and then imagine you're in
the government. Oh, sorry. Now I'm repeating myself
- Mark Twain

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130810/0444ed04/attachment.html>

More information about the zfs-discuss mailing list