[zfs-discuss] !HELP! Lost my pool?!?

Manuel Amador (Rudd-O) rudd-o at rudd-o.com
Wed Jul 20 15:48:59 EDT 2011


Ah, dude, not to be an asshole to you, but you get what you deserve...

And yes, you are the only one.

On Tue, 2011-07-19 at 19:56 -0700, UbuntuNewbie wrote:
> Status update:
> I am still unable to access the "lost" pool.
> I am still hoping to have a chance at recovering, but...
> It looks as if it would be getting worse:
> - Backup: i failed for reasons, that could be called "lack of disk
> space". I tried several different attempts, each time waiting for
> almost a day to have the copy terminate (yes, even eSATA takes time,
> the data doesnt compress well). Additionally i observed, that plugging
> in drives on the eSATA bus messes up my device naming (sda, sdb, sdc,
> and so on), especially for ZFS, since i didnt yet switch to an id-
> based configuration for it (luckily, my linux had already been
> configured that way before).
> - Solaris: really impossible to run that on my PC. It was possible to
> use it inside Virtualbox (before the problem with the pool was there).
> I am aware of the possibility to use virtualbox on a real disk. I have
> done so earlier from Windows. But i am not willing to take such risks
> with my pool at stake on a system, i am still learning to use (Ubuntu
> is still a new OS for me).
> - In between I attempted to use the -fF suggestion
> That is, when new information turned up:
> At first, zpool status gave something like "import failed due to a
> missing label".
> When i tried to force (-fF), i got "...failed... I/O-Error"
> As i didnt want to come back here without veryfying that...
> I created another pool on a spare partition of the same disk having
> ALMOST the same size, turned compression on for a new filesystem on
> and started to dd...
> i could see some progress (the compression factor about 1.3 should
> have been enough to get the data over onto the new pool)
> when i came back... the system had rebooted! (by itself), the NEW pool
> did not import, the error message being:
> "The pool metadata is corrupted and the pool cannot be opened.
> Recovery is possible, but will result in some data loss. Returning the
> pool to its state as of (1 hour ago) should correct the problem.
> Approximately 1 second of data must be discarded, irreversiibly.
> Recovery can be attempted by executing 'zpool clear -F TANK2' . A
> scrub of the pool is strongly recommended after recovery."
> And the status reveals: 2 checksum errors (corrupted data) on the new
> pool while 8 (!) on the underlying device.
> That is where i left, as it indicates to me, that there ought to be a
> serious problem with the software/hardware, i am using at present, and
> the enjoyment diminishes day by day. Are you guys running the last zfs
> updates without problems?
> Am I really the only one?
> Brian Behlendorf schrieb:
> > You should be able to recover your pool by importing it with the -F
> > option.  This will rollback your pool a few seconds/minutes to a point
> > in the past where the meta data was consistent.
> >
> > zpool import -fF tank
> >
> > This sort of damage can occur if your system crashes and the disks are
> > not properly honoring cache flush commands.  Lots of consumer grade
> > hardware can suffer from this sort of problem.  If you know your
> > hardware has this problem consider disabling the disk write back caches.
> I understand, but wouldnt know how to do it. If at all i could verify
> that being the cause...
> > It will cost performance but improve your data integrity.
> >
> > > Steve Costaras schrieb:
> > > > As for your issue specifically:
> > > > - First thing fix your PC for stability.
> when the dd led to a reboot, there wasnt much else running (except an
> empty desktop)
> As I have 16 GB ram, i believe, zfs runs into some problem which leads
> to the reboot. i could see the ram getting used more and more before i
> left...
> > > > - Third, if possible would be to create 1:1 bit copies of all your ZFS drives so you can always get back to your starting point.:
> > > >  dd if=/dev/{disk} of=/mnt/{disk}.img bs=512 conv=noerror,sync
> > > >  (bs=4096 if you have ashift=12/newer AFT drive).
> > >
> > > working on this right now
> see above
> > > thanks for those hints. will give feedback on how it went.

