[zfs-discuss] Re: [zfs-fuse] Source RPM for zfs-fuse with pool v26 support

Durval Menezes durval.menezes at gmail.com
Wed Aug 14 11:07:11 EDT 2013


Hi Gordan,

On Wed, Aug 14, 2013 at 11:44 AM, Gordan Bobic <gordan.bobic at gmail.com>wrote:

> On Wed, Aug 14, 2013 at 3:32 PM, Durval Menezes <durval.menezes at gmail.com>wrote:
>
>> On Wed, Aug 14, 2013 at 11:02 AM, Gordan Bobic <gordan.bobic at gmail.com>wrote:
>>
>>> On Wed, Aug 14, 2013 at 2:53 PM, Durval Menezes <
>>> durval.menezes at gmail.com> wrote:
>>>
>>>> On Wed, Aug 14, 2013 at 10:35 AM, Gordan Bobic <gordan.bobic at gmail.com>wrote:
>>>>
>>>>> You're welcome. It seems to support fs  <= v5 and pool <= v26, so
>>>>> should be able to handle zfs send|receive to/from the other implementations.
>>>>>
>>>>
>>>> I can confirm I was able to import a v26 pool and mount its v5 FS, both
>>>> created (and the FS populated) using ZoL 0.6.1; right now I'm doing running
>>>> a "tar c| tar d" to verify whether any data/metadata differences show up on
>>>> this pool (which is mounted using zfs-fuse) when compared against the
>>>> snapshot in another pool I populated it from (which is right now mounted at
>>>> the same time and on the same machine running ZoL 0.6.1). How cool is that,
>>>> uh? :-)
>>>>
>>>
>>> Are you saying you have one mounted using ZoL and the other using Z-F?
>>>
>>
>> Yep :-)
>>
>>
>>> I have to say I hadn't really considered that option...
>>>
>>
>> It's working great, and it's not difficult at all to do: I just installed
>> the Z-F binaries (zpool, etc) in a different out-of-PATH directory, it's
>> just a matter of running the binaries from there when one wants to do
>> anything with the Z-F pools/FSs, and running the Zol binaries (from PATH)
>> when one wants to handle the ZoL pools/FSs.
>>
>
> I'd be concerned about cross-importing pools by accident. That is likely
> to lead to corruption as they are being accessed simultaneously. Be careful.
>

Humrmrmr.... doesn't importing a pool marks it on disk as busy, in such a
way as to require "-f" for a subsequent import? That is, I should be safe
as long as I take enough care when specifying -f to import, right?


 Seriously now, I don't really expect to find any differences nor hit any
>>>> issues, but I will post a reply to this message when it's finished (and
>>>> cross-post to the ZoL list so our friends there know there's a reliable
>>>> v26/v5 zfs-fuse available).
>>>>
>>>
>>> Great. Of course the real test will be when somebody's pool breaks due
>>> to an implementation bug - something that may not happen for months/years
>>> (and hopefully never will, but it's nice to have some level of insurance
>>> like this).
>>>
>>
>> Nahh.... :-) I think I found a really easy way to break a ZoL pool (and
>> in fact anything that depends on flush for consistency): just run it in a
>> VirtualBox VM with host disk caching turned on, then power off the VM
>> without exporting the pool... I know that shouldn't be so, but it was that
>> way that I managed to screw up a perfectly good pool (in fact, that one I'm
>> now diff'ing) when testing imports with multiple OSs: the very first time I
>> shut down the VM without exporting it (while running OpenIndiana FWIW) the
>> pool wouldn't import anymore until Nils made me aware of the -X complement
>> to the -F switch...
>>
>
> That isn't  ZoL specific issue, or even a ZFS specific issue - it's just
> that other file systems won't notice as much wrong, even if some of your
> data is trashed.
>

I fully agree. If I were testing other filesystems, I would have them
checked via the "tar c|tar d" method no matter what...


In fact, when the diff is over I think I'm going to try and crash that pool
>> again just to see whether this latest zfs-fuse can import it.
>>
>>
>>> On a totally unrelated note I'm now re-pondering switching even my
>>> rootfs-es to ZFS - yesterday I hit what seems to multiple disk failures
>>> manifesting as massive silent data corruption, and preliminary
>>> investigation suggests that MDRAID1+ext4 has disintegrated pretty
>>> thoroughly. Most inconvenient.
>>>
>>
>> Ouch.
>>
>> FWIW, if we were betting, I would put all my cash on ext4 being the
>> culprit: in my experience the whole extX family sucks in terms of
>> robustness/reliability. I've not used extX  in machines under my direct
>> supervision at least since 2001 when ReiserFS became solid enough for
>> production use, but in a client which insists on running only
>> plain-bland-boring EL6 and which is subjected to frequent main power/UPS
>> SNAFUs, theyt used to lose at least 2 or 3 ext4 hosts every time they lost
>> power, and that in a universe of 20-30 machines.... by contrast, a
>> specially critical machine of theirs which I personally insisted on
>> configuring with ReiserFS running on top of linux md just kept running for
>> years through that selfsame power hell, never losing a beat. This (and many
>> other experiences) lead me to having a lot of faith in Linux md and
>> reiserfs, and a lot of doubt on ExtX.
>>
>
> I'm more suspecting the fundamental limitation of MD RAID1 here. If the
> disk is silently feeding back duff data, all bets are off,
>

This sure can happen, but in my experience is very rare... in the many
dozens of disks which have been under my direct care during the last few
years, only once I found a silent duff-producing disk... and I'm very
careful, even on the ones I wasn't running ZFS I think I would have
detected any issues (see below).


> and unlike ZFS, traditional checksumless RAID has no hope of guessing
> which disk (if not both) is feeding back duff data for each block.
>

This is indeed the case. But it can detect the duff (unless of course all
disks in the array are returning exactly the same duff, which I find
extremely improbable): it's just a matter of issuing a "check" command to
the md device, which I do every day on every md array.


>
>
>> In fact, my involvement with ZFS (initially FUSE and then ZoL) started in
>> 2008 when it became clear that there would be no ongoing significant
>> maintenance for ReiserFS... and, depending on performance, I may even end
>> up running ZoL on top of md :-)
>>
>
> Knowing your data is trashed is of limited usefulness (but better than
> nothing, I guess).
>

Well, one can always restore the affected data from a backup... although I
agree, it's much better to have it automatically restored from a good copy
from somewhere else in the array.


> Not sure why you'd run ZFS on top of MD, though, rather than give ZFS the
> raw disks/partitions.
>

In a word: performance. It's still unclear to me how RAIDZ2 will perform
performance-wise when compared to RAID6... the reasonable behavior would be
for ZFS to suffer only onder under a sustained random write load (ie, write
IOPS) as only then it would have to reread the whole RAID stripe to
recalculate its syndrome (ie, exactly like RAID6), but I've read
conflicting reports that other loads would suffer too.. so I'm gonna test
it myself. In fact, I'm bringing up an experimental machine here right now
with 6 1TB SATA disks exactly to test that.

Cheers,
-- 
   Durval.




>
>
> Gordan
>
> --
> --
> To post to this group, send email to zfs-fuse at googlegroups.com
> To visit our Web site, click on http://zfs-fuse.net/
> ---
> You received this message because you are subscribed to the Google Groups
> "zfs-fuse" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zfs-fuse+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20130814/ba63cb5e/attachment.html>


More information about the zfs-discuss mailing list