Native zfs vs zfs-fuse

Brian Behlendorf behlendorf1 at llnl.gov
Mon May 9 19:47:12 EDT 2011


This is great!  I'm happy to see it builds on all of these vanilla
kernels.  Historically I've just manually tested on the latest kernels
for a specific set of distributions.  This usually covers what most
people would try but it certainly left holes in the testing coverage!

As for your bad news... it's good news!  This is exactly the sort of
thing we wanted to catch with CI testing.  And on the upside it looks
like a pretty easy fix.  We just need to fallback to the
d_obtain_alias() compatibility code for 2.6.28, I had to do something
similar with the kernel-blk-end-request.m4 macro.

-- 
Thanks,
Brian 

On Mon, 2011-05-09 at 12:53 -0700, Gunnar Beutner wrote:
> Some good and some bad news:
> 
> Rudimentary output for the first few test runs:
> 
> https://www.beutner.name/public/zfsci-09-05-2011/
> 
> For now this just shows the 'master' Git branch. While there are test
> results for other branches (the job scripts automatically discover new
> branches for the configured list of Git repositories when figuring out
> whether there are any untested commits) the "results-to-HTML" script is
> really just a hack and needs some more work to show all available test
> results in a user-friendly manner (per-branch HTML pages with
> per-distribution tables, etc.).
> 
> I'm planning to implement different priorities for the jobs, so for
> example benchmarks and the ZFS test suite can be run while the build
> boxes are otherwise idle. This should allow me to safely run the test
> suite for all commits (rather than just once a day) without
> unnecessarily delaying the build tests.
> 
> Now for the bad news: looks like my NFS patch broke builds on kernel
> 2.6.28 - oops :)
> 
> Regards,
> Gunnar
> 
> On 08.05.2011 01:05, Brian Behlendorf wrote:
> > Very cool!  This is a great start.  I'd love to see us using a framework
> > like this to ensure we always maintain the highest quality of
> > development.  In my ideal world the CI system would minimally provide
> > the following:
> >
> > * Every commit to master validate the spl/zfs build on N distributions.
> > * Nightly run the full ZFS Test Suite on N distributions.
> > * Nightly/weekly performance testing on real hardware (not VMs).
> > * All results should be publicly available
> > * Developers can manually trigger builds for development branches
> >
> > Concerning the kernel autotest framework.  The more that I look at it,
> > the more I agree with you that it may not be suited for this.
> > Particularly, since I see they haven't posted results for kernels newer
> > than 2.6.36.
> >
> > However, there's another existing CI candidate called Buildbot I've been
> > reading the manual for.  It looks like it might be a really nice fit to
> > augment what your already working on.  It's a GPL project written in
> > Python and is used by the Python, Mozilla, Chromium, Wireshark,
> > Subversion, etc projects for their CI needs.
> >
> > http://trac.buildbot.net/
> >
> > It looks like to already covers the mundane stuff on your TODO list (web
> > interface, logging, scheduling, etc).  And since its written in python
> > I'm sure it could be easily extended to integrate in to your scheme for
> > quickly setting up distributions on real hardware.  In fact, they have a
> > section in the manual called 'Latent Buildslaves' for exactly this sort
> > of thing.  Although their examples are for EC2 or libvirt.
> >
> > http://buildbot.net/buildbot/docs/current/Latent-Buildslaves.html#Latent-Buildslaves
> >
> > The Latent Buildslaves probably only make sense for the performance
> > critical testing where VMs aren't appropriate.  For the quick build
> > testing setting up a bunch of long lived build slave VMs would be nice.
> >
> > On nice feature is that it's designed to support distributed build
> > slaves.  If a someone wants to ensure we're testing on their
> > distribution, they just need to offer up the hardware/VM for a build
> > slave.  They can administer the host and we can easily get the automated
> > testing results.
> >
> > Anyway, great work so far!  It's looking very promising.  If you get a
> > little a time you might skim the buildbot manual.  Maybe this is
> > something we can leverage, maybe not.
> >
> > Thanks,
> > Brian
> >
> >
> > On Sat, 2011-05-07 at 07:42 -0700, Gunnar Beutner wrote:
> >> Autotest looks quite interesting. However there seems to be a bit a lack
> >> of documentation - or maybe I just didn't look at it thoroughly enough.
> >>
> >> For now I've continued with writing my own scripts, which by now can
> >> automatically build images for multiple distributions (just Debian
> >> squeeze, CentOS 5.6 and OpenSUSE 11.x so far) with various kernels
> >> (2.6.26 - 2.6.39) and continuously run tests in a loop.
> >>
> >> There are scripts to build jobs for all the possible combinations of the
> >> input parameters (distribution, kernel, git branch/commit, filesystem
> >> type, etc.):
> >>
> >> root at charon:~/zfsci# ./zfsci-build-jobs
> >> Successfully built 294 job descriptions.
> >>
> >> root at charon:~/zfsci# python -mjson.tool
> >> build/jobs/c96bd1d61225c9e8c87d3c5abd93150f.json
> >> {
> >>     "input": {
> >>         "distribution": "debian",
> >>         "fs-type": "zfs",
> >>         "kernel-version": "2.6.30",
> >>         "zfs-git-commit": 1304532000,
> >>         "zfs-git-repo": {
> >>             "branch": "master",
> >>             "spl": "behlendorf-spl",
> >>             "zfs": "behlendorf-zfs"
> >>         }
> >>     },
> >>     "job_id": "c96bd1d61225c9e8c87d3c5abd93150f"
> >> }
> >>
> >> The PXE-bootable "master" image picks one of those job descriptions from
> >> an NFS share and installs the selected distribution, kernel and build
> >> tools and sets up the node to automatically run the tests on boot.
> >>
> >> Once the tests are completed the node reboots and the master image tries
> >> to extract the test results:
> >>
> >> root at charon:~/zfsci# python -mjson.tool
> >> /var/lib/zfsci-data/results/result-1304780569/job.json
> >> {
> >>     "input": {
> >>         "distribution": "debian",
> >>         "fs-type": "zfs",
> >>         "kernel-version": "2.6.35",
> >>         "rootpart": "/dev/sda1",
> >>         "swappart": "/dev/sda2",
> >>         "testpart": "/dev/sda3",
> >>         "zfs-git-commit": 1304712000,
> >>         "zfs-git-repo": {
> >>             "branch": "master",
> >>             "spl": "behlendorf-spl",
> >>             "zfs": "behlendorf-zfs"
> >>         }
> >>     },
> >>     "job_id": "cd4da6f9e1f659e590c3de206e863dc8",
> >>     "output": {
> >>         "<class '__main__.BuildKernelTask'>": {
> >>             "status": "SKIPPED"
> >>         },
> >>         "<class '__main__.BuildSPLTask'>": {
> >>             "run_end": 1304780359.8432441,
> >>             "run_start": 1304780318.95593,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.BuildZFSTask'>": {
> >>             "run_end": 1304780441.2562261,
> >>             "run_start": 1304780364.5518751,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.CreateExtFSTask'>": {
> >>             "run_end": 1304780443.9385149,
> >>             "run_start": 1304780443.938489,
> >>             "status": "SKIPPED"
> >>         },
> >>         "<class '__main__.CreateZPoolTask'>": {
> >>             "run_end": 1304780442.927438,
> >>             "run_start": 1304780442.339977,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.DepmodTask'>": {
> >>             "run_end": 1304780242.4898851,
> >>             "run_start": 1304780242.0290351,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.InstallZFSFuseTask'>": {
> >>             "run_end": 1304780360.88288,
> >>             "run_start": 1304780360.882858,
> >>             "status": "SKIPPED"
> >>         },
> >>         "<class '__main__.SaveCrashDumpsTask'>": {
> >>             "run_end": 1304780569.548317,
> >>             "run_start": 1304780569.535913,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.ZFSDepDebianTask'>": {
> >>             "run_end": 1304780315.426764,
> >>             "run_start": 1304780296.448957,
> >>             "status": "PASSED"
> >>         },
> >>         "<class '__main__.ZFSDepsCentOSTask'>": {
> >>             "run_end": 1304780316.4739571,
> >>             "run_start": 1304780316.4739261,
> >>             "status": "SKIPPED"
> >>         },
> >>         "<class '__main__.ZFSDepsOpenSUSETask'>": {
> >>             "run_end": 1304780317.4897759,
> >>             "run_start": 1304780317.489748,
> >>             "status": "SKIPPED"
> >>         }
> >>     }
> >> }
> >>
> >> (And possibly other files that can be used to diagnose problems in more
> >> detail.)
> >>
> >> An average test run takes about 5 minutes (installing the system,
> >> building spl/zfs, creating a test zpool and post-processing the results)
> >> - which means I'm getting about 50 test results per hour using my
> >> current test setup (http://charon.shroudbox.net/public/zfsci-cluster).
> >>
> >> There are still quite a few items on my TODO list though:
> >>
> >> * capturing stdout/stderr output (especially for 'early' failures, e.g.
> >> during the install process) and system information (IP address, etc.)
> >> * integrating the watchdog script with my remote-controlled power strip
> >> (the watchdog script would've worked just fine, but after a couple dozen
> >> reboots the on-board Realtek NIC isn't detected anymore - and the test
> >> boxes need to be power-cycled to fix that)
> >> * conditionally building job descriptions (right now the script creates
> >> jobs even when we already have results for the same input parameters)
> >> * web thingie/visualization to make the results more user-friendly
> >> * support for more distributions (Ubuntu should be trivial to add,
> >> should probably also support Fedora)
> >> * making the scripts more robust and cleaning them up
> >> * modifying the job scripts so they automatically pick up new git branches
> >> * documentation
> >> * benchmarks (sort of low priority, considering all the other things
> >> that need to be fixed first)
> >> * and a few other things
> >>
> >> Regards,
> >> Gunnar
> >>
> >> On 28.04.2011 19:49, Brian Behlendorf wrote:
> >>> Good idea.  We absolutely need something like this.  It's really the
> >>> best way to ensure the quality of the port.  I've had similar thoughts
> >>> myself, but for the moment I am still making due with a dozen or so VMs.
> >>> I'd love to have the infrastructure in place to automatically run a
> >>> development branch through N different distributions checking for build
> >>> failures and performance regressions.
> >>>
> >>> A while ago I came across a project at kernel.org called Autotest.  It
> >>> is a framework which was designed for fully automated Linux kernel
> >>> testing.  It appears to be under active development and quite far along.
> >>> Plus since it's designed for kernel testing I would think extending it
> >>> to kernel module testing might not be too hard.  In fact it may already
> >>> support this.  It might be worth spending the time to see what they've
> >>> done and if it's something we can use.
> >>>
> >>>   http://autotest.kernel.org/
> >>>   http://autotest.kernel.org/wiki/WhitePaper
> >>>
> >
> >
> 
> 
> --
> Gunnar Beutner
> Oberhäuserstrasse 167
> 91522 Ansbach
> 
> E-Mail: gunnar at beutner.name
> Mobiltelefon: 0171 95 818 49



More information about the zfs-discuss mailing list