This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new!

Feature: System Administration

Using ZFS though FUSE

By Ben Martin on June 19, 2008 (9:00:00 AM)

Share    Print    Comments   

ZFS is an advanced filesystem created by Sun Microsystems but not supported in the Linux kernel. The ZFS_on_FUSE project allows you to use ZFS through the Linux kernel as a FUSE filesystem. This means that a ZFS filesystem will be accessible just like any other filesystem the Linux kernel lets you use.

Apart from any technical or funding issues, a major reason that ZFS support has not been integrated into the Linux kernel is that Sun has released it under its Common Development and Distribution License, which is incompatible with the GPL used by the kernel. There are also patent issues with ZFS. However, the source code for ZFS is available, and running ZFS through FUSE does not violate any licenses, because you are not linking CDDL and GPL code together. You're on your own as far as patents go.

The idea of running what is normally an in-kernel filesystem through FUSE will make some in-kernel filesystem developers grumble about inefficiency. When an application makes a call into the kernel, a context switch must be performed. The x86 architecture is not particularly fast at performing context switches. Because a FUSE filesystem runs out of the kernel, the kernel must at times perform a context switch to the FUSE filesystem. This means that overall there are more context switches required to run a filesystem through FUSE than in-kernel. However, accessing information that is stored on disk is so much slower than performing a context switch that performing two instead of one context switch is likely to have minimal impact, if any, on benchmarks. It has been reported that NTFS running through FUSE has results comparable to those of a native Linux filesystem.


No packages for zfs-fuse exist for Ubuntu, openSUSE, or Fedora. As of writing, the latest release of zfs-fuse, 0.4.0 beta, is from March 2007. Looking at the source repository for the 0.4.x version of zfs-fuse, it appears the developers have made many desirable additions since then -- for example, the ability to compile using recent versions of gcc, which were not available in the March 2007 release. I used the 0.4.x version from the source repository instead of the latest released tarball and performed benchmarking on a 64-bit Fedora 8 machine.

The source repository uses the Mercurial revision control system, which is itself available in the main Hardy and Fedora 9 repositories. To compile zfs-fuse you will need SCons and the development package for libaio. Both of these are packaged for Hardy (libaio-dev, scons), openSUSE 10.3 1-Click installs (libaio-devel, scons), and in the Fedora 9 repository. The installation step places five executables into /usr/local/sbin.

$ hg clone $ cd 0.4.x/src $ scons $ sudo scons install $ sudo zfs-fuse

Once the zfs-fuse daemon is started you use the zpool and zfs commands to set up your zfs filesystems. If you have not used ZFS before, you might like to read the OpenSolaris intro or the more serious documentation for it.


I tested performance inside a VMWare server virtual machine. I created a new virtual disk, preallocating 8GB of space for the disk. The use of virtualization would likely affect the overall benchmark, but the relative performance of ZFS vs. the in-kernel filesystem should still be indicative of the performance you might expect from ZFS running through FUSE. As the in-kernel Linux filesystem I used XFS because it performs well on large files such as the Bonnie++ benchmark I used.

The design of ZFS is a little different from that of most Linux filesystems. Given one or more partition, you set up a ZFS "pool," and then create as many filesystems as you like inside that pool. For the benchmark I created a pool for a single partition on the 8GB virtual disk and create two ZFS filesystems on that pool. To benchmark XFS I created an XFS filesystem directly on the partition that ZFS was using, wiping out the ZFS data in the process.

Shown below is the setup and benchmarking of ZFS. First I use fdisk to create a new partition for the whole disk. I use the zool create command to create new pools, associating physical disks with the pool. The -n option informs you of what would have been done but doesn't actually make the pool. I include its output here to make things easier to follow. Once I create the tank/testfs ZFS filesystem with the zfs command, I have a new filesystem that I can access through the Linux kernel at /tank/testfs, as shown using the standard df command. I then ran the Bonnie benchmark multiple times to make sure that the figures were not taken from a first run that was disadvantaged in any manner.

# fdisk /dev/sdd ... Disk /dev/sdd: 8589 MB, 8589934592 bytes 255 heads, 63 sectors/track, 1044 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes ... /dev/sdd1 1 1044 8385898+ 83 Linux ... # zfs-fuse # zpool create -n tank /dev/sdd1 would create 'tank' with the following layout: tank sdd1 # zpool create tank /dev/sdd1 # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 7.94G 92.5K 7.94G 0% ONLINE - # zfs create tank/testfs # df -h /tank/testfs/ Filesystem Size Used Avail Use% Mounted on tank/testfs 7.9G 18K 7.9G 1% /tank/testfs $ cd /tank/testfs $ /usr/sbin/bonnie++ -d `pwd` ... $ /usr/sbin/bonnie++ -d `pwd` Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP linuxcomf8 4G 12373 24 14707 11 10604 8 33935 50 36985 3 109.0 0 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 2272 17 3657 20 2754 18 2534 15 3736 20 3061 20 linuxcomf8,4G,12373,24,14707,11,10604,8,33935,50,36985,3,109.0,0,16,2272,17,3657,20,2754,18,2534,15,3736,20,3061,20

The commands below show how the Bonnie benchmark was performed on the XFS filesystem. Once again, I ran the benchmarks multiple times.

# mkfs.xfs /dev/sdd1 meta-data=/dev/sdd1 isize=256 agcount=8, agsize=262059 blks = sectsz=512 attr=0 data = bsize=4096 blocks=2096472, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=2560, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 # mkdir /raw # mount /dev/sdd1 /raw $ cd /raw $ /usr/sbin/bonnie++ -d `pwd` ... $ /usr/sbin/bonnie++ -d `pwd` Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP linuxcomf8 4G 38681 65 34840 6 16528 6 18312 40 18585 5 365.8 2 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 1250 26 +++++ +++ 3032 39 2883 69 +++++ +++ 3143 59 linuxcomf8,4G,38681,65,34840,6,16528,6,18312,40,18585,5,365.8,2,16,1250,26,+++++,+++,3032,39,2883,69,+++++,+++,3143,59

As you can see from the benchmark results, for output operations you might only get 30-60% the performance of XFS using ZFS through FUSE. On the other hand, the caching that FUSE performs allowed zfs-fuse to perform noticeably better than XFS for both character and block input tests. In real world terms, this means that there is no speed penalty for using ZFS through FUSE for a filesystem that is read more than it is written. Write operations do suffer a performance loss with zfs-fuse as apposed to an in-kernel filesystem, but the loss should not render the system unusable. As always, you should benchmark for the task you have at hand to make sure you can get the performance you expect.

There are many issues with running ZFS under Linux. For instance, the fact that the zfs-fuse FUSE process runs as the root user implies potential security issues and gives any bugs that might be present in zfs-fuse free rein over the sysstem. Also, the sharenfs ZFS directive does not currently work with zfs-fuse, and if you wish to export your ZFS filesystems manually then you'll likely have to recompile your FUSE kernel module too.

zfs-fuse does bring the flexibility of creating many filesystems using ZFS, and the manner in which quotas and space reservation is performed can make system administration to Linux. Because of the way ZFS uses pools to let you quickly create as many filesystems as you like, it's not uncommon to create a new ZFS filesystem in your pool for a new project you are working on. New filesystems being quick and easy to create works well with the rest of ZFS administration, where you can snapshot a ZFS filesystem in its current state and export the current filesystem or a snapshot to another machine. Though, as mentioned above, the sharenfs directive is currently not supported by zfs-fuse.

ZFS also reimplements much of the functionality of the Linux kernel, such as software RAID and logical volume management combination (LVM). One downside of this, as is noted in the March 2008 ZFS administration documentation on page 60, is that you cannot attach an additional disk to an existing RAID-Z configuration. With Linux, you can grow an existing RAID-5 array, adding new disks as you desire.

Ben Martin has been working on filesystems for more than 10 years. He completed his Ph.D. and now offers consulting services focused on libferris, filesystems, and search solutions.

Share    Print    Comments   


on Using ZFS though FUSE

Note: Comments are owned by the poster. We are not responsible for their content.

Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 10:39 AM
through, not though.

Thanks for the great article.


Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 10:48 AM
zpool create - not zool create

But a great article, anyway.



Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 01:58 PM
why has nobody made a GPL linux port? NTFS, HFS, FAT, etc were never released by their makers as GPL compatible code, yet they have been ported.

or why not release it as a module. quite a few gnu/linux users are happy to taint their kernel with graphics drivers. tainting it with a still-opensource-but-not-gpl seems far less of a issue.


Re: Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 07:27 PM
None of those file systems were ported to Linux, they were reverse-engineered.. ZFS FUSE is based on Sun's own source code, hence the license issue.

As for a kernel module, FUSE _is_ a kernel module - It works similarly to the shims that the graphics drivers use between their binary blobs and the kernel to abstract the kernel interface. The only difference is FUSE does this to facilitate easing porting of a file-system to different platforms, nVidia/ATI do it to exploit what is essentially a GPL loophole. Of course, FUSE runs the filesystem in userspace as well, but that is about the only downside of this approach.


Re(1): Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 20, 2008 02:00 AM

ZFS should be able to exploit a better legal 'loophole' - it is arguably not a "derived work" of the linux kernel. It existed first as part of Solaris and its development had nothing to do with linux.

There is nothing in the GPL (v2 at least) - that I am aware of anyway - that states that the kernel code is sacred (GPL only) and user code is open slather. That concept has been fabricated and has no _provable_ basis.

The kernel code/user code boundary is not the issue - the "derived work" clause is.

I believe that ZFS would _not_ be a derived work of the linux kernel - in the same way that a car is not a derived work of a tyre. ZFS requires and uses the kernel and its interfaces, but it was developed independently from linux and not derived from it in any meaningful way.

Now let the flames begin.


Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 02:41 PM
great it is.


Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 19, 2008 04:25 PM
It's not clear from your post how nuch RAM is assigned to the VMWare guest and how much of it is actually free. Since ZFS is *very* memory-hungry it doesn't seem to be a good idea to use it on a machine with less than 2GB physical RAM.

IMHO ZFS is not worth of using anyway but if you'd be able to repeat the benchmark with much larger RAM + perhaps also a physical machine instead of virtualized guest it would be interesting to see the difference. Maybe you could also install OpenSolaris 2008.05 and try to run some benchmarks on the same amount of RAM as Linux+FUSE...


Using ZFS natively on Linux

Posted by: Anonymous [ip:] on June 19, 2008 07:16 PM
Silliness about licenses aside, is there anyone who's managed to make a native module or kernel patch that will allow ZFS to run without FUSE? I really couldn't care less about compatibility of licenses, and if the software works, then the software works. I gotta imagine someone's come up with a hack.


Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 21, 2008 01:04 PM
ZFS is not as memory hungry as everyone's making out. What it does is like to use unused memory for caching purposes, which is a logical thing to do. But since that caching footprint shows in any diagnostic tools, everyone goes "Hurrrrrrr eatin' memories!" If your applications need memory, ZFS relinquishes as much of its ARC as necessary.

Something the FUSE version lacks is the ARC cache. Lack of this disables the prefetching functionality as well hampers the IO pipeline. If those things were enabled, you'd gain more performance out of it.


Using ZFS though FUSE

Posted by: Anonymous [ip:] on June 25, 2008 08:26 AM
"running ZFS through FUSE does not violate any licenses, because you are not linking CDDL and GPL code together"

Note that linking CDDL and GPL is not prohibited (at least from the GPL POV, I'm not that familiar with CDDL) for you as an end user.
You are not allowed to distribute the result though.

- Peder


This story has been archived. Comments can no longer be posted.

Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya