This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new!

Feature: Networking

Benchmarking NFSv3 vs. NFSv4 file operation performance

By Ben Martin on June 20, 2008 (9:00:00 AM)

Share    Print    Comments   

NFS version 4, published in April 2003, introduced stateful client-server interaction and "file delegation," which allows a client to gain temporary exclusive access to a file on a server. NFSv4 brings security improvements such as RPCSEC_GSS, the ability to send multiple operations to the server at once, new file attributes, replication, client side caching, and improved file locking. Although there are a number of improvements in NFSv4 over previous versions, this article investigates just one of them -- performance.

One issue with migrating to NFSv4 is that all of the filesystems you export have to be located under a single top-level exported directory. This means you have to change your /etc/exports file and also use Linux bind mounts to mount the filesystems you wish to export under your single top-level NFSv4 exported directory. Because the manner in which filesystems are exported in NFSv4 requires fairly large changes to system configuration, many folks might not have upgraded from NFSv3. This administration work is covered in other articles. This article provides performance benchmarks of NFSv3 against NFSv4 so you can get an idea of whether your network filesystem performance will be better after the migration.

I ran these performance tests using an Intel Q6600-based server with 8GB of RAM. The client was an AMD X2 with 2GB of RAM. Both machines were using Intel gigabit PCIe EXPI9300PT NICs, and the network between the two machines had virtually zero traffic on it for the duration of the benchmarks. The NICs provide a very low latency network, as described in a past article. While testing performance for this article I ran each benchmark multiple times to ensure performance was repeatable. The difference in RAM between the two machines changes how Bonnie++ is run by default. On the server, I ran the test using 16GB files, and on the client, 4GB files. Both machines were running 64-bit Fedora 9.

The filesystem exported from the server was an ext3 filesystem created on a RAID-5 over three 500GB hard disks. The exported filesystem was 60GB in size. The stripe_cache_size was 16384, meaning that for a three-disk RAID array, 192MB of RAM was used to cache pages at the RAID level. Default cache sizes for distributions might be in the 3-4MB range for the same RAID array. Using a larger cache directly improves write performance of the RAID. I also ran benchmarks locally on the server without using NFS to get an idea of the theoretical maximum performance NFS could achieve.

Some readers may point out that RAID-5 is not a desirable configuration, and certainly running it on only three disks is not a typical configuration. However, the relative performance of NFSv3 to NFSv4 is our main point of interest. I used a three disk RAID-5 because it had a filesystem that could be recreated for the benchmark. Recreation of the filesystem removes factors such as file fragmentation that can adversely effect performance.

I tested NFSv3 with and without the async option. The async option allows the NFS server to respond to a write request before it is actually on disk. The NFS protocol normally requires the server to ensure data has been writen to storage successfully before replying to the client. Depending on your needs, you might be running mounts with the async option on some filesystems for the performance improvement it offers, though you should be aware of what async implies for data integrity, in particular, potential undetectable data loss if the NFS server crashes.

The table below shows the Bonnie++ input, output, and seek benchmarks for the various NFS version 3 and 4 mounted filesystems as well as the benchmark that was run on the server. As expected, the reading performance is almost identical whether or not you are using the async option. You can perform more than five times the number of "seeks" over NFS when using the async option, presumably because the server can avoid actually performing some of them because a subsequent seek is issued before the initial seek was completed. Unfortunately the block sequential output for NFSv4 is not any better than for NFSv3. Without using the async option, output was about 50Mbps, whereas the local filesystem was capable of performing at 91Mbps. When using the async option, sequential block output came much closer to local disk speeds over the NFS mount.

ConfigurationSequential OutputSequential InputRandom
Per CharBlockRewrite Per CharBlockSeeks
K/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPUK/sec% CPU/sec% CPU
local filesystem623409491939224553319430466910935632239.20
NFSv3 noatime,nfsvers=3 501298647700635942852871961075161117044
NFSv3 noatime,nfsvers=3,async 592879683729104888012528249510758210914730
NFSv4 noatime 498648649548534046852990951080911016494
NFSv4 noatime,async 585699685796104914610528569510824711913521

The table below shows the Bonnie++ benchmarks for file creation, read, and deletion. Notice that the async option has a tremendous impact on file creation and deletion.

ConfigurationSequential CreateRandom Create
/sec% CPU/sec% CPU/sec% CPU /sec% CPU/sec% CPU/sec% CPU
NFSv3 noatime,nfsvers=3 1860612210182018606604101810
NFSv3 noatime,nfsvers=3,async 30311087891130789292111112711330699
NFSv4 noatime 98060051319309306520111920
NFSv4 noatime,async 131487155135350121298875371250609

To test more day-to-day performance I extracted the linux- uncompressed Linux kernel source tarball and then deleted the extracted sources. Note that the original source tarball was not compressed in order to ensure that the CPU of the client was not slowing down extraction.

ConfigurationFind (m:ss)Remove (m:ss)
local filesystem 0:010:03
NFSv3 noatime,nfsvers=3 9:442:36
NFSv3 noatime,nfsvers=3,async 0:310:10
NFSv4 noatime 9:522:27
NFSv4 noatime,async 0:400:08

Wrap up

These tests show no clear performance advantage to moving from NFSv3 to NFSv4.

NFSv4 file creation is actually about half the speed of file creation over NFSv3, but NFSv4 can delete files quicker than NFSv3. By far the largest speed gains come from running with the async option on, though using this can lead to issues if the NFS server crashes or is rebooted.

Ben Martin has been working on filesystems for more than 10 years. He completed his Ph.D. and now offers consulting services focused on libferris, filesystems, and search solutions.

Share    Print    Comments   


on Benchmarking NFSv3 vs. NFSv4 file operation performance

Note: Comments are owned by the poster. We are not responsible for their content.

Benchmarking NFSv3 vs. NFSv4 file operation performance

Posted by: Johannes Truschnigg on June 20, 2008 09:50 AM
If you compare NFSv2/3 and NFSv4's performance when transferring large files over GBit links, you'll be nothing less than staggered by what margin NFSv4 (with "rsize=32768,wsize=32768" as mount-time options) manages to beat its predecessor. NFSv4 offers substantial benefit for my particular usage. Window-size is an important factor for NFSv2/3-filesystems, too.

You can further tweak network filesystem performance by reducing the number of interrupts your NIC generates: Intel Pro/1000 networking hardware, for example, offers a driver option to use rx-polling (CONFIG_E1000_NAPI=Y), which sped up my NFSv4 setup at home from ~85Mb/sec to as much as ~115Mb/s when transfering large files.


Async win

Posted by: Anonymous [ip:] on June 23, 2008 09:44 PM
Is there any reason to not run with the async flag if the mounting clients are read only?


Benchmarking NFSv3 vs. NFSv4 file operation performance

Posted by: Anonymous [ip:] on June 24, 2008 09:34 PM
I am able to achieve the write speeds here (near wire speed) but am getting between 30-50MB/s when reading with bonnie++ on the NFS client. I have been through every nfs and tcp tweak fathomable (have tried udp as well). The NFS server's OS can easily read +200MB/s however this is not being translated to the NFS client. ideas or things to check?

On a side note, under TCP i'm noticing the NFS server's tcp window is being advertised lower than a single MTU (always 501 bytes or lower), this seems related but no amount of TCP tweaking could increase this. both the client and host are CentOS 5 x86_64.


This story has been archived. Comments can no longer be posted.

Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya