- About Us
Network latency and bandwidth are the two metrics most likely to be of interest when you benchmark a network. Even though most service and product advertising focuses on bandwidth, at times the latency can be a more important metric. Here's a look at three projects that include tools to test your network performance: nepim "network pipemeter," LMbench, and nuttcp.
For this article I built each utility from source on a 64-bit Fedora 9 machine. I used nepim version 0.51, LMbench version 3, and nuttcp version 5.5.5.
For testing, I used a network link with two Gigabit Ethernet network interface cards configured for network bonding. As you'll see from the results, however, something obviously is not functioning correctly because I was unable to achieve the 2 gigabit theoretical bandwidth. The nepim and nuttcp benchmarks below show that communication from the server is faster than sending data to the server, which might be an effect of the network interface bonding.
nepim is packaged for openSUSE 11 as a 1-Click install but is not in the Fedora or Ubuntu repositories. It requires the liboop library to be installed, and this library too is packaged for openSUSE but not Fedora and Ubuntu. You can install liboop using the
./configure; make; sudo make install procedure.
nepim does not use autotools to build. To compile, you change into the src directory and run make. I found that to compile nepim I had to include an additional define in the Makefile to avoid a duplicate definition of a data structure. The change is shown below, together with the installation, as there is no
install target in the makefile.
$ cd src $ vi Makefile ... ifeq ($(PLATFORM),Linux) CFLAGS += -DHAVE_STDINT -DHAVE_SUSECONDS_T \ -DHAVE_SIGHANDLER_T -DHAVE_IP_MREQN -DHAVE_IP_MREQ -DHAVE_INET6_S6_ADDR32 -DHAVE_GROUP_SOURCE_REQ ifdef ENABLE_DLOPEN LDFLAGS += -ldl endif endif ... $ make $ sudo install -m 555 nepim /usr/local/bin
If you invoke nepim without any command-line options it starts in server mode. In this mode nepim will listen on every interface on the system and accept both UDP and TCP connections. Running nepim with the
-c option puts nepim into client mode and allows you to specify one or more servers that nepim should connect to and start benchmarking the network link.
Starting nepim on the server is shown below, together with the output from the server when a client connects. The lines beginning with "6 cur" are printed as the benchmark is being performed. The final lines showing the mean (average), minimum, and maximum figures for kilobits per second in both directions and packets sent and received per second are printed when the client has disconnected.
$ nepim nepim - network pipemeter - version 0.51 server: tcp_read=32768 tcp_write=32768 udp_read=4096 udp_write=4096 3: TCP socket listening on ::,1234 ... 6: TCP incoming: ::ffff:192.168.10.200,38176 6: send=1 bit_rate=-1 pkt_rate=-1 interval=2 duration=10 delay=250000 .... 6: sweep write: floor=32768 ceil=32768 6: pmtud_mode=1 path_mtu=1500 mss=1448 tos=0 ttl=64 mcast_ttl=64 win_recv=87380 win_send=16384 sock_ka=1 nodelay=0 kbps_in kbps_out rcv/s snd/s 6 cur 8 273194.97 676940.88 2619.50 2583.00 6 cur 6 235308.55 722075.62 2435.50 2754.50 6 cur 4 223439.58 723386.38 2282.00 2759.50 6 cur 2 255724.64 702152.69 2346.00 2678.50 6 avg 0 246072.14 708041.75 2413.30 2701.10 6 min 0 223439.58 676940.88 2282.00 2583.00 6 max 0 273194.97 723386.38 2619.50 2759.50 write: errno=104: Connection reset by peer write: connection lost on TCP socket 6 6: pmtud_mode=1 path_mtu=1500 mss=1448 tos=0 ttl=64 mcast_ttl=64 win_recv=603680 win_send=256360 sock_ka=1 nodelay=0
By default, traffic goes only from the server to the client after the client connects to the server. you can change this by using the
-s option to have the client send to the server, or the
-d option for communication in both directions. The session on a client that connects to the above server is shown below.
$ nepim -c 192.168.10.200 -d nepim - network pipemeter - version 0.51 client: tcp_read=32768 tcp_write=32768 write_floor=32768 write_ceil=32768 step=1 not a UNIX domain path: 192.168.10.200: errno=2: No such file or directory ... 3: TCP socket connected to 192.168.10.210,1234 3: sending: hello server_send=1 bit_rate=-1 pkt_rate=-1 stat_interval=2 ... 3: greetings sent to 192.168.10.210,1234 3: pmtud_mode=1 path_mtu=1500 mss=1448 tos=0 ttl=64 mcast_ttl=1 win_recv=87380 win_send=16384 sock_ka=1 nodelay=0 kbps_in kbps_out rcv/s snd/s 3 cur 8 675722.31 273269.25 2696.00 1086.00 3 cur 6 719693.06 235371.25 3278.50 953.50 3 cur 4 725370.31 223025.72 3067.50 898.50 3 cur 2 700528.75 255723.53 2785.00 1019.00 3 avg 0 706910.69 246072.14 2943.30 986.20 3 min 0 675722.31 223025.72 2696.00 898.50 3 max 0 725370.31 273269.25 3278.50 1086.00 3: pmtud_mode=1 path_mtu=1500 mss=1448 tos=0 ttl=64 mcast_ttl=1 win_recv=1688544 win_send=99000 sock_ka=1 nodelay=0 nepim: no event sink registered nepim: done
If you start the nepim server using
-U /tmp/nepim-socket it will use local domain stream sockets instead of TCP/IP networking. You supply the path to the socket to the
-c option on the client to connect to this local socket for benchmarking. This is useful if you want to know how fast nepim can possibly communicate without having the network card slow things down.
Shown below are the figures for nepim run against a local domain socket on an Intel Q6600 quad core CPU. The Q6600 can manage about 7 gigabits in both directions. The CPU was running at slightly over 50% capacity for the duration of the test, giving both the nepim client and server full use of a CPU core each.
kbps_in kbps_out rcv/s snd/s 3 cur 8 7100432.50 7105806.50 27203.50 27106.50 3 cur 6 7268335.50 7266631.50 27887.00 27720.00 3 cur 4 7105020.00 7108296.50 27196.00 27116.00 3 cur 2 7189823.50 7188644.00 27557.00 27422.50 3 avg 0 7154958.50 7156819.50 27413.10 27301.10 3 min 0 7100432.50 7105806.50 27196.00 27106.50 3 max 0 7268335.50 7266631.50 27887.00 27720.00
To run more than a single session at once, use the
-n option when running the client and supply the number of connections you would like. When I used
-n 2 in the local socket test, each stream achieved about 4 to 4.5 gigabits/second, so bandwidth was improved but not doubled.
When you run the nepim client with the
-u option, it will use UDP instead of TCP for communications. The output for UDP includes statistics about the number of packets that were lost, as shown below.
$ nepim -c 192.168.10.200 -d -u ... kbps_in kbps_out rcv/s snd/s loss ooo LOST 3 0 0 cur 8 595738.62 808632.31 18180.50 24677.50 .0495 .0262 1894 3 0 0 cur 6 505872.38 868532.25 15438.00 26505.50 .0090 .0050 2174 3 0 0 cur 4 585842.69 825393.12 17878.50 25189.00 .0177 .0097 2817 3 0 0 cur 2 563150.88 872955.88 17186.00 26640.50 .0232 .0115 3633 3 0 0 avg 0 546350.69 866831.56 16673.30 26453.60 .0389 .0190 6749 3 0 0 min 0 505872.38 808632.31 15438.00 24677.50 .0090 .0050 3 0 0 max 0 595738.62 872955.88 18180.50 26640.50 .0495 .0262
nepim is great for seeing what the bandwidth is in both directions between two machines. Being able to test send, receive, and bidirectional throughput speeds lets you see if you are having issues in only one direction. The UDP tests can also show you how many packets are being lost so you can see whether connecting a switch between two hosts leads to more packets being lost or not.
Next we'll take a look at LMBench, which includes many tools to benchmark network, memory, filesystem, and other system components' performance. For this article I'll explore only the network-related benchmarks.
LMbench is packaged for Ubuntu but not for Fedora or openSUSE. It does not use autotools to build. Once you have run make to generate the executables, they can be executed directly from where they are created. The build procedure is shown below:
$ tar xzvf /.../lmbench3.tar.gz $ cd ./lmbench* $ make -k $ cd ./bin/x86_64-linux-gnu $ ls bw_file_rd lat_connect lat_proc lib_mem.o loop_o bw_mem lat_ctx lat_rpc lib_sched.o memsize bw_mmap_rd lat_fcntl lat_select lib_stats.o mhz bw_pipe lat_fifo lat_sem lib_tcp.o msleep bw_tcp lat_fs lat_sig lib_timing.o par_mem bw_unix lat_http lat_syscall lib_udp.o par_ops disk lat_mem_rd lat_tcp lib_unix.o stream enough lat_mmap lat_udp line timing_o flushdisk lat_ops lat_unix lmbench.a tlb getopt.o lat_pagefault lat_unix_connect lmdd hello lat_pipe lib_debug.o lmhttp
To measure the TCP bandwidth between two hosts, start
bw_tcp -s on the server and
bw_tcp servername on the client. The client session is shown below:
$ ./bw_tcp 192.168.10.210 0.065536 88.32 MB/sec
Many benchmarks in the LMbench suite follow the same pattern as
bw_tcp shown above. That is, you start a server by supplying
-s as the sole argument and run clients by passing the host name or IP address of the server. Shown below are the TCP and UDP latency tests, as well as a test of the latency to simply complete a TCP/IP connection over the network. These clients offer very few options to allow you to experiment with different network queue sizes and other variables that might affect performance.
./lat_tcp 192.168.10.210 TCP latency using 192.168.10.210: 685.9002 microseconds $ ./lat_udp 192.168.10.210 UDP latency using 192.168.10.210: 1378.2743 microseconds $ ./lat_connect 192.168.10.210 TCP/IP connection cost to 192.168.10.210: 185.5484 microseconds
The LMBench network tests have few options but do provide you with an easy way to measure your current network bandwidth and latency. When you are changing the kernel module options for your current NIC or replacing a NIC with a new one, LMBench provides a quick test to see how much your latency has improved.
Finally, we'll take a look at nuttcp, which includes many options for tweaking buffer lengths, nodelay options, and type of service fields to see what impact this has on your network performance. nuttcp can show either overall bandwidth or the bandwidth achieved in the last second.
nuttcp is available in the Fedora 9 repositories but not for openSUSE or Ubuntu. Build and installation is shown below:
tar xjvf nuttcp-5.5.5.tar.bz2 cd ./nuttcp* cc -O3 -o nuttcp nuttcp-5.5.5.c strip nuttcp sudo install -m 555 nuttcp /usr/local/bin/
Start the server using
nuttcp -S. The client can be invoked with many options, followed by the server host name(s) at the end of the command line. The below test prints the bandwidth every second (specified by the
-i1 option) while the test is running and runs for 10 seconds before completing.
$ nuttcp -v -v -i1 192.168.10.210 nuttcp-t: v5.5.5: socket nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> 192.168.10.210 nuttcp-t: time limit = 10.00 seconds nuttcp-t: connect to 192.168.10.210 with mss=1448 nuttcp-t: send window size = 8192, receive window size = 43690 nuttcp-r: v5.5.5: socket nuttcp-r: buflen=65536, nstream=1, port=5001 tcp nuttcp-r: interval reporting every 1.00 second nuttcp-r: accept from 192.168.0.200 nuttcp-r: send window size = 8192, receive window size = 43690 85.3719 MB / 1.00 sec = 715.9765 Mbps 86.3684 MB / 1.00 sec = 724.5411 Mbps 85.9188 MB / 1.00 sec = 720.7551 Mbps 84.4201 MB / 1.00 sec = 708.2533 Mbps 87.7772 MB / 1.00 sec = 736.2222 Mbps 86.7372 MB / 1.00 sec = 727.5696 Mbps 91.4327 MB / 1.00 sec = 767.0191 Mbps 89.4166 MB / 1.00 sec = 750.2228 Mbps 85.4859 MB / 1.00 sec = 717.0937 Mbps 87.0377 MB / 1.00 sec = 729.9696 Mbps nuttcp-t: 870.1633 MB in 10.00 real seconds = 89091.75 KB/sec = 729.8396 Mbps nuttcp-t: 13923 I/O calls, msec/call = 0.74, calls/sec = 1392.10 nuttcp-t: 0.0user 22.3sys 0:10real 224% 0i+0d 0maxrss 0+3pf 16198+1383csw nuttcp-r: 870.1633 MB in 10.00 real seconds = 89083.52 KB/sec = 729.7722 Mbps nuttcp-r: 55254 I/O calls, msec/call = 0.19, calls/sec = 5524.09 nuttcp-r: 0.0user 6.7sys 0:10real 67% 0i+0d 0maxrss 0+20pf 62619+635csw
You can also run multiple streams at once; use
-N3 to start three connections, for example. The
-B option makes the client receive traffic only, while the
-D option transmits only. The default is for communication in both directions.
$ nuttcp -v -v -N3 -B 192.168.10.210 nuttcp-t: v5.5.5: socket nuttcp-t: buflen=65536, nstream=3, port=5001 tcp -> 192.168.10.210 nuttcp-t: time limit = 10.00 seconds nuttcp-t: connect to 192.168.10.210 with mss=1448 nuttcp-t: send window size = 8192, receive window size = 43690 nuttcp-t: 1239.8698 MB in 10.00 real seconds = 126944.75 KB/sec = 1039.9314 Mbps nuttcp-t: 19838 I/O calls, msec/call = 0.52, calls/sec = 1983.52 nuttcp-t: 0.0user 41.2sys 0:10real 413% 0i+0d 0maxrss 0+3pf 4758+3081csw nuttcp-r: v5.5.5: socket nuttcp-r: buflen=65536, nstream=3, port=5001 tcp nuttcp-r: accept from 192.168.0.200 nuttcp-r: send window size = 8192, receive window size = 43690 nuttcp-r: 1239.8698 MB in 10.00 real seconds = 126934.93 KB/sec = 1039.8509 Mbps nuttcp-r: 29899 I/O calls, msec/call = 0.34, calls/sec = 2989.25 nuttcp-r: 0.0user 8.5sys 0:10real 86% 0i+0d 0maxrss 0+18pf 12519+1847csw $ nuttcp -v -v -N3 -D 192.168.10.210 ... nuttcp-r: v5.5.5: socket nuttcp-r: buflen=65536, nstream=3, port=5001 tcp nuttcp-r: accept from 192.168.0.200 nuttcp-r: send window size = 8192, receive window size = 43690 nuttcp-r: 806.2317 MB in 10.00 real seconds = 82545.65 KB/sec = 676.2140 Mbps nuttcp-r: 67104 I/O calls, msec/call = 0.15, calls/sec = 6709.39 nuttcp-r: 0.0user 5.7sys 0:10real 57% 0i+0d 0maxrss 0+18pf 73018+378csw
nuttcp provides similar options to nepim and is heavily focused on measuring the network bandwidth between hosts. Using
-i1 with nuttcp, and by default with nepim, you see the bandwidth statistics printed every second while the test is taking place. The nuttcp man page shows many options for the type of service and buffer sizes that you can explicitly set when running nuttcp so you can see if your particular hardware and drivers do not perform well in certain configurations. Running
nepim --help will show many more options for configuring the buffers, window sizes, and TCP options.
It is a matter of personal convenience whether you should use nuttcp or nepim. Since nepim is packaged for openSUSE and nuttcp is packaged for Fedora, it might boil down to what distribution you are running as to which of these two tools to use.
Both nepim and nuttcp provide options for setting the size of network packet queues and other more advanced options, such as the TCP maximum segment size, in order to improve the network performance by changing the software setup at each end. Meanwhile, the LMbench tests are quick to run and provide useful insight into your available bandwidth and latencies on your network link.
Ben Martin has been working on filesystems for more than 10 years. He completed his Ph.D. and now offers consulting services focused on libferris, filesystems, and search solutions.