Have you you ever wanted to deliberately slow down your network to test your web service over a flaky connection? You can do it with
netem, the Network Emulator.
It’s built right in to the Linux kernel.
It’s nice to develop something locally, throw it behind Apache or NGINX and hit
http://localhost:24242 to test it out. But what if you need to simulate a network that isn’t near-zero latency and 30-odd Gbps?
You have a couple of options:
- Forward some ports and phone a friend
- Host it on a cloud provider in a remote region
- Get a second computer, tether it through your phone’s data plan, and test it that way (I know you have done this)
netem, you don’t need a phone, a second computer, a cloud host, or even friends.
Since this is intended to be more of a field guide, I’m not going to go in to the details of the Linux networking subsystem. There are entire books on that. But it’s helpful to have a general overview.
netem works at the kernel TCP/IP stack. It’s part of a tool called
tc, for traffic control, which has shipped with Linux for at least the last decade as a component of the
iproute2 suite of networking power tools.
Luckily, the dude who wrote
netem also wrote a paper explaining what it is and how to use it all the way back in 2005. In brief,
netem is one of a set of queuing disciplines which sit between the kernel TCP/IP stack and the hardware (or virtualized) network interface. The role of a queuing discipline is to decide how to prioritize packets leaving the stack destined for the interface.
From the aforementioned paper:
Most queuing disciplines attempt to solve the problem of deciding how to prioritize outgoing packets based on a set of rules in order to ensure quality of service.
netem “solves” this problem by limiting egress rate, delaying packets for random intervals and sending them in bursts, or other doing other aggravating things that happen in real-world networks.
I wrote a little framework using Docker Compose to demonstrate how
netem affects both bandwidth and latency by bringing up two containers, adding the
netem queuing discipline to one of the containerized network interfaces, then running ping and iperf3. We’ll be using it to demonstrate the havoc
netem can simulate.
First, try running it without any options:
$ export NETEM='' $ docker-compose up
You should see typical
localhost performance: sub-millisecond latency and >10 Gbps bandwidth.
Let’s sprinkle some latency on our perfect network:
$ NETEM='delay 60ms' docker-compose up
You’ll notice that not only was latency affected, but bandwidth too. By a lot.
The reason for this has to do with the fact that TCP is a guaranteed delivery protocol. Specifically, the sender must wait for the receiver to acknowledge receipt of data every X bytes, known as the receive window.
In the real world (or our simulation of it) where latency exists, those acknowledgements also take time to reach the other end, slowing the whole transmission down.
Take a look at these two plots, generated by Wireshark:
Within the first two seconds, the client scales up the TCP receive window to roughly 3 MiB, which is the full capacity of the client’s receive buffer (limited by the kernel parameter
net.ipv4.tcp_rmem). You can see quite clearly the corresponding linear increase in throughput.
We can also add latency with some random jitter to make our network act even more like the real world by adding a second parameter to
delay, like this:
$ NETEM='delay 60ms 10ms' docker-compose up
Now the ping results dance between 55-65 ms.
There’s even a way to change the distribution curve of the latency filter for some seriously detailed testing. But I’ll leave that as an exercise for you, dear reader.
We can also directly throttle the throughput of our network by specifying a target
$ NETEM='rate 5mbit' docker-compose up
iperf3-client_1 | [ 5] 4.00-5.00 sec 585 KBytes 4.80 Mbits/sec iperf3-client_1 | [ 5] 5.00-6.00 sec 581 KBytes 4.76 Mbits/sec iperf3-client_1 | [ 5] 6.00-7.00 sec 584 KBytes 4.78 Mbits/sec ...
Why aren’t we hitting exactly 5 megabits per second?
As discussed earlier, TCP is a pretty complicated protocol. Every data packet ships with a header which carries information like the source IP address and port, the destination, a timestamp, the aforementioned receive window plus a 16-bit scaling factor, etc. All that extra stuff takes up space that the application, in this case
iperf, doesn’t see.
As far as
netem is concerned, it is limiting the throughput to exactly 5 Mbps, but your application will see a bit less because of the TCP overhead.
There’s even a technical term for throughput minus TCP and link layer overhead: goodput.
Honorable mention UDP
If TCP is registered mail, UDP is carpet bombing. UDP doesn’t care if packets make it to their destination at all. It’s up to the application to handle corruption, duplication, reordering, and all the stuff TCP does on its own.
The advantage is that UDP is fast, both in terms of latency and throughput. That’s why UDP is used more often for transmitting things like lossy video streams, where a couple bytes missing here and there won’t do too much harm, and for applications that encapsulate other traffic that already have error correction, like an encrypted VPN tunnel carrying TCP.
Also, if Google gets its way, a new UDP-based application protocol called QUIC will become the future of the internet. QUIC will move all the complicated congestion control and error correction stuff now done by the operating system to the application level, giving developers more control over the networking stack.
We can run
iperf3 in UDP mode, but we have to specify the bandwidth at the client, because with UDP the server doesn’t have the slightest idea whether the client is receiving any data at all. Carpet bomb.
iperf3 in UDP mode sending at 10 Mbps together with
netem rate set to 5 Mbps, we can demonstrate packet loss.
iperf3-client_1 | [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams iperf3-client_1 | [ 5] 5.00-6.00 sec 592 KBytes 4.85 Mbits/sec 0.609 ms 443/862 (51%) iperf3-client_1 | [ 5] 6.00-7.00 sec 594 KBytes 4.87 Mbits/sec 0.568 ms 444/864 (51%) iperf3-client_1 | [ 5] 7.00-8.00 sec 592 KBytes 4.85 Mbits/sec 0.619 ms 444/863 (51%)
Right on the money.
netem can do a lot more than what was covered here, like introducing random data corruption, time slotting to simulate ‘bursty’ links where bandwidth is non-continuous, packet duplication, etc. I recommend exploring the
netem man page (links below) to learn more.
We don’t live in a perfectly networked world. Tuning your app’s performance for sub-optimal network connections (sitting on the subway in a tunnel during rush hour, airport WiFi) can have a huge impact on user experience.