`msg-sim` Linux implementation
Context
Since #54, we have an initial implementation for our network simulation crate for MacOS, but we're still missing one for Linux. The simulation crate allows us to start simulated endpoints in Rust tests that we can then use to emulate a real world environment. The MacOS one uses dummynet with some ifconfig commands and a hacky localhost alias to configure a simulated endpoint. We want to do the same for Linux, but luckily Linux has some better tools for these types of things:
ipcommand: with theipcommand, we can create network namespaces, virtual ethernet devices, dummy interfaces etc. I think dummy interfaces specifically will be most useful in the beginning (it's basically a separate localhost but with normal MTU for more accurate simulation).tccommand: thetc(traffic-control) command can be used withnetem(a kernel module) to shape traffic.
Some inspiration:
- https://github.com/tylertreat/comcast
- https://github.com/celestiaorg/bittwister
Other notes:
- These commands all require root access. Maybe there's a way to utilize network namespaces to not need root access? Also the
CAP_NET_ADMINcapability allows us to do this stuff without root but I don't think it's useful when running tests. And granting capabilities also require root. - What other things can we do with network namespaces? A completely simulated p2p network where each peer is in its own namespace and can have its own latency, bandwidth and packet loss rates? This will need
vethor virtual ethernet devices as well.
I've given a look at this and here are some of my thoughts/notes:
On dummy interfaces
It's very easy to spin up a dummy interface on Linux and play with it, e.g.
# Make sure that the dummy kernel module is loaded,
# which provides support for the creation of dummy network interfaces
sudo modprobe dummy
# `ip link` allows to configure network devices. We want
# to add a new one, called `dummy0` of type `dummy`
sudo ip link add dummy0 type dummy
# We configure it by assigning an IP address to it.
# We also need to specify the network device
sudo ip addr add 192.168.1.1/24 dev dummy0
# Verify the dummy interface is correctly configured using
ip addr show dummy0
# Now you should be able to ping it:
ping 192.168.1.1
# To remove it
sudo ip link delete dummy0
However adding delay or loss using tc + netem doesn't work for some reasons. Internet suggests to work with namespaces, and I listened.
On namespaces and local P2P network
With ip it's easy to spin up linked virtual ethernet devices on a separated namespaces where each end of the link has its own IP address and network emulation rules.
In order to achieve a dummy-like interface with similar purposes using this technique we can spin up one veth on a separate namespace, and one veth on the host environment, like so:
# create namespace ns1
sudo ip netns add ns1
# create veth devices linked together
sudo ip link add veth-host type veth peer name veth-ns1
# move veth-ns1 device to ns1 namespace
sudo ip link set veth-ns1 netns ns1
# associate ip addr to veth-host device and spin it up
sudo ip addr add 192.168.1.2/24 dev veth-host
sudo ip link set veth-host up
# same but from ns1 namespace
sudo ip netns exec ns1 ip addr add 192.168.1.1/24 dev veth-ns1
sudo ip netns exec ns1 ip link set veth-ns1 up
# this should work
sudo ip netns exec ns1 ping 192.168.1.2
# this too
ping 192.168.1.1
# add latency etc to veth-ns1 from ns1 namespace
sudo ip netns exec ns1 tc qdisc add dev veth-ns1 root netem delay 3000ms loss 50%
# this should be slow
ping 192.168.1.1
# bonus: attach a toy web server in another terminal; should take a while to respond
sudo ip netns exec ns1 python3 -m http.server --bind 192.168.1.1 8000
In the same fashion, we can spin up a local p2p network where each veth device lives in a separate namespace, then we have to link them, assign IP addresses and adding network emulations as desired
On the actual implementation
I think a good strategy would be:
- extending the existing simulation code for Linux support -- I imagine something like
dummynet.rsbut for Linux only, let's sayip_tc.rswhich contains the logic to create a veth between the host and a target which lives on a separate namespace with provided emulation parameters; then use such logic forSimulation::startwith Linux target. - since it is possible to create a local p2p network on Linux with the above mentioned techniques, it'd be cool to create and use a
P2PSimulatorstruct made for this purpose. I still have to see what can be reused to avoid major refactorings.
On root access
Here I think there is not much we can do, since modifying the network stack requires root access. I think it's better to ask for them instead of altering the user's system configuration (which would require root privileges anyway) and may feel hacky/sketchy. For example, both tylertreat/comcast and celestiaorg/bittwister require root access to run the binary.
Great writeup, thanks for clarifying all the steps. Also agreed on the strategy! One question: when adding delay / bandwidth to a veth inside of a namespace, is it symmetrical or only one way? If it's one way, it will be different from how it works on MacOS over the loopback interface.
Keep in mind that the API I've settled on for the simulator is by no means final, if you think you can improve on it or bump into any limitations, feel free to propose changes to it.
Let's start with just single endpoint simulation and we can then expand it to P2P.
One question: when adding delay / bandwidth to a veth inside of a namespace, is it symmetrical or only one way? If it's one way, it will be different from how it works on MacOS over the loopback interface.
It is asymmetrical/only one way! For instance, if I apply delay to the device veth-ns1 (on namespace ns1) like in the code snippet above, then it affects all the packets leaving veth-ns1.