goreplay icon indicating copy to clipboard operation
goreplay copied to clipboard

VXLAN engine is inconsistent with request capture

Open monrax opened this issue 2 years ago • 2 comments

Often only the response part of the HTTP message is being displayed when using --input-raw-engine-vxlan and --output-stdout, without the corresponding request.

I previously mentioned this issue in #1095 . The environment and instructions for reproducing the issue are the same, except for the last step.

Environment: AWS How to repeat issue: Launch 2 t3-type EC2 instances, and set up a VPC traffic mirror filter and session between them. The ENI for one of them acts as a target and the other as the source. Create an inbound rule on the target's security group to allow UDP traffic on port 4789. SSH into both machines:

  • On the target machine: clone this repo, compile gor and run the following command
sudo ./gor --input-raw :8323 --input-raw-engine vxlan --input-raw-vxlan-vni 123 --input-raw-bpf-filter "(src port 8323) or (dst port 8323)" --output-stdout

In this case, 123 was chosen for VXLAN ID when creating the mirror session.

  • On the source machine:
echo world > hello.txt && python3 -m http.server 8323

In this case, a simple webserver is exposed at port 8323. Remember to create an inbound rule in the security group of the source machine to be able to reach port 8323 from your local machine

From you local machine curl this simple server at http://<source machine public ip>:8323/hello.txt

Expected result: Both parts of the HTTP message printed to stdout in target machine, including request (1) and response (2). Actual result: Only HTTP responses (2) are printed . See attached image.

image

Additional info: It appears that sometimes the issue does not happen when accessing the web server from the browser, instead of using curl or another client like wget or Insomnia.

Note: I experienced another issue while trying this engine (#1095) only headers show up, without the body. Both issues could be related but we cannot be sure until further debug.

monrax avatar Jul 07 '22 22:07 monrax

After a couple weeks of trying different things to get to the root cause I've learned the following:

  • This issue is not related to VXLAN, or AWS VPC Mirroring. You can actually reproduce the issue just by launching an EC2 instance, doing

    wget https://github.com/buger/goreplay/releases/download/1.3.3/gor_1.3.3_x64.tar.gz && tar xzf gor_1.3.3_x64.tar.gz
    

    then

    ./gor --input-raw :8323 --input-raw-bpf-filter "(src port 8323) or (dst port 8323)" --output-stdout
    

    and, finally

    echo world > hello.txt && python3 -m http.server 8323
    

    before using curl from the client machine, which brings me to the second item:

  • I failed to mention above that the OS I was runnning in the client machine was Windows. After some testing with different environments I realized this issue was only occuring when using Windows machines as clients to reach Amazon EC2 instances as servers through their public addresses, particularly when using tools like curl. When using browsers like Chrome of Firefox, the behavior was inconsistent: sometimes gor showed request data, sometimes it didn't.

  • After comparing the Hex streams for each captured packet between a request made from another EC2 instance and my own Windows partition, I realized there was a difference: there were some extra trailing bytes for the ACK corresponding to the request (just before the one with the HTTP payload) when using Windows as a client.

  • After taking a look at the codebase (and adding a bunch of fmt.Printf statements 😅), I verified that the trailing bytes were the issue: these trailers are part of the Ethernet frame, but the latest release of GoReplay is currently unable to interpret these as such. Instead it only removes headers and assumes the rest corresponds to the inner layer payload, which in this case would be the inner payload of a TCP layer, assumed as the HTTP layer. I attach a few screenshots below: image image image

  • I haven't found anything explaining online about why Windows adds those Ethernet trailing bytes but I was able to verify that it happens by using different PCs running Windows 7, 10 and 11 (including a Windows VM in Azure) as clients. Thankfully, @buger has opened a PR that addresses this issue (thank you!). I have tested it and it works fine with both regular pcap capture as well as the original vxlan use case. I believe this issue can be closed after merging it.

monrax avatar Jul 29 '22 14:07 monrax

Hey @monrax , I think we are experiencing the same issue, can you check if it related to MTU and if changing it fix it? #1134

RoeiGanor avatar Oct 12 '22 15:10 RoeiGanor