vert.x icon indicating copy to clipboard operation
vert.x copied to clipboard

Introduce support for pcap traffic capturing of network traffic

Open geoand opened this issue 3 years ago • 51 comments

Motivation:

The idea here is to allow users to analyze HTTP traffic consumed and produced via Vert.x using Wireshark. This capabity relies on Netty's io.netty.handler.pcap.PcapWriteHandler and has also been mentioned in https://github.com/vert-x3/issues/issues/566.

P.S. I tested this by also patching Quarkus to use this change and tried an application that uses RESTEasy Reactive and the Reactive REST Client. Here is what the Wireshark screenshot looks like: Screenshot from 2022-01-28 16-51-38

geoand avatar Jan 28 '22 14:01 geoand

we should have it also on NetSocket as it remains useful for other parts of vertx stack like MQTT or SQL client that use NetSocket

vietj avatar Jan 28 '22 16:01 vietj

Good point. I'll have a look on Monday

geoand avatar Jan 28 '22 16:01 geoand

Capturing also add to the NetClientImpl pipeline

geoand avatar Jan 31 '22 06:01 geoand

I think that the option declaration could simply be in TCPSSLOptions or perhaps NetworkOptions since it could work for UDP as well ?

vietj avatar Jan 31 '22 07:01 vietj

Indeed, PR updated

geoand avatar Jan 31 '22 08:01 geoand

Very cool!

FroMage avatar Jan 31 '22 08:01 FroMage

@vietj is there anything else that this PR needs, or should I mark it ready for review?

geoand avatar Feb 01 '22 08:02 geoand

I need to make a final review for this

vietj avatar Feb 03 '22 14:02 vietj

I did a quick review of the VertxPcapWriteHandler and I can see a few issues with the static metadata map and the pcap write handler.

  1. it seems to me that when a channel is closed it will close the underlying output stream and all other concurrent channel use will not be able to write anymore.

  2. the shared static map access seems to be racy, e.g two FileOutputStream could be created for the same file.

Overall this seems to come from the fact that we haven't properly defined the lifecycle of the pcap files and therefore we should first define how this should be used.

e.g the pcap file could be created when the HTTP server is starting and closed when the HTTP server is closed, etc...

vietj avatar Feb 15 '22 15:02 vietj

it seems to me that when a channel is closed it will close the underlying output stream and all other concurrent channel use will not be able to write anymore.

Although theoretically true, close is called by handlerRemoved which properly guards handles the case of multiple writers.

the shared static map access seems to be racy, e.g two FileOutputStream could be created for the same file.

I don't see how could this happen, as Metadata (which contains the reference to the OutputStream) is only created via fileToMetadata.computeIfAbsent(pcapCaptureFile, Metadata::new);

e.g the pcap file could be created when the HTTP server is starting and closed when the HTTP server is closed, etc...

I could be wrong of course, but this is exactly what the code in the constructor and in handlerRemoved is meant handle.

geoand avatar Feb 15 '22 15:02 geoand

If I understand correctly, an HTTP server will have concurrent writes to the pcap file.

  • when the first client will connect to the server, it will create the pcap file.
  • the original PcapWriteHandler will close the output stream when the channel handler is removed from the pipeline.
  • the second client will reuse the same output stream which has been closed when the first client socket is closed and will likely throw a IOException because the stream is closed.

Beside concurrent access, the server will keep writing to the pcap file, and it is not clear when it will finish to write to it. Let's say I want to use the file, when should I open the file and make sure that I do have all the data I'm supposed to have ?

So my point is how should one use this feature in practice ? e.g that might be a cloud application that needs debugging through this feature. What are the other usage we see for this ? we need to answer these questions before going forward.

vietj avatar Feb 15 '22 15:02 vietj

Fair points, I'll come back tomorrow with answers

geoand avatar Feb 15 '22 16:02 geoand

Hi, what happens incase of HTTPS(SSL/TLS), can we have ability dump encrypted message?

Unev avatar Feb 15 '22 20:02 Unev

what happens incase of HTTPS(SSL/TLS), can we have ability dump encrypted message?

Yes, the messages are shown decrypted (in my original screenshot, one of the HTTP calls is actually an HTTPS call)

geoand avatar Feb 16 '22 03:02 geoand

It seems like this does have issues with concurrent writes. I'll have to investigate further

geoand avatar Feb 16 '22 04:02 geoand

The only reliable way I have found so far to get this to work for multiple event loop threads under load is have each instance write to its own file and then have the user invoke mergecap in order to create a capture file that has all the information or just drag and drop all the capture files into the wireshark window.

It's not a great user experience, but it's not terrible either...

geoand avatar Feb 21 '22 12:02 geoand

for now I'm really interested by the client and server use case, i.e how / when does the user start and end capture.

I think that we should have an API to control start/stop recording to give user the control of the lifecycle of the dump and then we could have predefined ways to control this, e.g when the HTTP server closes then the pcap file is flushed and closed, etc...

On Mon, Feb 21, 2022 at 1:25 PM Georgios Andrianakis < @.***> wrote:

The only reliable way I have found so far to get this to work for multiple event loop threads under load is have each instance write to its own file and then have the user invoke mergecap in order to create a capture file that has all the information.

It's not a great user experience, but it's not terrible either...

— Reply to this email directly, view it on GitHub https://github.com/eclipse-vertx/vert.x/pull/4260#issuecomment-1046823876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABXDCXPXZG5TQ3ZEXPNY5DU4IVMPANCNFSM5NA6JK3A . You are receiving this because you were mentioned.Message ID: @.***>

vietj avatar Feb 21 '22 12:02 vietj

for now I'm really interested by the client and server use case, i.e how / when does the user start and end capture.

The way things are now, the bytes are written straight to a FileOutputStream, so the capture files are pretty much immediately available on the file system.

geoand avatar Feb 21 '22 12:02 geoand

yes but the question is when does this stop ?

On Mon, Feb 21, 2022 at 1:44 PM Georgios Andrianakis @.***> wrote:

for now I'm really interested by the client and server use case, i.e how / when does the user start and end capture.

The way things are now, the bytes are written straight to a FileOutputStream, so the capture files are pretty much immediately available on the file system.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

vietj avatar Feb 21 '22 12:02 vietj

It does not stop.

If we want to control that, we need to have our own OutputStream that can receive such "commands".

geoand avatar Feb 21 '22 13:02 geoand

I think that we should have an API to control start/stop recording

How do you envision this would work? How would the user enable this?

geoand avatar Feb 22 '22 13:02 geoand

How is this for an alternative:

We introduce an SPI in Vert.x that allows for customizing the Netty Handler chain. That way in Quarkus we can introduce the the PcapHandler in whatever way we like. WDYT?

geoand avatar Mar 03 '22 08:03 geoand

Bump :)

geoand avatar Mar 22 '22 12:03 geoand

@geoand you can already do this with NetSocketInternal (add handlers) can you check with this API ? if that works we could expose the same API for HTTP.

vietj avatar Mar 22 '22 13:03 vietj

Ah, I wasn't aware of NetSocketInternal#channelHandlerContext. That is super useful, so yeah, it would be great to have it for the HTTP API as well in order to cater for advanced use cases :)

geoand avatar Mar 22 '22 14:03 geoand

also I'm not convinced anymore by the interest of the feature because dump is achieved in the application layer instead of the network layer and it is not an exact dump of what goes over the wire at the TCP level (instead it's a fake dump created from what the JVM received) so the usefulness seems limited and could actually be misleading when debugging.

vietj avatar Mar 22 '22 14:03 vietj

@geoand you can read this guide https://github.com/vert-x3/advanced-vertx-guide

vietj avatar Mar 22 '22 14:03 vietj

also I'm not convinced anymore by the interest of the feature because dump is achieved in the application layer instead of the network layer and it is not an exact dump of what goes over the wire at the TCP level (instead it's a fake dump created from what the JVM received) so the usefulness seems limited and could actually be misleading when debugging.

It is indeed a fake capture, which is why I only think this makes sense for developers when they are writing the application and it's not fit for production usage.

geoand avatar Mar 22 '22 14:03 geoand

@geoand you can read this guide https://github.com/vert-x3/advanced-vertx-guide

Thanks for the info!

If possible, In Quarkus, I'd like to avoid having to go down to that low-level just to be able to add a handler :)

geoand avatar Mar 22 '22 14:03 geoand

Just wanted to say that I think this is actually very useful for me as a developer, especially if we can make it work with WebSockets. It's been frustrating to debug TCP communication (especially when it's https) and to know exactly what is the JVM receiving or not (to detect bugs or find out what's going on), so this would improve my life considerably! :+1:

jvican avatar Mar 22 '22 16:03 jvican