ja4 icon indicating copy to clipboard operation
ja4 copied to clipboard

Missing JA4 fingerprints in output

Open elpy1 opened this issue 1 year ago • 2 comments

Hi :wave: . While working on a personal project that implements JA4, I noticed some discrepancies when comparing JA4 (TCP) fingerprint output against some of the tls PCAP files in your repo.

For example, I get the following TLS fingerprints from tls-handshake.pcapng:

$ python pcap.py --file ~/git/ext/ja4/pcap/tls-handshake.pcapng | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With ja4.py I get:

$ python ja4.py --ja4 ~/git/ext/ja4/pcap/tls-handshake.pcapng | grep -E -o 't\w{9}_\w{12}_\w{12}' | sort | uniq -c | sort -nr
     49 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With tshark (TShark (Wireshark) 4.2.6 (Git commit fca52ffc018f).) I get:

$ tshark -r  ~/git/ext/ja4/pcap/tls-handshake.pcapng -Y 'tls.handshake.type == 1' -Tfields -e 'tls.handshake.ja4' | grep '^t' | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

Upon looking at this a bit further I realised the caching functionality in common.py is based on streams. So, if there is more than one fingerprint in a stream, it gets overwritten in the cache? Examples stream: image

I was able to resolve this locally by hacking together a change that uses a tuple containing the stream and frame number as the cache key, but this probably isn't suitable because it results in multiple outputs for a stream, instead of multiple fingerprints inside a single stream output.

elpy1 avatar Jul 29 '24 04:07 elpy1

Thanks for bringing this up! We should add any additional JA4s seen in streams to the output as JA4.2, etc. like how we do with JA4X I think. Would that work?

john-althouse avatar Aug 05 '24 15:08 john-althouse

Considering the core functionality currently involves extracting fingerprints from each stream, that makes sense to me.

I'm simply grepping for the JA4 pattern, so it doesn't matter where it is in the output for my use-case. Thanks.

elpy1 avatar Aug 06 '24 03:08 elpy1

Thanks all!

elpy1 avatar Feb 20 '25 06:02 elpy1