Fix missing TLS packets and recognizing TLS encrypted data
I noticed that scapy seemed to skip packets when parsing tlsv1.2 messages from pcap files. I could not find a relevant issue and investigated it myself. It seems that scapy was returning the last tls message in a sequence when tls was fragmented across tcp packets. Independently, I noticed that tls messages wireshark calls "Encrypted Handshake Messages" were not recognized by scapy. Since these messages aren't indicated by any outside field, I modified tls records to consider the message content encrypted if a message fails to parse.
I'm opening this PR as a draft since I haven't done testing outside of my use case. I'm not sure what implications outside of tls handshakes these changes have.
Codecov Report
Merging #4015 (25a0296) into master (047be32) will decrease coverage by
29.09%. The diff coverage is7.14%.
@@ Coverage Diff @@
## master #4015 +/- ##
===========================================
- Coverage 81.93% 52.85% -29.09%
===========================================
Files 328 328
Lines 75404 75425 +21
===========================================
- Hits 61783 39865 -21918
- Misses 13621 35560 +21939
| Impacted Files | Coverage Δ | |
|---|---|---|
| scapy/layers/tls/record.py | 20.26% <0.00%> (-71.61%) |
:arrow_down: |
| scapy/layers/tls/session.py | 35.36% <0.00%> (-52.83%) |
:arrow_down: |
| scapy/layers/tls/crypto/cipher_aead.py | 38.15% <100.00%> (-50.00%) |
:arrow_down: |
Thanks for your interest in Scapy. Could you share an example that triggers the issue? For example by adding a pcap file to a github comment.
When running the included test with the included pcap file with master, I observe scapy parsing: the client hello, the server hello done, 2 hello requests?, raw handshake data, and finally application data. When using my commit, I observe scapy parsing: the client hello, the server hello & handshake certificate & key exchange & hello done, the client key exchange & change cipher spec & encrypted handshake data, the server change cipher spec & encrypted handshake data, and finally application data. issue.zip I think tls packets reassembled over tcp were being dropped since only the last tls packet parsed was returned in session.py. The record.py change helps identify handshake messages with encrypted content, but I'm not sure if it has other consequences.
There's indeed an issue with TLS and tcp reassembly, but that should be handled by fixing TCP sessions (which is currently somewhat broken because we dissect then rebuild packets), rather than modifying TLS.
Perhaps the first issue would be fixed with a refactoring of TCP sessions, but as-is, as far as I can tell, TLS tcp_reassemble does not return the correct value.
The behavior I noticed was that tcp_reassemble does not return a packet until all available bytes are able to be deserialized. When the bytes were able to be deserialized, it would only return the last packet in the list. My changes to session.py still ensure all bytes must be deserialized until a packet is returned, but it returns the first packet in the list such that packets are not dropped and are instead accessible from the payload of the first packet.
The first change to record.py ensures TLS encrypted data is represented by an encrypted class instead of as raw data.
The second change to record.py and cipher_aead.py is addressing a cryptographic error I received when dealing with a specific encryption mechanism (which I'd have to look back at to identify). IIRC self._tls_auth_decrypt wasn't returning a predictable tuple with some encryption mechanism and I couldn't not find the route cause.