possible bug displaying TCP segments
The Problem
While analysing an outgoing TCP stream, I noticed unexpected sequences of \0 bytes appearing in the tcpflow output. These null byte sequences were observed in the middle of an otherwise valid data stream.
Here is how it appears when viewed with less. I verified using xxd that these symbols are indeed \0 bytes:
GLOBAL_USER_CACHE_INVALIDATE11633_fe31e7f08d14de2f075ae2bffb88e9384a734c00ca77f72722301d504644826ded816371c8abb61dcd51f51a0fff6b21770cf3f5dc42e6998daa193f1ee7a95a1730232e68aa463c26fa0bb5b190dbc1f9d9b8ecedd5db08bb7c005b96fd5d2db930f794b60e15599aa01ea21fcd5fd103400463035943127dfc6c03331579f31cb1c8aa20bcb41a9c406509303e89ca35792dd4615e5d8f83a1459fef4afc59fddc4e00ce54acf0b9fe779ce4e000514f2fe5c7953a7529a0f5a390604d8778ab6d8db561d61057c03bb195d7147aaf6d911095d940dba7b2d670b1c6^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
I successfully captured the traffic using both tcpdump and tcpflow, and compared the problematic frame. Interestingly, neither tcpdump nor tshark shows these null byte sequences. The frame in question is somewhat unusual — it contains 11,264 bytes of data.
I'm attaching the tshark output of that frame for reference: frame.txt.gz
The null sequence in tcpflow output starts exactly where tshark writes "Data [truncated]: ".
Is this repeatable? Do the tcp and ip checksums in this packet validate?----Sent from my phone.On May 30, 2025, at 7:46 AM, Roman Gershman @.***> wrote:romange created an issue (simsong/tcpflow#272) The Problem While analysing an outgoing TCP stream, I noticed unexpected sequences of \0 bytes appearing in the tcpflow output. These null byte sequences were observed in the middle of an otherwise valid data stream. Here is how it appears when viewed with less. I verified using xxd that these symbols are indeed \0 bytes: ... GLOBAL_USER_CACHE_INVALIDATE11633_fe31e7f08d14de2f075ae2bffb88e9384a734c00ca77f72722301d504644826ded816371c8abb61dcd51f51a0fff6b21770cf3f5dc42e6998daa193f1ee7a95a1730232e68aa463c26fa0bb5b190dbc1f9d9b8ecedd5db08bb7c005b96fd5d2db930f794b60e15599aa01ea21fcd5fd103400463035943127dfc6c03331579f31cb1c8aa20bcb41a9c406509303e89ca35792dd4615e5d8f83a1459fef4afc59fddc4e00ce54acf0b9fe779ce4e000514f2fe5c7953a7529a0f5a390604d8778ab6d8db561d61057c03bb195d7147aaf6d911095d940dba7b2d670b1c6^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
I successfully captured the traffic using both tcpdump and tcpflow, and compared the problematic frame. Interestingly, neither tcpdump nor tshark shows these null byte sequences. The frame in question is somewhat unusual — it contains 11,264 bytes of data.
I'm attaching the tshark output of that frame for reference: frame.txt.gz
The null sequence in tcpflow output starts exactly where tshark writes "Data [truncated]: ".
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
yes, it is reproducible though it happens randomly during the run. I had to build a distributed setup to reproduce the issue.
Okay. I guess it is a bug in some weird boundry case. Since tshark and tcpdump are still under active development and tcpflow really is not, perhaps this would be a good time to archive the project and turn it over to someone else.
Sorry to hear this issue was the last straw. Thank you for your project and your contributions to Computer Science; I hope this project will be adopted by someone.
FWIW, I also stumbled upon this bug. However, even though I had to capture the traffic using tcpdump and then use tcpflow as a workaround, tcpflow was still immensely useful! I was debugging a fairly large 16GB http request, and stitching the body together manually from the tcpdump data would have been both slow and time-consuming. tcpflow did it in seconds. Thanks! :)
I'm glad you find the program useful. I am curious — does wireshark/tshark not implement this functionality?
I'm glad you find the program useful. I am curious — does wireshark/tshark not implement this functionality?
@simsong I tried wireshark (follow tcpstream and export to file worked for a ~100MB file)/tshark with the pcap file from tcpdump - wireshark ran out of memory after 7 hours trying to open the 16GB file, and I stopped tshark after 18 hours. Without tcpflow, I don't think I could have completed the task.
Wow. Thanks.
On Wed, Oct 22, 2025 at 5:58 PM Mats Taraldsvik @.***> wrote:
meastp-nk left a comment (simsong/tcpflow#272) https://github.com/simsong/tcpflow/issues/272#issuecomment-3431206374
I'm glad you find the program useful. I am curious — does wireshark/tshark not implement this functionality?
@simsong https://github.com/simsong I tried wireshark (follow tcpstream and export to file worked for a ~100MB file)/tshark with the pcap file from tcpdump - wireshark ran out of memory after 7 hours trying to open the 16GB file, and I stopped tshark after 18 hours. Without tcpflow, I don't think I could have completed the task.
— Reply to this email directly, view it on GitHub https://github.com/simsong/tcpflow/issues/272#issuecomment-3431206374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMFHLFJ43HJ27AEGCMLFST3Y5BLTAVCNFSM6AAAAAB6HGBTYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMZRGIYDMMZXGQ . You are receiving this because you were mentioned.Message ID: @.***>