.findx file not written in real-time
When I do a live capture on an interface I see that tcpflow writes the output/data file every second, but the .findx file (command line option -I) just stays at 0 bytes:
vagrant@vagrant:~$ ls -lh --full-time /tmp/
-rw-r--r-- 1 root root 6 2021-04-22 15:54:10.565581023 +0000 010.000.002.002.54150-010.000.002.015.00022
-rw-r--r-- 1 root root 0 2021-04-22 15:54:03.633581023 +0000 010.000.002.002.54150-010.000.002.015.00022.findx
-rw-r--r-- 1 root root 1.7K 2021-04-22 15:54:27.377581023 +0000 010.000.002.002.56520-010.000.002.015.00022
-rw-r--r-- 1 root root 0 2021-04-22 15:54:10.553581023 +0000 010.000.002.002.56520-010.000.002.015.00022.findx
-rw-r--r-- 1 root root 264 2021-04-22 15:54:10.565581023 +0000 010.000.002.015.00022-010.000.002.002.54150
-rw-r--r-- 1 root root 0 2021-04-22 15:54:03.633581023 +0000 010.000.002.015.00022-010.000.002.002.54150.findx
-rw-r--r-- 1 root root 1.7K 2021-04-22 15:54:27.377581023 +0000 010.000.002.015.00022-010.000.002.002.56520
-rw-r--r-- 1 root root 0 2021-04-22 15:54:10.553581023 +0000 010.000.002.015.00022-010.000.002.002.56520.findx
It seems it gets only written once I stop the program. This makes it impossible for me to get timestamps for the data I'm interested in while I run the capture.
I observe this behaviour both with version 1.4.5 and 1.6.1 Is this a bug or done on purpose?
You'll need to look at the code. My suspicion is that the .findex file is written when the tcpstream is closed. Do you have pcap file that you can distribute that replicates the problem, or can you replicate it with one of the pcap files at https://digitalcorpora.org/ ?
This problem only occurs during live capture where my captured session stays open. I just ran: sudo tcpflow -i eth0 -o /tmp/ -I
Trying to reproduce this with a pcap is impractical.
I saw in the code that the index file is sorted before closing, I guess that's the issue then. Another issue is that indexes are only sorted, but duplicates (due to tcp retransmissions, etc.) are not removed.
Well, it looks like you have a bunch of SSH sessions. Are they long-lived? The file should be written when the session is closed.
You could probably add an option to disable t he sorting and write incrementally, if that's something you need.