With live traffic, expired flows are not properly exported by ndpiReader
This is the command "sudo /usr/local/ndpi/bin/ndpiReader -i ens192 -k output.json -K json -m 300" I have run on my Ubuntu machine and the traffic statistics are: Traffic statistics: Ethernet bytes: 4920005186 (includes ethernet CRC/IFC/trailer) Discarded bytes: 2733978 IP packets: 5412175 of 5454144 packets total IP bytes: 4790112986 (avg pkt size 878 bytes) Unique flows: 5044 TCP Packets: 1222888 UDP Packets: 4185122 VLAN Packets: 4044765 MPLS Packets: 0 PPPoE Packets: 0 Fragmented Packets: 35195 Max Packet size: 1480 Packet Len < 64: 1232145 Packet Len 64-128: 58350 Packet Len 128-256: 47131 Packet Len 256-1024: 1569684 Packet Len 1024-1500: 2504865 Packet Len > 1500: 0 nDPI throughput: 18.02 K pps / 124.95 Mb/sec Analysis begin: 14/Nov/2024 11:31:20 Analysis end: 14/Nov/2024 11:36:21 Traffic throughput: 18.02 K pps / 124.95 Mb/sec Traffic duration: 300.415 sec Guessed flow protos: 16308 DPI Packets (TCP): 73704 (3.54 pkts/flow) DPI Packets (UDP): 26584 (2.18 pkts/flow) DPI Packets (other): 855 (1.00 pkts/flow) Confidence: Unknown 712 (flows) Confidence: Match by port 13295 (flows) Confidence: DPI (partial) 11 (flows) Confidence: DPI (partial cache) 2973 (flows) Confidence: DPI (cache) 250 (flows) Confidence: DPI 16609 (flows) Confidence: Match by IP 40 (flows) Traffic statistics: Ethernet bytes: 3616799543 (includes ethernet CRC/IFC/trailer) Discarded bytes: 2861397 IP packets: 4245007 of 4289513 packets total IP bytes: 3514919375 (avg pkt size 819 bytes) Unique flows: 4581 TCP Packets: 275961 UDP Packets: 3966062 VLAN Packets: 3189357 MPLS Packets: 0 PPPoE Packets: 0 Fragmented Packets: 38254 Max Packet size: 1480 Packet Len < 64: 997130 Packet Len 64-128: 41737 Packet Len 128-256: 31039 Packet Len 256-1024: 1449851 Packet Len 1024-1500: 1725250 Packet Len > 1500: 0 nDPI throughput: 17.31 K pps / 112.50 Mb/sec Analysis begin: 14/Nov/2024 11:36:21 Analysis end: 14/Nov/2024 11:40:26 Traffic throughput: 17.31 K pps / 112.50 Mb/sec Traffic duration: 245.276 sec Guessed flow protos: 12800 DPI Packets (TCP): 58152 (3.59 pkts/flow) DPI Packets (UDP): 19676 (2.15 pkts/flow) DPI Packets (other): 496 (1.00 pkts/flow) Confidence: Unknown 457 (flows) Confidence: Match by port 10700 (flows) Confidence: DPI (partial) 19 (flows) Confidence: DPI (partial cache) 2052 (flows) Confidence: DPI (cache) 246 (flows) Confidence: DPI 12345 (flows) Confidence: Match by IP 42 (flows)
The total number of IP packets are 9657182
But the sum of src2dst_packets and dst2src_packets reported in the json file is 3613812. I wanted to know why there is a significant drop in number of packets.
I confirm there is an issue with realtime traffic and expired flows: they are not properly exported and accounted for. As a workaround, you can try to process the traffic offline
tcpdump -i ens192 -w traffic.pcap
ndpiReader -i traffic.pcap -k output.json -K json
Note: ndpiReader is just an example integration and not meant to get used for live traffic processing. In addition, the JSON/CSV output is not well tested AFAIR.
But the number of unique flows in the Traffic Statistics matches the number of flows written in the JSON file so why there are less packets but the flows are evenly matched.
As I said, accounting of expired flows is wrong. This PR https://github.com/ntop/nDPI/pull/2622 should fix the flow number statistic
Updated the title of this discussion
My issue is that when I used split analysis duration of 30 seconds then the number of packets and bytes are evenly matched with the traffic statistics captured by nDPI but when I starts the same packet capture on the same interface with split analysis duration of 300 seconds then the number of flows captured are evenly matched but there is around 40% drop in packets and bytes.
With current master, statistics are in sync: when we you have a mismatch in packet counters, you also have a mismatch in flow numbers. BTW, the reported values are now "correct" The core issue is not about split analysis but expired flows in general with live traffic: they are not exported
I have also used tcpdump on a live interface alongside nDPI, I created a pcapng file using tcpdump and a json file using nDPI simultaneously. After that I analyzed the pcapng file using nDPI and created the json file where the number of packets in the Traffic statistics were evenly matched with the number of packets written in the json file. So i wanted to know when a pcap file is compiled using nDPI is providing the correct traffic stats then what is the issue with the live interface.
I have also used tcpdump on a live interface alongside nDPI, I created a pcapng file using tcpdump and a json file using nDPI simultaneously. After that I analyzed the pcapng file using nDPI and created the json file where the number of packets in the Traffic statistics were evenly matched with the number of packets written in the json file. So i wanted to know when a pcap file is compiled using nDPI is providing the correct traffic stats then what is the issue with the live interface.
You are confirming what I already said: with offline trace, stats and export are correct; with live traffic export is wrong (expired flows are missing)