nDPI icon indicating copy to clipboard operation
nDPI copied to clipboard

With live traffic, expired flows are not properly exported by ndpiReader

Open pasitushar opened this issue 1 year ago • 9 comments

This is the command "sudo /usr/local/ndpi/bin/ndpiReader -i ens192 -k output.json -K json -m 300" I have run on my Ubuntu machine and the traffic statistics are: Traffic statistics: Ethernet bytes: 4920005186 (includes ethernet CRC/IFC/trailer) Discarded bytes: 2733978 IP packets: 5412175 of 5454144 packets total IP bytes: 4790112986 (avg pkt size 878 bytes) Unique flows: 5044 TCP Packets: 1222888 UDP Packets: 4185122 VLAN Packets: 4044765 MPLS Packets: 0 PPPoE Packets: 0 Fragmented Packets: 35195 Max Packet size: 1480 Packet Len < 64: 1232145 Packet Len 64-128: 58350 Packet Len 128-256: 47131 Packet Len 256-1024: 1569684 Packet Len 1024-1500: 2504865 Packet Len > 1500: 0 nDPI throughput: 18.02 K pps / 124.95 Mb/sec Analysis begin: 14/Nov/2024 11:31:20 Analysis end: 14/Nov/2024 11:36:21 Traffic throughput: 18.02 K pps / 124.95 Mb/sec Traffic duration: 300.415 sec Guessed flow protos: 16308 DPI Packets (TCP): 73704 (3.54 pkts/flow) DPI Packets (UDP): 26584 (2.18 pkts/flow) DPI Packets (other): 855 (1.00 pkts/flow) Confidence: Unknown 712 (flows) Confidence: Match by port 13295 (flows) Confidence: DPI (partial) 11 (flows) Confidence: DPI (partial cache) 2973 (flows) Confidence: DPI (cache) 250 (flows) Confidence: DPI 16609 (flows) Confidence: Match by IP 40 (flows) Traffic statistics: Ethernet bytes: 3616799543 (includes ethernet CRC/IFC/trailer) Discarded bytes: 2861397 IP packets: 4245007 of 4289513 packets total IP bytes: 3514919375 (avg pkt size 819 bytes) Unique flows: 4581 TCP Packets: 275961 UDP Packets: 3966062 VLAN Packets: 3189357 MPLS Packets: 0 PPPoE Packets: 0 Fragmented Packets: 38254 Max Packet size: 1480 Packet Len < 64: 997130 Packet Len 64-128: 41737 Packet Len 128-256: 31039 Packet Len 256-1024: 1449851 Packet Len 1024-1500: 1725250 Packet Len > 1500: 0 nDPI throughput: 17.31 K pps / 112.50 Mb/sec Analysis begin: 14/Nov/2024 11:36:21 Analysis end: 14/Nov/2024 11:40:26 Traffic throughput: 17.31 K pps / 112.50 Mb/sec Traffic duration: 245.276 sec Guessed flow protos: 12800 DPI Packets (TCP): 58152 (3.59 pkts/flow) DPI Packets (UDP): 19676 (2.15 pkts/flow) DPI Packets (other): 496 (1.00 pkts/flow) Confidence: Unknown 457 (flows) Confidence: Match by port 10700 (flows) Confidence: DPI (partial) 19 (flows) Confidence: DPI (partial cache) 2052 (flows) Confidence: DPI (cache) 246 (flows) Confidence: DPI 12345 (flows) Confidence: Match by IP 42 (flows)

The total number of IP packets are 9657182

But the sum of src2dst_packets and dst2src_packets reported in the json file is 3613812. I wanted to know why there is a significant drop in number of packets.

pasitushar avatar Nov 14 '24 11:11 pasitushar

I confirm there is an issue with realtime traffic and expired flows: they are not properly exported and accounted for. As a workaround, you can try to process the traffic offline

tcpdump -i ens192 -w traffic.pcap
ndpiReader -i traffic.pcap -k output.json -K json

IvanNardi avatar Nov 14 '24 20:11 IvanNardi

Note: ndpiReader is just an example integration and not meant to get used for live traffic processing. In addition, the JSON/CSV output is not well tested AFAIR.

utoni avatar Nov 14 '24 23:11 utoni

But the number of unique flows in the Traffic Statistics matches the number of flows written in the JSON file so why there are less packets but the flows are evenly matched.

pasitushar avatar Nov 18 '24 09:11 pasitushar

As I said, accounting of expired flows is wrong. This PR https://github.com/ntop/nDPI/pull/2622 should fix the flow number statistic

IvanNardi avatar Nov 18 '24 10:11 IvanNardi

Updated the title of this discussion

IvanNardi avatar Nov 18 '24 12:11 IvanNardi

My issue is that when I used split analysis duration of 30 seconds then the number of packets and bytes are evenly matched with the traffic statistics captured by nDPI but when I starts the same packet capture on the same interface with split analysis duration of 300 seconds then the number of flows captured are evenly matched but there is around 40% drop in packets and bytes.

pasitushar avatar Nov 19 '24 07:11 pasitushar

With current master, statistics are in sync: when we you have a mismatch in packet counters, you also have a mismatch in flow numbers. BTW, the reported values are now "correct" The core issue is not about split analysis but expired flows in general with live traffic: they are not exported

IvanNardi avatar Nov 19 '24 14:11 IvanNardi

I have also used tcpdump on a live interface alongside nDPI, I created a pcapng file using tcpdump and a json file using nDPI simultaneously. After that I analyzed the pcapng file using nDPI and created the json file where the number of packets in the Traffic statistics were evenly matched with the number of packets written in the json file. So i wanted to know when a pcap file is compiled using nDPI is providing the correct traffic stats then what is the issue with the live interface.

pasitushar avatar Nov 20 '24 09:11 pasitushar

I have also used tcpdump on a live interface alongside nDPI, I created a pcapng file using tcpdump and a json file using nDPI simultaneously. After that I analyzed the pcapng file using nDPI and created the json file where the number of packets in the Traffic statistics were evenly matched with the number of packets written in the json file. So i wanted to know when a pcap file is compiled using nDPI is providing the correct traffic stats then what is the issue with the live interface.

You are confirming what I already said: with offline trace, stats and export are correct; with live traffic export is wrong (expired flows are missing)

IvanNardi avatar Nov 20 '24 10:11 IvanNardi