vyatta-dataplane
vyatta-dataplane copied to clipboard
DPI related bug-fixes
@shubham-cdot could you tell us why you had to remove the zero data length check from dpi_ndpi_session_first_packet?
Are there some applications which are not being identified? Is this particular to 3.4?
Can you give us details of, "app detection is given up for TCP sessions in some cases if the first packet of session sent to nDPI is not a SYN packet"?
I'm trying to understand why this wasn't an issue for us before, and I'd appreciate any information which we could use to improve our internal DPI testing.
Thanks.
@pjaitken In ndpi_detection_process_packet function (file: ndpi_main.c):
if((flow->num_processed_pkts == 1) && (ret.master_protocol == NDPI_PROTOCOL_UNKNOWN) && (ret.app_protocol == NDPI_PROTOCOL_UNKNOWN) && flow->packet.tcp && (flow->packet.tcp->syn == 0) && (flow->guessed_protocol_id == 0)) { u_int8_t protocol_was_guessed;
/*
This is a TCP flow
- whose first packet is NOT a SYN
- no protocol has been detected
We don't see how future packets can match anything
hence we giveup here
*/
ret = ndpi_detection_giveup(ndpi_str, flow, 0, &protocol_was_guessed);
}
The above checking was first added to nDPI via commit id e4f01976a and is present since nDPI version 3.0:
commit e4f01976a66f1943bde7b253b62430d36c6d9e74 Author: Luca [email protected] Date: Thu Aug 30 11:10:30 2018 +0200
Added missing categorization when giveup/guess is called
Added optimization for TCP flows that do not start with a SYN packet: early giveup is performed
Code cleanup
As per our understanding, app detection will be given up for a TCP flow as per above checking in case of TCP flows for which nDPI is unable to detect the app protocol using the first packet sent to nDPI after the SYN packet, by applying both:
- protocol guessing (for example based on IP address, TCP ports etc.) and
- protocol dissectors
This would be more likely to happen for protocols running on non-standard ports for which multiple packets of the flow are required for app identification.
We tried to simulate this scenario but currently unable to do so due to limitations of test setup at our end. For more info, please refer to packet handling in ndpi_detection_process_packet. Thanks!
@shubham-cdot, I'm copying some internal feedback from @dfawcus.
@pjaitken What if we remove the check for UDP packets from dpi.c:dpi_get_data_len() ? How will it impact the DPI feature? The issue might arise for some protocol detection (TCP based) running on non standard ports. The nDPI might give up if data_len == 0 packets are skipped.
We can use ARRAY_SIZE() to find num of engines rather than static entry , however can you elaborate "there's only one place to add additional DPI engines in future." This is true if we use num of engines = 2 ?
Regards, S
What if we remove the check for UDP packets from dpi.c:dpi_get_data_len() ? How will it impact the DPI feature? The issue might arise for some protocol detection (TCP based) running on non standard ports. The nDPI might give up if data_len == 0 packets are skipped.
My point is that the code I referred to needs a mechanism to force some packets to be dropped. The code originally made use of indicating the length was zero to achieve that. If now a length of zero can not be used to signal that the packet should be dropped, then some other mechanism needs to be provided.
I am not suggesting disallowing a length zero packet to processed, I am suggesting that you need to add a alternate mechanism to force some packets to be dropped.
@subhajit-cdot, I mean that engines[] and engines_len should be set in only one place (ie, in a new function) rather than in multiple places throughout the code. If a new engine DPI engine is added then it will only need to be added in that one function rather than in multiple places.
My point is that the code I referred to needs a mechanism to force some packets to be dropped. The code originally made use of indicating the length was zero to achieve that. If now a length of zero can not be used to signal that the packet should be dropped, then some other mechanism needs to be provided.
In order to distinguish invalid data length case from 0 (valid) data length, we can do one of the following:
- add an error flag to dpi.c:dpi_get_data_len() as an output argument to indicate invalid data length case
- add new function just to check for invalid data length
In case of option 2, the checking for invalid data length will be moved from dpi_get_data_len() to the new function.
@pjaitken @dfawcus Few queries:
a) In the original DANOS 2009 code, the check for data_len != 0 is done only in ndpi.c:dpi_ndpi_session_first_packet() i.e. for the first packet of session. Will this check not be required for subsequent packets of the session?
b) In dpi.c:dpi_get_data_len(), data length is validated only for UDP. Will similar check not be required for TCP?
Hello @pjaitken Can you please update about the current status of the two pending bug fixes (data_len and engines)? Thanks Subhajit
@shubham-cdot, @subhajit-cdot,
The data_len != 0 check should be removed. All valid packets should be sent to the DPI engine.
Should packets with an invalid length be sent to the DPI engine or not? If the DPI engine validates packets itself, and does not overrun the packet or crash, then we can simply send all traffic into the engine. If we don't send invalid length packets to the DPI engine, then should further packets be sent to the engine after an invalid length packet has been blocked?
If we agree to block invalid length packets, then a new "invalid length" check should be added - for all packets (not just the first) and for all protocols (not just UDP). It seems sensible to add the invalid length check to dpi_get_data_len and return an output argument. It would be useful to return the length of other transport protocols too, though this could be done later.
The DPI statistics should only count packets which are being sent to the DPI engine.
Regarding engines, engines[] and engines_len should be set in a new function so this is done in only one place rather than in multiple places throughout the code.
Thanks.