community-id-spec
community-id-spec copied to clipboard
Community ID spec appears to rely on underlying application implementation for src/dst direction determination
Hey,
While doing some research on various data we have, we've seen across two separate applications which saw the same flow, they each had their own interpretation of src/dst ordering.
You'll have to excuse the formatting but in application 1 (the data is fake but the example has happened), the flow was seen as:
src,srcp,dst,dstp aaa,65000,bbb,443
While in application 2 it was reversed:
src,srcp,dst,dstp bbb,443,aaa,65000
Looking at the spec itself, we appear to leave the ordering of the src/dst combination up to the individual application and if we had implemented community ID in both applications, we would have ended up with a different community ID for the same flow.
I would suggest that the spec is updated to specify how the src/dst combination should be calculated/sorted in order to ensure a consistent community ID is generated for the same flow (regardless of what the underlying application thinks the direction is).
Thanks, Conor
Thanks for flagging, Conor. I looked over the spec as well as the three implementations (reference, Suricata, and Zeek). The spec indeed does not mention address/port ordering — a clear omission that we'll need to fix.
However, all three implementations actually do apply an address/port ordering function. I modeled the reference implementation's on Zeek's addr_port_canon_lt(), and @victorjulien mirrored the reference implementation in Suricata. So whatever's going on here is a bit more subtle — a corner case, byte ordering, some kind of mid-flow problem, etc. Are you able to share more details about the offending flow?
I did see some references in the scripts but my day to day knowledge of Bro & Suricata scripting left my unsure if this was happening or not. I can do some further debugging on the flows but that will happen outside this issue.
Thanks! Conor
Sure thing, thanks Conor. I'll leave this ticket open for the time being ... I'm sure as we get more experience with producing the ID, we'll get more such feedback.