suricata
suricata copied to clipboard
pgsql/logger: open json object from logger fn - v3
Previous PR: https://github.com/OISF/suricata/pull/10997
Changes from last PR:
- remove changes to
template.rs
as per Jason's analysis
Link to redmine ticket: https://redmine.openinfosecfoundation.org/issues/6983 https://redmine.openinfosecfoundation.org/issues/7000
Provide values to any of the below to override the defaults.
SV_BRANCH=https://github.com/OISF/suricata-verify/pull/1810
I don't understand the tx_id explanation, can you expand on it? What was happening before? What is the tx_id space? 0-N or 1-N?
I don't understand the tx_id explanation, can you expand on it? What was happening before? What is the tx_id space? 0-N or 1-N?
Pgsql's tx_id
space is 1-N.
When Suricata does app-layer event logging, it uses JsonPgsqlLogger
, which will use data from output-tx
, it starts from tx_id, as that's the value that it gets from AppLayerParserState, from log_id
.
When Suricata does alert event logging and adds app-layer metadata, the tx_id
used comes from the PacketAlert, whose info comes from AppLayerParserGetTransactionInspectId
, and that one seems to have discrepancy, so the metadata logging, without this fix, will always look for the alert's tx_id+1, and either log the incorrect tx data, or nothing.
If, when creating Pgsql Events, I call CreateEveHeaderWithTx
instead of CreateEveHeader
, I see the tx_id
discrepancy in the Pgsql EVE logs. I think that's because the logging function won't actually use the tx_id
passed for calling the parser logging function, and will only add that to the event header.
I hope this all make sense. I could update the commit message for bug 7000 to be something like:
pgsql: tx_id doesn't need to be adjusted
PgsqlTransaction.tx_id space is 1-N. Alerts are generated using the actual
tx_id from the parser, so, if tx_id == 1, PacketAlert.tx_id == 1.
Without this fix, the EVE alert metadata for Pgsql will always log tx_id+1, and
thus either log the wrong tx, or no tx at all.
When creating a new PgsqlTransaction, we first increment PgsqlState's
tx_id and then pass that to the new tx's tx_id. As Pgsql txs are not created
first with id 0, there's no need to adjust when freeing or searching for txs by
id.
Extra: there is still a discrepancy between PgsqlState's `tx_id` and
AppLayerState's log_id, as that is initialized as zero.
Bug #7000
About that extra part - feels like something that should be documented.
I'm initially curious why other protocols aren't having the same issue, or maybe they are. As this pattern used throughout.
Observation.
In master branch without this patch:
- by default, the
alert
record DOES NOT get thepgsql
metadata - with
--simulate-ips
, thealert
records DOES get thepgsql
metadata
With this patch:
- by default, the
alert
record DOES get thepgsql
metadata - with
--simulate-ips
, thealert
record DOES NOT get thepgsql
metadata
While I haven't yet, this should be looked at in another protocol as well to see if its a common pattern.
Information:
ERROR: QA failed on SURI_TLPW2_autofp_suri_time.
field | baseline | test | % |
---|---|---|---|
SURI_TLPW2_autofp_stats_chk | |||
.uptime | 138 | 146 | 105.8% |
Pipeline 20450
Observation.
In master branch without this patch:
- by default, the
alert
record DOES NOT get thepgsql
metadata- with
--simulate-ips
, thealert
records DOES get thepgsql
metadataWith this patch:
- by default, the
alert
record DOES get thepgsql
metadata- with
--simulate-ips
, thealert
record DOES NOT get thepgsql
metadataWhile I haven't yet, this should be looked at in another protocol as well to see if its a common pattern.
I vaguely remember a discussion where something connected was mentioned - by you, I think? Where we'd see different results between IDS/IPS...
The tx_id
for detection, in IDS, is recovered by AppLayerParserGetTransactionInspectId
, which takes direction into consideration. Logging doesn't, so uses something different. I'm curious with what happens with IPS.
I will try to craft SV tests for other protocols, having versions with --simulate-ips
(shall also add one of those for pgsql, too).
I'm initially curious why other protocols aren't having the same issue, or maybe they are. As this pattern used throughout.
Maybe stateless parsers won't be affected by this? Afaik we don't have this fix for the alert metadata for all app-protos, yet - might help in us not seeing this?
Wonder if the fact that we don't add the tx_id
to the EVE header could help in masking the issue, or more likely how protocol transactions differ. Or none of the above. Just hope I'm not making us waste time in something that isn't important >__<'
I vaguely remember a discussion where something connected was mentioned - by you, I think? Where we'd see different results between IDS/IPS...
There are differences. Detection is run earlier that can lead to some differing results, such as packet logged, what transaction data might be available - but from adding some prints its clear the transaction is there, just the indexing is off.
However, I don't think the app-layer should need to worry if we're in IDS or IPS mode, that should be done more inside the engine, and this is where things seem to be off here.
Closing this as I'm still investigating. I want to submit the next PR once we have a deeper understanding of what's going on, and whether the fix should include other protocols - or follow a different approach, entirely.