feat: support writing audit events to stdout for log
Fixes: #3101
Allows writing audit logs to stdout if using log sink with no file specified
It looks like this when enabled along side existing logs to stdout:
2024-05-26T16:43:10-04:00 INFO finished unary call with code OK {"server": "grpc", "grpc.start_time": "2024-05-26T16:43:10-04:00", "system": "grpc", "span.kind": "server", "grpc.service": "flipt.Flipt", "grpc.method": "UpdateFlag", "peer.address": "127.0.0.1:51395", "grpc.code": "OK", "grpc.time_ms": 8.519}
{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":false,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-05-26T16:43:10-04:00"}
2024-05-26T16:43:10-04:00 INFO finished unary call with code OK {"server": "grpc", "grpc.start_time": "2024-05-26T16:43:10-04:00", "system": "grpc", "span.kind": "server", "grpc.service": "flipt.Flipt", "grpc.method": "GetFlag", "peer.address": "127.0.0.1:51395", "grpc.code": "OK", "grpc.time_ms": 0.857}
Im not 💯 on if this is the functionality we want though as its hard to parse visually with both types of logs (audit and application) intertwined like this.
I guess one argument could be that if you're setting audit logs to stdout then you'd likely either:
- send application logs to a file or
- set the application log level to something high like ERROR so that they arent as chatty and would make audit logs easier to spot/parse
Another thing we could support is allowing the user to configure which file handle to write both application logs and audit logs to (either STDOUT or STDERR), so they could separate the logs that way if they didn't want to / couldn't write to a file for either?
wdyt @erka @GeorgeMac ?
We have not forgotten that logging could have encoding set to json which will provide a bit different output
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"using driver","server":"grpc","driver":"sqlite3"}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"migrations up to date","server":"grpc"}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"constructing builder","server":"grpc","prepared_statements":true}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"database driver configured","server":"grpc","driver":"sqlite3"}
I really like @GeorgeMac idea to reuse zap somehow and have level set to AUDIT.
Where did we end up on the stance of using the existing zap logger instead?
As in, instead of using a JSON encoder over STDOUT, just adapt audit events into fields on the zap logger and add a distinguising field like
type == auditso that a log aggregator can then distinguish these?Was it just the leveling debate? I vaguely remember we thought about just having distinct level for the audit events.
Also, looking at zap's encoder config: https://pkg.go.dev/go.uber.org/[email protected]/zapcore#EncoderConfig
It seems like you can set the
levelKeyto empty string to omit it from output. Or it looks like there might be a way (via aLevelEncoderperhaps?) to get our ownlevel = auditin there somehow.
if we reuse the existing zap logger then we still have the issue of which zap level audit events are logged at.
Do you mean we use a new instance of zap logger that is always say set to log at the lowest level and then we simply exclue that levelKey in the output so that its empty?
I did some more research and it looks like its still not really possible for us to define our own log levels (at least not in a supported way), see: https://github.com/uber-go/zap/issues/1277#issuecomment-1548366070
Do you mean we use a new instance of zap logger that is always say set to log at the lowest level and then we simply exclue that levelKey in the output so that its empty?
I think this is an option, but speaking without trying it myself.
I think the main thing though is that whatever the approach is, a nice experience would be for the log output to match the encoding of the existing logger. So if its console, then audit logs come out in console too, if its JSON, they all come out in JSON. And that there is a way to distinguish, through structure logging, an audit event from a standard log line.
I've now tried switching to use a new zap logger for this functionality, seems promising:
{"T":"2024-06-05T10:37:59-04:00","M":"audit","event":{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Testing newest framework of the week","enabled":true,"key":"beta-new-ui-framework","name":"beta-new-ui-framework","namespace_key":"default"},"timestamp":"2024-06-05T10:37:59-04:00"}}
I've now tried switching to use a new zap logger for this functionality, seems promising:
{"T":"2024-06-05T10:37:59-04:00","M":"audit","event":{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Testing newest framework of the week","enabled":true,"key":"beta-new-ui-framework","name":"beta-new-ui-framework","namespace_key":"default"},"timestamp":"2024-06-05T10:37:59-04:00"}}
The only problem is that when written to a file, it breaks backward compatibility with our previous audit log format, which simply marshalls the event like:
{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1 "}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":true,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-06-05T10:51:24-04:00"}
I got it a bit closer when using file based logging:
{
"event": {
"version": "0.1",
"type": "flag",
"action": "updated",
"metadata": {
"actor": {
"authentication": "none",
"ip": "127.0.0.1"
}
},
"payload": {
"description": "Testing newest framework of the week",
"enabled": false,
"key": "beta-new-ui-framework",
"name": "beta-new-ui-framework",
"namespace_key": "default"
},
"timestamp": "2024-06-05T11:25:27-04:00"
}
}
I think this is good enough IMO
Here's what the console log looks like:
2024-06-05T11:37:56-04:00 AUDIT {"event": {"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":false,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-06-05T11:37:56-04:00"}}
ok, I think this is ready for another round of comments, im going to be adding tests as well
/cc @erka @GeorgeMac
@markphelps this will be a breaking change for people My audit configuration
audit:
sinks:
log:
enabled: true
file: audit.log
Audit log for v1.40.2 sends
{"version":"0.1","type":"flag","action":"created","metadata":{"actor":{"authentication":"none"}},"payload":{"description":"","enabled":false,"key":"adf","name":"adf","namespace_key":"default"},"timestamp":"2024-06-06T16:41:21+03:00"}
And this PR will send
{"event": {"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"::1"}},"payload":{"description":"","enabled":false,"key":"adf","name":"adfasdf","namespace_key":"default"},"timestamp":"2024-06-06T16:42:28+03:00"}}