flipt icon indicating copy to clipboard operation
flipt copied to clipboard

feat: support writing audit events to stdout for log

Open markphelps opened this issue 1 year ago • 3 comments

Fixes: #3101

Allows writing audit logs to stdout if using log sink with no file specified

It looks like this when enabled along side existing logs to stdout:

2024-05-26T16:43:10-04:00	INFO	finished unary call with code OK	{"server": "grpc", "grpc.start_time": "2024-05-26T16:43:10-04:00", "system": "grpc", "span.kind": "server", "grpc.service": "flipt.Flipt", "grpc.method": "UpdateFlag", "peer.address": "127.0.0.1:51395", "grpc.code": "OK", "grpc.time_ms": 8.519}
{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":false,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-05-26T16:43:10-04:00"}
2024-05-26T16:43:10-04:00	INFO	finished unary call with code OK	{"server": "grpc", "grpc.start_time": "2024-05-26T16:43:10-04:00", "system": "grpc", "span.kind": "server", "grpc.service": "flipt.Flipt", "grpc.method": "GetFlag", "peer.address": "127.0.0.1:51395", "grpc.code": "OK", "grpc.time_ms": 0.857}

Im not 💯 on if this is the functionality we want though as its hard to parse visually with both types of logs (audit and application) intertwined like this.

I guess one argument could be that if you're setting audit logs to stdout then you'd likely either:

  1. send application logs to a file or
  2. set the application log level to something high like ERROR so that they arent as chatty and would make audit logs easier to spot/parse

Another thing we could support is allowing the user to configure which file handle to write both application logs and audit logs to (either STDOUT or STDERR), so they could separate the logs that way if they didn't want to / couldn't write to a file for either?

wdyt @erka @GeorgeMac ?

markphelps avatar May 26 '24 20:05 markphelps

We have not forgotten that logging could have encoding set to json which will provide a bit different output

{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"using driver","server":"grpc","driver":"sqlite3"}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"migrations up to date","server":"grpc"}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"constructing builder","server":"grpc","prepared_statements":true}
{"L":"DEBUG","T":"2024-05-27T15:44:49+03:00","M":"database driver configured","server":"grpc","driver":"sqlite3"}

I really like @GeorgeMac idea to reuse zap somehow and have level set to AUDIT.

erka avatar May 27 '24 12:05 erka

Where did we end up on the stance of using the existing zap logger instead?

As in, instead of using a JSON encoder over STDOUT, just adapt audit events into fields on the zap logger and add a distinguising field like type == audit so that a log aggregator can then distinguish these?

Was it just the leveling debate? I vaguely remember we thought about just having distinct level for the audit events.

Also, looking at zap's encoder config: https://pkg.go.dev/go.uber.org/[email protected]/zapcore#EncoderConfig

It seems like you can set the levelKey to empty string to omit it from output. Or it looks like there might be a way (via a LevelEncoder perhaps?) to get our own level = audit in there somehow.

if we reuse the existing zap logger then we still have the issue of which zap level audit events are logged at.

Do you mean we use a new instance of zap logger that is always say set to log at the lowest level and then we simply exclue that levelKey in the output so that its empty?

I did some more research and it looks like its still not really possible for us to define our own log levels (at least not in a supported way), see: https://github.com/uber-go/zap/issues/1277#issuecomment-1548366070

markphelps avatar May 27 '24 22:05 markphelps

Do you mean we use a new instance of zap logger that is always say set to log at the lowest level and then we simply exclue that levelKey in the output so that its empty?

I think this is an option, but speaking without trying it myself.

I think the main thing though is that whatever the approach is, a nice experience would be for the log output to match the encoding of the existing logger. So if its console, then audit logs come out in console too, if its JSON, they all come out in JSON. And that there is a way to distinguish, through structure logging, an audit event from a standard log line.

GeorgeMac avatar May 28 '24 11:05 GeorgeMac

I've now tried switching to use a new zap logger for this functionality, seems promising:

{"T":"2024-06-05T10:37:59-04:00","M":"audit","event":{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Testing newest framework of the week","enabled":true,"key":"beta-new-ui-framework","name":"beta-new-ui-framework","namespace_key":"default"},"timestamp":"2024-06-05T10:37:59-04:00"}}

markphelps avatar Jun 05 '24 14:06 markphelps

I've now tried switching to use a new zap logger for this functionality, seems promising:

{"T":"2024-06-05T10:37:59-04:00","M":"audit","event":{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Testing newest framework of the week","enabled":true,"key":"beta-new-ui-framework","name":"beta-new-ui-framework","namespace_key":"default"},"timestamp":"2024-06-05T10:37:59-04:00"}}

The only problem is that when written to a file, it breaks backward compatibility with our previous audit log format, which simply marshalls the event like:

{"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1 "}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":true,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-06-05T10:51:24-04:00"}

markphelps avatar Jun 05 '24 14:06 markphelps

I got it a bit closer when using file based logging:

{
  "event": {
    "version": "0.1",
    "type": "flag",
    "action": "updated",
    "metadata": {
      "actor": {
        "authentication": "none",
        "ip": "127.0.0.1"
      }
    },
    "payload": {
      "description": "Testing newest framework of the week",
      "enabled": false,
      "key": "beta-new-ui-framework",
      "name": "beta-new-ui-framework",
      "namespace_key": "default"
    },
    "timestamp": "2024-06-05T11:25:27-04:00"
  }
}

I think this is good enough IMO

markphelps avatar Jun 05 '24 15:06 markphelps

Here's what the console log looks like:

2024-06-05T11:37:56-04:00	AUDIT	{"event": {"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"127.0.0.1"}},"payload":{"description":"Manages the rollout of updates to our IoT devices","enabled":false,"key":"iot-deviceUpdateControl","name":"iot-deviceUpdateControl","namespace_key":"default"},"timestamp":"2024-06-05T11:37:56-04:00"}}

markphelps avatar Jun 05 '24 15:06 markphelps

ok, I think this is ready for another round of comments, im going to be adding tests as well

/cc @erka @GeorgeMac

markphelps avatar Jun 05 '24 17:06 markphelps

@markphelps this will be a breaking change for people My audit configuration

audit:
  sinks:
    log:
      enabled: true
      file: audit.log

Audit log for v1.40.2 sends

{"version":"0.1","type":"flag","action":"created","metadata":{"actor":{"authentication":"none"}},"payload":{"description":"","enabled":false,"key":"adf","name":"adf","namespace_key":"default"},"timestamp":"2024-06-06T16:41:21+03:00"}

And this PR will send

{"event": {"version":"0.1","type":"flag","action":"updated","metadata":{"actor":{"authentication":"none","ip":"::1"}},"payload":{"description":"","enabled":false,"key":"adf","name":"adfasdf","namespace_key":"default"},"timestamp":"2024-06-06T16:42:28+03:00"}}

erka avatar Jun 06 '24 13:06 erka