jetstack-secure
jetstack-secure copied to clipboard
[VC-35738] Log with klog to stdout and stderr in JSON format
I've settled on using klog and kubernetes component-base module for the reasons given in the comments in the logs.go file.
I think this logging UX is most likely to be familiar to Kubernetes platform admins who are likely to be deploying and debugging the venafi-kubernetes-agent.
I gave up on Slog because it seemed to have unfamiliar severity / levels and I couldn't get the klog client-go logs to have the right levels with slog.
Using klog allows us to debug problems in client-go.
For example, you can view the API requests and responses using --vmodule=round_tripper=6.
In followup PRs I will convert the existing log lines to use context derived logr loggers.
$ go test ./pkg/logs/... -v -run TestLogs -count 1
=== RUN TestLogs
=== RUN TestLogs/help
logs_test.go:134: FLAGS
-h
logs_test.go:136: STDOUT
pflag: help requested
logs_test.go:137: STDERR
Usage of test-logs:
--logging-format string Sets the log format. Permitted formats: "json", "text". (default "json")
-v, --v Level number for the log level verbosity
--vmodule pattern=N,... comma-separated list of pattern=N settings for file-filtered logging (only works for text log format)
logs_test.go:142: ERROR: exit status 2
=== RUN TestLogs/unrecognized-flag
logs_test.go:134: FLAGS
--foo
logs_test.go:136: STDOUT
unknown flag: --foo
logs_test.go:137: STDERR
unknown flag: --foo
Usage of test-logs:
--logging-format string Sets the log format. Permitted formats: "json", "text". (default "json")
-v, --v Level number for the log level verbosity
--vmodule pattern=N,... comma-separated list of pattern=N settings for file-filtered logging (only works for text log format)
logs_test.go:142: ERROR: exit status 2
=== RUN TestLogs/logging-format-unrecognized
logs_test.go:134: FLAGS
--logging-format=foo
logs_test.go:136: STDOUT
logs_test.go:137: STDERR
Error in logging configuration: format: Invalid value: "foo": Unsupported log format
logs_test.go:142: ERROR: exit status 2
=== RUN TestLogs/original-defaults
logs_test.go:134: FLAGS
logs_test.go:136: STDOUT
logs_test.go:137: STDERR
vCert: 2024/10/17 08:10:12 log Print
vCert: 2024/10/17 08:10:12 INFO slog Info
vCert: 2024/10/17 08:10:12 WARN slog Warn
vCert: 2024/10/17 08:10:12 ERROR slog Error
I1017 08:10:12.337242 29346 logs_test.go:46] klog Info
W1017 08:10:12.337247 29346 logs_test.go:47] klog Warning
E1017 08:10:12.337254 29346 logs_test.go:48] "klog Error" err="fake-error"
I1017 08:10:12.337259 29346 logs_test.go:49] "klog InfoS" key="value"
E1017 08:10:12.337268 29346 logs_test.go:52] "Contextual error" err="fake-error" logger="foo" key="value"
=== RUN TestLogs/modified-defaults
logs_test.go:134: FLAGS
logs_test.go:136: STDOUT
{"ts":1729149012343.6404,"caller":"log/log.go:245","msg":"log Print","v":0}
{"ts":1729149012343.672,"caller":"logs/logs_test.go:43","msg":"slog Info","v":0}
{"ts":1729149012343.6807,"caller":"logs/logs_test.go:44","msg":"slog Warn","v":0}
{"ts":1729149012343.7136,"caller":"logs/logs_test.go:46","msg":"klog Info","v":0}
{"ts":1729149012343.7205,"caller":"logs/logs_test.go:47","msg":"klog Warning","v":0}
{"ts":1729149012343.7432,"caller":"logs/logs_test.go:49","msg":"klog InfoS","v":0,"key":"value"}
logs_test.go:137: STDERR
{"ts":1729149012343.6902,"caller":"logs/logs_test.go:45","msg":"slog Error"}
{"ts":1729149012343.7363,"caller":"logs/logs_test.go:48","msg":"klog Error","err":"fake-error"}
{"ts":1729149012343.759,"logger":"foo","caller":"logs/logs_test.go:52","msg":"Contextual error","key":"value","err":"fake-error"}
=== RUN TestLogs/logging-format-json
logs_test.go:134: FLAGS
--logging-format=json
logs_test.go:136: STDOUT
{"ts":1729149012349.122,"caller":"log/log.go:245","msg":"log Print","v":0}
{"ts":1729149012349.1545,"caller":"logs/logs_test.go:43","msg":"slog Info","v":0}
{"ts":1729149012349.1628,"caller":"logs/logs_test.go:44","msg":"slog Warn","v":0}
{"ts":1729149012349.1802,"caller":"logs/logs_test.go:46","msg":"klog Info","v":0}
{"ts":1729149012349.189,"caller":"logs/logs_test.go:47","msg":"klog Warning","v":0}
{"ts":1729149012349.2192,"caller":"logs/logs_test.go:49","msg":"klog InfoS","v":0,"key":"value"}
logs_test.go:137: STDERR
{"ts":1729149012349.1685,"caller":"logs/logs_test.go:45","msg":"slog Error"}
{"ts":1729149012349.206,"caller":"logs/logs_test.go:48","msg":"klog Error","err":"fake-error"}
{"ts":1729149012349.236,"logger":"foo","caller":"logs/logs_test.go:52","msg":"Contextual error","key":"value","err":"fake-error"}
=== RUN TestLogs/log-json-split-stream-false
logs_test.go:134: FLAGS
--logging-format=json --log-json-split-stream=false
logs_test.go:136: STDOUT
logs_test.go:137: STDERR
{"ts":1729149012355.1396,"caller":"log/log.go:245","msg":"log Print","v":0}
{"ts":1729149012355.1697,"caller":"logs/logs_test.go:43","msg":"slog Info","v":0}
{"ts":1729149012355.177,"caller":"logs/logs_test.go:44","msg":"slog Warn","v":0}
{"ts":1729149012355.183,"caller":"logs/logs_test.go:45","msg":"slog Error"}
{"ts":1729149012355.1924,"caller":"logs/logs_test.go:46","msg":"klog Info","v":0}
{"ts":1729149012355.1992,"caller":"logs/logs_test.go:47","msg":"klog Warning","v":0}
{"ts":1729149012355.203,"caller":"logs/logs_test.go:48","msg":"klog Error","err":"fake-error"}
{"ts":1729149012355.2168,"caller":"logs/logs_test.go:49","msg":"klog InfoS","v":0,"key":"value"}
{"ts":1729149012355.2212,"logger":"foo","caller":"logs/logs_test.go:52","msg":"Contextual error","key":"value","err":"fake-error"}
=== RUN TestLogs/logging-format-text
logs_test.go:134: FLAGS
--logging-format=text
logs_test.go:136: STDOUT
I1017 08:10:12.360983 29371 log.go:245] log Print
I1017 08:10:12.361035 29371 logs_test.go:43] "slog Info"
I1017 08:10:12.361060 29371 logs_test.go:46] klog Info
I1017 08:10:12.361079 29371 logs_test.go:49] "klog InfoS" key="value"
logs_test.go:137: STDERR
W1017 08:10:12.361052 29371 logs_test.go:44] "slog Warn"
E1017 08:10:12.361056 29371 logs_test.go:45] "slog Error"
W1017 08:10:12.361063 29371 logs_test.go:47] klog Warning
E1017 08:10:12.361073 29371 logs_test.go:48] "klog Error" err="fake-error"
E1017 08:10:12.361083 29371 logs_test.go:52] "Contextual error" err="fake-error" logger="foo" key="value"
=== RUN TestLogs/log-text-split-stream-false
logs_test.go:134: FLAGS
--logging-format=text --log-text-split-stream=false
logs_test.go:136: STDOUT
logs_test.go:137: STDERR
I1017 08:10:12.367297 29377 log.go:245] log Print
I1017 08:10:12.367352 29377 logs_test.go:43] "slog Info"
W1017 08:10:12.367358 29377 logs_test.go:44] "slog Warn"
E1017 08:10:12.367361 29377 logs_test.go:45] "slog Error"
I1017 08:10:12.367364 29377 logs_test.go:46] klog Info
W1017 08:10:12.367367 29377 logs_test.go:47] klog Warning
E1017 08:10:12.367382 29377 logs_test.go:48] "klog Error" err="fake-error"
I1017 08:10:12.367402 29377 logs_test.go:49] "klog InfoS" key="value"
E1017 08:10:12.367406 29377 logs_test.go:52] "Contextual error" err="fake-error" logger="foo" key="value"
=== RUN TestLogs/v-level-3
logs_test.go:134: FLAGS
--v=3
logs_test.go:136: STDOUT
{"ts":1729149012373.1528,"caller":"log/log.go:245","msg":"log Print","v":0}
{"ts":1729149012373.1838,"caller":"logs/logs_test.go:43","msg":"slog Info","v":0}
{"ts":1729149012373.1895,"caller":"logs/logs_test.go:44","msg":"slog Warn","v":0}
{"ts":1729149012373.2012,"caller":"logs/logs_test.go:46","msg":"klog Info","v":0}
{"ts":1729149012373.207,"caller":"logs/logs_test.go:47","msg":"klog Warning","v":0}
{"ts":1729149012373.217,"caller":"logs/logs_test.go:49","msg":"klog InfoS","v":0,"key":"value"}
{"ts":1729149012373.2212,"logger":"foo","caller":"logs/logs_test.go:51","msg":"Contextual Info Level 3","v":3,"key":"value"}
logs_test.go:137: STDERR
{"ts":1729149012373.194,"caller":"logs/logs_test.go:45","msg":"slog Error"}
{"ts":1729149012373.2112,"caller":"logs/logs_test.go:48","msg":"klog Error","err":"fake-error"}
{"ts":1729149012373.225,"logger":"foo","caller":"logs/logs_test.go:52","msg":"Contextual error","key":"value","err":"fake-error"}
=== RUN TestLogs/vmodule-level-3
logs_test.go:134: FLAGS
--logging-format=text --vmodule=logs_test=3
logs_test.go:136: STDOUT
I1017 08:10:12.379068 29389 log.go:245] log Print
I1017 08:10:12.379865 29389 logs_test.go:43] "slog Info"
I1017 08:10:12.379892 29389 logs_test.go:46] klog Info
I1017 08:10:12.379906 29389 logs_test.go:49] "klog InfoS" key="value"
I1017 08:10:12.379911 29389 logs_test.go:51] "Contextual Info Level 3" logger="foo" key="value"
logs_test.go:137: STDERR
W1017 08:10:12.379884 29389 logs_test.go:44] "slog Warn"
E1017 08:10:12.379888 29389 logs_test.go:45] "slog Error"
W1017 08:10:12.379894 29389 logs_test.go:47] klog Warning
E1017 08:10:12.379900 29389 logs_test.go:48] "klog Error" err="fake-error"
E1017 08:10:12.379915 29389 logs_test.go:52] "Contextual error" err="fake-error" logger="foo" key="value"
--- PASS: TestLogs (0.07s)
--- PASS: TestLogs/help (0.01s)
--- PASS: TestLogs/unrecognized-flag (0.01s)
--- PASS: TestLogs/logging-format-unrecognized (0.01s)
--- PASS: TestLogs/original-defaults (0.01s)
--- PASS: TestLogs/modified-defaults (0.01s)
--- PASS: TestLogs/logging-format-json (0.01s)
--- PASS: TestLogs/log-json-split-stream-false (0.01s)
--- PASS: TestLogs/logging-format-text (0.01s)
--- PASS: TestLogs/log-text-split-stream-false (0.01s)
--- PASS: TestLogs/v-level-3 (0.01s)
--- PASS: TestLogs/vmodule-level-3 (0.01s)
PASS
ok github.com/jetstack/preflight/pkg/logs 0.077s
$ ./hack/e2e/test.sh
...
{
"ts": 1729108594511.0774,
"caller": "log/log.go:245",
"msg": "successfully gathered 30 items from \"k8s/replicasets\" datagatherer",
"v": 0
}
{
"ts": 1729108594511.1047,
"caller": "log/log.go:245",
"msg": "Posting data to: ",
"v": 0
}
{"ts":1729108595121.0984,"caller":"log/log.go:245","msg":"Data sent successfully.","v":0}
Hi, does cert-manager and the other components use klog? Do they use the conventions that this MR is adopting?
Hey Richard, great job with adding unit tests around the logs. I really appreciate the effort. Thanks to you, I can just run go test -v to check what the various logging outputs are is great. Well done!
Does cert-manager and the other components use klog? Do they use the conventions that this MR is adopting?
To the question of consistency:
- The library convention is klog for older projects such as cert-manager.
- For newer projects, the convention is to use klogr as we prefer using structured logging
- The default format convention seems to be klog's text format.
For that reason, the choice of klogr would make more sense in the Venafi Kubernetes Agent as it would bring structured logging. My first suggestion would be for us to start with klogr since klogr and klog don't have the same flags, and that would reduce the chance for people realizing they can't use --v anymore. We don't need to re-structure the logs right away, we can do that later (if ever).
Note that I found in https://github.com/kubernetes/klog/commit/44eadc3e71f44b0c1d05d418f56b9ad2a26dbb22 that slog is preferred over klogr since last year:
The klogr package doesn't get updated. It shouldn't be used anymore. To make that clearer, it gets marked as "deprecated".
My second suggestion, based on the fact that klogr is being phased out, would be to use std lib's log/slog and use klog as the backend. WDYT @wallrj?
Full picture:
| project | lib | flag | values | default** | supports json format |
|---|---|---|---|---|---|
| cert-manager | klog | -v, --v |
0 to 9 | info, 2 | yes with --logging-format=json, default to klog's text format |
| cert-manager webhook | klog | -v, --v |
0 to 9 | info, 2 | yes with --logging-format=json, default to klog's text format |
| openshift-routes | klogr | -v, --log-level |
0 to 9 | info, 1 | no, uses klog's text format |
| csi-driver | klogr | -v, --log-level |
0 to 9 | info, 1 | no, uses klog's text format |
| trust-manager | klogr | -v, --log-level |
0 to 9 | info, 1 | yes with --log-format=json, defaults to klog's text format |
| approver-policy | klogr | -v, --log-level |
0 to 9 | info, 1 | yes with --log-format=json, defaults to klog's text format |
| venafi-enhanced-issuer | zap | -zap-log-level |
debug, info, error, 0 to 9 | info | yes with --zap-encoder=json, defaults to zap's text format |
| Agent (before this PR) | log + klog* | none | no, uses log's and klogs' text formats | ||
| Agent (with this PR) | klog | -v, --v |
0 to 9 | ? | yes by default, can use klog's text format using --logging-format=text |
| venctl | slog*** | --log-level |
debug, info, warn, error | info | yes with --log-format=json, defaults to slog's text format |
| vcert | log | --verbose |
true, false | false | no, uses log's text format |
* log is the standard library's log package. It is used for the agent's code, while klog is used by the client-go library that the agent uses.
** For example, "info, 4" means that the default severity is "info" and the default level for the "info" severity is 4. The level is always for the "info" level.
*** "slog" and "log" belongs to the standard library (log/slog).
Hi, yep, consistency was exactly my concern, thanks for the detailed response. In my opinion it's favourable to have a smaller set to make updates easier down the line, even if that smaller set doesn't include the current best practice yet. If this project were to adopt the conventions of the majority of the other projects then we would have more consistency, and could figure out how/when to migrate them en-masse. Alternatively there may be an argument to update to the latest/greatest now, it depends on the details.
It sounds like you think that log/slog would be preferable to klog at this moment in this project?
@wallrj I've reviewed your PR and made some small adjustments:
-
In d9afd9508612005ca77ebfd556538c1076e59da4, I moved everything to slog (it was pretty easy). We can later migrate every
logs.Log.Printfwithslog.Info. -
In 33f5a12756a22395a02fb8fdf9c22702fb2549f5, I added assertions to the stderr and stdout outputs of each test case so that we can track the changes I have made when I moved to slog.
-
In d9afd9508612005ca77ebfd556538c1076e59da4, I've added a "source" to indicate whether a log line originates from VCert or from the agent itself.
-
In 4428a242dcc630ec93f2c594fe2350d07483c0e1, I've made sure the
logs.Log.Fatalflogs are showing with theERRORseverity in slog. -
In 6752d49, I removed
--vand replaced it with--log-levelto match the long form of-vthat most of our components uses (approver-policy, trust-manager, openshift-routes, and csi-driver). The help now looks like this:+ -v, --log-level Level number for the log level verbosity - -v, --v Level number for the log level verbosity
Regarding the flags:
It seems we (the cert-manager project) have initiated the convention of the flags -v and --log-level. It has nothing to do with klogr; klogr itself doesn't have a convention around the flag names. Example: https://github.com/cert-manager/openshift-routes/blob/0da68aed69ef66c8c9cefde45d855410813a6b3c/internal/cmd/app/options/options.go#L114-L116
I propose that we ~only expose the -v flag~ expose --log-level instead of --v since that's the long form flag that is the most common across our components (approver-policy, trust-manager, openshift-routes, and csi-driver).
Regarding the slog severity problem:
I gave up on Slog because it seemed to have unfamiliar severity / levels and I couldn't get the klog client-go logs to have the right levels with slog.
I've investigated and was able to configure an slog logger in d9afd9508612005ca77ebfd556538c1076e59da4. To avoid lots of unrelated changes, the agent logs are still using the standard library's *log.Logger.
Regarding the output format:
I found that none of our projects currently output JSON by default. You mentioned the Kubernetes logging conventions, which say that the JSON format must be enabled with --logging-format, which means that if not passed, the output shouldn't be JSON:
Requires passing
--logging-format=jsonto enable.
@wallrj IMO we should set the default output to "text"... WDYT? I haven't reverted to "text". I'll let you decide.
@wallrj Feel free to merge this PR once you are happy with it. I'll approve it now.
@wallrj IMO we should set the default output to "text"... WDYT? I haven't reverted to "text". I'll let you decide.
I agree that text format by default is more consistent with other TLSPK components. Done.
@wallrj Feel free to merge this PR once you are happy with it. I'll approve it now.
I'll merge this after the 1.2.0 release because I'd like to finish the other logging improvements before releasing to customers.