etcd
etcd copied to clipboard
Client: fix panics if no logger is provided to fileutil and transport.
Fixes panics in client pkg in transport that can happen if nil logger is provided. Also fixes panic with nil logger in fileutil, which is used by transport.
Hi Nick,
Is there a repro using etcd client showing this problem? We rather assume callers responsibility to provide zap.NewNop(), rather making sure every log-line tests the logger for nullability.
Codecov Report
Merging #13947 (bded06a) into main (ff1569f) will decrease coverage by
0.35%. The diff coverage isn/a.
@@ Coverage Diff @@
## main #13947 +/- ##
==========================================
- Coverage 72.66% 72.30% -0.36%
==========================================
Files 469 469
Lines 38414 38416 +2
==========================================
- Hits 27912 27776 -136
- Misses 8746 8860 +114
- Partials 1756 1780 +24
| Flag | Coverage Δ | |
|---|---|---|
| all | 72.30% <ø> (-0.36%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Impacted Files | Coverage Δ | |
|---|---|---|
| server/proxy/grpcproxy/register.go | 69.76% <0.00%> (-20.94%) |
:arrow_down: |
| server/etcdserver/api/rafthttp/peer_status.go | 87.87% <0.00%> (-12.13%) |
:arrow_down: |
| server/etcdserver/api/rafthttp/peer.go | 87.01% <0.00%> (-8.45%) |
:arrow_down: |
| client/pkg/v3/tlsutil/tlsutil.go | 83.33% <0.00%> (-8.34%) |
:arrow_down: |
| server/storage/mvcc/watchable_store.go | 85.86% <0.00%> (-8.34%) |
:arrow_down: |
| client/v3/concurrency/session.go | 88.63% <0.00%> (-4.55%) |
:arrow_down: |
| server/etcdserver/cluster_util.go | 70.35% <0.00%> (-3.17%) |
:arrow_down: |
| server/etcdserver/api/v3election/election.go | 66.66% <0.00%> (-2.78%) |
:arrow_down: |
| client/v3/experimental/recipes/key.go | 75.34% <0.00%> (-2.74%) |
:arrow_down: |
| client/v3/retry_interceptor.go | 64.54% <0.00%> (-2.73%) |
:arrow_down: |
| ... and 17 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update ff1569f...bded06a. Read the comment docs.
Hi! Yes, we had a problem where we had a panic. And there is clearly a nil-logger check missing in on place. So should I just keep tests as they were? Personally, I feel like nil logger tests are overkill too, but panics aren't good either.
Hi! Yes, we had a problem where we had a panic.
Was it etcd or the other customer of the library ?
Could you, please, share a stack-trace when it happened ?
And there is clearly a nil-logger check missing in on place. So should I just keep tests as they were? Personally, I feel like nil logger tests are overkill too, but panics aren't good either.
I would go for panic. It's easy to miss that 'directory was not removed' because the logger was not given if we would return an err. panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.
Thanks for the PR, which basically makes sense. Please resolve the following two comments:
- Please rebase this PR, and it should can resolve the pipeline failures;
- Please organize the test cases using sub tests so as to remove the duplicate code. There are lots of good examples in the repo, such as server_test.go#L1910-L1964
After second thought, I would agree with @ptabor 's comments (copy-pasted below). We should try to keep this repo as simple/clean as possible.
- We rather assume callers responsibility to provide zap.NewNop();
- panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.
After second thought, I would agree with @ptabor 's comments (copy-pasted below). We should try to keep this repo as simple/clean as possible.
- We rather assume callers responsibility to provide zap.NewNop();
- panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.
My problem with this approach is that panic will only happen occasionally when trying to report error. Caller might not even expect that function will panic as it can succeed without an error. In function SelfCert checking logger nil is already done in most call places, only 2 are not checking.
I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.
I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.
Agree. I would be fine with either of the constructs in the first lines of the function:
if (lg == nil) {
panic(fmt.Error("TouchDirAll() requires logger"));
}
shorter:
lg.Debug("Executing TouchDirAll", zap.String("dir", "dir")) // also asserts lg != nil
I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.
Agree. I would be fine with either of the constructs in the first lines of the function:
if (lg == nil) { panic(fmt.Error("TouchDirAll() requires logger")); }
2. shorter:lg.Debug("Executing TouchDirAll", zap.String("dir", "dir")) // also asserts lg != nil
@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.
@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.
Can we follow the similar way as what bbolt does? Example:
_assert(lg != nil, "nil log isn't allowed")
// _assert will panic with a given formatted message if the given condition is false.
func _assert(condition bool, msg string, v ...interface{}) {
if !condition {
panic(fmt.Sprintf("assertion failed: "+msg, v...))
}
}
FYI. db.go#L1227 tx.go#L142
Like the assert approach as it is explicit. Calling debug might be removed by mistake as it doesn't imply that we are checking for logger.
@nickexported Could you update this PR per issuecomment-1108246247 ?
@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.
Can we follow the similar way as what bbolt does? Example:
_assert(lg != nil, "nil log isn't allowed") // _assert will panic with a given formatted message if the given condition is false. func _assert(condition bool, msg string, v ...interface{}) { if !condition { panic(fmt.Sprintf("assertion failed: "+msg, v...)) } }FYI. db.go#L1227 tx.go#L142
SGTM.
I would consider adding verify.Assert to this package: https://github.com/etcd-io/etcd/tree/main/client/pkg/verify or introducing tiny: https://github.com/etcd-io/etcd/tree/main/client/pkg/assert
so the code would become verify.Assert(lg != nil, "nil log isn't allowed"),
ideally Assert should support formatting: verify.Assert(lg, "expected %v got %v", a, ,b)
@nickexported Please let me know if you need help, and I am happy to guide you to finish this PR.
I would consider adding verify.Assert to this package: https://github.com/etcd-io/etcd/tree/main/client/pkg/verify or introducing tiny: https://github.com/etcd-io/etcd/tree/main/client/pkg/assert
so the code would become
verify.Assert(lg != nil, "nil log isn't allowed"), ideally Assert should support formatting:verify.Assert(lg, "expected %v got %v", a, ,b)
The only comment is probably we need to perform the assertion in production/runtime environment, which means it shouldn't depend on the verification switch on/off.
@ahrtr thanks, I'l make fixes as described here and come back
Please use the function verify.Assert, see also 4022#discussion_r872293596
@nickexported Please resolve review comments and conflicts.
Since no response from the author for a long time, so I resolved it in https://github.com/etcd-io/etcd/pull/14455