etcd icon indicating copy to clipboard operation
etcd copied to clipboard

Client: fix panics if no logger is provided to fileutil and transport.

Open nickexported opened this issue 3 years ago • 18 comments

Fixes panics in client pkg in transport that can happen if nil logger is provided. Also fixes panic with nil logger in fileutil, which is used by transport.

nickexported avatar Apr 15 '22 11:04 nickexported

Hi Nick,

Is there a repro using etcd client showing this problem? We rather assume callers responsibility to provide zap.NewNop(), rather making sure every log-line tests the logger for nullability.

ptabor avatar Apr 16 '22 18:04 ptabor

Codecov Report

Merging #13947 (bded06a) into main (ff1569f) will decrease coverage by 0.35%. The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #13947      +/-   ##
==========================================
- Coverage   72.66%   72.30%   -0.36%     
==========================================
  Files         469      469              
  Lines       38414    38416       +2     
==========================================
- Hits        27912    27776     -136     
- Misses       8746     8860     +114     
- Partials     1756     1780      +24     
Flag Coverage Δ
all 72.30% <ø> (-0.36%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/proxy/grpcproxy/register.go 69.76% <0.00%> (-20.94%) :arrow_down:
server/etcdserver/api/rafthttp/peer_status.go 87.87% <0.00%> (-12.13%) :arrow_down:
server/etcdserver/api/rafthttp/peer.go 87.01% <0.00%> (-8.45%) :arrow_down:
client/pkg/v3/tlsutil/tlsutil.go 83.33% <0.00%> (-8.34%) :arrow_down:
server/storage/mvcc/watchable_store.go 85.86% <0.00%> (-8.34%) :arrow_down:
client/v3/concurrency/session.go 88.63% <0.00%> (-4.55%) :arrow_down:
server/etcdserver/cluster_util.go 70.35% <0.00%> (-3.17%) :arrow_down:
server/etcdserver/api/v3election/election.go 66.66% <0.00%> (-2.78%) :arrow_down:
client/v3/experimental/recipes/key.go 75.34% <0.00%> (-2.74%) :arrow_down:
client/v3/retry_interceptor.go 64.54% <0.00%> (-2.73%) :arrow_down:
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ff1569f...bded06a. Read the comment docs.

codecov-commenter avatar Apr 16 '22 20:04 codecov-commenter

Hi! Yes, we had a problem where we had a panic. And there is clearly a nil-logger check missing in on place. So should I just keep tests as they were? Personally, I feel like nil logger tests are overkill too, but panics aren't good either.

nickexported avatar Apr 18 '22 09:04 nickexported

Hi! Yes, we had a problem where we had a panic.

Was it etcd or the other customer of the library ? Could you, please, share a stack-trace when it happened ?

And there is clearly a nil-logger check missing in on place. So should I just keep tests as they were? Personally, I feel like nil logger tests are overkill too, but panics aren't good either.

I would go for panic. It's easy to miss that 'directory was not removed' because the logger was not given if we would return an err. panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.

ptabor avatar Apr 19 '22 14:04 ptabor

Thanks for the PR, which basically makes sense. Please resolve the following two comments:

  1. Please rebase this PR, and it should can resolve the pipeline failures;
  2. Please organize the test cases using sub tests so as to remove the duplicate code. There are lots of good examples in the repo, such as server_test.go#L1910-L1964

ahrtr avatar Apr 20 '22 05:04 ahrtr

After second thought, I would agree with @ptabor 's comments (copy-pasted below). We should try to keep this repo as simple/clean as possible.

  1. We rather assume callers responsibility to provide zap.NewNop();
  2. panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.

ahrtr avatar Apr 20 '22 05:04 ahrtr

After second thought, I would agree with @ptabor 's comments (copy-pasted below). We should try to keep this repo as simple/clean as possible.

  1. We rather assume callers responsibility to provide zap.NewNop();
  2. panic() should trigger immediate developer attention to provide proper zap.NewNop() logger.

My problem with this approach is that panic will only happen occasionally when trying to report error. Caller might not even expect that function will panic as it can succeed without an error. In function SelfCert checking logger nil is already done in most call places, only 2 are not checking.

I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.

serathius avatar Apr 20 '22 08:04 serathius

I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.

Agree. I would be fine with either of the constructs in the first lines of the function:

if (lg == nil) {
  panic(fmt.Error("TouchDirAll() requires logger"));
}

shorter:

lg.Debug("Executing TouchDirAll", zap.String("dir", "dir"))  // also asserts lg != nil

ptabor avatar Apr 20 '22 08:04 ptabor

I think if we want to push the responsibility to set the caller to always provide the logger, panic should always happen if logger is not provided.

Agree. I would be fine with either of the constructs in the first lines of the function:

if (lg == nil) { panic(fmt.Error("TouchDirAll() requires logger")); }


2. shorter:

lg.Debug("Executing TouchDirAll", zap.String("dir", "dir")) // also asserts lg != nil

@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.

ptabor avatar Apr 25 '22 08:04 ptabor

@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.

Can we follow the similar way as what bbolt does? Example:

_assert(lg != nil, "nil log isn't allowed")

// _assert will panic with a given formatted message if the given condition is false.
func _assert(condition bool, msg string, v ...interface{}) {
	if !condition {
		panic(fmt.Sprintf("assertion failed: "+msg, v...))
	}
}

FYI. db.go#L1227 tx.go#L142

ahrtr avatar Apr 25 '22 08:04 ahrtr

Like the assert approach as it is explicit. Calling debug might be removed by mistake as it doesn't imply that we are checking for logger.

serathius avatar Apr 25 '22 08:04 serathius

@nickexported Could you update this PR per issuecomment-1108246247 ?

ahrtr avatar Apr 27 '22 10:04 ahrtr

@ahrtr @serathius Do you have opinion which of the pattern we should prefer. I would go for (2) unless the function is expected to be executed >O(100) times per second, that makes it both: performance important and very verbose.

Can we follow the similar way as what bbolt does? Example:

_assert(lg != nil, "nil log isn't allowed")

// _assert will panic with a given formatted message if the given condition is false.
func _assert(condition bool, msg string, v ...interface{}) {
	if !condition {
		panic(fmt.Sprintf("assertion failed: "+msg, v...))
	}
}

FYI. db.go#L1227 tx.go#L142

SGTM.

I would consider adding verify.Assert to this package: https://github.com/etcd-io/etcd/tree/main/client/pkg/verify or introducing tiny: https://github.com/etcd-io/etcd/tree/main/client/pkg/assert

so the code would become verify.Assert(lg != nil, "nil log isn't allowed"), ideally Assert should support formatting: verify.Assert(lg, "expected %v got %v", a, ,b)

ptabor avatar Apr 29 '22 08:04 ptabor

@nickexported Please let me know if you need help, and I am happy to guide you to finish this PR.

ahrtr avatar Apr 29 '22 08:04 ahrtr

I would consider adding verify.Assert to this package: https://github.com/etcd-io/etcd/tree/main/client/pkg/verify or introducing tiny: https://github.com/etcd-io/etcd/tree/main/client/pkg/assert

so the code would become verify.Assert(lg != nil, "nil log isn't allowed"), ideally Assert should support formatting: verify.Assert(lg, "expected %v got %v", a, ,b)

The only comment is probably we need to perform the assertion in production/runtime environment, which means it shouldn't depend on the verification switch on/off.

ahrtr avatar Apr 29 '22 08:04 ahrtr

@ahrtr thanks, I'l make fixes as described here and come back

nickexported avatar Apr 29 '22 12:04 nickexported

Please use the function verify.Assert, see also 4022#discussion_r872293596

ahrtr avatar May 13 '22 11:05 ahrtr

@nickexported Please resolve review comments and conflicts.

ahrtr avatar Aug 08 '22 22:08 ahrtr

Since no response from the author for a long time, so I resolved it in https://github.com/etcd-io/etcd/pull/14455

ahrtr avatar Sep 13 '22 11:09 ahrtr