celestia-core icon indicating copy to clipboard operation
celestia-core copied to clipboard

panic: failed to listen on 127.0.0.1:47768: listen tcp 127.0.0.1:47768: bind: address already in use

Open rootulp opened this issue 2 years ago • 4 comments

I'm observing a test flake in CI run: https://github.com/celestiaorg/celestia-core/runs/7750400729?check_suite_focus=true

panic: failed to listen on 127.0.0.1:47768: listen tcp 127.0.0.1:47768: bind: address already in use

goroutine 1 [running]:
github.com/tendermint/tendermint/rpc/jsonrpc.setup()
	/home/runner/work/celestia-core/celestia-core/rpc/jsonrpc/jsonrpc_test.go:130 +0xd9c
github.com/tendermint/tendermint/rpc/jsonrpc.TestMain(0x0)
	/home/runner/work/celestia-core/celestia-core/rpc/jsonrpc/jsonrpc_test.go:[90](https://github.com/celestiaorg/celestia-core/runs/7750400729?check_suite_focus=true#step:6:91) +0x2a
main.main()
	_testmain.go:103 +0x365
FAIL	github.com/tendermint/tendermint/rpc/jsonrpc	0.018s

rootulp avatar Aug 09 '22 16:08 rootulp

this is a super common bug that can even occur locally, but is mainly due to the resource restrictions of the CI as multiple tests are ran.

I'm not really sure we will fix this here, perhaps we should move this upstream?

evan-forbes avatar Nov 14 '22 23:11 evan-forbes

Unfortunately the logs for the occurrence in the issue description have already expired.

  1. I don't see a similar issue already in tendermin/tendermint
  2. I don't see any changes in celestia-core's jsonrpc_test.go that would make this error occur more often than in tendermint's jsonrpc_test.go but I wonder if this is related to how tests are split intro groups and run in parallel. We may be able to force all tests that invoke this line to run serially in the same test group.

rootulp avatar Nov 15 '22 18:11 rootulp

I don't see a similar issue already in tendermin/tendermint

this error doesn't just occur in the rpc tests tho, it, or very similar errors, occur everywhere all the time here and upstream. Here are a few examples. That's why I suspect that it is related to how we run CI.

https://github.com/tendermint/tendermint/actions/runs/3234541218/jobs/5297773051#step:6:124

https://github.com/tendermint/tendermint/actions/runs/3438160242/jobs/5733862660#step:5:121

https://github.com/tendermint/tendermint/actions/runs/3311124135/jobs/5466230834#step:6:149

evan-forbes avatar Nov 15 '22 18:11 evan-forbes

might be unrelated but this is a very common error for node runners, which usually means an instance is already running on that port on the localhost

mindstyle85 avatar Nov 15 '22 18:11 mindstyle85