cli icon indicating copy to clipboard operation
cli copied to clipboard

CI: integration tests errors with `text file busy`

Open lumtis opened this issue 3 years ago • 1 comments

Describe the bug Integration tests sometime fail with the error text file busy when building proto when building the app

Full logs:

=== RUN   TestCliWithCaching
exit status 1
    env.go:161: 
        	Error Trace:	env.go:161
        	            				cache_test.go:73
        	Error:      	Received unexpected error:
        	            	exit status 1
        	Test:       	TestCliWithCaching
        	Messages:   	build
        	            	Logs:
        	            	Cosmos SDK's version is: stargate - v0.45.5
        	            	🛠️  Building proto...
        	            	cannot build app:
        	            		error while running command /tmp/protoc645837501 -I /tmp/[25](https://github.com/ignite/cli/runs/7499413611?check_suite_focus=true#step:7:26)07385852 --plugin /tmp/protoc-gen-ts_proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/third_party/proto -I /home/runner/go/pkg/mod/github.com/cosmos/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/[email protected]/third_party/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/third_party/proto --ts_proto_out=. /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto/ibc/applications/interchain_accounts/controller/v1/controller.proto /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto/ibc/applications/interchain_accounts/controller/v1/query.proto --ts_proto_opt=snakeToCamel=false: : fork/exec /tmp/protoc645837501: text file busy

lumtis avatar Jul 25 '22 12:07 lumtis

The text file busy seems to happen ONLY in Linux, not in macOS, when the contents of a running binary is overwritten by another process before the binary stops running. The same doesn’t happen when the binary file is deleted.

This lead me to think that some integration tests might be “flaky” because some other test is overwriting one of the generated binary files. The specific file being affected seems to be the generated protoc binary which for this issue is the /tmp/protoc645837501 .

The generated binaries use random suffixes so one test should not overwrite the file of another test. See os.CreateTemp.

If the same hash is generated in different test processes it should not clash because the CreateTemp will iterate and try again to create the temporary file if it already exists.

Just in case I checked the /tmp/protoc-gen-ts_proto because it always have the same name. It is a script that runs the actual plugin binary so it’s not likely to be the source of the issue. The protoc command runs it to generate the Typescript code. I added the change set 4ee3e938a5f383a90e5dfe2408daea33ea6316e7 to #2674 to use random temporary folders for it which might be a good idea.

I could replicate the issue only once after a while by running the TestCliWithCaching test in a Linux OS. It took way too many consecutive runs.

First I cleared the test cache:

go clean -testcache

Then I ran the same test in two separate terminals concurrently until I saw the issue:

GODEBUG=gocachetest=1 \
  go test -v -cpu 1 -count 5 -failfast -timeout 30m -run ^TestCliWithCaching$

jeronimoalbi avatar Jul 28 '22 14:07 jeronimoalbi

I think this issue can be closed. It's been two weeks since the fix was merged and the issue didn't repeat.

jeronimoalbi avatar Aug 17 '22 16:08 jeronimoalbi