cli
cli copied to clipboard
CI: integration tests errors with `text file busy`
Describe the bug
Integration tests sometime fail with the error text file busy when building proto when building the app
Full logs:
=== RUN TestCliWithCaching
exit status 1
env.go:161:
Error Trace: env.go:161
cache_test.go:73
Error: Received unexpected error:
exit status 1
Test: TestCliWithCaching
Messages: build
Logs:
Cosmos SDK's version is: stargate - v0.45.5
🛠️ Building proto...
cannot build app:
error while running command /tmp/protoc645837501 -I /tmp/[25](https://github.com/ignite/cli/runs/7499413611?check_suite_focus=true#step:7:26)07385852 --plugin /tmp/protoc-gen-ts_proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/third_party/proto -I /home/runner/go/pkg/mod/github.com/cosmos/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/[email protected]/third_party/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto -I /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/third_party/proto --ts_proto_out=. /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto/ibc/applications/interchain_accounts/controller/v1/controller.proto /home/runner/go/pkg/mod/github.com/cosmos/ibc-go/[email protected]/proto/ibc/applications/interchain_accounts/controller/v1/query.proto --ts_proto_opt=snakeToCamel=false: : fork/exec /tmp/protoc645837501: text file busy
The text file busy seems to happen ONLY in Linux, not in macOS, when the contents of a running binary is overwritten by another process before the binary stops running. The same doesn’t happen when the binary file is deleted.
This lead me to think that some integration tests might be “flaky” because some other test is overwriting one of the generated binary files. The specific file being affected seems to be the generated protoc binary which for this issue is the /tmp/protoc645837501 .
The generated binaries use random suffixes so one test should not overwrite the file of another test. See os.CreateTemp.
If the same hash is generated in different test processes it should not clash because the CreateTemp will iterate and try again to create the temporary file if it already exists.
Just in case I checked the /tmp/protoc-gen-ts_proto because it always have the same name. It is a script that runs the actual plugin binary so it’s not likely to be the source of the issue. The protoc command runs it to generate the Typescript code. I added the change set 4ee3e938a5f383a90e5dfe2408daea33ea6316e7 to #2674 to use random temporary folders for it which might be a good idea.
I could replicate the issue only once after a while by running the TestCliWithCaching test in a Linux OS. It took way too many consecutive runs.
First I cleared the test cache:
go clean -testcache
Then I ran the same test in two separate terminals concurrently until I saw the issue:
GODEBUG=gocachetest=1 \
go test -v -cpu 1 -count 5 -failfast -timeout 30m -run ^TestCliWithCaching$
I think this issue can be closed. It's been two weeks since the fix was merged and the issue didn't repeat.