teleport
teleport copied to clipboard
`TestAccessMongoDB` flakiness
Failure
Saw this fail off of branch/v10
.
CI Logs
- https://console.cloud.google.com/cloud-build/builds/64878a7f-6064-4135-aeb7-24f155b9e04b;step=0?project=ci-account
- https://console.cloud.google.com/cloud-build/builds/61625553-453f-46e0-ad31-7a0e5495201a?project=ci-account
- https://console.cloud.google.com/cloud-build/builds/7fe827fa-b190-4db3-988e-454ed92bb2e0?project=ci-account
- https://console.cloud.google.com/cloud-build/builds;region=us-west1/b8812649-6701-4019-a687-b37cad789d22?project=ci-account
- https://console.cloud.google.com/cloud-build/builds;region=us-west1/1d1d3095-984a-4a8e-9091-bd97a8f6be9c?project=ci-account
- https://console.cloud.google.com/cloud-build/builds;region=us-west1/1d1d3095-984a-4a8e-9091-bd97a8f6be9c?project=ci-account
Relevant Snippet
--- FAIL: TestAccessMongoDB/access_denied_to_specific_user_and_database (13.73s)
===================================================
OUTPUT github.com/gravitational/teleport/lib/srv/db.TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
===================================================
=== RUN TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
2022-06-17T15:20:41Z DEBU [DB:SERVIC] Asked check out of cycle srv/heartbeat.go:287
2022-06-17T15:20:41Z DEBU [DB:PROXY] Available databases in root.example.com: [DatabaseServer(Name=mongo, Version=10.0.0-alpha.1, Hostname=teleport.cluster.local, HostID=42b1c357-3c83-409e-a28c-6114b95346e3, Database=Database(Name=mongo, Type=self-hosted, Labels=map[echo:test]))]. db/proxyserver.go:629
2022-06-17T15:20:41Z DEBU [AUTH] ClientCertPool -> cert(root.example.com issued by root.example.com:4408681756169843927285952159426259152) auth/middleware.go:679
=== PAUSE TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
=== CONT TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
=== CONT TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
access_test.go:829:
Error Trace: access_test.go:829
Error: Received unexpected error:
server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46073, Type: Unknown, Average RTT: 0 }, ] }
Test: TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
You can use the following branch to reproduce this issue pretty consistently: rjones/debug-testaccessmongoDB
.
Open two terminal windows on a t3a.xlarge
instance and run the following command in both: while go test . -run TestAccessMongoDB -count=1 -race; do :; done
.
You will see output like the following:
$ while go test . -run TestAccessMongoDB -count=1 -race; do :; done
ok github.com/gravitational/teleport/lib/srv/db 7.338s
ok github.com/gravitational/teleport/lib/srv/db 5.154s
ok github.com/gravitational/teleport/lib/srv/db 7.112s
ok github.com/gravitational/teleport/lib/srv/db 6.478s
--> MakeTestClient: mongo.Connect
--> MakeTestClient: Sending ping...
--> HandleConnection 0...
--> HandleConnection: 1: enter
--> HandleConnection: 2: enter
--> HandleConnection: 3: enter
--> [16bb3] clientMessage: OpQuery(FullCollectionName=admin.$cmd, Query={"isMaster": {"$numberInt":"1"},"compression": [],"client": {"driver": {"name": "mongo-go-driver","version": "v1.5.3"},"os": {"type": "linux","architecture": "amd64"},"platform": "go1.18.3"}}, ReturnFieldsSelector=, NumberToSkip=0, NumberToReturn=-1, Flags=[SlaveOK])
--> [16bb3] serverMessage: OpReply(Documents=[{"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}], StartingFrom=0, NumberReturned=1, CursorID=0, Flags=[])
--> [b5736] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [b5736] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> [26db0] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [26db0] serverMessage: OpMsg(Body={"compression": ["zlib"],"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"}}, Documents=[], Flags=)
--> [f3882] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [f3882] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> HandleConnection 0...
--> HandleConnection: 1: enter
--> [d738d] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> MakeTestClient: Ping failed: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46535, Type: Unknown, Average RTT: 0 }, ] }
--> [d738d] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> HandleConnection: 4
--- FAIL: TestAccessMongoDB (0.00s)
--- FAIL: TestAccessMongoDB/has_access_to_all_database_names_and_users (4.81s)
--- FAIL: TestAccessMongoDB/has_access_to_all_database_names_and_users/new_server/client_without_compression (2.06s)
access_test.go:828:
Error Trace: access_test.go:828
Error: Received unexpected error:
server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46535, Type: Unknown, Average RTT: 0 }, ] }
Test: TestAccessMongoDB/has_access_to_all_database_names_and_users/new_server/client_without_compression
FAIL
FAIL github.com/gravitational/teleport/lib/srv/db 7.086s
FAIL
Another one here: https://console.cloud.google.com/cloud-build/builds;region=us-west1/30d35a21-a610-4941-9f59-1fad46d3f602?project=ci-account
@ibeckermayer We have a fix on master. This test failure comes from v10. We haven't merged the fix to any production branch. We wanted to wait to be sure that we won't break our Mongo integration as my fix updates the Mongo driver. @smallinsky Do you know when we could backport my Mongo changes to v10 and probably v9?
We can schedule this at the end of Aug when teleport 10.2 is going to be released. We will do additional testing to cover the mongo driver change.
@smallinsky and @jakule: this is still making it hard to get things in v10. We're well beyond the end of August now - do you think it's safe to get the fix in v10 yet?
@zmb3 I prepared a backport of that fix from master https://github.com/gravitational/teleport/pull/16695 @smallinsky Please approve it when you think it can be merged.
On a v9 backport:
- https://console.cloud.google.com/cloud-build/builds/a60e4bed-cef8-4841-b991-97992558b704?project=ci-account
Test: TestAccessMongoDB/old_server/client_without_compression/has_access_to_all_database_names_and_users
server selection error: server selection timeout, current topology: { Type: Unknown, Servers: [{ Addr: 127.0.0.1:41095, Type: Unknown, Average RTT: 0 }, ] }
Error: Received unexpected error:
Error Trace: /workspace/lib/srv/db/access_test.go:812
access_test.go:812:
Same as the above comment:
https://console.cloud.google.com/cloud-build/builds/8f988b93-c1a5-433c-b96d-c30f5aa44a4f?project=ci-account
@ibeckermayer @zmb3 @smallinsky I created the backport to v9 https://github.com/gravitational/teleport/pull/18884