teleport icon indicating copy to clipboard operation
teleport copied to clipboard

`TestAccessMongoDB` flakiness

Open nklaassen opened this issue 2 years ago • 4 comments

Failure

Saw this fail off of branch/v10.

CI Logs

  • https://console.cloud.google.com/cloud-build/builds/64878a7f-6064-4135-aeb7-24f155b9e04b;step=0?project=ci-account
  • https://console.cloud.google.com/cloud-build/builds/61625553-453f-46e0-ad31-7a0e5495201a?project=ci-account
  • https://console.cloud.google.com/cloud-build/builds/7fe827fa-b190-4db3-988e-454ed92bb2e0?project=ci-account
  • https://console.cloud.google.com/cloud-build/builds;region=us-west1/b8812649-6701-4019-a687-b37cad789d22?project=ci-account
  • https://console.cloud.google.com/cloud-build/builds;region=us-west1/1d1d3095-984a-4a8e-9091-bd97a8f6be9c?project=ci-account
  • https://console.cloud.google.com/cloud-build/builds;region=us-west1/1d1d3095-984a-4a8e-9091-bd97a8f6be9c?project=ci-account

Relevant Snippet

    --- FAIL: TestAccessMongoDB/access_denied_to_specific_user_and_database (13.73s)
===================================================
OUTPUT github.com/gravitational/teleport/lib/srv/db.TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
===================================================
=== RUN   TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
2022-06-17T15:20:41Z DEBU [DB:SERVIC] Asked check out of cycle srv/heartbeat.go:287
2022-06-17T15:20:41Z DEBU [DB:PROXY]  Available databases in root.example.com: [DatabaseServer(Name=mongo, Version=10.0.0-alpha.1, Hostname=teleport.cluster.local, HostID=42b1c357-3c83-409e-a28c-6114b95346e3, Database=Database(Name=mongo, Type=self-hosted, Labels=map[echo:test]))]. db/proxyserver.go:629
2022-06-17T15:20:41Z DEBU [AUTH]      ClientCertPool -> cert(root.example.com issued by root.example.com:4408681756169843927285952159426259152) auth/middleware.go:679
=== PAUSE TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
=== CONT  TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
=== CONT  TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression
    access_test.go:829: 
        	Error Trace:	access_test.go:829
        	Error:      	Received unexpected error:
        	            	server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46073, Type: Unknown, Average RTT: 0 }, ] }
        	Test:       	TestAccessMongoDB/access_denied_to_specific_user_and_database/old_server/client_without_compression

nklaassen avatar Jun 17 '22 17:06 nklaassen

You can use the following branch to reproduce this issue pretty consistently: rjones/debug-testaccessmongoDB.

Open two terminal windows on a t3a.xlarge instance and run the following command in both: while go test . -run TestAccessMongoDB -count=1 -race; do :; done.

You will see output like the following:

$ while go test . -run TestAccessMongoDB -count=1 -race; do :; done
ok  	github.com/gravitational/teleport/lib/srv/db	7.338s
ok  	github.com/gravitational/teleport/lib/srv/db	5.154s
ok  	github.com/gravitational/teleport/lib/srv/db	7.112s
ok  	github.com/gravitational/teleport/lib/srv/db	6.478s
--> MakeTestClient: mongo.Connect
--> MakeTestClient: Sending ping...
--> HandleConnection 0...
--> HandleConnection: 1: enter
--> HandleConnection: 2: enter
--> HandleConnection: 3: enter
--> [16bb3] clientMessage: OpQuery(FullCollectionName=admin.$cmd, Query={"isMaster": {"$numberInt":"1"},"compression": [],"client": {"driver": {"name": "mongo-go-driver","version": "v1.5.3"},"os": {"type": "linux","architecture": "amd64"},"platform": "go1.18.3"}}, ReturnFieldsSelector=, NumberToSkip=0, NumberToReturn=-1, Flags=[SlaveOK])
--> [16bb3] serverMessage: OpReply(Documents=[{"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}], StartingFrom=0, NumberReturned=1, CursorID=0, Flags=[])
--> [b5736] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [b5736] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> [26db0] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [26db0] serverMessage: OpMsg(Body={"compression": ["zlib"],"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"}}, Documents=[], Flags=)
--> [f3882] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> [f3882] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> HandleConnection 0...
--> HandleConnection: 1: enter
--> [d738d] clientMessage: OpMsg(Body={"isMaster": {"$numberInt":"1"},"$db": "admin"}, Documents=[], Flags=)
--> MakeTestClient: Ping failed: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46535, Type: Unknown, Average RTT: 0 }, ] }
--> [d738d] serverMessage: OpMsg(Body={"ok": {"$numberInt":"1"},"maxWireVersion": {"$numberInt":"9"},"compression": ["zlib"]}, Documents=[], Flags=)
--> HandleConnection: 4
--- FAIL: TestAccessMongoDB (0.00s)
    --- FAIL: TestAccessMongoDB/has_access_to_all_database_names_and_users (4.81s)
        --- FAIL: TestAccessMongoDB/has_access_to_all_database_names_and_users/new_server/client_without_compression (2.06s)
            access_test.go:828: 
                	Error Trace:	access_test.go:828
                	Error:      	Received unexpected error:
                	            	server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:46535, Type: Unknown, Average RTT: 0 }, ] }
                	Test:       	TestAccessMongoDB/has_access_to_all_database_names_and_users/new_server/client_without_compression
FAIL
FAIL	github.com/gravitational/teleport/lib/srv/db	7.086s
FAIL

russjones avatar Jul 06 '22 17:07 russjones

Another one here: https://console.cloud.google.com/cloud-build/builds;region=us-west1/30d35a21-a610-4941-9f59-1fad46d3f602?project=ci-account

ibeckermayer avatar Aug 08 '22 17:08 ibeckermayer

@ibeckermayer We have a fix on master. This test failure comes from v10. We haven't merged the fix to any production branch. We wanted to wait to be sure that we won't break our Mongo integration as my fix updates the Mongo driver. @smallinsky Do you know when we could backport my Mongo changes to v10 and probably v9?

jakule avatar Aug 09 '22 16:08 jakule

We can schedule this at the end of Aug when teleport 10.2 is going to be released. We will do additional testing to cover the mongo driver change.

smallinsky avatar Aug 10 '22 09:08 smallinsky

@smallinsky and @jakule: this is still making it hard to get things in v10. We're well beyond the end of August now - do you think it's safe to get the fix in v10 yet?

zmb3 avatar Sep 23 '22 20:09 zmb3

@zmb3 I prepared a backport of that fix from master https://github.com/gravitational/teleport/pull/16695 @smallinsky Please approve it when you think it can be merged.

jakule avatar Sep 23 '22 21:09 jakule

On a v9 backport:

  • https://console.cloud.google.com/cloud-build/builds/a60e4bed-cef8-4841-b991-97992558b704?project=ci-account
        	Test:       	TestAccessMongoDB/old_server/client_without_compression/has_access_to_all_database_names_and_users
        	            	server selection error: server selection timeout, current topology: { Type: Unknown, Servers: [{ Addr: 127.0.0.1:41095, Type: Unknown, Average RTT: 0 }, ] }
        	Error:      	Received unexpected error:
        	Error Trace:	/workspace/lib/srv/db/access_test.go:812
    access_test.go:812: 

ibeckermayer avatar Oct 31 '22 17:10 ibeckermayer

Same as the above comment:

https://console.cloud.google.com/cloud-build/builds/8f988b93-c1a5-433c-b96d-c30f5aa44a4f?project=ci-account

ibeckermayer avatar Nov 28 '22 19:11 ibeckermayer

@ibeckermayer @zmb3 @smallinsky I created the backport to v9 https://github.com/gravitational/teleport/pull/18884

jakule avatar Nov 29 '22 20:11 jakule