gocql
gocql copied to clipboard
Data Race During --race Test
Probably needs proper mutexing - this one periodically causes flaking in CI for me. The items hitting it are: github.com/gocql/gocql.(*tokenAwareHostPolicy).Init() vs github.com/gocql/gocql. (*tokenAwareHostPolicy).updateReplicas()

It seems this is something that should be fixed in upstream gocql/gocql as it seems that upstream is affected too. I'll be happy to merge the fix upstream.
I've checked the code, but it seems gocql does not spawn goroutines using the policy before calling Init, so there shouldn't be any race. Do you by any chance reuse the policy instance between multiple sessions? Unfortunately I don't see from the screenshot where goroutine 61 was spawned, but it seems that it was spawned somewhere in Session.init, which is called after returning from tokenAwareHostPolicy.Init.
The issue is the assignment of the logger on line 401 during Init() can race with a use of the logger in the updateReplicas where it passes to getStrategy. The updateReplicas is getting called concurrently with the Init() because the schema event listener debouncer starts further up in the sequence, and if you do per-test schema operations, its possible to get into a state where one arrives during the race, because it's seeing stuff like the previous tests keyspace getting torn down.
s.schemaEvents = newEventDebouncer("SchemaEvents", s.handleSchemaEvent, s.logger)
In short, if a schema update arrives during startup, you can get this. I believe without the race detector, it'll probably nil-pointer as the logger wasn't assigned.
The issue is the assignment of the logger on line 401 during Init() can race with a use of the logger in the updateReplicas where it passes to getStrategy.
But what leads to that? It seems that Init is called and returns before any connections are created, so how do you reach the Session.handleEvent that ultimately leads to updateReplicas?
Init is called in NewSession:
https://github.com/scylladb/gocql/blob/6aca262598eadfc2c433ac6f790fd8b6abf86f88/session.go#L161
There are no goroutines that gocql spawns until this point.
First connection is made in Session.init:
https://github.com/scylladb/gocql/blob/6aca262598eadfc2c433ac6f790fd8b6abf86f88/session.go#L181
The only explanation I have right now is that the policy instance is used with multiple sessions. Is that the case? Or why do we reach handleEvent? Could you please provide example code that reproduces the issue?
Multiple sessions, but each only created after the previous has been .Closed()'d. However, I can't share the entire body of the application - but it seems like the thrashing during my unit tests (creating and deleting lots of keyspaces, as each test uses its own) makes this exponentially more likely.
Reusing host selection policies between multiple sessions (even after closing the old session) is not supported as the old session might still be using the policy. Adding a lock to Init won't fix the fact that the old session is still using the policy. I suggest you use a new instance of the policy every time you create a session. I guess we need to document that reusing policies is not supported and panic in tokenAwareHostPolicy.Init if the policy was already initialized.
Sounds like it’s a more along the lines of a defective design pattern - and the strategies should be passed as a factory so that each session the driver summons can get its own.
However that boat has probably sailed in terms keeping things compatible with todays world. It’d be a fairly breaking change to fix.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Martin Sucha @.> Sent: Wednesday, January 26, 2022 4:30:56 PM To: scylladb/gocql @.> Cc: Steve Gray @.>; Author @.> Subject: Re: [scylladb/gocql] Data Race During --race Test (Issue #94)
Reusing host selection policies between multiple sessions (even after closing the old session) is not supported as the old session might still be using the policy. Adding a lock to Init won't fix the fact that the old session is still using the policy. I suggest you use a new instance of the policy every time you create a session. I guess we need to document that reusing policies is not supported and panic in tokenAwareHostPolicy.Init if the policy was already initialized.
— Reply to this email directly, view it on GitHubhttps://github.com/scylladb/gocql/issues/94#issuecomment-1021905431, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADPDTP46JG5ZHUB75FZGHHDUX6IKBANCNFSM5MOREHVQ. You are receiving this because you authored the thread.Message ID: @.***>
Sounds like it’s a more along the lines of a defective design pattern - and the strategies should be passed as a factory so that each session the driver summons can get its own.
However that boat has probably sailed in terms keeping things compatible with todays world. It’d be a fairly breaking change to fix.
Yes, I agree. Unfortunately the policy interface is like this for a long time and there could be external implementations, so we can't easily change it.