vitess icon indicating copy to clipboard operation
vitess copied to clipboard

Bug Report: Update VSchema not send version if there are multiple vtgates operating at the same time.

Open javamyth opened this issue 1 year ago • 5 comments

Overview of the Issue

hi,There is no version number passed here. There may be concurrency issues when updating VSchema if there are multiple vtgates operating at the same time.

20240425183642557 20240425184440471

Reproduction Steps

N/A

Binary Version

vtgate version Version: 18.0.4

Operating System and Environment details

N/A

Log Fragments

N/A

javamyth avatar Apr 25 '24 10:04 javamyth

The vschema has a timestamp in it. But more importantly it's versioned in the topology server (e.g. etcd). I'm going to close this for now as it's not clear to me what bug is being called out. If I'm missing something please let us know and we can re-open this any time. Thanks!

mattlord avatar Apr 25 '24 14:04 mattlord

The vschema has a timestamp in it. But more importantly it's versioned in the topology server (e.g. etcd). I'm going to close this for now as it's not clear to me what bug is being called out. If I'm missing something please let us know and we can re-open this any time. Thanks!

The UpdateVSchema method does not pass the version number and will be overwritten directly. This will cause it to be unsafe when multiple vtgate processes execute UpdateVSchema at the same time. Please take a look at the version of ts.globalCell.Update that passes nil, thank you.

javamyth avatar Apr 26 '24 01:04 javamyth

What is passed here is null, not version. If the data of a certain node is updated, there will be security issues. 20240425184440471 20240425183642557 20240426092442173

javamyth avatar Apr 26 '24 01:04 javamyth

Thanks! I think you are correct here and we’re not providing serializability (although we could via etcd key versions and revisions) so you could read an old version, update it, and save the updated old structure overwriting intermediate changes. Because there’s no synchronization or concurrency controls concurrent writes from multiple vtgates will have a non-deterministic result and we could lose updates. I’ll reopen and let the query serving team comment.

mattlord avatar Apr 26 '24 01:04 mattlord

We should be able to enforce serializable changes across vtgates by leveraging the topo key version. For an example of where we're doing this today, look at the topo Tablet records/keys and the TabletInfo struct used in memory — same for the topo Shard records/keys and the ShardInfo in memory structure. For example: https://github.com/vitessio/vitess/blob/f27d287e0873d139ad2f71588068256e28a16f6a/go/vt/topo/shard.go#L165-L222

Writing the Shard topo key will fail if the version provided in the Update() call is older than the key's current version.

mattlord avatar May 06 '24 13:05 mattlord