[Feature] Reduce number of queries sent to Oracle
Problem
Currently for each statement sent to a test cluster we send 3 queries to oracle cluster
- The exact query that was sent to
test clusterto save it in the exact table - The statement that is executed against
test clustersaved instatement loggertable - The statement that is executed against
oracle clustersaved instatement loggertable
This 3 for 1 kills a performance a bit since we can batch a couple of statements and one thing helps, that the oracle and test statement live in the same partition in statement logger table, this will improve performance in the long run for gemini.
Solutions
Simple
Batch the two statements into one for oracle and test cluster (cannot be one statement, cause sometimes gemini might have a bug where we send different queries to oracle and test cluster and that causes discrepency)
Bit complex but better for performance
Time based queue cleanup -> save a couple of seconds worth of queries and do everything in bigger batch -> even a half a second will work, 10k req/s will save 5k queries to statement logger and network trips.
questions:
- how much perf gains we expect from doing simple vs complex solution?
- how much perf gains we would have if we didn't store text/blobs in current form but just start:stop of our random pool? (and later decode when putting into stmt log)
questions:
- how much perf gains we expect from doing simple vs complex solution?
In simple solution I'm expecting double the perf, cause removing one Network call (still have only one network call, batch statement). In complex, I really don't know, it will be faster but how much, without proof of concept and benchmark, hard to say.
- how much perf gains we would have if we didn't store text/blobs in current form but just start:stop of our random pool? (and later decode when putting into stmt log)
Yeah that could work, but it's more work then this (a lot more rework of the current logger and we have to be carefull cause of the partition keys, selecting them etc...).
questions:
- how much perf gains we expect from doing simple vs complex solution?
In simple solution I'm expecting double the perf, cause removing one Network call (still have only one network call, batch statement). In complex, I really don't know, it will be faster but how much, without proof of concept and benchmark, hard to say.
ok, I see, what would be the time estimates for simple and complex (including benchmarks) solutions?
- how much perf gains we would have if we didn't store text/blobs in current form but just start:stop of our random pool? (and later decode when putting into stmt log)
Yeah that could work, but it's more work then this (a lot more rework of the current logger and we have to be carefull cause of the partition keys, selecting them etc...).
I see, let's drop that idea for now then.
Closing this issue as it was moved to Jira. Please continue the thread in https://scylladb.atlassian.net/browse/QATOOLS-105