Add property to control whether union all same database sql routes
Add property to control whether union all same database sql routes
now union all same database sql routes is default. However union all will cost much time than before. This cause an exception 'maximum statement execution time exceeded' . If query time over max_execution_time (mysql). If union all sql statements number can refer ‘max-connections-size-per-query’ or can be turn off by some property , user can adjust by themself.
eg: 1.sql routes :10. 2.max-connections-size-per-query :5 --> union all with 2 batch ,each batch contains 5 sql .
Hi @strongduanmu , Could you please hava a look, thanks~
@RaigorJiang Thank you for your remind.
Hi @lif123, can you provide your test program? According to our previous performance tests, using union all can effectively reduce the consumption of database connections and improve performance. For more details, you can refer - https://mp.weixin.qq.com/s/Mn9K-mn1P7oX0G5PorZRMw.
Therefore, there must be sufficient tests to prove that there is a performance problem with union all before we consider adding a props.
Thanks for your reply. Union all same database is a great feature which can improve performance in general. In my select case : select f1,f2,f3 from logic_table where id in (?,?,?,?,?). Union all will cost more time than execute one of them, this is clear. If your time cost > max_execution_time, then mysql will interrupt your sql execution. Unfortunately your DBA do not allow increase max_execution_time :( .So now I just shard myself on application and then execute sql to avoid this issue.
So is there any possible we can control Union logic (sql number or turn off ) ? thanks
Thank you, I will test this case.
Hi @strongduanmu is there any update ? thanks
Is there any feedback? @strongduanmu @RaigorJiang
Understanding
- Ask: Add a fine-grained control so that “same-database multi-route queries auto-merged into a single UNION ALL” can be limited by a threshold of merged sub-queries, preventing overly long SQL from hitting MySQL max_execution_time.
- Current behavior: For simple queries (no subquery/Join/Order By/Limit/Lock) routed to the same datasource, the rewriter merges actual SQLs into one UNION ALL to reduce connection usage.
Root Cause
- RouteSQLRewriteEngine always merges same-database route units via String.join(" UNION ALL ", …) to minimize same-database connections.
- max-connections-size-per-query only affects connection grouping/ConnectionMode and does not control UNION ALL or batching.
- With many shards and a low MySQL max_execution_time, the long merged SQL may run effectively serially and exceed the timeout.
Analysis
- The default strategy reduces connections and handshakes, but in high-shard and strict-timeout scenarios, merging sacrifices parallelism and can approach the sum of per-shard times.
- Introducing a “merge upper bound” keeps default compatibility while giving timeout-sensitive workloads a tunable merge size.
Conclusion (proposal + PR welcome)
- Add a global property max-union-size-per-datasource (default Integer.MAX_VALUE to keep current behavior):
- If same-database route count ≤ threshold: keep UNION ALL merge.
- If count > threshold: skip merge (or batch, each batch ≤ threshold) to regain parallelism and reduce single-SQL runtime.
- Implementation sketch:
- Add max-union-size-per-datasource to the configuration property enum with a sensible default.
- In RouteSQLRewriteEngine’s merge decision, read the property; when exceeding the threshold, either do not merge or split routeUnits into batches and merge per batch.
- Add unit tests for: under-threshold merge; over-threshold no-merge/batching; SQL text and parameter correctness.
- Document the new property, default, and usage guidance in the common properties docs (official entry: https://shardingsphere.apache.org/document/current/cn/user-manual/common-config/properties/ ).
- We warmly invite community contributors to submit a PR implementing and testing this; we’ll gladly help review and validate.