shardingsphere icon indicating copy to clipboard operation
shardingsphere copied to clipboard

Critical Performance Bottleneck: Optimize DataNode Creation in Table Sharding with Time-Based Strategy

Open fyeeme opened this issue 7 months ago • 1 comments

Performance Issue

In a time-based sharding scenario with hour-level granularity, we've identified a severe performance bottleneck in ShardingStandardRouteEngine.routeTables(). Performance drops from 10,000 TPS to 1-3 TPS when using sharding keys.

Configuration Context

tables:
  zns_simple_delay_message:
    actualDataNodes: ds_0.zns_simple_delay_message_2025${['06'..'12']}${['01'..'31']}${['00'..'23']}
    tableStrategy:
      standard:
        shardingColumn: scheduled_time
        shardingAlgorithmName: zns_simple_delay_message_inline
shardingAlgorithms:
  zns_simple_delay_message_inline:
    type: INTERVAL
    props:
      datetime-interval-unit: HOURS
      datetime-interval-amount: 1

Image

Root Cause

The performance degradation occurs in the following loop where a new DataNode is created for each routed table:

for (String each : routedTables) {
    result.add(new DataNode(routedDataSource, each));
}

With hour-level sharding, this creates thousands of DataNode objects (24 hours * 31 days * 7 months = 5,208 potential objects) for a single routing operation.

Impact

  • Single insert performance: 1-3 TPS (with sharding) vs 10,000 TPS (without sharding)
  • Excessive object creation causing GC pressure
  • Significant latency in high-throughput scenarios

Proposed Solutions

  1. Immediate Fix: Implement DataNode object pooling

    private static final Map<String, DataNode> DATA_NODE_CACHE = new ConcurrentHashMap<>();
    
    private DataNode getOrCreateDataNode(String dataSource, String table) {
        String key = dataSource + "." + table;
        return DATA_NODE_CACHE.computeIfAbsent(key, k -> new DataNode(dataSource, table));
    }
    
  2. Long-term Optimization:

    • Pre-calculate and cache DataNode objects for time-based sharding
    • Implement batch DataNode creation for time-interval strategies
    • Consider lazy DataNode instantiation pattern

Success Metrics

  • Restore TPS to at least 50% of non-sharding

fyeeme avatar Jun 07 '25 06:06 fyeeme

even if datetime-interval-unit: HOURS set DAYS ,TPS is still much lower than normal mysql jdbc

fyeeme avatar Jun 07 '25 14:06 fyeeme