Why does actual-data-nodes work without specifying physical tables explicitly?
I have observed an unexpected behavior in ShardingSphere's table sharding configuration. My configuration doesn't explicitly specify all physical tables, yet it works correctly. I'm trying to understand why this is possible.
Here's my current configuration and sharding code:
shardingsphere:
datasource:
names: ds0
ds0:
type: com.zaxxer.hikari.HikariDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://127.0.0.1:3306/demo?useUnicode=true&characterEncoding=UTF-8&useSSL=false&serverTimezone=GMT%2B8&allowPublicKeyRetrieval=true
username: root
password: '111111'
rules:
sharding:
key-generate:
snowflake-gen:
type: SNOWFLAKE
shardingAlgorithms:
agreement-table-complex-algorithm:
type: CLASS_BASED
props:
strategy: COMPLEX
algorithmClassName: biz.demo.config.sharding.AgreementAlgorithm
tables:
agreement:
actual-data-nodes: ds0.agreement
table-strategy:
complex:
sharding-algorithm-name: agreement-table-complex-algorithm
sharding-columns: id,create_time
props:
sql-show: true
public Collection<String> doSharding(Collection<String> availableTargetNames, ComplexKeysShardingValue<Long> complexKeysShardingValue) {
Map<String, Range<Long>> columnNameAndRangeValuesMap = complexKeysShardingValue.getColumnNameAndRangeValuesMap();
Map<String, Collection<Long>> columnNameAndShardingValuesMap = complexKeysShardingValue.getColumnNameAndShardingValuesMap();
List<String> actualTarget = new ArrayList<>();
Snowflake snowflakeInstances = Snow.getSnowflakeInstances();
Collection<Long> idKeyColl = columnNameAndShardingValuesMap.get("id");
String logicTableName = complexKeysShardingValue.getLogicTableName();
if (CollectionUtils.isNotEmpty(idKeyColl)) {
for (Long id : idKeyColl) {
long generateDateTime = snowflakeInstances.getGenerateDateTime(id);
int suffix = Math.abs(String.valueOf(generateDateTime).hashCode() % 4);
String tb = logicTableName + "_" + suffix;
actualTarget.add(tb);
}
}
return actualTarget;
}
In my database, I have four physical tables: agreement_0, agreement_1, agreement_2, and agreement_3. However, in my configuration, I've only specified actual-data-nodes: ds0.agreement instead of actual-data-nodes: ds0.agreement_$->{0..3}.
Despite this, my sharding setup works correctly. The system successfully routes queries to the appropriate physical tables.
My questions are:
- Why does this configuration work without explicitly specifying all physical tables?
- Is this behavior intentional or a side effect of how ShardingSphere processes configurations?
- Are there any potential risks or downsides to using this simplified configuration in a production environment?
- Is there any documentation that explains this behavior?
Is there some default behavior in ShardingSphere that I'm not aware of, or is it related to my custom sharding algorithm? Thank you.
The locking issue was just a mistake of hand. Can you debug the variables actualTarget and availableTargetNames? That sounds kind of weird.
I have debugged the variables availableTargetNames and actualTarget as you suggested. availableTargetNames contains the logical table name agreement. actualTarget contains a single physical table name agreement_2. From this, it seems that my custom sharding algorithm is correctly mapping the logical table to the appropriate physical table. However, my original question was about why my configuration works even though I didn't specify the physical table names explicitly in the actual-data-nodes property. Could it be that ShardingSphere relies on the sharding algorithm to resolve the physical table names from the logical table name, even when the actual-data-nodes doesn't explicitly list all the tables? Is this behavior expected?
Thank you for your help.
Is this behavior expected?
- Currently, this feature does exist. In fact, the unit tests also rely on this feature.
- Feel free to check https://github.com/apache/shardingsphere/tree/master/test/native . In unit test files like https://github.com/apache/shardingsphere/blob/master/test/native/src/test/resources/test-native/yaml/features/sharding.yaml ,
actualDataNodesis empty. The actual table of target routes is generated by the logic function of the custom algorithm class. - This feature can be traced all the way back to the ShardingSpehre 5.1.0 example.
- But this feature is really humorous, because ShardingSphere Proxy obviously should not allow this. This is something that only ShardingSphere JDBC has.
- @terrymanu What do you think? I will say that I haven’t tested whether ShardingSphere Proxy can do the same with DistSQL.