scylla-manager
scylla-manager copied to clipboard
Manager: Support small_table_optimization feature
See https://github.com/scylladb/scylladb/pull/15974
A new API option "small_table_optimization" Manager should set this value to true for all system tables.
Code is in - https://github.com/scylladb/scylladb/pull/15974
@karol-kokoszka FYI.
With this feature.
- No token range to repair is needed by the user. It will repair all token
ranges automatically.
- Users only need to send the repair rest api to one of the nodes in the
cluster. It can be any of the nodes in the cluster.
- It does not require the RF to be configured to replicate to all nodes in the
cluster. This means it can work with any tables as long as the amount of data
is low, e.g., less than 100MiB per node.
Item 3 will allow us to use small table optimization for more tables.
@karol-kokoszka - this should be a '3.3' item I reckon, but it's an important one, hopefully we'll be able to get all this wrapped into 2024.1 (and perhaps backport eventually to 2023.1.x!)
@asias How can SM know whether node supports small_table_optimization
param?
I also tested that Scylla doesn't complain when it gets unknown param in repair API call, so it creates the danger that SM sends API call with small_table_optimization
thinking that it will repair the whole table, but it would be silently ignored and the table won't be repaired.
@asias How can SM know whether node supports
small_table_optimization
param? I also tested that Scylla doesn't complain when it gets unknown param in repair API call, so it creates the danger that SM sends API call withsmall_table_optimization
thinking that it will repair the whole table, but it would be silently ignored and the table won't be repaired.
This is a good observation. Scylla core should reject the unknown options.
I created an issue here:
https://github.com/scylladb/scylladb/issues/16299
and a PR here:
https://github.com/scylladb/scylladb/pull/16300
@amnonh Is there any rest api to list the supported parameters for a given rest api?
E.g., in api/api-doc/storage_service.json. Can a user use the rest api to know that parameters.id and parameters.timeout are supported.
{
"path":"/storage_service/repair_status/",
"operations":[
{
"method":"GET",
"summary":"Query the repair status and return when the repair is finished or timeout",
"type":"string",
"enum":[
"RUNNING",
"SUCCESSFUL",
"FAILED"
],
"nickname":"repair_await_completion",
"produces":[
"application/json"
],
"parameters":[
{
"name":"id",
"description":"The repair ID to check for status",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"timeout",
"description":"Seconds to wait before the query returns even if the repair is not finished. The value -1 or not providing this parameter means no timeout",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
Yes, we are documenting the API using swagger, the old API uses swagger 1.2 which mean each part of the api is under different url.
With a working scylla instance you can use the swagger ui that comes with scylla
http://localhost:10000/ui/
And here is the repair status:
The relative swagger can be downloaded: http://localhost:10000/api-doc/storage_service/
@asias How can SM know whether node supports
small_table_optimization
param? I also tested that Scylla doesn't complain when it gets unknown param in repair API call, so it creates the danger that SM sends API call withsmall_table_optimization
thinking that it will repair the whole table, but it would be silently ignored and the table won't be repaired.
For now - stick with simple rule - 2024.2 and above. (Later - we may backport this to 2023.1.x, unsure)
@asias A follow-up question regarding this feature:
- No token range to repair is needed by the user. It will repair all token ranges automatically.
- Users only need to send the repair rest api to one of the nodes in the cluster. It can be any of the nodes in the cluster.
- It does not require the RF to be configured to replicate to all nodes in the cluster. This means it can work with any tables as long as the amount of data is low, e.g., less than 100MiB per node.
When sending repair with small_table_optimization
enabled, does Scylla respect hosts
param?
"name": "hosts",
"in": "query",
"required": false,
"type": "string",
"description": "Which hosts are to participate in this repair. Multiple hosts can be listed separated by commas."
SM uses this param to orchestrate repair on nodes from specified dc, ignore down nodes, etc.
So does repairing a cluster with 1 node down with small_table_optimization
and hosts
(with excluded down node) goes well?
@asias A follow-up question regarding this feature:
- No token range to repair is needed by the user. It will repair all token ranges automatically.
- Users only need to send the repair rest api to one of the nodes in the cluster. It can be any of the nodes in the cluster.
- It does not require the RF to be configured to replicate to all nodes in the cluster. This means it can work with any tables as long as the amount of data is low, e.g., less than 100MiB per node.
When sending repair with
small_table_optimization
enabled, does Scylla respecthosts
param?"name": "hosts", "in": "query", "required": false, "type": "string", "description": "Which hosts are to participate in this repair. Multiple hosts can be listed separated by commas."
SM uses this param to orchestrate repair on nodes from specified dc, ignore down nodes, etc. So does repairing a cluster with 1 node down with
small_table_optimization
andhosts
(with excluded down node) goes well?
Hello Michal,
The small_table_optimization is designed to repair all ranges and all nodes in the cluster. We currently do not wire the hosts and dc selection with it. It does not make much sense if we use small_table_optimization while repairing only some of the DCs anyway. We can start with using small_table_optimization when none of the restrictions are specified by user which should be the most common cases. This feature is mainly for system table repairs pains like we have with system_auth.