scylla-manager
scylla-manager copied to clipboard
Scylla Manager downgrade procedure does not work
The downgrade procedure described in docs does not really work. The problem is that it does not mention, that SM won't simply start after downgrading, because it won't recognize schema in it's own DB (the one where SM stores it's tasks definitions, etc.) This can be seen in this issue. So in case someone needs to downgrade SM, there are 3 options:
- drop SM keyspace before restart (this way SM will recreates required schema from scratch, but it will lose all task definitions, their progress, etc.)
- try to undo schema changes manually before restart (not advised, not always possible)
- make SM DB snapshot before upgrade and restore it before starting SM (IMHO way to much work in comparison to recreating tasks)
So realistically speaking, the easiest and safest option is the first one (drop and recreate tasks) and it should be mentioned in the docs.
Related #3400
Idea: what if we added something like DESCRIBE SCHEMA
command to SM, so that user could run it to get a simple bash script that recreates all SM tasks? This could make downgrade procedure way simpler than it is right now (but this wouldn't really be in SM 3.2 scope, as it already contains many different and important tasks).
What do you think @tzach @karol-kokoszka ?
Idea: what if we added something like DESCRIBE SCHEMA command to SM, so that user could run it to get a simple bash script that recreates all SM tasks? This could make downgrade procedure way simpler than it is right now (but this wouldn't really be in SM 3.2 scope, as it already contains many different and important tasks).
I'm not sure if I follow the idea. Could you please elaborate ? How the downgrade looks like then ?
Downgrade is not so important. It just may be useful in a situation we had few months ago, with RC0 released instead of 3.1.
I'm not sure if I follow the idea. Could you please elaborate ?
The example output of this command would be:
sctool backup -L s3:loc --cron @weekly --retention 3 --retention-days 2
sctool repair --cron @weekly --intensity 0 --parallel 3
How the downgrade looks like then ?
So SM downgrade would look like (skipping not affected steps):
- get the output of this command from SM
- stop SM
- drop SM keyspace from it's DB
- start SM
- run command output to recreate dropped tasks
So that the user does not have to recreate all tasks from hand.