Before flag in restore
Output of restic version
restic 0.9.5 compiled with go1.12.4 on linux/amd64
What should restic do differently? Which functionality do you think we should add?
It would be nice to have a before (and maybe after) date flag in the restore command.
What are you trying to do?
Use case is that you know when bad files/bugs appear and you want the latest snapshot before a specific date, instead of just the latest.
Did restic help you or made you happy in any way?
Very :D
Since restore command requires a snapshot ID, how is it possible to obtain programmatically ID of a snapshot made at given date?
@giskou Can't you just 1) snapshots, 2) in the list of snapshots, locate the last one before the date you are concerned about, 3) restore that snapshot? It's just one extra command to run.
@rawtaz If restic should be used from a script / pipeline, which is given restore date as a parameter, what is recommended way to do 2) programmatically?
@eprigorodov In the list of snapshots you have the dates and the snapshots ID. There's many different ways to do it, here's one using awk:
bash-3.2$ cat apa.txt
e953e6fb 2020-05-11 04:29:58 foo.local /what/ever
746f1121 2020-05-12 04:01:01 foo.local /what/ever
98d3115e 2020-05-13 03:02:19 foo.local /what/ever
4ca4fdcd 2020-05-14 03:24:39 foo.local /what/ever
6d25cfbd 2020-05-18 00:33:02 foo.local /what/ever
bash-3.2$ cat apa.txt | awk '/^\s*$/{next;} $2=="2020-05-14"{print line}{line=$1}'
98d3115e
bash-3.2$
There are probably cleaner ways, but this one is simple enough that you can use it in a script if you want (obviously while handling the case where you don't get any snapshot ID out of it).
I think this use case is rather rare. Most people restore manually, and those that have it scripted somehow either have some input from a user after showing them a list or offering them some search that can yield a snapshot ID in another way.
EDIT: The above grabs the last snapshot ID before the date you provided. If you just want the snapshot ID for a given date, that's even simpler. Also note that it won't work if there's multiple lines of paths for the snapshot you want, it will have to be adjusted for that. This is just a basic example.
Since restore command requires a snapshot ID, how is it possible to obtain programmatically ID of a snapshot made at given date?
@eprigorodov that's my main issue. Such a feature will not require an id. It's more or less the same as latest but with a specific date.
In the list of snapshots you have the dates and the snapshots ID. There's many different ways to do it, here's one using
awk:
@rawtaz Of course something like this is possible, but it can get complicated very fast.
There are probably cleaner ways, but this one is simple enough that you can use it in a script if you want (obviously while handling the case where you don't get any snapshot ID out of it).
The cleanest way would be that restic deals with this internally, so every run, on every shell version, with no extra tools (awk,sed,grep) will give you exactly the same results.
I think this use case is rather rare. Most people restore manually, and those that have it scripted somehow either have some input from a user after showing them a list or offering them some search that can yield a snapshot ID in another way.
Fair enough. I still think that such a feature will not be that hard to implement and will benefit at least some.
EDIT: The above grabs the last snapshot ID before the date you provided. If you just want the snapshot ID for a given date, that's even simpler. Also note that it won't work if there's multiple lines of paths for the snapshot you want, it will have to be adjusted for that. This is just a basic example.
As you can see such an approach has a few "gotchas"
I think you are overcomplicating it and that restic shouldn't cater to every single use case. It, if anything, would become too complex. Do few things and do it well. But that's just my opinion. FWIW I don't think the proposed solution is very complex at all, it's literally one single command to get you 99% of the way. If you add a couple more lines you're done. And having to adjust and run different commands on different platforms is already something you have to do anyway, even if you were to use a built-in feature like this in restic, so you already crossed that boundary.
@rawtaz Point-in-time restore is a primary use case for keeping multiple backup snapshots in general. If all restore operations only used the "latest" snapshot then there would be no need in the whole restic machinery.
And manual restore is only feasible while the data to be restored is simple as single directory. Consider repeating the "awk" approach for related snapshots from several hosts. Think of restore scenarios that have further steps after extracting files, like database import. Automation becomes necessary in any infrastructure larger than personal space and/or when multiple operators are involved.
Compare the snippet above with the following possible way:
restic restore latest --before 2020-01-01 --target ...
Would you say that this feature is not needed at all or just that it has low priority?
Technically, this change should be simple: just handling a new --before command line filter in addition to existing --tag, --host, --path,
and then changing a single line in snapshot_find.go/FindLatestSnapshot():
if snapshot.Time.Before(latest) || !snapshot.Time.Before(before) {
return nil
}
Sorry for the late reply, been busy.
Would you say that this feature is not needed at all or just that it has low priority?
Needed or not isn't something that can be answered as the concept of needed is relative and individual. I think it's a nice to have rather than need to have, if that answers your question. People restoring manually don't need this, and those that automate/integrate their restores will have written scripts/integrations already, and will have absolutely no problem to grab the list of snapshots and pick the one they're interested in (or as I think I mentioned before, will already have a UI for it anyway). It's a pretty specific use case.
Regarding priority, it's currently not realistic that a core developer will write it (as there's a lot of other more important things to focus that time on), that's pretty certainly not going to happen in the foreseeable future. Possibly a PR would be considered though, if written properly :)
Tested a workaround with helper script, as suggested in the comment above: Python script gist. Usage:
restic snapshots --json | python3 find_snapshot.py <before|after> YYYY-MM-DD.HH:MM+HH:MM
Some details can make the task too complex for an inline awk program: semantics of input date and time zones.
- As mentioned in the same comment, the requested restore date can be interpreted in several ways:
- latest snapshot taken before the given date,
- snapshot taken exactly on the given date,
- earlest snapshot taken after the given date.
At least our ops were not able to tell if preferred mode would be "latest before" or "earliest after", they need both. The case with exact date might be rare due to backup retention.
- Backup and restore may take place in different time zones.
To the credit of restic, it saves snapshot timestamps with time zone information.
When restoring, operator can /and most often will/ enter desired restore date without indicating any time zone. In that case the only reasonable assumption is the current local time zone. Even in the same location, the time zone of requested date may differ from the time zone of snapshot (due to DST).
The helper script should consider these details when comparing points in time.