restic icon indicating copy to clipboard operation
restic copied to clipboard

Before flag in restore

Open giskou opened this issue 6 years ago • 10 comments

Output of restic version

restic 0.9.5 compiled with go1.12.4 on linux/amd64

What should restic do differently? Which functionality do you think we should add?

It would be nice to have a before (and maybe after) date flag in the restore command.

What are you trying to do?

Use case is that you know when bad files/bugs appear and you want the latest snapshot before a specific date, instead of just the latest.

Did restic help you or made you happy in any way?

Very :D

giskou avatar Nov 06 '19 14:11 giskou

Since restore command requires a snapshot ID, how is it possible to obtain programmatically ID of a snapshot made at given date?

eprigorodov avatar May 19 '20 21:05 eprigorodov

@giskou Can't you just 1) snapshots, 2) in the list of snapshots, locate the last one before the date you are concerned about, 3) restore that snapshot? It's just one extra command to run.

rawtaz avatar May 19 '20 22:05 rawtaz

@rawtaz If restic should be used from a script / pipeline, which is given restore date as a parameter, what is recommended way to do 2) programmatically?

eprigorodov avatar May 19 '20 22:05 eprigorodov

@eprigorodov In the list of snapshots you have the dates and the snapshots ID. There's many different ways to do it, here's one using awk:

bash-3.2$ cat apa.txt 
e953e6fb  2020-05-11 04:29:58  foo.local                /what/ever

746f1121  2020-05-12 04:01:01  foo.local                /what/ever

98d3115e  2020-05-13 03:02:19  foo.local                /what/ever

4ca4fdcd  2020-05-14 03:24:39  foo.local                /what/ever

6d25cfbd  2020-05-18 00:33:02  foo.local                /what/ever

bash-3.2$ cat apa.txt | awk '/^\s*$/{next;} $2=="2020-05-14"{print line}{line=$1}'
98d3115e

bash-3.2$

There are probably cleaner ways, but this one is simple enough that you can use it in a script if you want (obviously while handling the case where you don't get any snapshot ID out of it).

I think this use case is rather rare. Most people restore manually, and those that have it scripted somehow either have some input from a user after showing them a list or offering them some search that can yield a snapshot ID in another way.

EDIT: The above grabs the last snapshot ID before the date you provided. If you just want the snapshot ID for a given date, that's even simpler. Also note that it won't work if there's multiple lines of paths for the snapshot you want, it will have to be adjusted for that. This is just a basic example.

rawtaz avatar May 19 '20 22:05 rawtaz

Since restore command requires a snapshot ID, how is it possible to obtain programmatically ID of a snapshot made at given date?

@eprigorodov that's my main issue. Such a feature will not require an id. It's more or less the same as latest but with a specific date.

In the list of snapshots you have the dates and the snapshots ID. There's many different ways to do it, here's one using awk:

@rawtaz Of course something like this is possible, but it can get complicated very fast.

There are probably cleaner ways, but this one is simple enough that you can use it in a script if you want (obviously while handling the case where you don't get any snapshot ID out of it).

The cleanest way would be that restic deals with this internally, so every run, on every shell version, with no extra tools (awk,sed,grep) will give you exactly the same results.

I think this use case is rather rare. Most people restore manually, and those that have it scripted somehow either have some input from a user after showing them a list or offering them some search that can yield a snapshot ID in another way.

Fair enough. I still think that such a feature will not be that hard to implement and will benefit at least some.

EDIT: The above grabs the last snapshot ID before the date you provided. If you just want the snapshot ID for a given date, that's even simpler. Also note that it won't work if there's multiple lines of paths for the snapshot you want, it will have to be adjusted for that. This is just a basic example.

As you can see such an approach has a few "gotchas"

giskou avatar May 20 '20 10:05 giskou

I think you are overcomplicating it and that restic shouldn't cater to every single use case. It, if anything, would become too complex. Do few things and do it well. But that's just my opinion. FWIW I don't think the proposed solution is very complex at all, it's literally one single command to get you 99% of the way. If you add a couple more lines you're done. And having to adjust and run different commands on different platforms is already something you have to do anyway, even if you were to use a built-in feature like this in restic, so you already crossed that boundary.

rawtaz avatar May 20 '20 10:05 rawtaz

@rawtaz Point-in-time restore is a primary use case for keeping multiple backup snapshots in general. If all restore operations only used the "latest" snapshot then there would be no need in the whole restic machinery.

And manual restore is only feasible while the data to be restored is simple as single directory. Consider repeating the "awk" approach for related snapshots from several hosts. Think of restore scenarios that have further steps after extracting files, like database import. Automation becomes necessary in any infrastructure larger than personal space and/or when multiple operators are involved.

Compare the snippet above with the following possible way:

restic restore latest --before 2020-01-01 --target ...

Would you say that this feature is not needed at all or just that it has low priority?

eprigorodov avatar May 20 '20 11:05 eprigorodov

Technically, this change should be simple: just handling a new --before command line filter in addition to existing --tag, --host, --path,

and then changing a single line in snapshot_find.go/FindLatestSnapshot():

		if snapshot.Time.Before(latest) || !snapshot.Time.Before(before) {
			return nil
		}

eprigorodov avatar May 27 '20 14:05 eprigorodov

Sorry for the late reply, been busy.

Would you say that this feature is not needed at all or just that it has low priority?

Needed or not isn't something that can be answered as the concept of needed is relative and individual. I think it's a nice to have rather than need to have, if that answers your question. People restoring manually don't need this, and those that automate/integrate their restores will have written scripts/integrations already, and will have absolutely no problem to grab the list of snapshots and pick the one they're interested in (or as I think I mentioned before, will already have a UI for it anyway). It's a pretty specific use case.

Regarding priority, it's currently not realistic that a core developer will write it (as there's a lot of other more important things to focus that time on), that's pretty certainly not going to happen in the foreseeable future. Possibly a PR would be considered though, if written properly :)

rawtaz avatar May 27 '20 18:05 rawtaz

Tested a workaround with helper script, as suggested in the comment above: Python script gist. Usage:

restic snapshots --json | python3 find_snapshot.py <before|after> YYYY-MM-DD.HH:MM+HH:MM

Some details can make the task too complex for an inline awk program: semantics of input date and time zones.

  1. As mentioned in the same comment, the requested restore date can be interpreted in several ways:
  • latest snapshot taken before the given date,
  • snapshot taken exactly on the given date,
  • earlest snapshot taken after the given date.

At least our ops were not able to tell if preferred mode would be "latest before" or "earliest after", they need both. The case with exact date might be rare due to backup retention.

  1. Backup and restore may take place in different time zones.

To the credit of restic, it saves snapshot timestamps with time zone information. When restoring, operator can /and most often will/ enter desired restore date without indicating any time zone. In that case the only reasonable assumption is the current local time zone. Even in the same location, the time zone of requested date may differ from the time zone of snapshot (due to DST). The helper script should consider these details when comparing points in time.

eprigorodov avatar Dec 18 '20 12:12 eprigorodov