interface-spec icon indicating copy to clipboard operation
interface-spec copied to clipboard

[FINAL] feat: Canister backup and restore

Open AlexandraZapuc opened this issue 2 years ago • 9 comments

Adds new API to the management canister that facilitates backup and restore operations. Design doc: backup and restore

Note to reviewers: opened MR to request feedback from the interface spec owners.

AlexandraZapuc avatar Dec 01 '23 11:12 AlexandraZapuc

Will this require a stopped canister, or will it work even with outstanding callbacks?

nomeata avatar Dec 01 '23 11:12 nomeata

Will this require a stopped canister, or will it work even with outstanding callbacks?

In the design doc, there's a note that "While it is not explicitly enforced, creating a snapshot only after stopping the canister is recommended. This follows the same principle as upgrading a canister, as making sense of the callbacks may not be possible.". I'd suggest to add this recommendation as a note to the spec, too.

mraszyk avatar Dec 05 '23 07:12 mraszyk

Will this require a stopped canister, or will it work even with outstanding callbacks?

As discussed with @ielashi and @bogwar, we will not force the canister to be stopped, but similar to install_code, that would be the recommendation.

Will this require a stopped canister, or will it work even with outstanding callbacks?

In the design doc, there's a note that "While it is not explicitly enforced, creating a snapshot only after stopping the canister is recommended. This follows the same principle as upgrading a canister, as making sense of the callbacks may not be possible.". I'd suggest to add this recommendation as a note to the spec, too.

@mraszyk sure, will do!

AlexandraZapuc avatar Dec 05 '23 11:12 AlexandraZapuc

As discussed with @ielashi and @bogwar, we will not force the canister to be stopped, but similar to install_code, that would be the recommendation.

I think it is indeed a good choice to allow taking snapshots even with open call context, in order to cover disaster scenarios. But it might be good to expose a flag to control whether this is enabled (maybe even make disable it by default).

oggy-dfin avatar Dec 05 '23 14:12 oggy-dfin

I had an offline discussion with @mraszyk. We agreed on the following API:

 take_snapshot: (record { 
    canister_id: canister_id;
    replace_snapshot: opt snapshot_id;
  }) -> (snapshot_id);
  load_snapshot: (record { 
    canister_id: canister_id;
    snapshot_id: snapshot_id;
  }) -> ();
  list_snapshots: (record {canister_id : canister_id}) -> (vec snapshot);
  delete_snapshot: (record {snapshot_id : snapshot_id}) -> ();

Change: add an optional snapshot ID field to take_snapshot. The snapshot with the specified ID will be deleted if canister has reached the maximum amount of snapshots allowed (limit currently set to 1).

I will additionally rebase on master and update the semantics, which will include the changes to reserve_cycles.

AlexandraZapuc avatar Dec 07 '23 11:12 AlexandraZapuc

Let me also point out that in my opinion if the new argument

replace_snapshot: opt snapshot_id

is null and there's no available slot for the new snapshot, then take_snapshot should fail.

mraszyk avatar Dec 07 '23 15:12 mraszyk

Let's not check in the didc binary, please.

mraszyk avatar Dec 21 '23 07:12 mraszyk

Could you please also update canister history textual sections in the spec mentioning that loading snapshots is also recorded?

mraszyk avatar Jan 15 '24 14:01 mraszyk