[FINAL] feat: Canister backup and restore
Adds new API to the management canister that facilitates backup and restore operations. Design doc: backup and restore
Note to reviewers: opened MR to request feedback from the interface spec owners.
Will this require a stopped canister, or will it work even with outstanding callbacks?
Will this require a stopped canister, or will it work even with outstanding callbacks?
In the design doc, there's a note that "While it is not explicitly enforced, creating a snapshot only after stopping the canister is recommended. This follows the same principle as upgrading a canister, as making sense of the callbacks may not be possible.". I'd suggest to add this recommendation as a note to the spec, too.
Will this require a stopped canister, or will it work even with outstanding callbacks?
As discussed with @ielashi and @bogwar, we will not force the canister to be stopped, but similar to install_code, that would be the recommendation.
Will this require a stopped canister, or will it work even with outstanding callbacks?
In the design doc, there's a note that "While it is not explicitly enforced, creating a snapshot only after stopping the canister is recommended. This follows the same principle as upgrading a canister, as making sense of the callbacks may not be possible.". I'd suggest to add this recommendation as a note to the spec, too.
@mraszyk sure, will do!
As discussed with @ielashi and @bogwar, we will not force the canister to be stopped, but similar to
install_code, that would be the recommendation.
I think it is indeed a good choice to allow taking snapshots even with open call context, in order to cover disaster scenarios. But it might be good to expose a flag to control whether this is enabled (maybe even make disable it by default).
I had an offline discussion with @mraszyk. We agreed on the following API:
take_snapshot: (record {
canister_id: canister_id;
replace_snapshot: opt snapshot_id;
}) -> (snapshot_id);
load_snapshot: (record {
canister_id: canister_id;
snapshot_id: snapshot_id;
}) -> ();
list_snapshots: (record {canister_id : canister_id}) -> (vec snapshot);
delete_snapshot: (record {snapshot_id : snapshot_id}) -> ();
Change: add an optional snapshot ID field to take_snapshot. The snapshot with the specified ID will be deleted if canister has reached the maximum amount of snapshots allowed (limit currently set to 1).
I will additionally rebase on master and update the semantics, which will include the changes to reserve_cycles.
Let me also point out that in my opinion if the new argument
replace_snapshot: opt snapshot_id
is null and there's no available slot for the new snapshot, then take_snapshot should fail.
Let's not check in the didc binary, please.
Could you please also update canister history textual sections in the spec mentioning that loading snapshots is also recorded?