`lxd recover` loses the expiration dates of snapshots.
I just went through a successful lxd recover which reimport my container and its snapshots from the intact zpool. Snapshots are taken on a schedule:
# lxc config show -e ganymede | grep snapshot
snapshots.expiry: 3d
snapshots.schedule: '@daily, @startup'
However, after lxd recover brought those snapshots back, they lost their expires at field:
# lxc info ganymede | sed -n '/^Snapshots:$/,$ p'
Snapshots:
+---------+----------------------+----------------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+---------+----------------------+----------------------+----------+
| snap222 | 2024/05/07 05:28 UTC | | NO |
+---------+----------------------+----------------------+----------+
| snap223 | 2024/05/08 05:28 UTC | | NO |
+---------+----------------------+----------------------+----------+
| snap224 | 2024/05/09 05:28 UTC | | NO |
+---------+----------------------+----------------------+----------+
| snap225 | 2024/05/09 22:21 UTC | 2024/05/12 22:21 UTC | NO |
+---------+----------------------+----------------------+----------+
In the above, snap225 was taken after the lxd recover.
However, the instance's backup.yaml should have had this information which is where it learned about the taken at field. That said, it seems the recovery have overwritten the backup.yaml with the bogus values now:
# LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ nsenter --mount=/run/snapd/ns/lxd.mnt sed -n '/^volume_snapshots:$/,$ p' /var/snap/lxd/common/lxd/storage-pools/default/containers/ganymede/backup.yaml
volume_snapshots:
- name: snap222
description: ""
content_type: filesystem
created_at: 0001-01-01T00:00:00Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: a951c197-c11f-4fcc-a76e-ec575b99e305
- name: snap223
description: ""
content_type: filesystem
created_at: 0001-01-01T00:00:00Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 2b8f7c0c-92da-4170-af35-c5033ec6b89c
- name: snap224
description: ""
content_type: filesystem
created_at: 0001-01-01T00:00:00Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: dba273a7-2980-46e4-afc3-ddb7ec617171
- name: snap225
description: ""
content_type: filesystem
created_at: 2024-05-09T22:21:56.762923236Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 93c48d05-4de3-492b-8e6c-f3ce2e3e9c63
Additional information:
# snap list lxd
Name Version Rev Tracking Publisher Notes
lxd 5.21.1-d46c406 28460 5.21/stable canonical✓ -
In fact, looking at the snap225 section of the backup.yaml (added post-recovery), it seems that during normal operations LXD doesn't save the right expires_at field in the backup.yaml file.
Here's a easy reproducer for the previous comment where I said the backup.yaml didn't contain the expires_at field:
$ lxc launch images:alpine/edge c1 -c snapshots.expiry=1d
$ lxc snapshot c1
$ sudo LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ nsenter --mount=/run/snapd/ns/lxd.mnt sed -n '/^volume_snapshots:$/,$ p' /var/snap/lxd/common/lxd/storage-pools/default/containers/c1/backup.yaml
volume_snapshots:
- name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-23T20:38:04.166424212Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 80b83e4b-482b-49a4-b83e-4e965ce51265
While clearly LXD itself is aware of the snapshot expiry:
$ lxc info c1 | sed -n '/^Snapshots:/,$ p'
Snapshots:
+-------+----------------------+----------------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+-------+----------------------+----------------------+----------+
| snap0 | 2024/08/23 16:38 EDT | 2024/08/24 16:38 EDT | NO |
+-------+----------------------+----------------------+----------+
It appears that the snapshot expiry date is correct, hence why LXD is aware of the snapshot (see below snippet). However, lxd recover is using the volume_snapshots expiry which is zeroed. Furthermore, volume snapshot expiry is set differently than instance snapshot expiry (lxc storage volume set default container/c1 snapshots.expiry=1d). I'm not sure what the intended behaviour is for volume snapshots expiry dates, ie. should they match up with instance snapshot expiry dates? Or, should lxd recover be looking at the instance snapshot expiry date rather than the volume snapshot expiry date? cc. @tomponline
$ sudo LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ nsenter --mount=/run/snapd/ns/lxd.mnt sed -n '/^snapshots:$/,$ p' /var/snap/lxd/common/lxd/storage-pools/default/containers/c1/backup.yaml
snapshots:
- architecture: x86_64
config:
image.architecture: amd64
image.description: Alpine edge amd64 (20240823_0018)
image.os: Alpine
image.release: edge
image.requirements.secureboot: "false"
image.serial: "20240823_0018"
image.type: squashfs
image.variant: default
snapshots.expiry: 1d
volatile.base_image: 3aab2d4b12a5bf88b798fe02cf361349cb9cd5648c89789bfac96f1cdce1d32c
volatile.cloud-init.instance-id: 5dd2a543-b181-4dce-8e0c-4202173421b7
volatile.eth0.host_name: vethdc246f41
volatile.eth0.hwaddr: 00:16:3e:9f:79:4b
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: RUNNING
volatile.uuid: ad4f73ac-50e7-4e92-87a1-82a05e928157
volatile.uuid.generation: ad4f73ac-50e7-4e92-87a1-82a05e928157
created_at: 2024-08-23T22:01:05.356755362Z
expires_at: 2024-08-24T22:01:05.353609739Z
devices: {}
ephemeral: false
expanded_config:
image.architecture: amd64
image.description: Alpine edge amd64 (20240823_0018)
image.os: Alpine
image.release: edge
image.requirements.secureboot: "false"
image.serial: "20240823_0018"
image.type: squashfs
image.variant: default
snapshots.expiry: 1d
volatile.base_image: 3aab2d4b12a5bf88b798fe02cf361349cb9cd5648c89789bfac96f1cdce1d32c
volatile.cloud-init.instance-id: 5dd2a543-b181-4dce-8e0c-4202173421b7
volatile.eth0.host_name: vethdc246f41
volatile.eth0.hwaddr: 00:16:3e:9f:79:4b
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: RUNNING
volatile.uuid: ad4f73ac-50e7-4e92-87a1-82a05e928157
volatile.uuid.generation: ad4f73ac-50e7-4e92-87a1-82a05e928157
expanded_devices:
eth0:
name: eth0
network: lxdbr1
type: nic
root:
path: /
pool: default
type: disk
last_used_at: 0001-01-01T00:00:00Z
name: snap0
profiles:
- default
stateful: false
size: -1
pool:
name: default
description: ""
driver: zfs
status: Created
config:
size: 9GiB
source: /var/snap/lxd/common/lxd/disks/default.img
zfs.pool_name: default
used_by: []
locations:
- none
profiles:
- name: default
description: Default LXD profile
config: {}
devices:
eth0:
name: eth0
network: lxdbr1
type: nic
root:
path: /
pool: default
type: disk
used_by: []
volume:
name: c1
description: ""
type: container
pool: default
content_type: filesystem
project: default
location: none
created_at: 2024-08-23T22:01:04.354122974Z
config:
volatile.uuid: 50aa66e6-dca7-4676-94e5-ee292e098d7f
used_by: []
volume_snapshots:
- name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-23T22:01:05.356755362Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 69e6db4c-f32c-49ff-b79b-08f388a8fe9c
As found by @kadinsayani, the volume snapshot associated with snapshot'ing an instance has no expiry set:
$ lxc launch images:alpine/edge c1 -c snapshots.expiry=1d
$ lxc snapshot c1
$ lxc storage volume show default container/c1/snap0
name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-26T13:36:28.408175831Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 705af216-a8cb-4494-b8ca-dda67a8d1dd2
# or more simply
$ lxc storage volume get default container/c1/snap0 --property expires_at
0001-01-01 00:00:00 +0000 UTC
However the instance's snapshot has one:
$ lxc info c1 | sed -n '/^Snapshots:/,$ p'
Snapshots:
+-------+----------------------+----------------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+-------+----------------------+----------------------+----------+
| snap0 | 2024/08/26 09:36 EDT | 2024/08/27 09:36 EDT | NO |
+-------+----------------------+----------------------+----------+
Could it be due to how snapshots are cleaned up? Maybe instance snapshots are cleaned in a different pass than volume ones?