vitess-operator icon indicating copy to clipboard operation
vitess-operator copied to clipboard

Ensure that binary logs for PITR are in a shared directory

Open mattlord opened this issue 1 year ago • 4 comments

When executing the vtctldclient RestoreFromBackup --restore-to-pos <value> command, the vttablet process in the vttablet container within the vttablet pod — in the RestoreFromBackup tabletmanager RPC — restores the full backup within the VTDATAROOT (specifically /vt/vtdataroot/vt_<tabletUID>/ for the mysql data) that is shared by all containers within the pod using the configured backup engine (e.g. xtrabackup). It orchestrates that in conjunction with the mysqlctld process that's running inside the mysqld container within the same vttablet pod. In the end there is a running mysqld instance inside the mysqld container that is from the restored full backup. Then once the full backup is in place and the mysqld process is running the vttablet process uses the OS tmp dir of /tmp to restore the binary logs from the backup — via the builtinbackupengine — for subsequent application and /tmp is not a shared mount point within the pod so when mysqlbinlog subsequently tries to read them from within the mysqld container it cannot find them in its container's /tmp directory and it fails with an error.

vtctldclient

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/cmd/vtctldclient/command/backups.go#L227-L263

vtctld[server]

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vtctl/grpcvtctldserver/server.go#L3260-L3286

vttablet

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vttablet/tabletmanager/rpc_backup.go#L173-L193

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vttablet/tabletmanager/restore.go#L191-L273

mysqlctld (rather than mysqlctl, and which runs in the mysql container)

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/backup.go#L364-L487

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/builtinbackupengine.go#L995-L1060

vttablet builtinbackupengine

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/builtinbackupengine.go#L995-L1060

Related issues and PRs:

  • https://github.com/vitessio/vitess/pull/15451
  • https://github.com/vitessio/vitess/issues/14765
  • https://github.com/vitessio/vitess/issues/15452

mattlord avatar Mar 12 '24 15:03 mattlord

Nice! Should we also provide the flag in yaml files?

Yeah. I think this does it. https://github.com/planetscale/vitess-operator/pull/541/commits/e2e5e8b1e749b6f550ee72e77dce2f8ef326408d

mattlord avatar Mar 16 '24 04:03 mattlord

Yeah. I think this does it.

How is the value being set, and to what specific value?

shlomi-noach avatar Mar 17 '24 17:03 shlomi-noach

How is the value being set, and to what specific value?

The user would specify the flag and value in their cluster yaml definition using the extraFlags parameter, just as they do for mysqld flags, e.g. If they don't specify a value then we enforce the default within the operator.

mattlord avatar Mar 17 '24 17:03 mattlord

How is the value being set, and to what specific value?

The user would specify the flag and value in their cluster yaml definition using the extraFlags parameter, just as they do for mysqld flags, e.g. If they don't specify a value then we enforce the default within the operator.

The flag ended up being for vttablet and vtbackup, not mysqlctld (although vtbackup is a modified mysqlctld). I will leave the mysqlctld extra flags support though as that may come to be useful.

mattlord avatar Mar 27 '24 00:03 mattlord