db2_backup role fail with error: "CTGSK3036W The output file "/mnt/blumeta0/db2/keystore/master_key_label.kdb" already exists."
executing db2_backup role fails because the file master_key_label.kdb already existed in /mnt/blumeta0/db2/keystore/
the error was:
TASK [ibm.mas_devops.db2_backup : Extract Master Key Label from keystore.p12] **********************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n db2u c-mlmax-db2u-0 -- su -lc \"gsk8capicmd_64 -secretkey -extract -db '/mnt/blumeta0/db2/keystore/keystore.p12' -stashed -label 'DB2_SYSGEN_db2inst1_BLUDB_2023-07-03-08.45.30_8403CB5A' -format ascii -target '/mnt/blumeta0/db2/keystore/master_key_label.kdb'\" db2inst1\n", "delta": "0:00:02.635918", "end": "2023-07-27 16:50:28.673852", "msg": "non-zero return code", "rc": 233, "start": "2023-07-27 16:50:26.037934", "stderr": "Defaulted container \"db2u\" out of: db2u, init-labels (init), init-kernel (init)\nUnable to use a TTY - input is not a terminal or the right kind of file\ncommand terminated with exit code 233", "stderr_lines": ["Defaulted container \"db2u\" out of: db2u, init-labels (init), init-kernel (init)", "Unable to use a TTY - input is not a terminal or the right kind of file", "command terminated with exit code 233"], "stdout": "CTGSK3036W The output file \"/mnt/blumeta0/db2/keystore/master_key_label.kdb\" already exists.", "stdout_lines": ["CTGSK3036W The output file \"/mnt/blumeta0/db2/keystore/master_key_label.kdb\" already exists."]}
to workaround this issue, I renamed the existing file:
cd /mnt/blumeta0/db2/keystore/
mv master_key_label.kdb master_key_label.kdb_bkp
then rerun the playbook
for information, the playbook is:
- hosts: "localhost"
any_errors_fatal: true
vars:
db2_namespace: "db2u"
db2_dbname: "BLUDB"
db2_backup_dir: "/home/matthieu/db2backup"
db2_backup_instance_name: "mlmax"
roles:
- ibm.mas_devops.db2_backup
@mattlrx The role already deletes the master_key_label.kdb file at the end of the role https://github.com/ibm-mas/ansible-devops/blob/master/ibm/mas_devops/roles/db2_backup/tasks/main.yml#L136 so I can only assume that there was a previous attempt to run the backup that didn't complete?
that is correct, it failed previously. I am aware of 2 people who ran into this issue.
I can also confirm that when it ran successfuly, the file was removed.
@mattlrx thanks for the feedback. I think we can add a similar task like https://github.com/ibm-mas/ansible-devops/blob/master/ibm/mas_devops/roles/db2_backup/tasks/main.yml#L136 to cleanup any existing files at the start of that role. Then when a previous backup has failed then the next run of the backup role can cleanup before running.
You may also want to delete old backups remaining on the pod, because it may lead the pod to exceed ephemeral storage limit. I run these commands after unsuccessful db2 backup.
oc exec -it -n db2u c-db2w-shared-db2u-0 -- bash -c "sudo rm -rf /mnt/blumeta0/home/db2inst1/db_backup"
oc exec -it -n db2u c-db2w-shared-db2u-0 -- bash -c "sudo rm -rf /mnt/blumeta0/db2/keystore/master_key_label.kdb"
The db2_backup role has since been deleted .. there is a supported backup/restore action inside the main db2 role now, which works basically the same way.