core
core copied to clipboard
workspace backup: avoid "Not unique" error
trafficstars
When using the explicit workspace backup facilities, there's a glitch sometimes that makes usage difficult (requiring attention) and makes writing scripts harder:
$ ocrd workspace backup restore 3ab9097
Exception: Not unique, could be
.backup/3ab90974b9733d6fcbcd150b9ec0c79cac88975f6292c30c60ddeacf8ab51bfb.1631089673.327542
.backup/3ab90974b9733d6fcbcd150b9ec0c79cac88975f6292c30c60ddeacf8ab51bfb.1631171905.949088
You then have to pass the exact sub-ID from here on (by copying it from the printed string).
AFAICS, this bad state arises from the following conditions:
- there already are at least two backups (say 3ab9097 and 1eb4c60)
- you request a restore of a backup (say 3ab9097) that is not the last one (1eb4c60) but is identical to the current state of the workspace
- the operation will first attempt adding a new backup of the current state, creating a duplicate of it:
INFO ocrd.workspace_backup.add - Backing up to .backup/3ab90974b9733d6fcbcd150b9ec0c79cac88975f6292c30c60ddeacf8ab51bfb.1631171905.949088/mets.xml
INFO ocrd.workspace_backup.restore - .backup/1eb4c60dbca3306711c76858022bf544467608c12c11f4d3651be58500ed07b0.1631166420.772904/mets.xml
Under different circumstances, the implementation already suppresses making a duplicate (No changes since last backup).
IMO the problem is here: https://github.com/OCR-D/core/blob/d4a853964ce468a2c0eb6be2004c15441fbd206f/ocrd/ocrd/workspace_backup.py#L77-L79
Should probably be something like:
if next(bak for bak in backups or [] if bak.chksum == chksum, False):