hepdata
hepdata copied to clipboard
records: make `do_finalise` more robust and tolerate multiple commit messages
A problem was recently reported with version 2 of a record which failed to display because of multiple (10!) entries in the record_version_commit_message
database table with the same recid
and version
. There were new Sentry events: MultipleResultsFound: Multiple rows were found when exactly one was required
from the line:
https://github.com/HEPData/hepdata/blob/ef2ea51e490856f27daf13b55f0f5f1f0483a0c3/hepdata/modules/records/api.py#L327
This situation could conceivably arise if do_finalise
fails (for example, due to problems connecting to the database) after the commit message has been added to the database, but before setting hep_submission.overall_status = "finished"
. Eventually, after 10 duplicate commit messages had been added, it seems that the do_finalise
function failed when reindexing.
The do_finalise
function should be made more robust to catch possible exceptions due to temporary problems connecting to the database or OpenSearch, for example, by using db.session.rollback()
to rollback changes to the database and by catching reindexing exceptions (similar to the try
/except
used for the admin_indexer
immediately below). An exception should return an error to the user requesting that they try to finalise the record later. This is better than partially finalising the record with some steps missing.
Although it should not happen, the get_commit_message
function could also be made more robust to tolerate multiple commit messages with the same recid
and version
, for example, by replacing the lines:
https://github.com/HEPData/hepdata/blob/ef2ea51e490856f27daf13b55f0f5f1f0483a0c3/hepdata/modules/records/api.py#L323-L327
with:
commit_message_query = RecordVersionCommitMessage.query \
.filter_by(version=ctx["version"], recid=recid).order_by(RecordVersionCommitMessage.id.desc())
if commit_message_query.count() > 0:
commit_message = commit_message_query.first()
to get the most recent commit message in case there are multiple entries in the database.