zing
zing copied to clipboard
Duplicate submissions slip in the DB
It looks like from time to time duplicate submissions are slipping in the DB.
The logs show for example the following:
[10/Apr/2017 02:50:43] <username> SC 18.0 TA #<uid> NS=18 S=0 (total: 336226.39639)
[10/Apr/2017 02:50:43] <username> A <locale> <uid> /<locale>/<project>/<resource> <Translation>
[10/Apr/2017 02:50:43] <username> SC 18.0 TA #<uid> NS=18 S=0 (total: 336226.39639)
[10/Apr/2017 02:50:43] <username> A <locale> <uid> /<locale>/<project>/<resource> <Translation>
So the same translation addition from the same user is being reported for unit uid
. This is a duplicated submission that shouldn't happen.
In a production DB one can retrieve a list of units with duplicated submissions as follows (it might take a while to complete the query):
SELECT id, unit_id, creation_time, submitter_id, field, COUNT(*)
FROM pootle_app_submission
GROUP BY unit_id, creation_time, submitter_id, field
HAVING COUNT(*) > 1
Note this also lists rows which look like duplicates but are legit: e.g. quickly muting multiple quality checks for the same unit. Most of the time, duplicates refer to multiple consecutive unit submissions from the same user though (changes to state or target, or multiple suggestions).
We should ensure illegitimate duplicates cannot be created by:
- setting a DB constraint on columns (
creation_time
,submitter_id
,unit_id
,field
might not be enough). - avoiding any unit submissions from the UI when there is an ongoing submission.