fromthepage
fromthepage copied to clipboard
double-clicking page save duplicates subjects due to race condition
From Jon a WWP:
I recorded a video to show you exactly what the problem is. Basically we are seeing subjects being duplicated and one will get categorized under People while the other doesn't, but both seem to be referencing the same tagged reference in the same document.
https://www.screencast.com/t/Jjhsds1pi
Here are the links to the 2 subjects from the video:
https://fromthepage.com/woodruff/woodruffpapers/article/32050866 https://fromthepage.com/woodruff/woodruffpapers/article/32050865
There was actually another duplicate subject that looks like the same issue. For John K. Orme I went in and deleted the spaces in his name on the page that was link by both subjects and resaved the transcription and the reference doesn’t show up in the uncategorized instance.
If you look at the timestamps on the versions tab for both of these articles, they were both created at the same minute. To me, that rules out any sort of user error/change/mistake (i.e. linking the word with spaces and then redoing it to fix it), and seems to support the theory @benwbrum had, which is that maybe the user clicked the "save" button twice in quick succession and we kicked off the process that checks the page for links and creates the subjects if they are new (asking the user to categorize them in the process) twice. What's odd to me is that the subject with the higher ID -- 32050866 -- is the more correct one (categorized, linked on two pages), but maybe that makes sense.
At any rate, we should try to keep this (sort of) race condition from happening. I think Jon can fix it by deleting this one: https://fromthepage.com/woodruff/woodruffpapers/article/32050865 We're about 80% sure it won't delete the link to the other article from the page (but please confirm afterwards).
From email to Jon:
I've done some additional prowling through the logs, and verified that this user is indeed hitting save twice, introducing a race condition that creates any new articles twice. We need to add code to intercept second clicks like this, since in some cases she's waiting a few seconds before hitting 'save' again -- it's not like she's double-clicking. (Saving page transcriptions with linked subjects is one of our most compute-intensive processes -- is anyone complaining about system performance on the WWP team?)
I'm afraid that adding the code to avoid this race condition is likely to take quite a while. In the meanwhile could you ask NatalieH to be a bit more patient with slow page saves?
I think you can delete any uncategorized subjects you think are duplicates.
Our logs for this interaction:
Advisory locking or pessimistic locking of the pages
record might address this; I'm also researching another approach I've seen in big Java Servlet apps, in which the requestor simply attaches to a running (identical) HTTP request.
It seems like the best/easiest way to fix this is by disabling the submit button after it's been pressed. https://stackoverflow.com/questions/50707985/preventing-multiple-record-creation-with-multiple-clicks-rails
Looks like a good plan.