Approve Import over MED1000
A search over the Top Medical 1000 reveals 400 whitelisted DOIS complete list here: https://github.com/wpoa/OA-signalling/blob/master/cite_doi_analysis/med1000_whitelisted.csv
The top 20 with the most whitelisted DOIs here:
[(u':Bacteria', 4),
(u':Synesthesia', 5),
(u':Lead poisoning', 5),
(u':Acupuncture', 5),
(u':Cognitive behavioral therapy', 5),
(u':Attention deficit hyperactivity disorder', 5),
(u':Mental disorder', 6),
(u':Major depressive disorder', 6),
(u':Tuberculosis', 6),
(u':Autism', 6),
(u':Savant syndrome', 6),
(u':Psoriasis', 7),
(u':Circumcision', 7),
(u':Fungus', 7),
(u':Suicide', 7),
(u':HIV/AIDS', 8),
(u':Influenza', 9),
(u':Epigenetics', 9),
(u':Chikungunya', 10),
(u':Malaria', 18)]
Shall we start, with the current state of the importer and the bugs that exists?
Let's do only Malaria for the time being. I have imported some of these articles already - good occasion to think about duplicate detection and how to keep track of things in general.
Yes, I went and exhaustively did the malaria page. Quite a lot of them worked! Right now if there is a duplicate, it's known to the program, but the user is just redirected to the successful upload page, (or the error reporting webpage [btw fixed a bug there related to unicode you might have seen]). Does the user need to know if they attempted to upload a duplicate? And if so how?
And I think the way that I would like to expose the server data is to first a page of doi-prefixes and then a page for each doi we've uploaded. And then we should also have a landing page for each wikipedia page that our dois could relate to.