abbrv.jabref.org icon indicating copy to clipboard operation
abbrv.jabref.org copied to clipboard

Move data to wiki data

Open tobiasdiez opened this issue 5 years ago • 12 comments

Wiki data contains also quite a few abbreviations: https://query.wikidata.org/embed.html#SELECT%20DISTINCT%20%3Fname%20%3FISO_4_abbreviation%20WHERE%20%7B%0A%20%20%3Fjournal%20wdt%3AP31%20wd%3AQ5633421%3B%0A%20%20%20%20%20%20%20%20%20%20%20wdt%3AP1160%20%3FISO_4_abbreviation.%0A%20%20%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22.%0A%20%20%20%20%3Fjournal%20rdfs%3Alabel%20%3Fname.%0A%20%20%7D%0A%7D

query

SELECT DISTINCT ?name ?ISO_4_abbreviation WHERE {
  ?journal wdt:P31 wd:Q5633421;
           wdt:P1160 ?ISO_4_abbreviation.
  
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?journal rdfs:label ?name.
  }
}

tobiasdiez avatar Dec 19 '19 17:12 tobiasdiez

Hi, I had a couple of questions regarding this issue. Are you able to further explain what is meant by 'Add abbreviations from wiki data'? I'm hoping to better understand what is required before taking this issue

cmgoodall avatar Oct 16 '23 02:10 cmgoodall

After thinking about this more, I'm actually of the opinion we should just retire our custom abbreviation collection and migrate to wiki data. In this way, we can profit from other peoples work and, conversely, give back to a larger community. So in particular:

  • Push all the data that we currently have to wiki data: https://www.wikidata.org/wiki/Wikidata:Data_donation#How_to_add_data_to_Wikidata
  • Replace the JournalListMvGenerator in the main jabref repo to get all abbreviations of all journals from wikidata

@JabRef/developers any opinions?

tobiasdiez avatar Oct 16 '23 16:10 tobiasdiez

In general, this is a good idea.

  • [ ] How much coverage does WikiData have currently in comparison to our list? - In Python, this should be easily doable IMHO.

Does the WikiData distinguish from the data sources? We distinguish from IEEE, MathSciNet and others (https://github.com/JabRef/abbrv.jabref.org/blob/main/journals/README.md). I am not sure whether there are intersecting abbreviations and users want to choose their list (we had that a few years ago).

We only have the copyright for some of the data. Most of the data is coming by external sources. For instance, we update MathSciNet automatically: https://github.com/JabRef/abbrv.jabref.org/blob/main/scripts/update_mathscinet.py

  • [ ] In case WikiData has 80% coverage, I opt for just closing this repository. If WikiData has less coverage, this needs more thought.

The process would be:

  1. Thoroughly identify each data source
  2. Contact the data source and ask them to put their data to WikiData.
  3. Keep working that the external sources are available in WikiData.

koppor avatar Oct 20 '23 07:10 koppor