Create dataset loader for Abbrev Dataset
Adding a Dataset
- Name: Abbrev Dataset
- Description: The Abbrev dataset is made available by Stevenson, et al. (2009). It consists of the acronyms and long-forms from Medline abstracts that were intially prsented by Liu, et al. (2001). The dataset is automatically re-created by identifying the acronyms long froms in Medline and replacing it with it's acronym. The dataset consists of three subsets containing 100, 200 and 300 instances respectively
- Task: SPAN_CLASS
- Paper: https://aclanthology.org/W09-1309
- Data: https://nlp.cs.vcu.edu/data.html
- License: ?
#self-assign
There seems to be a problem with this dataset's download link. Will be nice if someone else could also check it and confirm if this link is accessible.
- http://nlp.shef.ac.uk/BioWSD/downloads/corpora/index.html
cc: @jason-fries
@sugatoray Looks like the link is down for me too. Might check later? Otherwise apologies !
@jason-fries Thank you, for checking it out and confirming.
@sugatoray @jason-fries any chance the link is up?
@hakunanatasha No. It is not working.
@sugatoray @hakunanatasha Looks all the files are captured here! And the dataset is GNU GENERAL PUBLIC LICENSE
https://web.archive.org/web/20141225030543/http://nlp.shef.ac.uk/BioWSD/downloads/corpora/index.html
Hi @sugatoray We're wrapping up the last commits for the hackathon this week so if you have any made any progress here please push a PR ASAP. Let us know how we can help!
@jason-fries I am unassigning myself. I will not be able to finish this by the deadline.