biomedical Create dataset loader for Abbrev Dataset

Adding a Dataset

Name: Abbrev Dataset
Description: The Abbrev dataset is made available by Stevenson, et al. (2009). It consists of the acronyms and long-forms from Medline abstracts that were intially prsented by Liu, et al. (2001). The dataset is automatically re-created by identifying the acronyms long froms in Medline and replacing it with it's acronym. The dataset consists of three subsets containing 100, 200 and 300 instances respectively
Task: SPAN_CLASS
Paper: https://aclanthology.org/W09-1309
Data: https://nlp.cs.vcu.edu/data.html
License: ?

Mar 22 '22 00:03 jason-fries

#self-assign

Mar 31 '22 08:03 sugatoray

There seems to be a problem with this dataset's download link. Will be nice if someone else could also check it and confirm if this link is accessible.

http://nlp.shef.ac.uk/BioWSD/downloads/corpora/index.html

cc: @jason-fries

Mar 31 '22 09:03 sugatoray

@sugatoray Looks like the link is down for me too. Might check later? Otherwise apologies !

Apr 07 '22 22:04 jason-fries

@jason-fries Thank you, for checking it out and confirming.

Apr 07 '22 22:04 sugatoray

@sugatoray @jason-fries any chance the link is up?

Apr 09 '22 21:04 hakunanatasha

@hakunanatasha No. It is not working.

Apr 09 '22 23:04 sugatoray

@sugatoray @hakunanatasha Looks all the files are captured here! And the dataset is GNU GENERAL PUBLIC LICENSE

https://web.archive.org/web/20141225030543/http://nlp.shef.ac.uk/BioWSD/downloads/corpora/index.html

Apr 10 '22 01:04 jason-fries

Hi @sugatoray We're wrapping up the last commits for the hackathon this week so if you have any made any progress here please push a PR ASAP. Let us know how we can help!

Apr 19 '22 19:04 jason-fries

@jason-fries I am unassigning myself. I will not be able to finish this by the deadline.

Apr 22 '22 04:04 sugatoray