biomedical
biomedical copied to clipboard
Create dataset loader for BioNLP-ST 2019 CRAFT-CA
Adding a Dataset
- Name: BioNLP-ST 2019 CRAFT-CA
- Description: None provided
- Task: NER|COREF
- Paper: https://aclanthology.org/D19-5725/
- Data: https://github.com/UCDenver-ccp/CRAFT/releases/tag/v3.1.3
- License: CC BY 3.0
#self-assign
Hi @davidstap you let us know if you are still working on this so we can update our project board? Please just notify us the status by Friday April 8. You can response to this comment or ping us on Slack or Discord.
No worries if you are not finished but still intend to work on this!
#self-assign
I would like to unblock this issue.
#self-assign
According to the github linked in the paper, this is the description for the dataset. Is this sufficient information?
A collection of 97 articles from the PubMed Central Open Access subset, each of which has been annotated along a number of different axes spanning structural, coreference, and concept annotation
@jason-fries There is already a CRAFT dataloader in #60 . Wondering if this is different in any way
Hi @shamikbose, You could double check to see if the #60 dataloader is the same format as this dataset. Otherwise, if it is a separate dataset or subset with potentially additional information, we should implement its own dataloader.
#60 is a much wider dataset containing all version of CRAFT. 3.1.3 is an update for some missing or malformed information in 3.1.2 as mentioned in this comment https://github.com/UCDenver-ccp/craft-shared-tasks/issues/1#issuecomment-508314708
I've released an update to the CRAFT corpus that includes the fix to address the issue you reported. Please update to CRAFT v3.1.3, and to the 0.1.2 version of this project. Or, if you are running the evaluations via Docker, please use the ucdenverccp/craft-eval:3.1.3_0.1.2 container which is now available on DockerHub.