biomedical icon indicating copy to clipboard operation
biomedical copied to clipboard

Create dataset loader for BioNLP-ST 2019 CRAFT-CA

Open jason-fries opened this issue 2 years ago • 9 comments

Adding a Dataset

  • Name: BioNLP-ST 2019 CRAFT-CA
  • Description: None provided
  • Task: NER|COREF
  • Paper: https://aclanthology.org/D19-5725/
  • Data: https://github.com/UCDenver-ccp/CRAFT/releases/tag/v3.1.3
  • License: CC BY 3.0

jason-fries avatar Mar 22 '22 00:03 jason-fries

#self-assign

davidstap avatar Mar 31 '22 16:03 davidstap

Hi @davidstap you let us know if you are still working on this so we can update our project board? Please just notify us the status by Friday April 8. You can response to this comment or ping us on Slack or Discord.

No worries if you are not finished but still intend to work on this!

jason-fries avatar Apr 07 '22 22:04 jason-fries

#self-assign

barthfab avatar Apr 11 '22 11:04 barthfab

I would like to unblock this issue.

barthfab avatar Apr 13 '22 11:04 barthfab

#self-assign

shamikbose avatar Apr 17 '22 10:04 shamikbose

According to the github linked in the paper, this is the description for the dataset. Is this sufficient information?

A collection of 97 articles from the PubMed Central Open Access subset, each of which has been annotated along a number of different axes spanning structural, coreference, and concept annotation

shamikbose avatar Apr 17 '22 22:04 shamikbose

@jason-fries There is already a CRAFT dataloader in #60 . Wondering if this is different in any way

shamikbose avatar Apr 18 '22 22:04 shamikbose

Hi @shamikbose, You could double check to see if the #60 dataloader is the same format as this dataset. Otherwise, if it is a separate dataset or subset with potentially additional information, we should implement its own dataloader.

jason-fries avatar Apr 19 '22 22:04 jason-fries

#60 is a much wider dataset containing all version of CRAFT. 3.1.3 is an update for some missing or malformed information in 3.1.2 as mentioned in this comment https://github.com/UCDenver-ccp/craft-shared-tasks/issues/1#issuecomment-508314708

I've released an update to the CRAFT corpus that includes the fix to address the issue you reported. Please update to CRAFT v3.1.3, and to the 0.1.2 version of this project. Or, if you are running the evaluations via Docker, please use the ucdenverccp/craft-eval:3.1.3_0.1.2 container which is now available on DockerHub.

shamikbose avatar Apr 20 '22 00:04 shamikbose