biomedical icon indicating copy to clipboard operation
biomedical copied to clipboard

Tools for curating biomedical training data for large-scale language modeling

Results 180 biomedical issues
Sort by recently updated
recently updated
newest added

Closes #502 This is a QnA dataset that supports two languages en and es, so there are two subsets containing the same questions: `head_qa_en` and `head_qa_es`. I implemented also a...

Please name your PR after the issue it closes. You can use the following line: "Closes #ISSUE-NUMBER" where you replace the ISSUE-NUMBER with the one corresponding to your dataset. If...

- **Name:** AIMed ### Checkbox - [x] Confirm that this PR is linked to the dataset issue. - [x] Create the dataloader script `biodatasets/my_dataset/my_dataset.py` (please use only lowercase and underscore...

## Adding a Dataset - **Name:** MedSTS - **Description:** 1,068 sentence pairs annotated by two medical experts with semantic similarity scores of 0-5 (low to high similarity). - **Task:** STS...

High
Private
Semantic Textual Similarity
New Dataset

## Adding a Dataset - **Name:** *ShAReCLEF 2013 Task 2* - **Description:** *The dataset for Tasks 1 and 2 consists of de-identified clinical free-text notes from the MIMIC II database,...

High
DUA
NER
NED/Normalization
New Dataset

## Adding a Dataset - **Name:** ShAReCLEF 2013 Task 1 - **Description:** *None provided* - **Task:** NER,NED - **Paper:** https://pubmed.ncbi.nlm.nih.gov/25147248/ - **Data:** https://physionet.org/content/shareclefehealth2013/1.0/ - **License:** DUA-NC

High
DUA
English
NER
NED/Normalization
New Dataset

## Adding a Dataset - **Name:** n2c2 2014 - Deidentification & Heart Disease - **Description:** *None provided* - **Task:** NER,DOC_CLASS - **Paper:** https://pubmed.ncbi.nlm.nih.gov/26225918/ - **Data:** https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/ - **License:** DUA-C/NC

XML
DUA
English
NER

## Adding a Dataset - **Name:** *PlantNorm* - **Description:** *Named entity disambiguation dataset for plants from PubMed abstracts* - **Task:** *NER, NED* - **Paper:** [*A method for named entity normalization...

## Adding a Dataset - **Name:** *plant-disease* - **Description:** *Dataset with tagged plant/disease entities, as well as relations on how the plants affect the tagged diseases* - **Task:** *NER, RE*...

## Adding a Dataset - **Name:** *PPR Plant Phenotype Relation corpus* - **Description:** *Dataset with plant and phenotype mentions, as well as relations of how plants/plant extracts affect the phenotypes*...