BioFLAIR
BioFLAIR copied to clipboard
BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks
BioFLAIR
This repository provides the code for fine-tuning BioFLAIR, a pretrained pooled contextualized embedding model for Biomedical Sequence Labeling tasks like NER. Please refer to our paper BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks.
Installation
BioFLAIR is built using FLAIR. Check out their repo for more information.
$ pip install flair
$ git clone https://github.com/shreyashub/BioFLAIR.git
Datasets
We provide a pre-processed version of benchmark datasets as follows:
- NCBI
- BC5CDR (complete\chemicals\diseases)
- JNLPBA
- Species-800
- LINNAEUS
Fine-Tuning
Run fine_tune.py
for fine-tuning proccess.
Just change the data_folder = 'data/ner/DATASET_NAME'
in fine_tune.py.
Citation
@article{sharma2019bioflair,
title={BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks},
author={Sharma, Shreyas and Daniel Jr, Ron},
journal={arXiv preprint arXiv:1908.05760},
year={2019}
}
Contact
Please email your questions or comments to Shreyas Sharma([email protected]
)