nltk_data icon indicating copy to clipboard operation
nltk_data copied to clipboard

NCBI disease corpus

Open proline827 opened this issue 10 years ago • 7 comments

http://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/

NCBI disease corpus is the latest version biomed disease related corpus used for biomedical research, the biocreative_ppi corpus doesn't work currently.

proline827 avatar May 27 '15 19:05 proline827

suggested NLTK name as 'ncbidis'

proline827 avatar May 27 '15 19:05 proline827

Sorry for the long delay @proline827. Will this corpus work with an existing corpus reader?

stevenbird avatar Sep 06 '15 01:09 stevenbird

@ewan-klein – do you have experience with the biocreative_ppi corpus?

stevenbird avatar Sep 06 '15 01:09 stevenbird

Only in the mists of history :frowning:. There must be other people out there with more up-to-date experience.

ewan-klein avatar Sep 06 '15 12:09 ewan-klein

Is this available as one of the options? Or all I could do is try and load the corpus with the existing corpus reader?

jmwenda avatar May 05 '16 18:05 jmwenda

Is there any update on this?

udaraweerasinghege avatar Jun 11 '17 19:06 udaraweerasinghege

I'm happy to consider a pull request. Note the file size limit imposed by github.

stevenbird avatar Jun 11 '17 23:06 stevenbird