NeuroNER icon indicating copy to clipboard operation
NeuroNER copied to clipboard

How can we create Dataset with format conll 2003,I have clinical data and I want to give like symptom,cause(like in your example..Name and Location),So How I can convert my simple text like fever,cold to conll 2003

Open purohitvivek8 opened this issue 6 years ago • 3 comments

purohitvivek8 avatar Apr 30 '18 09:04 purohitvivek8

did figured out?

jaysinghr avatar Jun 11 '19 12:06 jaysinghr

@purohitvivek8 @Anticsss

CoNLL-2003 format:

https://www.clips.uantwerpen.be/conll2003/ner/

  • "Software and Data" section explains the format.

https://sites.google.com/site/ermasoftware/getting-started/ne-tagging-conll2003-data

  • This explains with an example.
  • Also mentions about the DOCSTART line.

But as per my understanding, NeuroNER doesn't need POS and chunk tags. https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/neuroner/dataset.py#L46

token = str(line[0])
label = str(line[-1])

This reads only the word and Named Entity label.

kaushikacharya avatar May 19 '20 17:05 kaushikacharya

Thanks Kaushik, that's correct, NeuroNER reads only the word and Named Entity label.

On Tue, 19 May 2020 at 10:35, Kaushik Acharya [email protected] wrote:

@purohitvivek8 https://github.com/purohitvivek8 @Anticsss https://github.com/Anticsss

CoNLL-2003 format:

https://www.clips.uantwerpen.be/conll2003/ner/

  • "Software and Data" section explains the format.

https://sites.google.com/site/ermasoftware/getting-started/ne-tagging-conll2003-data

  • This explains with an example.
  • Also mentions about the DOCSTART line.

But as per my understanding, NeuroNER doesn't need POS and chunk tags.

https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/neuroner/dataset.py#L46

token = str(line[0]) label = str(line[-1])

This reads only the word and Named Entity label.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Franck-Dernoncourt/NeuroNER/issues/107#issuecomment-630971190, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADXY45NHWGQVSKULC4PBTRSK7OBANCNFSM4E5RIH2Q .

Franck-Dernoncourt avatar May 19 '20 17:05 Franck-Dernoncourt