nlp.js icon indicating copy to clipboard operation
nlp.js copied to clipboard

builtin entity recognition superseeds NER

Open j2l opened this issue 7 years ago • 3 comments

Describe the bug It looks like entity is set using builtin enity recognition only. Therefore, you can't extract what you want using regex.

To Reproduce NER: image Using the example, {{hashtag}} converts to #proudtobeaxa

Expected behavior It shloud extract proudtobeaxa as %hashtag% since the group doesn't include "#" in the NER regex group.

Additional question Any way to extract 2 groups from regex like /\b\#(\w+)[, ]\#(\w+)\b/ig to %hastag1% %hastag2%?

j2l avatar Oct 14 '18 16:10 j2l

Hello. Confirmed! About the Additial Question, there is an open issue https://github.com/axa-group/nlp.js/issues/76, I was thinking how to implement it, I think that I have an idea of how to it with the less code impact.

jesus-seijas-sp avatar Oct 15 '18 11:10 jesus-seijas-sp

Great! Thanks.

j2l avatar Oct 15 '18 12:10 j2l

Works in v4, btw (not the additional question).

@jesus-seijas-sp what was your idea? Maybe I can support implementing it :-) entittyname_X is not really working because already used for "same entitiy multiple times".

Apollon77 avatar Aug 08 '22 21:08 Apollon77

Closing due to inactivity. Please, re-open if you think the topic is still alive.

aigloss avatar Nov 25 '22 09:11 aigloss