nlp.js icon indicating copy to clipboard operation
nlp.js copied to clipboard

Isolate specific entity

Open jechazelle opened this issue 7 years ago • 2 comments

I would like to know, how can I isolate a specific entity. I don't know if it's a bug but I would like to isolate an entity in my intent with this pattern :

'%BOOK% %PAGE_START% %PARAGRAPH_START%'

in the result I have PAGE_START in double and PARAGRAPH_START in double :

... "intent": "[BOOK] search_paragraph", "domain": "default", "score": 0.9987136557407928, "entities": [ { "start": 0, "end": 2, "len": 3, "levenshtein": 0, "accuracy": 1, "option": "DAILY_PLANET", "sourceText": "Daily", "entity": "BOOK", "utteranceText": "dai" }, { "start": 4, "end": 4, "len": 1, "levenshtein": 0, "accuracy": 1, "option": "1", "sourceText": "2", "entity": "PAGE_START", "utteranceText": "2" }, { "start": 6, "end": 6, "len": 1, "levenshtein": 0, "accuracy": 1, "option": "1", "sourceText": "3", "entity": "PAGE_START", "utteranceText": "3" }, { "start": 4, "end": 4, "len": 1, "levenshtein": 0, "accuracy": 1, "option": "1", "sourceText": "2", "entity": "PARAGRAPH_START", "utteranceText": "2" }, { "start": 6, "end": 6, "len": 1, "levenshtein": 0, "accuracy": 1, "option": "1", "sourceText": "3", "entity": "PARAGRAPH_START", "utteranceText": "3" } ], ...

I would like to have only 3 entities in the response (and not the double PAGE_START and PARAGRAPH_START) :

  • BOOK (value: 'DAILY_PLANET'...)
  • PAGE_START (value: 1, 2, 3, 4...)
  • PARAGRAPH_START (value: 1, 2, 3, 4...)

How can I have that please ? It's a bug ?

jechazelle avatar Oct 05 '18 08:10 jechazelle

Hi Jérémie, I have a related problem with builtin ER. In your case, how do you disambiguate between a page number and a paragraph number? Maybe regex like: p[. ](\d+) and §(\d+) to avoid numbers to be recognized both as page and paragraph? It could work if you explain to users that § stands for paragraph sign in french (SHIFT+! on AZERTY keyboards). Can you share your lines or XLS?

j2l avatar Oct 15 '18 15:10 j2l

In v4 there should be a different response already, but still not what you need ... I prepare a PR for it - also in connection to #1174 ... but most likely PR nreeds to wait until my other 4 PRs are merged ... It starts to overlap code-wise, so else it gets a merge hell

Apollon77 avatar Aug 12 '22 07:08 Apollon77

Closing due to inactivity. Please, re-open if you think the topic is still alive.

aigloss avatar Nov 25 '22 09:11 aigloss