Natural-Language-Processing-Tutorials icon indicating copy to clipboard operation
Natural-Language-Processing-Tutorials copied to clipboard

Entities are not being recognised

Open oldmonkABA opened this issue 6 years ago • 4 comments

When i run the code I get following output

docx = nlp(u"I am looking for an Italian Restaurant where I can eat") for word in docx.ents: print("value",word.text,"entity",word.label_,"start",word.start_char,"end",word.end_char) ('value', u'Italian', 'entity', u'NORP', 'start', 20, 'end', 27)

print(interpreter.parse(u"I am looking for an Italian Restaurant where I can eat")) {u'entities': [], u'intent': {u'confidence': '0.7245936400661538', u'name': u'restaurant_search'}, 'text': u'I am looking for an Italian Restaurant where I can eat', u'intent_ranking': [{u'confidence': '0.7245936400661538', u'name': u'restaurant_search'}, {u'confidence': '0.16613318075824324', u'name': u'affirm'}, {u'confidence': '0.061131622985489784', u'name': u'greet'}, {u'confidence': '0.04814155619011318', u'name': u'goodbye'}]}

print(interpreter.parse(u"I want an African Spot to eat")) {u'entities': [], u'intent': {u'confidence': '0.6742354477482855', u'name': u'restaurant_search'}, 'text': u'I want an African Spot to eat', u'intent_ranking': [{u'confidence': '0.6742354477482855', u'name': u'restaurant_search'}, {u'confidence': '0.12795773626363155', u'name': u'affirm'}, {u'confidence': '0.1248807660919913', u'name': u'goodbye'}, {u'confidence': '0.07292604989609185', u'name': u'greet'}]}

print(interpreter.parse(u"Good morning World")) {u'entities': [], u'intent': {u'confidence': '0.3928691488396195', u'name': u'greet'}, 'text': u'Good morning World', u'intent_ranking': [{u'confidence': '0.3928691488396195', u'name': u'greet'}, {u'confidence': '0.2737002194915276', u'name': u'goodbye'}, {u'confidence': '0.17752522806694152', u'name': u'affirm'}, {u'confidence': '0.15590540360191174', u'name': u'restaurant_search'}]}

Below is the full code : from rasa_nlu.training_data import load_data from rasa_nlu.config import RasaNLUModelConfig from rasa_nlu.model import Trainer from rasa_nlu import config

Loading DataSet

train_data = load_data('./data/data.json')

Config Backend using Sklearn and Spacy

trainer = Trainer(config.load("config.yaml"))

Training Data

trainer.train(train_data)

Returns the directory the model is stored in (Creat a folder to store model in)

model_directory = trainer.persist('./projects/')

import spacy nlp = spacy.load('en')

docx = nlp(u"I am looking for an Italian Restaurant where I can eat") for word in docx.ents: print("value",word.text,"entity",word.label_,"start",word.start_char,"end",word.end_char)

from rasa_nlu.model import Metadata, Interpreter

where `model_directory points to the folder the model is persisted in

interpreter = Interpreter.load(model_directory)

Prediction of Intent

print(interpreter.parse(u"I am looking for an Italian Restaurant where I can eat")) print(interpreter.parse(u"I want an African Spot to eat")) print(interpreter.parse(u"Good morning World"))

oldmonkABA avatar Sep 10 '18 11:09 oldmonkABA

In my observation, the intent is working well but the problem may be due to spacy-rasa config or the training data. Pls you can try these options #1 Add more examples for your training data during the training #2 you can use a more detailed pipeline for the config.yml file (spacy_sklearn) #3 you can try it with mitie backend Hope it helps

Jcharis avatar Sep 11 '18 09:09 Jcharis

Hi Jcharis, i have the same problem. When i run your code, i have a result like this :

` { 'intent': { 'name': 'restaurant_search', 'confidence': 0.6966200345414107 }, 'entities': [

], 'intent_ranking': [ { 'name': 'restaurant_search', 'confidence': 0.6966200345414107 }, { 'name': 'affirm', 'confidence': 0.19163192173218538 }, { 'name': 'goodbye', 'confidence': 0.05679537002111616 }, { 'name': 'greet', 'confidence': 0.054952673705287634 } ], 'text': 'I am looking for an Italian Restaurant where I can eat'

Entities are not recognized, may you help me ?

rmiaouh avatar Mar 25 '19 14:03 rmiaouh

Hello, Pls when you try it without the data above individually,are you able to get spacy to recognize the entities? If yes then that can use spacy to recognize them and use it to fill your training dataset and try again. Alternatively you can use the nlu gui to help you create the dataset with the entities and you it. So times you may have to add some default entity labels in your training dataset. Hope it helps

Thanks

On Mon, Mar 25, 2019 at 4:10 PM RM [email protected] wrote:

Hi Jcharis, i have the same problem. When i run your code, i have a result like this :

`{ 'intent': { 'name': 'restaurant_search', 'confidence': 0.6966200345414107 }, 'entities': [

], 'intent_ranking': [ { 'name': 'restaurant_search', 'confidence': 0.6966200345414107 }, { 'name': 'affirm', 'confidence': 0.19163192173218538 }, { 'name': 'goodbye', 'confidence': 0.05679537002111616 }, { 'name': 'greet', 'confidence': 0.054952673705287634 } ], 'text': 'I am looking for an Italian Restaurant where I can eat' }`

Entities are not recognized, may you help me ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Jcharis/Natural-Language-Processing-Tutorials/issues/1#issuecomment-476213864, or mute the thread https://github.com/notifications/unsubscribe-auth/AUFJtTHPF-9H8a6LjNDAjPjfDbzSPp8Kks5vaNjjgaJpZM4WhNwd .

Jcharis avatar Mar 26 '19 12:03 Jcharis

Thk for you answer, I found a solution, the probleme came from my pipeline file (as you said)

Here is the pipeline i use :

language: "en"

pipeline:

  • name: "tokenizer_whitespace"
  • name: "ner_crf"
  • name: "ner_synonyms"
  • name: "intent_featurizer_count_vectors"
  • name: "intent_classifier_tensorflow_embedding" "epochs" : 500

rmiaouh avatar Mar 27 '19 16:03 rmiaouh