nlp.js icon indicating copy to clipboard operation
nlp.js copied to clipboard

Use context to obtain different outputs from similar inputs

Open msolefonte opened this issue 5 years ago • 4 comments

Is your feature request related to a problem? Please describe. This is not related to any problem. I am quite sure it is a lacking feature but perhaps I have just missed it.

Describe the solution you'd like I would like the NLP Manager to introduce contexts in a similar fashion as how they work in Dialogflow. Basically, in some situations, like in real life, you want to understand the same input differently based on the previous input. This can be, explained, for example, as an answer to a previous question. Let's see it with an example:

>>> I would like to create a new account. -> registerNewUser(), context = []
<<< Perfect. Can you give me your email? -> Answer to registerNewUser(). 
// {key: accountCreation, lifespan: 2} is pushed to the context. Lifespan is reduced with 
// each interaction till disappearing.
>>> [email protected]
// By default, a message containing only '%email%' should not result in any output, but, 
// if the context "accountCreation" is active, the result is different.
<<< Thank you for your answer. Proceeding.

I think Google has a neat example here: https://cloud.google.com/dialogflow/docs/contexts-overview

Context should also be used not only as a requirement but also as a recommendation (some priority in the classifying) or as a rejection (some results should not be shown under some contexts). I do not think there is any need to keep a context as is, with the lifespan, that could be handled by the user. However, the manager.process() function could accept an array of active context keys.

manager.process('en', '[email protected]', [accountCreation])

Of course, something similar should be introduced to the trainer.

Describe alternatives you've considered The clearest alternative to it is accept create an intent with the data you want to receive, checking the context once inside of the callback. For example, a message containing %email% should result in the invocation of the function email(). Then, inside of email, the context should be checked to consider the actions. I'll write an example in pseudocode.

email(email) {
  if accountCreation in context: accountCreationEmailReceived(email);
  elif login in context: loginEmailReceived(email);
  else fallbackFunction() // or nothing
}

The problem with this approach is that it is far harder to mantain, as it is less readable and natural. Also, sometimes, you want some rejections / affinities to work, which, if not happen, can result in something like this.

>>> Hello. My email is %email%. // No context
<<< hello()
>>> Hello. My email is %email% // Context = [accountCreation]
<<< accountCreationEmailReceived(email)
>>> Hello. // No context
<<< hello()
>>> Hello // Context = [alreadyGreeted]
<<< pass

These options are not possible to implement without a proper usage of context. You can still bypass it but it is, without any doubt, a poor patch.

Additional context Again, I think that Dialogflow implements a perfect solution for the problem: https://cloud.google.com/dialogflow/docs/contexts-overview

msolefonte avatar Aug 18 '20 21:08 msolefonte

I agree to the need for something similar to DF's contexts. Based on what I understand from the docs, domains could be the way to implement this behavior in Nlp.js. But, "contexts" does exist as a building-block aswell, but seems to be more of a session storage from what I can tell.

I opened another issue 10 days ago @ https://github.com/axa-group/nlp.js/issues/542 where I ask for clarifications on how Domains could be used.

I would be very happy to help contribute to the project through documentation given that I learn how these concepts (contexts and domains) are intented to be used.

luddilo avatar Aug 21 '20 09:08 luddilo

Hello, Domain and Context (understood as DialogFlow one) are not the same. Domain is a functional way to organize the intents, but intents between domains must be different. Example: you can have a domain for Finances, other one for HR, other one for Travels... Each domain contains utterances, but there must not be intersection between domains: if you have the same or similar utterances between different intents, you'll be making crazy the classification.

The complexity to train is calculated based in the number of features multiplied by the number of intents and multiplied by the amount of utterances, where a feature is each different stem present in the training. So supose that you have 10.000 intents with 5 utterances per intent and a total of 1.500 features, then the complexity to train will be 10.000 (intents) * 50.000 (utterances per intent * intents) * 1.500 (features) = 750.000.000.000. But now supose that you are able to split this problem into 4 smaller domains with 2.500 intents per problem. This will also decrease the amount of features, because different domains will contain different wording (the wording for HR is not the same than for finances). Let's say that the features are reduced to 1.250 for each problem. Then the complexity of each problem will be: 2.500 (intents) * 12.500 (utterances per intent * intents) * 1250 (features) = 39.062.500.000. If we multiply this number by 4, then we obtain 156.250.000.000 that is clearly less than 750.000.000.000. But this is not enough, now we have 4 different classifiers... for each utterance that we must classify, how we decide which of these 4 classifiers is the correct? With another classifier that classifies into domains instead of into intents. So this classifier has this numbers. 4 intents (one by domain), 50.000 utterances and 1.500 features => 300.000.00. If we sum 156.250.000.000 + 300.000.000 we obtain 156.550.000.000. That is about 5 times simplier than the original problem, so we can think that it will train about 5 times faster.

So Domains are useful to: organize your intents, improve the performance to train when you have a huge problem.

In the other hand, we have the DialogFlow Contexts. This concept is "ok, we have these intents, but if the state of the application is X then we should only take into account those ones". Example: you have 1000 intents, and one of the intents is the "true" (when user say "yes", "ok" or something similar) and other one is the "false" (when user say "no", "nope", "nah" or something similar). Then in an state of the application you can decide that only those 2 intents are taked into account.

This behaviour is not implemented in NLP.js, because is intended to be an NLP, not a chatbot engine, so it does not stores status and do not orchestrate a multi-turn conversation. For this, I usually use Microsoft Bot Framework.

I you want to implement this for NLP.js and make a PR, I suggest that you take a look into this: https://github.com/axa-group/nlp.js/blob/master/packages/nlu/src/nlu.js#L300 This function filter the valid intents, considering valid intents those that contains at least one feature in common with the input utterance. This is done to filter false positives based on the bias. A similar function can be implemented based on an allow list, so you will obtain an score of 0 for the other intents.

Then you'll have to put this new function into the pipeline, right here: https://github.com/axa-group/nlp.js/blob/master/packages/nlu/src/nlu.js#L125

And well... implement something so the NLU receive this allow list in the process: https://github.com/axa-group/nlp.js/blob/master/packages/nlu/src/nlu.js#L412

The NLU process is called from the DomainManager process, so this allow list should be provided also here: https://github.com/axa-group/nlp.js/blob/master/packages/nlu/src/domain-manager.js#L368

The DomainManager is called from the NluManager, here: https://github.com/axa-group/nlp.js/blob/master/packages/nlu/src/nlu-manager.js#L314 Then will be a problem: as there are multiple domains, if the trainByDomain is activated you'll have to filter also the domains non related to the intents of your allow list here.

Then you'll have to go to the NLP class to modify the process: https://github.com/axa-group/nlp.js/blob/master/packages/nlp/src/nlp.js#L460 So when it call the NluManager, the allow list will be provided. Also, this NLP process must receive this allow list, I suggest using the context variable for this communication.

And well.. then when you call nlp.process, you'll need something to orchestrate the contexts, the states and the allow list... Such class does not exists in NLP.js.

Kind Regards,

jesus-seijas-sp avatar Aug 21 '20 10:08 jesus-seijas-sp

Thanks @jesus-seijas-sp for detailed explanation. More in general, THANKS for your HUGE work with this project!

I agree on the fact NLP domains and DialogFlow contexts are pretty different things:

  • NLP domains are a policy to split a multi-domain intents classifier in to small closed-domain classifier. That "divide et impera" is great immo and it differentiate NLP.js from competitors.

But, It seems to me that domains have been removed in current NLP.j version 4. If so, Why?

BTW, I proposed (privately, via mail) a CR to get a probability rank of the domain classifier. If makes sense I'd propose again here as CR.

  • DialogFlow contexts This is a way to manage a states of a conversation, in dialog flow. BTW I opensourced my own small dialog manager, NaifJs, that's in a very ugly alfa release (please forgive my bad js programming skills) and I just thought about an integration with NLP.js:

Thanks giorgio

solyarisoftware avatar Sep 25 '20 12:09 solyarisoftware

@jesus-seijas-sp . Thank you for the detailed explanation. I am using NLPjs and looking to add context support in chatbot.

Let's say the code suggested by you has been added in NLPjs.

Then the library will be able to filter some intents based on active context. E.g. If the active context is "A", then don't consider an intent with input context "B".

I have added use case below. I would like to know if NLPjs can also help here.

Consider that I have added two intents in my bot:

  1. Input context: "A", Training Phrase: "hi, how are you?"
  2. Input context: NULL, Training Phrase: "hello, how are you doing?"

Now user says "hello, how are you doing?" and the active context is "A". How do we choose which intent to trigger?

1st Intent has the same input context "A" but is not an exact match (context matching score is high, nlp matching score is low) 2nd Intent doesn't have input context but is an exact match of input (context matching score is low, nlp matching score is high)

What if I someone wanted to achieve this functionality where the code can give a result considering both scores? Is it possible to achieve using NLPjs? Or can this be added in NLPjs similar to what you have suggested?

Edit: Looks the The allowList feature list has been already added by you. Can you help with this issue: https://github.com/axa-group/nlp.js/issues/953

shubhamtibra avatar Jul 15 '21 04:07 shubhamtibra

Closing due to inactivity. Please, re-open if you think the topic is still alive.

aigloss avatar Dec 07 '22 09:12 aigloss