snips-nlu
snips-nlu copied to clipboard
Low probability. How to debug/improve?
I've been reading through all the issues I could find and the two most notable findings are:
That said, I'm not quite sure how to validate the results of cross-val-metrics
.
I did read the wiki articles but still struggle to make sense of it. I do have parsing_errors
(quite a lot actually) but don't know how to improve the dataset based on it.
Removing sentences also did not help and I've been rather picky about what to add.
My dataset is a export from dialogflow (I wrote a converter script which supports intents and entities). Within dialogflow I made sure there are no validation errors and the same query gives me a much higher confidence score then I get with snips. I assume the calculation approach is very different (though probably hard to tell due to the closed-source nature of dialogflow).
Here are some confidence results I get:
Dialogflow
Query | In dataset | Confidence |
---|---|---|
ceiling lights on | yes | 1 |
tv lights on | no | 0.79 |
tv on | no | 0.47 * |
* I only just now tried this query and I am not sure how to feel about that result ':D
Snips
Query | In dataset | Confidence |
---|---|---|
ceiling lights on | yes | 1 |
tv lights on | no | 0.45 |
tv on | no | 1 |
The second entry worries me.
A confidence below .5
is quite bad for a query so similar to one within the dataset.
With a previous .fit
it got as low as .32
Here is:
I hope someone can help and sorry if I forgot something important