nlp.js
nlp.js copied to clipboard
How do I train the system for improving Language.guess and SentimentAnalyzer.getSentiment?
Hello,
I have been trying out the library and it works nicely. The main problems are related to the sentiment not always being correct and, more importantly, that the language guesser fails pretty often. I read on other tickets that it's possible to train the system to be able to improve the results for those two.
How would it work?
As an example, say I have the following corpuses:
- En: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/corpora/corpus-en.json
- It: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/corpora/corpus-it.json
And I have the following snippet as base code for language guessing:
const { Language } = require('node-nlp');
const language = new Language();
const guess = language.guess(
'When the night has come And the land is dark And the moon is the only light we see',
);
console.log(guess[0]);
and the following for sentiment analysis:
const { SentimentAnalyzer } = require('node-nlp');
const sentiment = new SentimentAnalyzer({ language: 'en' });
sentiment
.getSentiment('I like cats')
.then(result => console.log(result));
How would I be able to train them in such a way that "I donot know" doesn't get guessed as "Italian", "It is wonderful" doesn't get guessed as "German", and that the sentiment matches better the actual text?
Maybe it's not even possible to train it? 🤔