nlp.js icon indicating copy to clipboard operation
nlp.js copied to clipboard

How do I train the system for improving Language.guess and SentimentAnalyzer.getSentiment?

Open AlbertoMeQ opened this issue 2 years ago • 1 comments

Hello,

I have been trying out the library and it works nicely. The main problems are related to the sentiment not always being correct and, more importantly, that the language guesser fails pretty often. I read on other tickets that it's possible to train the system to be able to improve the results for those two.

How would it work?

As an example, say I have the following corpuses:

  • En: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/corpora/corpus-en.json
  • It: https://github.com/axa-group/nlp.js/blob/master/examples/13-languages/corpora/corpus-it.json

And I have the following snippet as base code for language guessing:

const { Language } = require('node-nlp');

const language = new Language();
const guess = language.guess(
  'When the night has come And the land is dark And the moon is the only light we see',
);
console.log(guess[0]);

and the following for sentiment analysis:

const { SentimentAnalyzer } = require('node-nlp');

const sentiment = new SentimentAnalyzer({ language: 'en' });
sentiment
    .getSentiment('I like cats')
    .then(result => console.log(result));

How would I be able to train them in such a way that "I donot know" doesn't get guessed as "Italian", "It is wonderful" doesn't get guessed as "German", and that the sentiment matches better the actual text?

AlbertoMeQ avatar Jun 07 '22 08:06 AlbertoMeQ

Maybe it's not even possible to train it? 🤔

AlbertoMeQ avatar Jun 21 '22 17:06 AlbertoMeQ