Jarvis icon indicating copy to clipboard operation
Jarvis copied to clipboard

NLP with Tensorflow

Open Tpool1 opened this issue 3 years ago • 7 comments

Could I implement NLP technology with Tensorflow to make Jarvis more natural? For instance, make it where you do not have to type the exact name of a plugin every time you use it. Also, I could perhaps even add like a sarcasm or emotion detection system that other developers could tap into for future plugins.

Tpool1 avatar May 23 '21 21:05 Tpool1

Sure!

Just would like to point out dev-branch. evamy did implement NLP using snips: https://github.com/sukeesh/Jarvis/blob/dev/jarviscli/language/snips.py

Dev-branch also introduces the concept of exchangeable 'LanguageParser'. Basically this would be like this:

class LanguageParserTensorFlow:
    def train(self, plugins):
        # analyze all available plugins
        pass

   def identify_action(self, action: str):
       # return correct plugin for action
       pass

And in main (https://github.com/sukeesh/Jarvis/blob/dev/jarviscli/main.py#L17) create which language parser you would like to use.

I hope this new dev-feature helps? If you experience any issues, please tell me ;).

pnhofmann avatar May 25 '21 07:05 pnhofmann

Got it! Thank you.

Tpool1 avatar May 25 '21 16:05 Tpool1

One thing to be careful of: The parser does not have control over what is being input in terms of the s value given with the command. In its current form, Jarvis does not care what the input is, as long as the command can be matched in the start and the string s can be build from the remaining piece.

For example, today, the Snips NLU parser does a great job of identifying the command, even if it is written backwards like if you write hello say instead of say hello, it will identify it as the say command, just cannot then disintegrate the values into command = say and s = hello.

Does that make sense ?

I think that is a bigger issue that needs to be solved before we come anywhere close to solving natural language inputs...

antiDigest avatar May 25 '21 19:05 antiDigest

@antiDigest Yes, thank you for that.

Tpool1 avatar May 25 '21 20:05 Tpool1

Would it be okay to use this dataset to train the model? It is a portion of Project Gutenberg: a collection of public domain books. It might have to be stored in the repo. I have checked out the README page and it looks to be free to use as long as its contents are not changed and not commercialized.

Tpool1 avatar May 27 '21 14:05 Tpool1

Would it be okay to use this dataset to train the model?

Surprisingly complicated question. But to make things short; I guess: yes.

Longer answer: Strictly speaking "not changed and not commercialized" is incompatible with GPL, which is the license of some dependencies. But I wouldn't consider everything a "derived work" since the train model ist data and you could use your own model trained with whatever data you want (?). Then again, I would probably prefer a separate repo.

pnhofmann avatar May 30 '21 16:05 pnhofmann

Got it. Thanks.

Tpool1 avatar May 30 '21 22:05 Tpool1