Jarvis
Jarvis copied to clipboard
NLP with Tensorflow
Could I implement NLP technology with Tensorflow to make Jarvis more natural? For instance, make it where you do not have to type the exact name of a plugin every time you use it. Also, I could perhaps even add like a sarcasm or emotion detection system that other developers could tap into for future plugins.
Sure!
Just would like to point out dev-branch. evamy did implement NLP using snips: https://github.com/sukeesh/Jarvis/blob/dev/jarviscli/language/snips.py
Dev-branch also introduces the concept of exchangeable 'LanguageParser'. Basically this would be like this:
class LanguageParserTensorFlow:
def train(self, plugins):
# analyze all available plugins
pass
def identify_action(self, action: str):
# return correct plugin for action
pass
And in main
(https://github.com/sukeesh/Jarvis/blob/dev/jarviscli/main.py#L17) create which language parser you would like to use.
I hope this new dev-feature helps? If you experience any issues, please tell me ;).
Got it! Thank you.
One thing to be careful of:
The parser does not have control over what is being input in terms of the s
value given with the command.
In its current form, Jarvis does not care what the input is, as long as the command
can be matched in the start and the string s
can be build from the remaining piece.
For example, today, the Snips NLU parser does a great job of identifying the command, even if it is written backwards
like if you write
hello say
instead of say hello
, it will identify it as the say
command, just cannot then disintegrate the values into command = say
and s = hello
.
Does that make sense ?
I think that is a bigger issue that needs to be solved before we come anywhere close to solving natural language inputs...
@antiDigest Yes, thank you for that.
Would it be okay to use this dataset to train the model? It is a portion of Project Gutenberg: a collection of public domain books. It might have to be stored in the repo. I have checked out the README page and it looks to be free to use as long as its contents are not changed and not commercialized.
Would it be okay to use this dataset to train the model?
Surprisingly complicated question. But to make things short; I guess: yes.
Longer answer: Strictly speaking "not changed and not commercialized" is incompatible with GPL, which is the license of some dependencies. But I wouldn't consider everything a "derived work" since the train model ist data and you could use your own model trained with whatever data you want (?). Then again, I would probably prefer a separate repo.
Got it. Thanks.