sempre
sempre copied to clipboard
Not able to parse natural language
I have completely installed the setup and tried parsing utterances such as california, the golden state by following the tutorial. But now we tried using emnlp2013 grammar file by using the command: ./run @mode=simple-freebase-nocache @sparqlserver=localhost:3001 -Grammar.inPaths freebase/data/emnlp2013.grammar. But I am not able to get the logical forms. Do we need to include a lexicon file as well as they did in the tutorial. If yes, where do we find one? Else please suggest something to help me move forward.
Thanks in advance.
The command
./pull-dependencies freebase
should pull the lexicons to lib/fb_data/7/
(the unintuitive name is a bit unfortunate). Then use the following flags:
-UnaryLexicon.unaryLexiconFilePath lib/fb_data/7/unaryInfoStringAndAlignment.txt -BinaryLexicon.binaryLexiconFilesPath lib/fb_data/7/binaryInfoStringAndAlignment.txt
If this does not work, modify the 'simple-freebase-nocache' mode in the run
script (around line 496). Change sparqlOpts
to freebaseOpts
(which will load sparqlOpts
and set a few other opts), and then comment out the two lines that set the lexicons to /dev/null
:
addMode('simple-freebase-nocache', ....
...
freebaseOpts, # instead of sparqlOpts
...
# remove o('UnaryLexicon.unaryLexiconFilePath', '/dev/null') since freebaseOpts already sets the lexicon
# remove o('BinaryLexicon.binaryLexiconFilesPath', '/dev/null') likewise
...
nil) })
I tried the following command-
./run @mode=simple-freebase-nocache @sparqlserver=localhost:3001 -Grammar.inPaths freebase/data/emnlp2013.grammar -SimpleLexicon.inPaths freebase/data/tutorial-freebase.lexicon -UnaryLexicon.unaryLexiconFilePath lib/fb_data/7/unaryInfoStringAndAlignment.txt -BinaryLexicon.binaryLexiconFilesPath lib/fb_data/7/binaryInfoStringAndAlignment.txt
Is the grammar file that I added correct? Also still I am not able to get the logical forms for the lexemes that were given in the two files that you mentioned. Is there anything else that I need to add to the command or any lexical file and for which natural language questions would I get the correct logical forms.
Thanks
After some digging, I think I found the issue. The two lexicon files are for unaries (e.g., state / city) and binaries (e.g., locatedIn), but not entities (e.g., California / Sacramento). Since the set of entities in Freebase is huge, a cache server is required for looking up entities. This is not available in the "simple-freebase-nocache" mode.
There are also two more missing arguments to the command:
-LanguageAnalyzer.languageAnalyzer corenlp.CoreNLPAnalyzer \
-Grammar.tags webquestions exact bridge join inject
The first loads the CoreNLP parser which is required for some grammar rules with POS / NER tags. (It only works if you did ./pull-dependencies corenlp
and ant corenlp
first). The second turns on the relevant "when" statements in the grammar file. I got this list of grammar tags from the run script (Line 271).
But running with the two additional options above will still give you an error when the Lexicon class is trying to access the cached entities. This is a bit beyond my knowledge of the repo, but I can try to dig for the answer later.