ga-reader icon indicating copy to clipboard operation
ga-reader copied to clipboard

How can I get "vocab.txt".

Open danlutan opened this issue 7 years ago • 1 comments

How can I get "vocab.txt". If I run , "make_dictionary()" will create a "vocab.txt" . But here only two words in the "txt" like "begin" abd "end". So when in "parse_one_file()", the function "map" will show the error "keyerror: @entity133".

Traceback (most recent call last): File "run.py", line 42, in <module> train.main(save_path, params) File "/home/root1/tld/ga-reader-master/ga-reader-master/train.py", line 27, in main data = dp.preprocess(dataset, no_training_set=False, use_chars=use_chars) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 43, in preprocess training = self.parse_all_files(question_dir + "/training", dictionary, use_chars) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 162, in parse_all_files questions =[ self.parse_one_file(f, dictionary, use_chars) + (f,) for f in all_files] File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 142, in parse_one_file qry_words = map(lambda w:w_dict[w], qry_raw) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 142, in <lambda> qry_words = map(lambda w:w_dict[w], qry_raw) KeyError: '@entity133'

danlutan avatar Sep 10 '17 11:09 danlutan

make_dictionary() should add all the types in the dataset to the vocabulary. It is strange that you only get begin and end in it. Maybe you can try deleting the vocab.txt and rerunning, so that it creates a new vocabulary file?

If that doesn't work, I would try putting a breakpoint in make_dictionary() and check if the question files are being read properly.

bdhingra avatar Sep 10 '17 16:09 bdhingra