deep-qa
deep-qa copied to clipboard
Question on '/tmp/trec-merged.txt'
A question on '/tmp/trec-merged.txt'. I have installed Theano as well as most other dependencies by installing Winpython. After running the first step:
To build the required train/dev/test sets in the suitable format for the network run: $ sh run_build_datasets.sh
with "os.system('run_build_datasets.sh')" in the main directory of the project,
jacana-qa-naacl2013-data-results/train.xml
outdir TRAIN
Traceback (most recent call last):
File "parse.py", line 178, in <module>
qids, questions, answers, labels = load_data(all_fname)
File "parse.py", line 15, in load_data
lines = open(fname).readlines()
IOError: [Errno 2] No such file or directory: '/tmp/trec-merged.txt'
Vocab size 17022
embeddings/aquaint+wiki.txt.gz.ndim=50.bin
vocab_size, layer1_size 2470719 50
. . . . . . . . . . . . . . . . . . . . . . . . . done
Words found in wor2vec embeddings 16201
ndim 50
Using zero vector as random
random_words_count 821
(17023L, 50L)
TRAIN\emb_aquaint+wiki.txt.gz.ndim=50.bin.npy
Vocab size 56952
embeddings/aquaint+wiki.txt.gz.ndim=50.bin
vocab_size, layer1_size 2470719 50
. . . . . . . . . . . . . . . . . . . . . . . . . done
Words found in wor2vec embeddings 51250
ndim 50
Using zero vector as random
random_words_count 5702
(56953L, 50L)
TRAIN-ALL\emb_aquaint+wiki.txt.gz.ndim=50.bin.npy
bash: make: command not found
bash: make: command not found
The first problem is about '/tmp/trec-merged.txt' which I failed to find it. What's '/tmp/trec-merged.txt'? Is it inside the downloaded zip file, or how to create it?
The following 2 lines of code after /tmp/trec-merged.txt , creates it. files = ' '.join([train, dev, test]) subprocess.call("/bin/cat {} > {}".format(files, all_fname), shell=True)
Train,dev and test are the three files from trec data which is merged into a single file trec-merged.txt in tmp.