deepdive
deepdive copied to clipboard
error in running smoke example
when runing "deepdive run" command, the command reports:
2016-12-20 10:30:26.823709 ERROR: extra data after last expected column 2016-12-20 10:30:26.823776 CONTEXT: COPY person_has_cancer, line 1: "1 \N \N"
The schema of table person_has_canser is:
Column | Type | Modifiers | Storage | Stats target | Description -----------+---------+-----------+---------+--------------+------------- person_id | bigint | | plain | | id | bigint | | plain | | label | boolean | | plain | |
I removed the last columns of both person_has_cancer.tsv and person_smoke.tsv. The above error disappear but following error occurs:
2016-12-20 11:29:24.266355 + sampler-dw gibbs -w /dev/fd/63 -v /dev/fd/62 -f /dev/fd/61 -m factorgraph/meta -o weights -l 0 -i 1000 --alpha 0.01 2016-12-20 11:29:24.266373 ++ find -L factorgraph/factors -type f -exec pbzip2 -c -d -k '{}' + 2016-12-20 11:29:24.282508 pbzip2: *ERROR: File [factorgraph/variables/person_has_cancer/variables.part-2.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.282627 ------------------------------------------- 2016-12-20 11:29:24.282648 pbzip2: *ERROR: File [factorgraph/variables/person_has_cancer/variables.part-3.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.282666 ------------------------------------------- 2016-12-20 11:29:24.282910 pbzip2: *ERROR: File [factorgraph/variables/person_smokes/variables.part-2.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.283018 -------------------------------------------
2016-12-20 11:29:24.283069 pbzip2: *ERROR: File [factorgraph/variables/person_smokes/variables.part-3.bin.bz2] is NOT a valid bzip2! Skipping...
2016-12-20 11:29:24.283109 -------------------------------------------
2016-12-20 11:29:24.291100 PARSE ERROR:
2016-12-20 11:29:24.291156 Required argument missing: n_samples_per_learning_epoch
2016-12-20 11:29:24.291170
2016-12-20 11:29:24.291187 Brief USAGE:
2016-12-20 11:29:24.291267 sampler-dw gibbs [--learn_non_evidence] ... [--sample_evidence] ...
2016-12-20 11:29:24.291354 [-q] ... [--regularization
After add "--n_samples_per_learning_epoch 3" in the parameter list, the program runs smoothly.
The data file, config and document may need update.
@rudaoshi the person_has_canser.id column indicates that it was created by an old version (say v0.8) of DeepDive, and that schema is no longer compatible with the examples in the latest git repo. We are about to release the next version, but until then, you could build from git by running make build; make install. Then running the examples should work. FYI, with the latest build, here is the schema:
Table "public.person_has_cancer"
Column | Type | Modifiers
---------------+------------------+-----------
person_id | bigint |
dd_label | boolean |
dd_truthiness | double precision |