deepdive icon indicating copy to clipboard operation
deepdive copied to clipboard

error in running smoke example

Open rudaoshi opened this issue 8 years ago • 3 comments

when runing "deepdive run" command, the command reports:

2016-12-20 10:30:26.823709 ERROR: extra data after last expected column 2016-12-20 10:30:26.823776 CONTEXT: COPY person_has_cancer, line 1: "1 \N \N"

The schema of table person_has_canser is:

Column | Type | Modifiers | Storage | Stats target | Description -----------+---------+-----------+---------+--------------+------------- person_id | bigint | | plain | | id | bigint | | plain | | label | boolean | | plain | |

rudaoshi avatar Dec 20 '16 02:12 rudaoshi

I removed the last columns of both person_has_cancer.tsv and person_smoke.tsv. The above error disappear but following error occurs:

2016-12-20 11:29:24.266355 + sampler-dw gibbs -w /dev/fd/63 -v /dev/fd/62 -f /dev/fd/61 -m factorgraph/meta -o weights -l 0 -i 1000 --alpha 0.01 2016-12-20 11:29:24.266373 ++ find -L factorgraph/factors -type f -exec pbzip2 -c -d -k '{}' + 2016-12-20 11:29:24.282508 pbzip2: *ERROR: File [factorgraph/variables/person_has_cancer/variables.part-2.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.282627 ------------------------------------------- 2016-12-20 11:29:24.282648 pbzip2: *ERROR: File [factorgraph/variables/person_has_cancer/variables.part-3.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.282666 ------------------------------------------- 2016-12-20 11:29:24.282910 pbzip2: *ERROR: File [factorgraph/variables/person_smokes/variables.part-2.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.283018 -------------------------------------------

2016-12-20 11:29:24.283069 pbzip2: *ERROR: File [factorgraph/variables/person_smokes/variables.part-3.bin.bz2] is NOT a valid bzip2! Skipping... 2016-12-20 11:29:24.283109 ------------------------------------------- 2016-12-20 11:29:24.291100 PARSE ERROR: 2016-12-20 11:29:24.291156 Required argument missing: n_samples_per_learning_epoch 2016-12-20 11:29:24.291170 2016-12-20 11:29:24.291187 Brief USAGE: 2016-12-20 11:29:24.291267 sampler-dw gibbs [--learn_non_evidence] ... [--sample_evidence] ... 2016-12-20 11:29:24.291354 [-q] ... [--regularization ] ... [-b 2016-12-20 11:29:24.291436 ] ... [-d ] ... [-p ] ... 2016-12-20 11:29:24.291527 [-a ] ... [-c ] ... [--burn_in ] 2016-12-20 11:29:24.291609 ... -i ... -s ... -l ... [-j 2016-12-20 11:29:24.291782 ] [-r ] [-o ] [-w ] 2016-12-20 11:29:24.291875 [-e ] [-f ] [-v ] [-m 2016-12-20 11:29:24.291955 ] [--] [--version] [-h]

rudaoshi avatar Dec 20 '16 03:12 rudaoshi

After add "--n_samples_per_learning_epoch 3" in the parameter list, the program runs smoothly.

The data file, config and document may need update.

rudaoshi avatar Dec 20 '16 03:12 rudaoshi

@rudaoshi the person_has_canser.id column indicates that it was created by an old version (say v0.8) of DeepDive, and that schema is no longer compatible with the examples in the latest git repo. We are about to release the next version, but until then, you could build from git by running make build; make install. Then running the examples should work. FYI, with the latest build, here is the schema:

       Table "public.person_has_cancer"
    Column     |       Type       | Modifiers 
---------------+------------------+-----------
 person_id     | bigint           | 
 dd_label      | boolean          | 
 dd_truthiness | double precision | 

alldefector avatar Dec 20 '16 17:12 alldefector