TOGA icon indicating copy to clipboard operation
TOGA copied to clipboard

old Java version on run_test.sh micro ValueError: Chain results file TOGA/micro_test_out/temp/chain_results_df.tsv is empty! Abort

Open SomePersonSomeWhereInTheWorld opened this issue 1 year ago • 22 comments

Not really a bug but using an older version of Java results in the below. Perhaps a more graceful Java version detection and/or warning/error message? Newer Java of course works.

#### STEP 4: Classify chains using gradient boosting model

Classifying chains
classify_chains: loaded dataframe of size 0
classify_chains: total number of transcripts: 0
classify_chains: 0 rows with spanning chains
classify_chains: filtered dataset contains 0 records
classify_chains: omputing additional features...
classify_chains: WARNING! The final df for classification is empty
classify_chains: df for single-exon model contains 0 records
classify_chains: df for multi-exon model contains 0 records
classify_chains: loading models at /path/to/me/TOGA/./models/se_model.dat (SE) and /path/to/me/TOGA/./models/me_model.dat (ME)
classify_chains: applying models to SE and ME datasets...
classify_chains: applying -1.0 score to the spanning chains
classify_chains: applying -2.0 score to the processed pseudogene alignments
classify_chains: number of processed pseudogene alignments: 0
classify_chains: arranging the final output
classify_chains: classification result stats:
* orthologs: 0
* paralogs: 0
* spanning chains: 0
* processed pseudogenes: 0
classify_chains: using 0.5 as a threshold to separate orthologs from paralogs
classify_chains: combining results for 0 individual transcripts
classify_chains: saving the classification to /path/to/me/TOGA/micro_test_out/temp/trans_to_chain_classes.tsv
classify_chains: found no classifiable chains for 0 transcripts
classify_chains: saving these transcripts to: /path/to/me/TOGA/micro_test_out/temp/rejected/classify_chains_rejected.txt
Chain results file /path/to/me/TOGA/micro_test_out/temp/chain_results_df.tsv is empty! Abort.
Traceback (most recent call last):
  File "/path/to/me/TOGA/./toga.py", line 1742, in <module>
    main()
  File "/path/to/me/TOGA/./toga.py", line 1738, in main
    toga_manager.run()
  File "/path/to/me/TOGA/./toga.py", line 621, in run
    self.__classify_chains()
  File "/path/to/me/TOGA/./toga.py", line 847, in __classify_chains
    check_chains_classified(self.chain_results_df)
  File "/path/to/me/TOGA/modules/sanity_check_functions.py", line 169, in check_chains_classified
    raise ValueError(msg)
ValueError: Chain results file /path/to/me/TOGA/micro_test_out/temp/chain_results_df.tsv is empty! Abort

Edit: I do get this error message from the 2nd test on the page:

./toga.py test_input/hg38.mm10.chr11.chain test_input/hg38.genCode27.chr11.bed test_input/hg38.2bit test_input/mm10.2bit --kt --pn test -i supply/hg38.wgEncodeGencodeCompV34.isoforms.txt --nc /path/to/me/TOGA/nextflow_config_files --cb 3,5 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms

#### STEP 4: Classify chains using gradient boosting model

Classifying chains
classify_chains: loaded dataframe of size 0
classify_chains: total number of transcripts: 0
classify_chains: 0 rows with spanning chains
classify_chains: filtered dataset contains 0 records
classify_chains: omputing additional features...
classify_chains: WARNING! The final df for classification is empty
classify_chains: df for single-exon model contains 0 records
classify_chains: df for multi-exon model contains 0 records
classify_chains: loading models at /path/to/me/TOGA/./models/se_model.dat (SE) and /path/to/me/TOGA/./models/me_model.dat (ME)
classify_chains: applying models to SE and ME datasets...
classify_chains: applying -1.0 score to the spanning chains
classify_chains: applying -2.0 score to the processed pseudogene alignments
classify_chains: number of processed pseudogene alignments: 0
classify_chains: arranging the final output
classify_chains: classification result stats:
* orthologs: 0
* paralogs: 0
* spanning chains: 0
* processed pseudogenes: 0
classify_chains: using 0.5 as a threshold to separate orthologs from paralogs
classify_chains: combining results for 0 individual transcripts
classify_chains: saving the classification to /path/to/me/TOGA/test/temp/trans_to_chain_classes.tsv
classify_chains: found no classifiable chains for 0 transcripts
classify_chains: saving these transcripts to: /path/to/me/TOGA/test/temp/rejected/classify_chains_rejected.txt
Chain results file /path/to/me/TOGA/test/temp/chain_results_df.tsv is empty! Abort.
Traceback (most recent call last):
  File "/path/to/me/TOGA/./toga.py", line 1742, in <module>
    main()
  File "/path/to/me/TOGA/./toga.py", line 1738, in main
    toga_manager.run()
  File "/path/to/me/TOGA/./toga.py", line 621, in run
    self.__classify_chains()
  File "/path/to/me/TOGA/./toga.py", line 847, in __classify_chains
    check_chains_classified(self.chain_results_df)
  File "/path/to/me/TOGA/modules/sanity_check_functions.py", line 169, in check_chains_classified
    raise ValueError(msg)
ValueError: Chain results file /path/to/me/TOGA/test/temp/chain_results_df.tsv is empty! Abort.

TOGA/test/temp/chain_results_df.tsv 
gene	gene_overs	chain	synt	gl_score	gl_exo	chain_len	exon_qlen	loc_exo	exon_cover	intr_cover	gene_len	ex_num	ex_fract	intr_fract	flank_cov