TOGA copied to clipboard
twobit sizes do not match
I am trying to run TOGA to transfer annotations over from my well-annotated reference to several different query genomes. However, while trying to run TOGA on certain genomes I would run into the following error:
Found 365 sequences in /home/lnunez/mendel-nas1/WGS/Cactus/Outputs/Diss/twobit/Thamnophis_elegans.2bit
Error! 2bit file: /home/lnunez/mendel-nas1/WGS/Cactus/Outputs/Diss/twobit/Thamnophis_elegans.2bit; chain_file: /home/lnunez/mendel-nas1/WGS/TOGA/Dissertation/Natrix_natrix/CM020096/temp/genome_alignment.chain Chromosome: WNA01000062.1; Sizes don't match! Size in twobit: 7
1766; size in chain: 71276
Traceback (most recent call last):
File "/home/lnunez/mendel-nas1/TOGA/", line 1600, in
WNA01000062.1 refers to an unplaced scaffold in the reference, of which there are 347 of them. However, I am only interested in looking at the actual reference chromosomes, of which there are 18. At first, I used the --limit_to_ref_chrom option to limit the runs to these specific chromosomes, like so:
./ "${path_to_chain}"/"${genome}.chain.gz" ${path_to_bed} "${path_to_2bit}"/"${ref}.2bit" ${path_to_2bit}"/"${genome}.2bit" --limit_to_ref_chrom ${chromosome} --kt --pn /home/lnunez/mendel-nas1/WGS/TOGA/Dissertation/"${genome}"/"${chromosome}" --nc ${path_to_nextflow_config_dir} --cb 10,100 --cjn 500
However, I still get the same error, despite noting to limit it to the chromosome. Is there a way to bypass this particular step that I am not seeing? I am in a time crunch, so I would greatly prefer it if I did not have to regenerate the input files from the start.