Finder icon indicating copy to clipboard operation
Finder copied to clipboard

cds_predict error

Open claumer opened this issue 2 years ago • 5 comments

Hello there,

I'm trying to use FINDER for the first time, on a Drosophila de novo assembly with locally stored RNA-seq data. Many parts of the pipeline seem to have worked using the installation and run guidance as documented - for instance BRAKER has finished - but I find an error in step 5 of the pipeline that I can't work out how to overcome:

INFO: Creating SIF file... Traceback (most recent call last): File "/softwares/FINDER/Finder/finder", line 688, in main() File "/softwares/FINDER/Finder/finder", line 665, in main findCDS( options, logger_proxy, logging_mutex ) File "/softwares/FINDER/Finder/scripts/predictCDS.py", line 66, in findCDS fhr = open( options.output_assemblies_psiclass_terminal_exon_length_modified + "/combined/cds_predict/annotation.gtf", "r" ) FileNotFoundError: [Errno 2] No such file or directory: '/lustre/scratch116/tol/teams/team301/users/cl16/Drosophila/idDroSubo1_FINDER/assemblies_psiclass_modified/combined/cds_predict/annotation.gtf'

Indeed, if I go to /idDroSub1_FINDER/assemblies_psiclass_modified/combined/cds_predict/ I see only:

-rw-r--r-- 1 cl16 team301 38219274 Jan 12 11:53 minus.fa -rw-r--r-- 1 cl16 team301 0 Jan 12 11:53 ORFs_minus.gtf -rw-r--r-- 1 cl16 team301 0 Jan 12 11:53 ORFs_plus.gtf

I also notice an error from running Codan in the combined/ directory:

Traceback (most recent call last): File "/softwares/CODAN/CodAn-1.2/bin/codan.py", line 524, in main() File "/softwares/CODAN/CodAn-1.2/bin/codan.py", line 506, in main codan_BOTH(options.transcripts, options.output_folder, options.model, options.cpu) File "/softwares/CODAN/CodAn-1.2/bin/codan.py", line 355, in codan_BOTH retrieveORF_BOTH(transcripts, outF+"minus.fa", outF) File "/softwares/CODAN/CodAn-1.2/bin/codan.py", line 147, in retrieveORF_BOTH record_dictP = SeqIO.index(transcripts, "fasta") File "/usr/lib/python3/dist-packages/Bio/SeqIO/init.py", line 979, in index return _IndexedSeqFileDict( File "/usr/lib/python3/dist-packages/Bio/File.py", line 350, in init raise ValueError("Duplicate key '%s'" % key) ValueError: Duplicate key 'u000001431.46339_3_covsplit.0'

Can you please advise on what might have gone wrong here, and what should be done to fix it? Happy to provide any intermediate files needed to diagnose. I'm running on an HPC cluster using singularity 3.9.0.

Regards, Chris L

claumer avatar Jan 12 '22 12:01 claumer