Finder icon indicating copy to clipboard operation
Finder copied to clipboard

Issues during checkpoints 2/3

Open Maxim-Karpov opened this issue 1 year ago • 2 comments

Hello, I've encountered 2 different issues when running Finder on 2 separate genomes.

INFO: Creating SIF file... /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:350: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. coverage_info[transcript_id]["bed_cov"] = np.array( temp ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: invalid value encountered in double_scalars ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: invalid value encountered in double_scalars ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: divide by zero encountered in double_scalars ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: divide by zero encountered in double_scalars ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 ) Traceback (most recent call last): File "/softwares/FINDER/Finder/finder", line 688, in main() File "/softwares/FINDER/Finder/finder", line 649, in main orchestrateGeneModelPrediction( options, logger_proxy, logging_mutex ) File "/softwares/FINDER/Finder/finder", line 491, in orchestrateGeneModelPrediction fixOverlappingAndMergedTranscripts( options, logger_proxy, logging_mutex ) File "/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py", line 740, in fixOverlappingAndMergedTranscripts exon = list( map( int, exon.split( "-" ) ) ) ValueError: invalid literal for int() with base 10: '1e+05'

I believe the line 740 in fixOverlappingAndMergedTranscripts.py needs to be changed from exon = list( map( int, exon.split( "-" ) ) ) to exon = list( map( int, map(float, exon.split( "-" )) ) ) to fix this.

INFO: Creating SIF file... cat: /home/Maxim/software/FINDER/output/ChrsSoftMask/alignments/SRR11184196_round3_SJ.out.tab: No such file or directory /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:350: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. coverage_info[transcript_id]["bed_cov"] = np.array( temp ) Warning: couldn't find fasta record for 'ENA_OX243811_OX243811'! Error: no genomic sequence available (check -g option!). /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: divide by zero encountered in double_scalars ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: invalid value encountered in double_scalars ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: invalid value encountered in double_scalars ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: divide by zero encountered in double_scalars ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 ) /softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:655: RuntimeWarning: divide by zero encountered in double_scalars ratio1 = round( np.average( coverage_2nd_portion ) / np.average( coverage_1st_portion ), 2 ) multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar return list(map(*args)) File "/softwares/FINDER/Finder/scripts/removeRedundantTranscripts.py", line 22, in findSubsetTranscripts if transcripts_fasta[transcript_i] in transcripts_fasta[transcript_j]: KeyError: 'ENA_OX243811_OX243811.1_0_covsplit.0' """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/softwares/FINDER/Finder/finder", line 688, in main() File "/softwares/FINDER/Finder/finder", line 649, in main orchestrateGeneModelPrediction( options, logger_proxy, logging_mutex ) File "/softwares/FINDER/Finder/finder", line 500, in orchestrateGeneModelPrediction removeRedundantTranscripts( input_gtf_filename, output_gtf_filename, options ) File "/softwares/FINDER/Finder/scripts/removeRedundantTranscripts.py", line 85, in removeRedundantTranscripts results = pool.map( findSubsetTranscripts, all_inputs ) File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get raise self._value KeyError: 'ENA_OX243811_OX243811.1_0_covsplit.0'

I haven't found a possible solution for this issue. Hope you can patch these.

On a side note, the issues people are experiencing regarding empty psiclass and no combined gff files could be due to people not splitting their paired-end reads from reads.fastq to reads_1.fastq + reads_2.fastq with SRA toolkit's fastq-dump.

Maxim-Karpov avatar Mar 13 '23 19:03 Maxim-Karpov