BRAKER icon indicating copy to clipboard operation
BRAKER copied to clipboard

Braker failing at GUSHR stage when using addUTR=on or UTR=on

Open bjhunt-git opened this issue 3 years ago • 8 comments

Hello, I was hoping I could get some advice regarding an error I encounter when trying to include UTR predictions, both as part of a full annotation run and when adding UTRs to a previous run. Braker (v2.1.6) runs fine without addUTR or UTR on, but fails while running gushr.py in either UTR mode.

The issue is that it is expecting a file called final_annotation.gff which I think should be produced by the preceeding GeMoMa AnnotationFinalizer command. The specified output directory exists and contains the files required for GeMoMa AnnotationFinalizer (complete_gemoma_like.gff3 etc), but the final_annotation.gff is not being created despite the command reporting a successful run (I have attached the gushr.log file, there is nothing in gushr.err). While searching for solutions I did encounter this issue, but on checking gushr.py we have the version with "score=NA" omitted. We are however using GeMoMa 1.6.4.

Any suggestions would be greatly appreciated.

gushr.log

bjhunt-git avatar Feb 02 '22 13:02 bjhunt-git

We are not on purpose ignoring your issue but it will likely be months before I find time to address it. I am very sorry.

KatharinaHoff avatar Feb 02 '22 15:02 KatharinaHoff

Hello!

I just wanted to report that I am having the exact same issue. Otherwise Braker2 is running nicely.

Thanks! Dustin

DustinSokolowski avatar Apr 02 '22 21:04 DustinSokolowski

Same issue here.

Wen

kuowenhsi avatar May 10 '22 15:05 kuowenhsi

I encountered the problem as you said and resolved it. It is because when run GeMoMa:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.base/java.util.Arrays.copyOf(Arrays.java:3689) at java.base/java.util.ArrayList.grow(ArrayList.java:238) at java.base/java.util.ArrayList.grow(ArrayList.java:243) at java.base/java.util.ArrayList.add(ArrayList.java:486) at java.base/java.util.ArrayList.add(ArrayList.java:499) at projects.gemoma.GeMoMa.readCoverage(GeMoMa.java:606) at projects.gemoma.GeMoMa.readCoverage(GeMoMa.java:550) at projects.gemoma.GeMoMa.fill(GeMoMa.java:160) at projects.gemoma.AnnotationFinalizer.run(AnnotationFinalizer.java:486) at projects.gemoma.AnnotationFinalizer.run(AnnotationFinalizer.java:468) at de.jstacs.tools.ui.cli.CLI.run(CLI.java:427) at projects.gemoma.GeMoMa.main(GeMoMa.java:374)

Just by increasing the memory allocation to GeMoMa can resolve the bug.

in my case: GeMoMa -Xmx32g AnnotationFinalizer u=YES g=/storage1/fs1/kolsen/Active/Wen/braker2/braker_RNA/genome.fa a=complete_gemoma_like.gff3 i=introns.gff c=UNSTRANDED coverage_unstranded=coverage.bedgraph rename=NO

kuowenhsi avatar May 11 '22 06:05 kuowenhsi

I encountered the same issue. As suggested by @kuowenhsi , increasing the max memory allocation solved my issue! This technique is mentioned in GeMoMa documentation too (http://www.jstacs.de/index.php/GeMoMa).

I modified these lines in gushr.py:

subprcs_args = [java, '-jar', jar, 'CLI', 'AnnotationFinalizer',
                    'u=YES', 'g=' + args.genome, 'a=' + gff3_file,
                    'i=' + intron_file, 'c=UNSTRANDED',
                    'coverage_unstranded=' + bed_graph, 'rename=NO',
                    'outdir=' + tmp_dir]

to

subprcs_args = [java, '-Xmx64G', '-jar', jar, 'CLI', 'AnnotationFinalizer',
                    'u=YES', 'g=' + args.genome, 'a=' + gff3_file,
                    'i=' + intron_file, 'c=UNSTRANDED',
                    'coverage_unstranded=' + bed_graph, 'rename=NO',
                    'outdir=' + tmp_dir]

tomomano avatar Oct 16 '22 16:10 tomomano

I am using conda installation of braker. I have encountered the same error.

The original line from conda provided gush.py that breaks is: subprcs_args = ['GeMoMa', java, '-jar', 'CLI', 'AnnotationFinalizer'....

The fix is to remove java, '-jar' and 'CLI': subprcs_args = ['GeMoMa', '-Xmx64G', 'AnnotationFinalizer'....

Good luck.

luciazifcakova avatar Jan 19 '23 02:01 luciazifcakova

I'm currently running into this issue too, so I appreciate the posted solutions, which I will be trying.

I am running braker.pl version 2.1.6 installed using conda.

The command:

braker.pl --species gfas_braker1 --genome=${ASM} --UTR=on --softmasking --stranded=+,- --bam=${FWD},${REV} --workingdir=02-braker-UTR --cores=16

The ERROR message printed to stderr from braker pipeline:

ERROR in file /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/braker.pl at line 10339

Failed not execute /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/python3 /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/gushr.py -b /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/01-map/Merged.forward.bam /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/01-map/Merged.reverse.bam -t /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/augustus.hints.gtf -g /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/genome.fa -o /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/gushr -c 16 -s /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin -a /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ -j /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin -q 2 > /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/gushr.log 2> /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/errors/gushr.err!

Code at line 10339:

 10333	    $cmdString .= "-t $in_gtf -g $otherfilesDir/genome.fa "
 10334	               .  "-o $out_stem -c $CPU -s $SAMTOOLS_PATH "
 10335	               .  "-a $AUGUSTUS_SCRIPTS_PATH -j $JAVA_PATH -q 2 "
 10336	               .  "> $otherfilesDir/gushr.log 2> $errorfilesDir/gushr.err";
 10337	    print LOG "\n$cmdString\n" if ( $v > 3 );
 10338	        system($cmdString) == 0 or die( "ERROR in file " . __FILE__
 10339	            . " at line " . __LINE__
 10340	            . "\nFailed not execute $cmdString!\n" );
 10341	}

The last bits in braker.log were:

#**********************************************************************************
#                               TRAINING AUGUSTUS UTR PARAMETERS                   
#**********************************************************************************
# Wed Mar  1 13:57:40 2023: Training AUGUSTUS UTR parameters
# Wed Mar  1 13:57:40 2023: Create backup of current species parameters:
cp /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_exon_probs.pbl /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_exon_probs.pbl.noUTR
cp /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_igenic_probs.pbl /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_igenic_probs.pbl.noUTR
cp /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_intron_probs.pbl /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/gfas_braker1/gfas_braker1_intron_probs.pbl.noUTR
# Wed Mar  1 13:57:40 2023: Running GUSHR...

/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/python3 /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/gushr.py -b /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/01-map/Merged.forward.bam /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/01-map/Merged.reverse.bam -t /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/augustus.hints.gtf -g /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/genome.fa -o /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/gushr -c 16 -s /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin -a /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ -j /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin -q 2 > /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/gushr.log 2> /central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/errors/gushr.err

The contents of gushr.log (Failed to open file final_annotation.gff for reading!):

GeMoMa ERE m=./gushr-NDXOWEYYRLQX/rnaseq_merged.bam u=true c=true outdir=./gushr-NDXOWEYYRLQX/
Suceeded in executing command.
Trying to execute the following command with input from STDIN:
/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin//gtf2gff.pl --out=./gushr-NDXOWEYYRLQX/complete.gff3 --gff3
Suceeded in executing command.
Done
Trying to execute the following command:
GeMoMa AnnotationFinalizer u=YES g=/central/groups/carnegie_poc/jurban/data/coral/combined-nanopore/annotation/canu_primary/01-braker1/02-braker-UTR/genome.fa a=./gushr-NDXOWEYYRLQX/complete_gemoma_like.gff3 i=./gushr-NDXOWEYYRLQX/introns.gff c=UNSTRANDED coverage_unstranded=./gushr-NDXOWEYYRLQX/coverage.bedgraph rename=NO outdir=./gushr-NDXOWEYYRLQX/
Suceeded in executing command.
Error in file /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/gushr.py at line 546: Failed to open file ./gushr-NDXOWEYYRLQX/final_annotation.gff for reading!

Here are the assumed problem lines in gushr.py (discussed above by other users):

    498 def add_utrs_to_gff3(gff3_file, bed_graph, intron_file):
    499     gff3_utr_file = tmp_dir + "final_annotation.gff"
    500     subprcs_args = ['GeMoMa', 'AnnotationFinalizer',
    501                     'u=YES', 'g=' + args.genome, 'a=' + gff3_file,
    502                     'i=' + intron_file, 'c=UNSTRANDED',
    503                     'coverage_unstranded=' + bed_graph, 'rename=NO',
    504                     'outdir=' + tmp_dir]
    505     run_simple_process(subprcs_args)
    506     return gff3_utr_file

I will be trying the following edit to line 500 (adding in '-Xmx64G',):

'GeMoMa', '-Xmx64G', 'AnnotationFinalizer'

I will report back the results.

Thanks again for the helpful discussion here.

JohnUrban avatar Mar 02 '23 15:03 JohnUrban

Some results.

I tested just re-running the GeMoMa command, which failed due to java.lang.OutOfMemoryError: Java heap space as expected:

GeMoMa AnnotationFinalizer u=YES g=/path/to/genome.fa a=./gushr-NDXOWEYYRLQX/complete_gemoma_like.gff3 i=./gushr-NDXOWEYYRLQX/introns.gff c=UNSTRANDED coverage_unstranded=./gushr-NDXOWEYYRLQX/coverage.bedgraph rename=NO outdir=./gushr-NDXOWEYYRLQX/

Adding the -Xmx with as little as 15G solved the problem (10G still failed):

GeMoMa -Xmx15G AnnotationFinalizer u=YES g=/path/to/genome.fa a=./gushr-NDXOWEYYRLQX/complete_gemoma_like.gff3 i=./gushr-NDXOWEYYRLQX/introns.gff c=UNSTRANDED coverage_unstranded=./gushr-NDXOWEYYRLQX/coverage.bedgraph rename=NO outdir=./gushr-NDXOWEYYRLQX/

...and of course it worked with larger values all the way up to 80G.

JohnUrban avatar Mar 02 '23 16:03 JohnUrban