BRAKER
BRAKER copied to clipboard
Braker3/GeneMark-ETP: file not found: complete.gtf, complete.id, complete_uniq.gtf
Hello,
Thank you for all the great tools coming from this team.
I gave Braker3 a shot, but am running into an error at the moment. I will report below how I installed Braker3, and how I used it in case it helps reproduce the error.
I would be grateful for any guidance you can provide, and am eager to get Braker3 working at some point in the near future, but fully understand that you are busy. I am mainly reporting this issue in case it helps your development.
First, here was the command used.
braker.pl --genome=${ASM} --UTR=on --stranded=+,- --bam=${FWD},${REV} --prot_seq=${PROTEINS} --workingdir=braker3 --threads=16
Second, here are the errors as reported.
This was reported to stdout/stderr.
# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023:Both protein and RNA-Seq libraries in input detected. BRAKER will be executed in ETP mode.
#*********
# Fri Feb 24 08:59:38 2023: Log information is stored in file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/braker.log
#*********
# WARNING: Detected whitespace in fasta header of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
ERROR in file /home/jurban/software/braker2/braker3/BRAKER/scripts/braker.pl at line 5486
Failed to execute: /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr
Failed to execute: /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr
The most common problem is an expired or not present file ~/.gm_key!
This is from braker.log
#**********************************************************************************
# BRAKER CONFIGURATION
#**********************************************************************************
# BRAKER CALL: /home/jurban/software/braker2/braker3/BRAKER/scripts/braker.pl --genome=/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked --UTR=on --stranded=+,- --bam=/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/forward.bam,/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/reverse.bam --prot_seq=/central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta --workingdir=braker3 --threads=16
# Fri Feb 24 08:59:35 2023: braker.pl version 3.0.0
# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023:Both protein and RNA-Seq libraries in input detected. BRAKER will be executed in ETP mode.
#*********
# Fri Feb 24 08:59:35 2023: Configuring of BRAKER for using external tools...
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_CONFIG_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_CONFIG_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/ as potential path for $AUGUSTUS_CONFIG_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_CONFIG_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/!
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_BIN_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_BIN_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ as potential path for $AUGUSTUS_BIN_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_BIN_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/!
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_SCRIPTS_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_SCRIPTS_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ as potential path for $AUGUSTUS_SCRIPTS_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_SCRIPTS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/!
# Fri Feb 24 08:59:35 2023: Trying to set $PYTHON3_PATH...
# Fri Feb 24 08:59:35 2023: Did not find environment variable $PYTHON3_PATH.
# Fri Feb 24 08:59:35 2023: Trying to guess PYTHON3_PATH from location of python3 executable that is available in your $PATH
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $PYTHON3_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $PYTHON3_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:35 2023: Trying to set $JAVA_PATH...
# Fri Feb 24 08:59:35 2023: Did not find environment variable $JAVA_PATH.
# Fri Feb 24 08:59:35 2023: Trying to guess JAVA_PATH from location of java executable that is available in your $PATH
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $JAVA_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $JAVA_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $GUSHR_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $GUSHR_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess GUSHR_PATH from location of gushr.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $GUSHR_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $GUSHR_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $GENEMARK_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $GENEMARK_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess GENEMARK_PATH from location of gmetp.pl executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin as potential path for $GENEMARK_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $GENEMARK_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $BAMTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $BAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess BAMTOOLS_PATH from location of bamtools executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $BAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $BAMTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $SAMTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $SAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess SAMTOOLS_PATH from location of samtools executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools as potential path for $SAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $SAMTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools!
# Fri Feb 24 08:59:36 2023: Trying to set $DIAMOND_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $DIAMOND_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess DIAMOND_PATH from location of diamond executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools as potential path for $DIAMOND_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $DIAMOND_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools!
# Fri Feb 24 08:59:36 2023: Trying to set $PROTHINT_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $PROTHINT_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess PROTHINT_PATH from location of prothint.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/deps/prothint/ProtHint-2.6.0/bin as potential path for $PROTHINT_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $PROTHINT_PATH to /central/groups/carnegie_poc/jurban/software/braker2/deps/prothint/ProtHint-2.6.0/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $TSEBRA_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $TSEBRA_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess TSEBRA_PATH from location of tsebra.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $TSEBRA_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $TSEBRA_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $CDBTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $CDBTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess CDBTOOLS_PATH from location of cdbfasta executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $CDBTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $CDBTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
#*********
# IMPORTANT INFORMATION: no species for identifying the AUGUSTUS parameter set that will arise from this BRAKER run was set. BRAKER will create an AUGUSTUS parameter set with name Sp_1. This parameter set can be used for future BRAKER/AUGUSTUS prediction runs for the same species. It is usually not necessary to retrain AUGUSTUS with novel extrinsic data if a high quality parameter set already exists.
#*********
#**********************************************************************************
# CREATING DIRECTORY STRUCTURE
#**********************************************************************************
# Fri Feb 24 08:59:38 2023: creating file that contains citations for this BRAKER run at /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/what-to-cite.txt...
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP.
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/species
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/species
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors
# Fri Feb 24 08:59:38 2023: changing into working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3
cd /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3
# Fri Feb 24 08:59:38 2023: getting GC content of the genome
/central/groups/carnegie_poc/jurban/software/braker2/braker3/BRAKER/scripts/get_gc_content.py --sequences /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked --print_sequence_length 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/gc_content.out 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/gc_content.stderr
# Fri Feb 24 08:59:40 2023: Creating parameter template files for AUGUSTUS with new_species.pl
# Fri Feb 24 08:59:40 2023: new_species.pl will create parameter files for species Sp_1 in /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/Sp_1
/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/new_species.pl --species=Sp_1 --AUGUSTUS_CONFIG_PATH=/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/ 1> /dev/null 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/new_species.stderr
# Fri Feb 24 08:59:40 2023: check_fasta_headers(): Checking fasta headers of file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked
# Fri Feb 24 08:59:40 2023: check_fasta_headers(): Checking fasta headers of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta
# Fri Feb 24 08:59:40 2023: Assuming that this is not a DNA fasta file because other characters than A, T, G, C, N, a, t, g, c, n were contained. If this is supposed to be a DNA fasta file, check the content of your file! If this is supposed to be a protein fasta file, please ignore this message!
#*********
# WARNING: Detected whitespace in fasta header of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
# Fri Feb 24 08:59:44 2023: Assuming that this is not a protein fasta file because other characters than AaRrNnDdCcEeQqGgHhIiLlKkMmFfPpSsTtWwYyVvBbZzJjOoUuXx were contained. If this is supposed to be DNA fasta file, please ignore this message.
#**********************************************************************************
# PROCESSING HINTS
#**********************************************************************************
#**********************************************************************************
# RUNNING GENEMARK-EX
#**********************************************************************************
# Fri Feb 24 09:00:15 2023: Preparing genemark_evidence file hints from manual hints...
# Fri Feb 24 09:00:15 2023: Running GeneMark-ETP
# Fri Feb 24 09:00:15 2023: changing into GeneMark-ETP directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
cd /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
# Fri Feb 24 09:00:16 2023: sorting RNA-Seq BAM files
samtools sort /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/forward.bam -o /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/forward.bam 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.forward.stdout 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.forward.stderr
samtools sort /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/reverse.bam -o /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/reverse.bam 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.reverse.stdout 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.reverse.stderr
# Fri Feb 24 09:00:32 2023: Running gmetp.pl
/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr
This is from GeneMark-ETP.stderr.
FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl line 2162.
Third, here is how I installed it.
First, I installed dependencies with Mamba (conda) using a YML file.
mamba env create -f braker3-deps.yml
I will copy/paste the braker3-deps.yml
at the very bottom.
Second, I installed GeneMark-ETP via git clone.
git clone https://github.com/gatech-genemark/GeneMark-ETP.git
Third, I cloned BRAKER and checked out the braker3 branch.
git clone https://github.com/Gaius-Augustus/BRAKER.git
cd BRAKER
git checkout braker3
Fourth, the run evironment is set by:
conda activate braker3-deps2
export PATH=${BRAKER3}:${GENEMARK_ETP_BIN}:${GENEMARK_ETP_TOOLS}:${PROTHINT2}:${PATH}
YML File
name: braker3-deps2
channels:
- eumetsat
- conda-forge
- bioconda
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_gnu
- alsa-lib=1.2.7.2=h166bdaf_0
- augustus=3.4.0=pl5262h5a9fe7b_2
- bamtools=2.5.1=hd03093a_10
- bedtools=2.30.0=h468198e_3
- biopython=1.81=py310h1fa729e_0
- blast=2.13.0=hf3cf87c_0
- boost-cpp=1.74.0=h75c5d50_8
- braker2=2.1.6=hdfd78af_5
- bzip2=1.0.8=h7f98852_4
- c-ares=1.18.1=h7f98852_0
- ca-certificates=2022.12.7=ha878542_0
- cairo=1.16.0=ha61ee94_1014
- cdbtools=0.99=hd03093a_7
- curl=7.87.0=h6312ad2_0
- diamond=2.1.3=hb97b32f_0
- entrez-direct=16.2=he881be0_1
- exonerate=2.4.0=h09da616_5
- expat=2.5.0=h27087fc_0
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=hab24e00_0
- fontconfig=2.14.2=h14ed4e7_0
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- freetype=2.12.1=hca18f0e_1
- gawk=5.1.0=h7f98852_0
- gemoma=1.6.4=hdfd78af_1
- genomethreader=1.7.1=h87f3376_4
- gettext=0.21.1=h27087fc_0
- gffread=0.12.7=hd03093a_1
- giflib=5.2.1=h36c2ea0_2
- glib=2.74.1=h6239696_1
- glib-tools=2.74.1=h6239696_1
- gmp=6.2.1=h58526e2_0
- graphite2=1.3.13=h58526e2_1001
- gsl=2.6=he838d99_2
- harfbuzz=5.3.0=h418a68e_0
- hisat2=2.2.1=h87f3376_4
- htslib=1.12=h9093b5e_1
- icu=70.1=h27087fc_0
- jbig=2.1=h7f98852_2003
- jpeg=9e=h0b41bf4_3
- keyutils=1.6.1=h166bdaf_0
- krb5=1.20.1=hf9c8cef_0
- lcms2=2.12=hddcbb42_0
- ld_impl_linux-64=2.40=h41732ed_0
- lerc=2.2.1=h9c3ff4c_0
- libblas=3.9.0=16_linux64_openblas
- libcblas=3.9.0=16_linux64_openblas
- libcups=2.3.3=h36d4200_3
- libcurl=7.87.0=h6312ad2_0
- libdeflate=1.7=h7f98852_5
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=h516909a_1
- libffi=3.4.2=h7f98852_5
- libgcc-ng=12.2.0=h65d4601_19
- libgfortran-ng=12.2.0=h69a702a_19
- libgfortran5=12.2.0=h337968e_19
- libglib=2.74.1=h606061b_1
- libgomp=12.2.0=h65d4601_19
- libhwloc=2.8.0=h32351e8_1
- libiconv=1.17=h166bdaf_0
- libidn2=2.3.4=h166bdaf_0
- liblapack=3.9.0=16_linux64_openblas
- libnghttp2=1.51.0=hdcd2b5c_0
- libnsl=2.0.0=h7f98852_0
- libopenblas=0.3.21=pthreads_h78a6416_3
- libpng=1.6.39=h753d276_0
- libssh2=1.10.0=haa6b8db_3
- libstdcxx-ng=12.2.0=h46fd767_19
- libtiff=4.3.0=hf544144_1
- libunistring=0.9.10=h7f98852_0
- libuuid=2.32.1=h7f98852_1000
- libwebp-base=1.2.4=h166bdaf_0
- libxcb=1.13=h7f98852_1004
- libxml2=2.9.14=h22db469_4
- libzlib=1.2.13=h166bdaf_4
- lp_solve=5.5.2.5=h14c3975_1001
- makehub=1.0.5=1
- metis=5.1.0=h58526e2_1006
- mmseqs2=13.45111=h95f258a_1
- mpfr=4.1.0=h9202a9a_1
- mysql-connector-c=6.1.11=h6eb9d5d_1007
- ncbi-vdb=3.0.2=h87f3376_0
- ncurses=6.2=h58526e2_4
- numpy=1.24.2=py310h8deb116_0
- openjdk=8.0.332=h166bdaf_0
- openssl=1.1.1t=h0b41bf4_0
- ossuuid=1.6.2=hf484d3e_1000
- pcre=8.45=h9c3ff4c_0
- pcre2=10.40=hc3806b6_0
- perl=5.26.2=h36c2ea0_1008
- perl-apache-test=1.40=pl526_1
- perl-app-cpanminus=1.7044=pl526_1
- perl-archive-tar=2.32=pl526_0
- perl-base=2.23=pl526_1
- perl-business-isbn=3.004=pl526_0
- perl-business-isbn-data=20140910.003=pl526_0
- perl-carp=1.38=pl526_3
- perl-class-data-inheritable=0.08=pl526_1
- perl-class-load=0.25=pl526_0
- perl-class-load-xs=0.10=pl526h6bb024c_2
- perl-class-method-modifiers=2.12=pl526_0
- perl-clone-choose=0.010=pl526_0
- perl-common-sense=3.74=pl526_2
- perl-compress-raw-bzip2=2.087=pl526he1b5a44_0
- perl-compress-raw-zlib=2.087=pl526hc9558a2_0
- perl-constant=1.33=pl526_1
- perl-cpan-meta=2.150010=pl526_0
- perl-cpan-meta-requirements=2.140=pl526_0
- perl-cpan-meta-yaml=0.018=pl526_0
- perl-data-dumper=2.173=pl526_0
- perl-data-optlist=0.110=pl526_2
- perl-dbi=1.642=pl526_0
- perl-devel-globaldestruction=0.14=pl526_0
- perl-devel-overloadinfo=0.005=pl526_0
- perl-devel-stacktrace=2.04=pl526_0
- perl-dist-checkconflicts=0.11=pl526_2
- perl-encode=2.88=pl526_1
- perl-eval-closure=0.14=pl526h6bb024c_4
- perl-exception-class=1.44=pl526_0
- perl-exporter=5.72=pl526_1
- perl-exporter-tiny=1.002001=pl526_0
- perl-extutils-cbuilder=0.280230=pl526_1
- perl-extutils-makemaker=7.36=pl526_1
- perl-extutils-manifest=1.72=pl526_0
- perl-extutils-parsexs=3.35=pl526_0
- perl-file-homedir=1.004=pl526_2
- perl-file-path=2.16=pl526_0
- perl-file-spec=3.48_01=pl526_1
- perl-file-temp=0.2304=pl526_2
- perl-file-which=1.23=pl526_0
- perl-getopt-long=2.50=pl526_1
- perl-hash-merge=0.300=pl526_0
- perl-inline=0.80=pl526_2
- perl-io-compress=2.087=pl526he1b5a44_0
- perl-io-zlib=1.10=pl526_2
- perl-ipc-cmd=1.02=pl526_0
- perl-json=4.02=pl526_0
- perl-json-pp=4.04=pl526_0
- perl-json-xs=2.34=pl526h6bb024c_3
- perl-list-moreutils=0.428=pl526_1
- perl-list-moreutils-xs=0.428=pl526_0
- perl-list-util=1.38=pl526_1
- perl-locale-maketext-simple=0.21=pl526_2
- perl-logger-simple=2.0=pl526_0
- perl-math-utils=1.13=pl526_0
- perl-mce=1.837=pl526_0
- perl-mime-base64=3.15=pl526_1
- perl-module-build=0.4224=pl526_3
- perl-module-corelist=5.20190524=pl526_0
- perl-module-implementation=0.09=pl526_2
- perl-module-load=0.32=pl526_1
- perl-module-load-conditional=0.68=pl526_2
- perl-module-metadata=1.000036=pl526_0
- perl-module-runtime=0.016=pl526_1
- perl-module-runtime-conflicts=0.003=pl526_0
- perl-moo=2.003004=pl526_0
- perl-moose=2.2011=pl526hf484d3e_1
- perl-mro-compat=0.13=pl526_0
- perl-object-insideout=4.05=pl526_0
- perl-package-deprecationmanager=0.17=pl526_0
- perl-package-stash=0.38=pl526hf484d3e_1
- perl-package-stash-xs=0.28=pl526hf484d3e_1
- perl-parallel-forkmanager=2.02=pl526_0
- perl-params-check=0.38=pl526_1
- perl-params-util=1.07=pl526h6bb024c_4
- perl-parent=0.236=pl526_1
- perl-pathtools=3.75=pl526h14c3975_1
- perl-perl-ostype=1.010=pl526_1
- perl-posix=1.38_03=pl526_1
- perl-role-tiny=2.000008=pl526_0
- perl-scalar-list-utils=1.52=pl526h516909a_0
- perl-scalar-util-numeric=0.40=pl526_1
- perl-socket=2.027=pl526_1
- perl-storable=3.15=pl526h14c3975_0
- perl-sub-exporter=0.987=pl526_2
- perl-sub-exporter-progressive=0.001013=pl526_0
- perl-sub-identify=0.14=pl526h14c3975_0
- perl-sub-install=0.928=pl526_2
- perl-sub-name=0.21=pl526_1
- perl-sub-quote=2.006003=pl526_1
- perl-test-harness=3.42=pl526_0
- perl-test-pod=1.52=pl526_0
- perl-text-abbrev=1.02=pl526_0
- perl-text-parsewords=3.30=pl526_0
- perl-time-hires=1.9760=pl526h14c3975_1
- perl-try-tiny=0.30=pl526_1
- perl-types-serialiser=1.0=pl526_2
- perl-uri=1.76=pl526_0
- perl-version=0.9924=pl526_0
- perl-xml-libxml=2.0132=pl526h7ec2d77_1
- perl-xml-namespacesupport=1.12=pl526_0
- perl-xml-sax=1.02=pl526_0
- perl-xml-sax-base=1.09=pl526_0
- perl-xsloader=0.24=pl526_0
- perl-yaml=1.29=pl526_0
- perl-yaml-xs=0.74=pl526h14c3975_0
- pip=23.0.1=pyhd8ed1ab_0
- pixman=0.40.0=h36c2ea0_0
- pthread-stubs=0.4=h36c2ea0_1001
- python=3.10.2=h62f1059_0_cpython
- python_abi=3.10=3_cp310
- readline=8.1=h46c0cb4_0
- samtools=1.12=h9aed4be_1
- setuptools=67.4.0=pyhd8ed1ab_0
- spaln=2.4.7=pl5262h9a82719_0
- sqlite=3.37.0=h9cd32fc_0
- sra-tools=3.0.3=h87f3376_0
- stringtie=2.2.1=h3198e80_0
- suitesparse=5.10.1=h9e50725_1
- tar=1.34=hb2e2bae_1
- tbb=2021.7.0=h924138e_1
- tk=8.6.12=h27826a3_0
- tzdata=2022g=h191b570_0
- ucsc-bedtobigbed=377=ha8a8165_3
- ucsc-fatotwobit=377=ha8a8165_5
- ucsc-genepredcheck=377=ha8a8165_3
- ucsc-genepredtobed=377=ha8a8165_5
- ucsc-genepredtobiggenepred=377=ha8a8165_3
- ucsc-gtftogenepred=377=ha8a8165_5
- ucsc-hggcpercent=377=ha8a8165_3
- ucsc-ixixx=377=ha8a8165_3
- ucsc-twobitinfo=377=ha8a8165_3
- ucsc-wigtobigwig=377=ha8a8165_3
- wget=1.20.3=ha56f1ee_1
- wheel=0.38.4=pyhd8ed1ab_0
- xorg-fixesproto=5.0=h7f98852_1002
- xorg-inputproto=2.3.2=h7f98852_1002
- xorg-kbproto=1.0.7=h7f98852_1002
- xorg-libice=1.0.10=h7f98852_0
- xorg-libsm=1.2.3=hd9c2040_1000
- xorg-libx11=1.7.2=h7f98852_0
- xorg-libxau=1.0.9=h7f98852_0
- xorg-libxdmcp=1.1.3=h7f98852_0
- xorg-libxext=1.3.4=h0b41bf4_2
- xorg-libxfixes=5.0.3=h7f98852_1004
- xorg-libxi=1.7.10=h7f98852_0
- xorg-libxrender=0.9.10=h7f98852_1003
- xorg-libxtst=1.2.3=h7f98852_1002
- xorg-recordproto=1.14.2=h7f98852_1002
- xorg-renderproto=0.11.1=h7f98852_1002
- xorg-xextproto=7.3.0=h0b41bf4_1003
- xorg-xproto=7.0.31=h7f98852_1007
- xz=5.2.6=h166bdaf_0
- zlib=1.2.13=h166bdaf_4
- zstd=1.5.2=h3eb15da_6
NOTE: This conda environment was originally created on a separate system this way:
mamba create -n braker3-deps2 -c bioconda braker2 hisat2 stringtie bedtools sra-tools gffread
conda activate braker3-deps2
mamba install -c eumetsat perl-yaml-xs
mamba install -c conda-forge openjdk=8
And YML file obtained by:
conda env export > braker3-deps2.yml
I am currently debugging.
The problem seems to be in gmetp.pl
at this line:
PrepareGenomeTraining($proc) if 1;
(I know this is technically not a Braker problem at this point, but I will keep you updated)
And specifically at this line in PrepareGenomeTraining
inside the if statement:
if ( CreateThis("hc_regions.gtf"))
{
## breaks with the following line
system( "$bin/create_regions.pl --hcc $hcc_genes --hcp $hcp_genes --out hc_regions.gtf --margin $margin" )
and die "error on create_regions.pl";
}
Hi, @JohnUrban first, thanks to you. I can now use conda instead of singularity container.
I have the same issue but using a singularity image from here: https://hub.docker.com/r/teambraker/braker3
And this is the command with singularity: singularity exec -B ${PWD}:${PWD} ${BRAKER_SIF} braker.pl --genome=scaffolds_vf.EDTA_RM_masked.fa --prot_seq=Aves_taxid_8782_3044547_prot.fasta --bam=rna_sorted.bam --softmasking --workingdir=run2 \ --GENEMARK_PATH=${ETP} --PROTHINT_PATH=${ETP}/gmes/ProtHint/bin --threads 72
The same issue as mentioned before :
WARNING: Detected | in fasta header of file /media/ben/Data2TB/test-p/annotation/BRAKER3/Aves_taxid_8782_3044547_prot.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
ERROR in file /opt/BRAKER/scripts/braker.pl at line 5484
Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr
Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr
The most common problem is an expired or not present file ~/.gm_key!
It is clearly related to the GeneMark-ETB LICENCE, which is, in fact not available for use
You can download the license for GeneMark-EP from their webserver, and place it in ~/.gm_key .
However, Lars is still working on updating Braker code.
BenAawf @.***> schrieb am Mo. 27. Feb. 2023 um 10:24:
Hi, @JohnUrban https://github.com/JohnUrban first, thank you. I can now use conda instead of singularity container.
I have the same issue but using a singularity image from here: https://hub.docker.com/r/teambraker/braker3
And this is the command with singularity: singularity exec -B ${PWD}:${PWD} ${BRAKER_SIF} braker.pl --genome=scaffolds_vf.EDTA_RM_masked.fa --prot_seq=Aves_taxid_8782_3044547_prot.fasta --bam=rna_sorted.bam --softmasking --workingdir=run2 \ --GENEMARK_PATH=${ETP} --PROTHINT_PATH=${ETP}/gmes/ProtHint/bin --threads 72
The same issue as mentioned before :
WARNING: Detected | in fasta header of file /media/ben/Data2TB/test-p/annotation/BRAKER3/Aves_taxid_8782_3044547_prot.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on! #********* ERROR in file /opt/BRAKER/scripts/braker.pl at line 5484 Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr The most common problem is an expired or not present file ~/.gm_key!
It is clearly related to the GeneMark-ETB LICENCE, which is, in fact not available for use
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1445979978, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JGBVLEMNGWKYHFTAJTWZRXFFANCNFSM6AAAAAAVHE6PTA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I have a valid license for GeneMark-ES/ET/EP ver 4.71_lic. It works for both Braker1 and Braker2 runs. It does not work for Braker3 runs. Perhaps that is b/c that license doesn't also work for GeneMark-ETP, which was downloaded separately from: https://github.com/gatech-genemark/GeneMark-ETP .
The container has today been updated to contain GeneMark-ETP. You still have to install the license key file in your home directory (license key of GeneMark-ES/ET/EP works) as file ~/.gm_key. Otherwise, it should be very easy to run, now.
Is there a way to get GeneMark-ETP if one is not using the container?
Look at the Dockerfile… I think that will answer the question.
John Urban @.***> schrieb am Do. 2. März 2023 um 18:51:
Is there a way to get GeneMark-ETP if one is not using the container?
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452277426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHUWFPICZAQDYCG6LLW2DM2BANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>
Uh oh - time to learn Docker.
No, a keyword search will do for this. There’s a link.
John Urban @.***> schrieb am Do. 2. März 2023 um 18:57:
Uh oh - time to learn Docker.
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452291086, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JDD3Q2BQVE5SVMVWU3W2DNSNANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>
Got it! Thanks!
wget http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz && \
tar -xzf etp.for_braker.tar.gz && \
mv etp.for_braker ETP && \
chmod a+x /opt/ETP/bin/*py /opt/ETP/bin/*pl /opt/ETP/tools/*
Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl
gives the same error that I opened up this thread with -- essentially complete.gtf
and complete_uniq.gtf
not made/found:
FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.
I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root
stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.
As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key
problem unless I need an entirely new ~/.gm_key
for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.
The rootless warnings of singularity can be safely ignored.
John Urban @.***> schrieb am Do. 2. März 2023 um 19:26:
Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl gives the same error that I opened up this thread with -- essentially complete.gtf and complete_uniq.gtf not made/found:
FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.
I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.
As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key problem unless I need an entirely new ~/.gm_key for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452341605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHQ5VOCZH5TLU3BWK3W2DQ57ANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>
Maybe I misunderstood. You very likely need root privileges to install Singularity. If it is not on your cluster, conda is the next best approach.
Did you update Braker? Did you switch to master branch?
Katharina Hoff @.***> schrieb am Do. 2. März 2023 um 19:32:
The rootless warnings of singularity can be safely ignored.
John Urban @.***> schrieb am Do. 2. März 2023 um 19:26:
Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl gives the same error that I opened up this thread with -- essentially complete.gtf and complete_uniq.gtf not made/found:
FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.
I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.
As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key problem unless I need an entirely new ~/.gm_key for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452341605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHQ5VOCZH5TLU3BWK3W2DQ57ANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>
Hey - is Braker3 on the master branch now? I was on git checkout braker3
.
As for singularity, I installed it using conda. So I have I guess I have a local copy of it. I followed the singularity instructions on the main page (e.g. singularity build braker3.sif docker://teambraker/braker3:latest
). It seemed to install, but when I tried the next part (singularity exec braker3.sif print_braker3_setup.py
or singularity exec braker3.sif braker.pl
), it threw an error. I erased the sif file after that, but at the moment I am re-doing the first step to try to reproduce that error (or get past it).
Ok. For the Singularity errors:
Command:
singularity exec braker3.sif print_braker3_setup.py
Error:
INFO: Converting SIF file to temporary sandbox...
FATAL: while extracting braker3.sif: root filesystem extraction failed: extract command failed: ERROR : Failed to create user namespace: user namespace disabled
: exit status 1
Other Command:
singularity exec braker3.sif braker.pl
Same error:
INFO: Converting SIF file to temporary sandbox...
FATAL: while extracting braker3.sif: root filesystem extraction failed: extract command failed: ERROR : Failed to create user namespace: user namespace disabled
: exit status 1
The container is not working because you didn't install singularity as root. There are possibilities to get it working without root I think, but it depends on the kernel version, see https://docs.sylabs.io/guides/3.5/admin-guide/installation.html#install-nonsetuid The best option would be to ask your cluster admin to install singularity (if possible).
I will ask our HPC admin. Possibly fakeroot has to be enabled in Singularity, but I am not sure. I vaguely recall that we discussed that a long time ago.
Edit: Oh, yes, you need to be root to install Singularity.
Hey - is Braker3 on the master branch now? I was on
git checkout braker3
.
Yes, checkout master. We finally merged. And git pull to update to the latest code.
Ok - I should have mentioned this for the braker3
branch already, but I didn't want to bombard you with issues (more so than I have).
Line 2295
in braker.pl
that tries to assess the java version causes an error. The approach to getting the java version changed and the fix is simply changing it back to the old way.
New way:
$cmdString = "java -version 2>&1 | grep \"openjdk version\" | awk -F[\"\.] -v OFS=. '{print \$2,\$3}'";
Old way:
$cmdString = "java -version 2>&1 | awk -F[\\\"\\\.] -v OFS=. 'NR==1{print \$2,\$3}'";
Full context:
####################### set_JAVA_PATH #######################################
# * set path to java
# * also checks whether java version 1.8 is present
################################################################################
sub set_JAVA_PATH {
my @required_files = ('java');
$JAVA_PATH = set_software_PATH($java_path, "JAVA_PATH",
\@required_files, 'exit');
#$cmdString = "java -version 2>&1 | grep \"openjdk version\" | awk -F[\"\.] -v OFS=. '{print \$2,\$3}'";
$cmdString = "java -version 2>&1 | awk -F[\\\"\\\.] -v OFS=. 'NR==1{print \$2,\$3}'";
my @javav = `$cmdString` or die("Failed to execute: $cmdString");
if(not ($javav[0] =~ m/1\.8/ )){
$prtStr = "\# " . (localtime) . " ERROR: in file " . __FILE__
." at line ". __LINE__ ."\n"
. "You have installed java version $javav[0]. GUSHR requires version 1.8!\n"
. "You can switch between java versions on your system with:\n"
. "sudo update-alternatives --config java\n";
$logString .= $prtStr;
print STDERR $logString;
exit(1);
}
}
Alright.
I ran Braker3 again with every thing up-to-date -- noting that I did have to change the java version line back to the old approach as described above.
I still get the same error I opened up this issue with (from braker3/errors/GeneMark-ETP.stderr
).
FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.
Here are the contents of the GeneMark-ETP subdirectory inside the Braker3 working directory:
> ls braker3/GeneMark-ETP/
arx data etp_config.yaml etp_data filter_gmst.log proteins.fa prothint_gmst.log rnaseq
> ls braker3/GeneMark-ETP/*/
braker3/GeneMark-ETP/arx/:
chr.names genome.fa
braker3/GeneMark-ETP/data/:
genome.fasta genome.softmasked.fasta genome.softmasked.fasta.fai proteins.fa
braker3/GeneMark-ETP/etp_data/:
forward.bam reverse.bam
braker3/GeneMark-ETP/proteins.fa/:
braker3/GeneMark-ETP/rnaseq/:
gmst hints hisat2 stringtie
> ls braker3/GeneMark-ETP/*/*/
braker3/GeneMark-ETP/rnaseq/gmst/:
GeneMark_hmm.mod genome_gmst_for_HC.gtf genome_gmst.gtf gms.log transcripts_merged.fasta.gff
braker3/GeneMark-ETP/rnaseq/hints/:
bam2hints_forward.gff bam2hints_merged.gff bam2hints_reverse.gff hintsfile_merged.gff proteins.fa
braker3/GeneMark-ETP/rnaseq/hisat2/:
mapping_forward.bam mapping_reverse.bam
braker3/GeneMark-ETP/rnaseq/stringtie/:
transcripts_forward.gff transcripts_merged.fasta transcripts_merged.gff transcripts_reverse.gff
> ls braker3/GeneMark-ETP/*/*/*
braker3/GeneMark-ETP/rnaseq/gmst/GeneMark_hmm.mod braker3/GeneMark-ETP/rnaseq/hints/bam2hints_forward.gff braker3/GeneMark-ETP/rnaseq/hisat2/mapping_reverse.bam
braker3/GeneMark-ETP/rnaseq/gmst/genome_gmst_for_HC.gtf braker3/GeneMark-ETP/rnaseq/hints/bam2hints_merged.gff braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_forward.gff
braker3/GeneMark-ETP/rnaseq/gmst/genome_gmst.gtf braker3/GeneMark-ETP/rnaseq/hints/bam2hints_reverse.gff braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.fasta
braker3/GeneMark-ETP/rnaseq/gmst/gms.log braker3/GeneMark-ETP/rnaseq/hints/hintsfile_merged.gff braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.gff
braker3/GeneMark-ETP/rnaseq/gmst/transcripts_merged.fasta.gff braker3/GeneMark-ETP/rnaseq/hisat2/mapping_forward.bam braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_reverse.gff
braker3/GeneMark-ETP/rnaseq/hints/proteins.fa:
log prothint tmp
...and so on
It cannot find a file located here:
braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf
So I decided to look at what is there:
> ls braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/
log prothint tmp
I wonder if the log
file (braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/log
) holds the answer or a hint - contents:
02-Mar-23 11:36:56 - INFO: Starting the GMST filtering and classification.
02-Mar-23 11:36:56 - INFO: Running the following system call: /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/GeneMarkSTFiltering/gms2hints.pl --tseq /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.fasta --ggtf /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.gff --tgff /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/gmst/transcripts_merged.fasta.gff --out /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/gmsttbny79f0.gtf
02-Mar-23 11:36:56 - INFO: Making diamond database from /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa
02-Mar-23 11:36:56 - INFO: Running the following system call: diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd
02-Mar-23 11:37:16 - ERROR: Program exited due to an error in command: diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd
02-Mar-23 11:37:16 - ERROR: Check stderr for more details.
When I run the diamond makedb
command to see the error:
> diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd
diamond v2.1.3.157 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)
#CPU threads: 32
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa
Opening the database file... [0.143s]
Loading sequences... [3.726s]
Masking sequences... [2.614s]
Writing sequences... [0.562s]
Hashing sequences... [0.239s]
Loading sequences... Error: Error reading input stream at line 7477106: Invalid character (.) in sequence
And looking more into that, for some reason many of the OrthoDB protein sequences end with a period... e.g.:
>307491_0:000000
MLAYADNIVVMGETKDINSTSKLISSNNFKYLGVNINNKIGMHIEINERITNGNSCYFSIIKFLRS.
I downloaded orthodb proteins like this:
wget --no-check-certificate https://v100.orthodb.org/download/odb10_metazoa_fasta.tar.gz
tar -xzf odb10_metazoa_fasta.tar.gz
cat metazoa/Rawdata/* > metazoan-proteins-orthoDB.fasta
rm -r metazoa
I will look into removing these periods.... what strikes me funny though is that this did not cause an error when running Braker2. Or maybe it caused an error that went silent/undetected...?
I just checked the sequence in orthodb 11 and there it seems to be okay. So maybe if you update to 11 the problem fixes itself.
>307491_0:000000 307491_0
MLAYADNIVVMGETKDINSTSKLISSNNFKYLGVNINNKIGMHIEINERITNGNSCYFSIIKFLRS
I re-downloaded the OrthoDB proteins to confirm its not just my copy somehow, and indeed 17014 of the 8266016 metazoan proteins end with a period. But this was still v10 -- thanks to @Thamos for telling me about v11. I didn't realize there had been an update.
Nonetheless, I removed the periods at the end of ODBv10 seqs the following way:
awk '{sub(/\.$/,""); print}' proteins.fa > proteins.fixed.fa
That definitely solved the diamond makedb
problem. And that in turn might solve my whole issue.... waiting to find out still.
I am still scratching my head as to why Braker2 didn't fail b/c of these period-containing sequences though. I'm somewhat guessing that it might have "failed silently" and perhaps I should be skeptical of those results.
Do you have a (non empty) "prothint.gff" file in your braker2 directories? I think if "diamond makedb" didn't work there shouldn't be one, as prothint uses diamond. E.g. in my case with orthodb plants it's 54MB.
Hi @JohnUrban, no need to be worried about that. BRAKER2 calls DIAMOND only via ProtHint which sanitizes the protein input (https://github.com/gatech-genemark/ProtHint/commit/19ef04c93bfa691bd6583017189b6e340a4513df).
In BRAKER3, DIAMOND is also called "directly" on raw protein input. That's definitely something to fix, thanks for pointing out.
I'll open an issue about this in GeneMark-ETP.
@Thamos the previous runs with the "dirty" protein sequences did not have that file. Now that I have a "pre-sanitized" protein file, Diamond is happily working at the moment, and I suspect I will get the "prothint.gff" file when it finishes up.
@tomasbruna glad I could point out a real issue here. I will keep you posted on whether or not this allows Braker3 to finish.
@Thamos is there a link like https://v100.orthodb.org/download/odb10_metazoa_fasta.tar.gz
for v11?
Else, I could download the whole v11 db here: https://data.orthodb.org/download/ ...do you have any tips on how to filter the entire DB to get just metazoan proteins?