BRAKER icon indicating copy to clipboard operation
BRAKER copied to clipboard

Braker3/GeneMark-ETP: file not found: complete.gtf, complete.id, complete_uniq.gtf

Open JohnUrban opened this issue 2 years ago • 51 comments

Hello,

Thank you for all the great tools coming from this team.

I gave Braker3 a shot, but am running into an error at the moment. I will report below how I installed Braker3, and how I used it in case it helps reproduce the error.

I would be grateful for any guidance you can provide, and am eager to get Braker3 working at some point in the near future, but fully understand that you are busy. I am mainly reporting this issue in case it helps your development.



First, here was the command used.


braker.pl --genome=${ASM} --UTR=on --stranded=+,- --bam=${FWD},${REV} --prot_seq=${PROTEINS} --workingdir=braker3 --threads=16


Second, here are the errors as reported.


This was reported to stdout/stderr.

# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023:Both protein and RNA-Seq libraries in input detected. BRAKER will be executed in ETP mode.
#*********
# Fri Feb 24 08:59:38 2023: Log information is stored in file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/braker.log
#*********
# WARNING: Detected whitespace in fasta header of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
ERROR in file /home/jurban/software/braker2/braker3/BRAKER/scripts/braker.pl at line 5486
Failed to execute: /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr
Failed to execute: /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr
The most common problem is an expired or not present file ~/.gm_key!

This is from braker.log

#**********************************************************************************
#                               BRAKER CONFIGURATION                               
#**********************************************************************************
# BRAKER CALL: /home/jurban/software/braker2/braker3/BRAKER/scripts/braker.pl --genome=/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked --UTR=on --stranded=+,- --bam=/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/forward.bam,/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/reverse.bam --prot_seq=/central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta --workingdir=braker3 --threads=16
# Fri Feb 24 08:59:35 2023: braker.pl version 3.0.0
# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023: Creating directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3.
# Fri Feb 24 08:59:35 2023:Both protein and RNA-Seq libraries in input detected. BRAKER will be executed in ETP mode.
#*********
# Fri Feb 24 08:59:35 2023: Configuring of BRAKER for using external tools...
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_CONFIG_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_CONFIG_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/ as potential path for $AUGUSTUS_CONFIG_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_CONFIG_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/!
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_BIN_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_BIN_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ as potential path for $AUGUSTUS_BIN_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_BIN_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/!
# Fri Feb 24 08:59:35 2023: Trying to set $AUGUSTUS_SCRIPTS_PATH...
# Fri Feb 24 08:59:35 2023: Found environment variable $AUGUSTUS_SCRIPTS_PATH.
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/ as potential path for $AUGUSTUS_SCRIPTS_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $AUGUSTUS_SCRIPTS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/!
# Fri Feb 24 08:59:35 2023: Trying to set $PYTHON3_PATH...
# Fri Feb 24 08:59:35 2023: Did not find environment variable $PYTHON3_PATH.
# Fri Feb 24 08:59:35 2023: Trying to guess PYTHON3_PATH from location of python3 executable that is available in your $PATH
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $PYTHON3_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $PYTHON3_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:35 2023: Trying to set $JAVA_PATH...
# Fri Feb 24 08:59:35 2023: Did not find environment variable $JAVA_PATH.
# Fri Feb 24 08:59:35 2023: Trying to guess JAVA_PATH from location of java executable that is available in your $PATH
# Fri Feb 24 08:59:35 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $JAVA_PATH.
# Fri Feb 24 08:59:35 2023: Success! Setting $JAVA_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $GUSHR_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $GUSHR_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess GUSHR_PATH from location of gushr.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $GUSHR_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $GUSHR_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $GENEMARK_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $GENEMARK_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess GENEMARK_PATH from location of gmetp.pl executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin as potential path for $GENEMARK_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $GENEMARK_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $BAMTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $BAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess BAMTOOLS_PATH from location of bamtools executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $BAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $BAMTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $SAMTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $SAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess SAMTOOLS_PATH from location of samtools executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools as potential path for $SAMTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $SAMTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools!
# Fri Feb 24 08:59:36 2023: Trying to set $DIAMOND_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $DIAMOND_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess DIAMOND_PATH from location of diamond executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools as potential path for $DIAMOND_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $DIAMOND_PATH to /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/tools!
# Fri Feb 24 08:59:36 2023: Trying to set $PROTHINT_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $PROTHINT_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess PROTHINT_PATH from location of prothint.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/braker2/deps/prothint/ProtHint-2.6.0/bin as potential path for $PROTHINT_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $PROTHINT_PATH to /central/groups/carnegie_poc/jurban/software/braker2/deps/prothint/ProtHint-2.6.0/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $TSEBRA_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $TSEBRA_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess TSEBRA_PATH from location of tsebra.py executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $TSEBRA_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $TSEBRA_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
# Fri Feb 24 08:59:36 2023: Trying to set $CDBTOOLS_PATH...
# Fri Feb 24 08:59:36 2023: Did not find environment variable $CDBTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Trying to guess CDBTOOLS_PATH from location of cdbfasta executable that is available in your $PATH
# Fri Feb 24 08:59:36 2023: Checking /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin as potential path for $CDBTOOLS_PATH.
# Fri Feb 24 08:59:36 2023: Success! Setting $CDBTOOLS_PATH to /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin!
#*********
# IMPORTANT INFORMATION: no species for identifying the AUGUSTUS  parameter set that will arise from this BRAKER run was set. BRAKER will create an AUGUSTUS parameter set with name Sp_1. This parameter set can be used for future BRAKER/AUGUSTUS prediction runs for the same species. It is usually not necessary to retrain AUGUSTUS with novel extrinsic data if a high quality parameter set already exists.
#*********
#**********************************************************************************
#                               CREATING DIRECTORY STRUCTURE                       
#**********************************************************************************
# Fri Feb 24 08:59:38 2023: creating file that contains citations for this BRAKER run at /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/what-to-cite.txt...
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP.
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/species
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/species
# Fri Feb 24 08:59:38 2023: create working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors
mkdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors
# Fri Feb 24 08:59:38 2023: changing into working directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3
cd /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3
# Fri Feb 24 08:59:38 2023: getting GC content of the genome
/central/groups/carnegie_poc/jurban/software/braker2/braker3/BRAKER/scripts/get_gc_content.py --sequences /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked --print_sequence_length 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/gc_content.out 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/gc_content.stderr
# Fri Feb 24 08:59:40 2023: Creating parameter template files for AUGUSTUS with new_species.pl
# Fri Feb 24 08:59:40 2023: new_species.pl will create parameter files for species Sp_1 in /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config//species/Sp_1
/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/new_species.pl --species=Sp_1 --AUGUSTUS_CONFIG_PATH=/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/config/ 1> /dev/null 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/new_species.stderr
# Fri Feb 24 08:59:40 2023: check_fasta_headers(): Checking fasta headers of file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/longest.fa.masked
# Fri Feb 24 08:59:40 2023: check_fasta_headers(): Checking fasta headers of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta
# Fri Feb 24 08:59:40 2023: Assuming that this is not a DNA fasta file because other characters than A, T, G, C, N, a, t, g, c, n were contained. If this is supposed to be a DNA fasta file, check the content of your file! If this is supposed to be a protein fasta file, please ignore this message!
#*********
# WARNING: Detected whitespace in fasta header of file /central/groups/carnegie_poc/jurban/software/braker2/protein/gfas1-and-hexacorallia-and-metazoan-proteins-orthoDB.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
# Fri Feb 24 08:59:44 2023: Assuming that this is not a protein fasta file because other characters than AaRrNnDdCcEeQqGgHhIiLlKkMmFfPpSsTtWwYyVvBbZzJjOoUuXx were contained. If this is supposed to be DNA fasta file, please ignore this message.
#**********************************************************************************
#                               PROCESSING HINTS                                   
#**********************************************************************************
#**********************************************************************************
#                              RUNNING GENEMARK-EX                                 
#**********************************************************************************
# Fri Feb 24 09:00:15 2023: Preparing genemark_evidence file hints from manual hints...
# Fri Feb 24 09:00:15 2023: Running GeneMark-ETP
# Fri Feb 24 09:00:15 2023: changing into GeneMark-ETP directory /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
cd /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP
# Fri Feb 24 09:00:16 2023: sorting RNA-Seq BAM files
samtools sort /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/forward.bam -o /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/forward.bam 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.forward.stdout 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.forward.stderr
samtools sort /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/data/toy/reverse.bam -o /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/reverse.bam 1> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.reverse.stdout 2> /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/samtools.sort.reverse.stderr
# Fri Feb 24 09:00:32 2023: Running gmetp.pl
/central/groups/carnegie_poc/jurban/software/conda/anaconda3/envs/braker3-deps2/bin/perl /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl --cfg /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_config.yaml --workdir /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP --bam /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/etp_data/ --cores 16 --softmask 1>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stdout 2>/central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/errors/GeneMark-ETP.stderr


This is from GeneMark-ETP.stderr.

FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/GeneMark-ETP/bin/gmetp.pl line 2162.


Third, here is how I installed it.


First, I installed dependencies with Mamba (conda) using a YML file.

mamba env create -f braker3-deps.yml 

I will copy/paste the braker3-deps.yml at the very bottom.

Second, I installed GeneMark-ETP via git clone.

git clone https://github.com/gatech-genemark/GeneMark-ETP.git

Third, I cloned BRAKER and checked out the braker3 branch.

git clone https://github.com/Gaius-Augustus/BRAKER.git
cd BRAKER
git checkout braker3

Fourth, the run evironment is set by:

conda activate braker3-deps2
export PATH=${BRAKER3}:${GENEMARK_ETP_BIN}:${GENEMARK_ETP_TOOLS}:${PROTHINT2}:${PATH}



YML File

name: braker3-deps2
channels:
  - eumetsat
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_gnu
  - alsa-lib=1.2.7.2=h166bdaf_0
  - augustus=3.4.0=pl5262h5a9fe7b_2
  - bamtools=2.5.1=hd03093a_10
  - bedtools=2.30.0=h468198e_3
  - biopython=1.81=py310h1fa729e_0
  - blast=2.13.0=hf3cf87c_0
  - boost-cpp=1.74.0=h75c5d50_8
  - braker2=2.1.6=hdfd78af_5
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.18.1=h7f98852_0
  - ca-certificates=2022.12.7=ha878542_0
  - cairo=1.16.0=ha61ee94_1014
  - cdbtools=0.99=hd03093a_7
  - curl=7.87.0=h6312ad2_0
  - diamond=2.1.3=hb97b32f_0
  - entrez-direct=16.2=he881be0_1
  - exonerate=2.4.0=h09da616_5
  - expat=2.5.0=h27087fc_0
  - font-ttf-dejavu-sans-mono=2.37=hab24e00_0
  - font-ttf-inconsolata=3.000=h77eed37_0
  - font-ttf-source-code-pro=2.038=h77eed37_0
  - font-ttf-ubuntu=0.83=hab24e00_0
  - fontconfig=2.14.2=h14ed4e7_0
  - fonts-conda-ecosystem=1=0
  - fonts-conda-forge=1=0
  - freetype=2.12.1=hca18f0e_1
  - gawk=5.1.0=h7f98852_0
  - gemoma=1.6.4=hdfd78af_1
  - genomethreader=1.7.1=h87f3376_4
  - gettext=0.21.1=h27087fc_0
  - gffread=0.12.7=hd03093a_1
  - giflib=5.2.1=h36c2ea0_2
  - glib=2.74.1=h6239696_1
  - glib-tools=2.74.1=h6239696_1
  - gmp=6.2.1=h58526e2_0
  - graphite2=1.3.13=h58526e2_1001
  - gsl=2.6=he838d99_2
  - harfbuzz=5.3.0=h418a68e_0
  - hisat2=2.2.1=h87f3376_4
  - htslib=1.12=h9093b5e_1
  - icu=70.1=h27087fc_0
  - jbig=2.1=h7f98852_2003
  - jpeg=9e=h0b41bf4_3
  - keyutils=1.6.1=h166bdaf_0
  - krb5=1.20.1=hf9c8cef_0
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.40=h41732ed_0
  - lerc=2.2.1=h9c3ff4c_0
  - libblas=3.9.0=16_linux64_openblas
  - libcblas=3.9.0=16_linux64_openblas
  - libcups=2.3.3=h36d4200_3
  - libcurl=7.87.0=h6312ad2_0
  - libdeflate=1.7=h7f98852_5
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libffi=3.4.2=h7f98852_5
  - libgcc-ng=12.2.0=h65d4601_19
  - libgfortran-ng=12.2.0=h69a702a_19
  - libgfortran5=12.2.0=h337968e_19
  - libglib=2.74.1=h606061b_1
  - libgomp=12.2.0=h65d4601_19
  - libhwloc=2.8.0=h32351e8_1
  - libiconv=1.17=h166bdaf_0
  - libidn2=2.3.4=h166bdaf_0
  - liblapack=3.9.0=16_linux64_openblas
  - libnghttp2=1.51.0=hdcd2b5c_0
  - libnsl=2.0.0=h7f98852_0
  - libopenblas=0.3.21=pthreads_h78a6416_3
  - libpng=1.6.39=h753d276_0
  - libssh2=1.10.0=haa6b8db_3
  - libstdcxx-ng=12.2.0=h46fd767_19
  - libtiff=4.3.0=hf544144_1
  - libunistring=0.9.10=h7f98852_0
  - libuuid=2.32.1=h7f98852_1000
  - libwebp-base=1.2.4=h166bdaf_0
  - libxcb=1.13=h7f98852_1004
  - libxml2=2.9.14=h22db469_4
  - libzlib=1.2.13=h166bdaf_4
  - lp_solve=5.5.2.5=h14c3975_1001
  - makehub=1.0.5=1
  - metis=5.1.0=h58526e2_1006
  - mmseqs2=13.45111=h95f258a_1
  - mpfr=4.1.0=h9202a9a_1
  - mysql-connector-c=6.1.11=h6eb9d5d_1007
  - ncbi-vdb=3.0.2=h87f3376_0
  - ncurses=6.2=h58526e2_4
  - numpy=1.24.2=py310h8deb116_0
  - openjdk=8.0.332=h166bdaf_0
  - openssl=1.1.1t=h0b41bf4_0
  - ossuuid=1.6.2=hf484d3e_1000
  - pcre=8.45=h9c3ff4c_0
  - pcre2=10.40=hc3806b6_0
  - perl=5.26.2=h36c2ea0_1008
  - perl-apache-test=1.40=pl526_1
  - perl-app-cpanminus=1.7044=pl526_1
  - perl-archive-tar=2.32=pl526_0
  - perl-base=2.23=pl526_1
  - perl-business-isbn=3.004=pl526_0
  - perl-business-isbn-data=20140910.003=pl526_0
  - perl-carp=1.38=pl526_3
  - perl-class-data-inheritable=0.08=pl526_1
  - perl-class-load=0.25=pl526_0
  - perl-class-load-xs=0.10=pl526h6bb024c_2
  - perl-class-method-modifiers=2.12=pl526_0
  - perl-clone-choose=0.010=pl526_0
  - perl-common-sense=3.74=pl526_2
  - perl-compress-raw-bzip2=2.087=pl526he1b5a44_0
  - perl-compress-raw-zlib=2.087=pl526hc9558a2_0
  - perl-constant=1.33=pl526_1
  - perl-cpan-meta=2.150010=pl526_0
  - perl-cpan-meta-requirements=2.140=pl526_0
  - perl-cpan-meta-yaml=0.018=pl526_0
  - perl-data-dumper=2.173=pl526_0
  - perl-data-optlist=0.110=pl526_2
  - perl-dbi=1.642=pl526_0
  - perl-devel-globaldestruction=0.14=pl526_0
  - perl-devel-overloadinfo=0.005=pl526_0
  - perl-devel-stacktrace=2.04=pl526_0
  - perl-dist-checkconflicts=0.11=pl526_2
  - perl-encode=2.88=pl526_1
  - perl-eval-closure=0.14=pl526h6bb024c_4
  - perl-exception-class=1.44=pl526_0
  - perl-exporter=5.72=pl526_1
  - perl-exporter-tiny=1.002001=pl526_0
  - perl-extutils-cbuilder=0.280230=pl526_1
  - perl-extutils-makemaker=7.36=pl526_1
  - perl-extutils-manifest=1.72=pl526_0
  - perl-extutils-parsexs=3.35=pl526_0
  - perl-file-homedir=1.004=pl526_2
  - perl-file-path=2.16=pl526_0
  - perl-file-spec=3.48_01=pl526_1
  - perl-file-temp=0.2304=pl526_2
  - perl-file-which=1.23=pl526_0
  - perl-getopt-long=2.50=pl526_1
  - perl-hash-merge=0.300=pl526_0
  - perl-inline=0.80=pl526_2
  - perl-io-compress=2.087=pl526he1b5a44_0
  - perl-io-zlib=1.10=pl526_2
  - perl-ipc-cmd=1.02=pl526_0
  - perl-json=4.02=pl526_0
  - perl-json-pp=4.04=pl526_0
  - perl-json-xs=2.34=pl526h6bb024c_3
  - perl-list-moreutils=0.428=pl526_1
  - perl-list-moreutils-xs=0.428=pl526_0
  - perl-list-util=1.38=pl526_1
  - perl-locale-maketext-simple=0.21=pl526_2
  - perl-logger-simple=2.0=pl526_0
  - perl-math-utils=1.13=pl526_0
  - perl-mce=1.837=pl526_0
  - perl-mime-base64=3.15=pl526_1
  - perl-module-build=0.4224=pl526_3
  - perl-module-corelist=5.20190524=pl526_0
  - perl-module-implementation=0.09=pl526_2
  - perl-module-load=0.32=pl526_1
  - perl-module-load-conditional=0.68=pl526_2
  - perl-module-metadata=1.000036=pl526_0
  - perl-module-runtime=0.016=pl526_1
  - perl-module-runtime-conflicts=0.003=pl526_0
  - perl-moo=2.003004=pl526_0
  - perl-moose=2.2011=pl526hf484d3e_1
  - perl-mro-compat=0.13=pl526_0
  - perl-object-insideout=4.05=pl526_0
  - perl-package-deprecationmanager=0.17=pl526_0
  - perl-package-stash=0.38=pl526hf484d3e_1
  - perl-package-stash-xs=0.28=pl526hf484d3e_1
  - perl-parallel-forkmanager=2.02=pl526_0
  - perl-params-check=0.38=pl526_1
  - perl-params-util=1.07=pl526h6bb024c_4
  - perl-parent=0.236=pl526_1
  - perl-pathtools=3.75=pl526h14c3975_1
  - perl-perl-ostype=1.010=pl526_1
  - perl-posix=1.38_03=pl526_1
  - perl-role-tiny=2.000008=pl526_0
  - perl-scalar-list-utils=1.52=pl526h516909a_0
  - perl-scalar-util-numeric=0.40=pl526_1
  - perl-socket=2.027=pl526_1
  - perl-storable=3.15=pl526h14c3975_0
  - perl-sub-exporter=0.987=pl526_2
  - perl-sub-exporter-progressive=0.001013=pl526_0
  - perl-sub-identify=0.14=pl526h14c3975_0
  - perl-sub-install=0.928=pl526_2
  - perl-sub-name=0.21=pl526_1
  - perl-sub-quote=2.006003=pl526_1
  - perl-test-harness=3.42=pl526_0
  - perl-test-pod=1.52=pl526_0
  - perl-text-abbrev=1.02=pl526_0
  - perl-text-parsewords=3.30=pl526_0
  - perl-time-hires=1.9760=pl526h14c3975_1
  - perl-try-tiny=0.30=pl526_1
  - perl-types-serialiser=1.0=pl526_2
  - perl-uri=1.76=pl526_0
  - perl-version=0.9924=pl526_0
  - perl-xml-libxml=2.0132=pl526h7ec2d77_1
  - perl-xml-namespacesupport=1.12=pl526_0
  - perl-xml-sax=1.02=pl526_0
  - perl-xml-sax-base=1.09=pl526_0
  - perl-xsloader=0.24=pl526_0
  - perl-yaml=1.29=pl526_0
  - perl-yaml-xs=0.74=pl526h14c3975_0
  - pip=23.0.1=pyhd8ed1ab_0
  - pixman=0.40.0=h36c2ea0_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - python=3.10.2=h62f1059_0_cpython
  - python_abi=3.10=3_cp310
  - readline=8.1=h46c0cb4_0
  - samtools=1.12=h9aed4be_1
  - setuptools=67.4.0=pyhd8ed1ab_0
  - spaln=2.4.7=pl5262h9a82719_0
  - sqlite=3.37.0=h9cd32fc_0
  - sra-tools=3.0.3=h87f3376_0
  - stringtie=2.2.1=h3198e80_0
  - suitesparse=5.10.1=h9e50725_1
  - tar=1.34=hb2e2bae_1
  - tbb=2021.7.0=h924138e_1
  - tk=8.6.12=h27826a3_0
  - tzdata=2022g=h191b570_0
  - ucsc-bedtobigbed=377=ha8a8165_3
  - ucsc-fatotwobit=377=ha8a8165_5
  - ucsc-genepredcheck=377=ha8a8165_3
  - ucsc-genepredtobed=377=ha8a8165_5
  - ucsc-genepredtobiggenepred=377=ha8a8165_3
  - ucsc-gtftogenepred=377=ha8a8165_5
  - ucsc-hggcpercent=377=ha8a8165_3
  - ucsc-ixixx=377=ha8a8165_3
  - ucsc-twobitinfo=377=ha8a8165_3
  - ucsc-wigtobigwig=377=ha8a8165_3
  - wget=1.20.3=ha56f1ee_1
  - wheel=0.38.4=pyhd8ed1ab_0
  - xorg-fixesproto=5.0=h7f98852_1002
  - xorg-inputproto=2.3.2=h7f98852_1002
  - xorg-kbproto=1.0.7=h7f98852_1002
  - xorg-libice=1.0.10=h7f98852_0
  - xorg-libsm=1.2.3=hd9c2040_1000
  - xorg-libx11=1.7.2=h7f98852_0
  - xorg-libxau=1.0.9=h7f98852_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xorg-libxext=1.3.4=h0b41bf4_2
  - xorg-libxfixes=5.0.3=h7f98852_1004
  - xorg-libxi=1.7.10=h7f98852_0
  - xorg-libxrender=0.9.10=h7f98852_1003
  - xorg-libxtst=1.2.3=h7f98852_1002
  - xorg-recordproto=1.14.2=h7f98852_1002
  - xorg-renderproto=0.11.1=h7f98852_1002
  - xorg-xextproto=7.3.0=h0b41bf4_1003
  - xorg-xproto=7.0.31=h7f98852_1007
  - xz=5.2.6=h166bdaf_0
  - zlib=1.2.13=h166bdaf_4
  - zstd=1.5.2=h3eb15da_6

NOTE: This conda environment was originally created on a separate system this way:

mamba create -n braker3-deps2 -c bioconda braker2 hisat2 stringtie bedtools sra-tools gffread
conda activate braker3-deps2
mamba install -c eumetsat perl-yaml-xs
mamba install -c conda-forge openjdk=8

And YML file obtained by:

conda env export > braker3-deps2.yml

JohnUrban avatar Feb 24 '23 17:02 JohnUrban

I am currently debugging.

The problem seems to be in gmetp.pl at this line:

PrepareGenomeTraining($proc) if 1;

(I know this is technically not a Braker problem at this point, but I will keep you updated)

JohnUrban avatar Feb 24 '23 19:02 JohnUrban

And specifically at this line in PrepareGenomeTraining inside the if statement:

	if ( CreateThis("hc_regions.gtf"))
	{
	        ## breaks with the following line
		system( "$bin/create_regions.pl --hcc $hcc_genes --hcp $hcp_genes --out hc_regions.gtf --margin $margin" )
		and die "error on create_regions.pl";
	}

JohnUrban avatar Feb 24 '23 20:02 JohnUrban

Hi, @JohnUrban first, thanks to you. I can now use conda instead of singularity container.

I have the same issue but using a singularity image from here: https://hub.docker.com/r/teambraker/braker3

And this is the command with singularity: singularity exec -B ${PWD}:${PWD} ${BRAKER_SIF} braker.pl --genome=scaffolds_vf.EDTA_RM_masked.fa --prot_seq=Aves_taxid_8782_3044547_prot.fasta --bam=rna_sorted.bam --softmasking --workingdir=run2 \ --GENEMARK_PATH=${ETP} --PROTHINT_PATH=${ETP}/gmes/ProtHint/bin --threads 72

The same issue as mentioned before :

 WARNING: Detected | in fasta header of file /media/ben/Data2TB/test-p/annotation/BRAKER3/Aves_taxid_8782_3044547_prot.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
ERROR in file /opt/BRAKER/scripts/braker.pl at line 5484
Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr
Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr
The most common problem is an expired or not present file ~/.gm_key!

It is clearly related to the GeneMark-ETB LICENCE, which is, in fact not available for use

BenAawf avatar Feb 27 '23 09:02 BenAawf

You can download the license for GeneMark-EP from their webserver, and place it in ~/.gm_key .

However, Lars is still working on updating Braker code.

BenAawf @.***> schrieb am Mo. 27. Feb. 2023 um 10:24:

Hi, @JohnUrban https://github.com/JohnUrban first, thank you. I can now use conda instead of singularity container.

I have the same issue but using a singularity image from here: https://hub.docker.com/r/teambraker/braker3

And this is the command with singularity: singularity exec -B ${PWD}:${PWD} ${BRAKER_SIF} braker.pl --genome=scaffolds_vf.EDTA_RM_masked.fa --prot_seq=Aves_taxid_8782_3044547_prot.fasta --bam=rna_sorted.bam --softmasking --workingdir=run2 \ --GENEMARK_PATH=${ETP} --PROTHINT_PATH=${ETP}/gmes/ProtHint/bin --threads 72

The same issue as mentioned before :

WARNING: Detected | in fasta header of file /media/ben/Data2TB/test-p/annotation/BRAKER3/Aves_taxid_8782_3044547_prot.fasta. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on! #********* ERROR in file /opt/BRAKER/scripts/braker.pl at line 5484 Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr Failed to execute: /usr/bin/perl /media/ben/Data2TB/test-p/annotation/BRAKER3/GeneMark-ETP/bin/etp_release.pl --cfg /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_config.yaml --workdir /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP --bam /media/ben/Data2TB/test-p/annotation/BRAKER3/run2/GeneMark-ETP/etp_data/ --cores 72 --softmask 1>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stdout 2>/media/ben/Data2TB/test-p/annotation/BRAKER3/run2/errors/GeneMark-ETP.stderr The most common problem is an expired or not present file ~/.gm_key!

It is clearly related to the GeneMark-ETB LICENCE, which is, in fact not available for use

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1445979978, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JGBVLEMNGWKYHFTAJTWZRXFFANCNFSM6AAAAAAVHE6PTA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

KatharinaHoff avatar Feb 27 '23 09:02 KatharinaHoff

I have a valid license for GeneMark-ES/ET/EP ver 4.71_lic. It works for both Braker1 and Braker2 runs. It does not work for Braker3 runs. Perhaps that is b/c that license doesn't also work for GeneMark-ETP, which was downloaded separately from: https://github.com/gatech-genemark/GeneMark-ETP .

JohnUrban avatar Feb 28 '23 18:02 JohnUrban

The container has today been updated to contain GeneMark-ETP. You still have to install the license key file in your home directory (license key of GeneMark-ES/ET/EP works) as file ~/.gm_key. Otherwise, it should be very easy to run, now.

KatharinaHoff avatar Mar 02 '23 16:03 KatharinaHoff

Is there a way to get GeneMark-ETP if one is not using the container?

JohnUrban avatar Mar 02 '23 17:03 JohnUrban

Look at the Dockerfile… I think that will answer the question.

John Urban @.***> schrieb am Do. 2. März 2023 um 18:51:

Is there a way to get GeneMark-ETP if one is not using the container?

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452277426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHUWFPICZAQDYCG6LLW2DM2BANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>

KatharinaHoff avatar Mar 02 '23 17:03 KatharinaHoff

Uh oh - time to learn Docker.

JohnUrban avatar Mar 02 '23 17:03 JohnUrban

No, a keyword search will do for this. There’s a link.

John Urban @.***> schrieb am Do. 2. März 2023 um 18:57:

Uh oh - time to learn Docker.

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452291086, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JDD3Q2BQVE5SVMVWU3W2DNSNANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>

KatharinaHoff avatar Mar 02 '23 17:03 KatharinaHoff

Got it! Thanks!

wget  http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz && \
    tar -xzf etp.for_braker.tar.gz && \
    mv etp.for_braker ETP && \
    chmod a+x /opt/ETP/bin/*py /opt/ETP/bin/*pl /opt/ETP/tools/*

JohnUrban avatar Mar 02 '23 17:03 JohnUrban

Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl gives the same error that I opened up this thread with -- essentially complete.gtf and complete_uniq.gtf not made/found:

FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.

I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.

As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key problem unless I need an entirely new ~/.gm_key for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.

JohnUrban avatar Mar 02 '23 18:03 JohnUrban

The rootless warnings of singularity can be safely ignored.

John Urban @.***> schrieb am Do. 2. März 2023 um 19:26:

Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl gives the same error that I opened up this thread with -- essentially complete.gtf and complete_uniq.gtf not made/found:

FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.

error, file not found: option --f1 complete.gtf

error on open file complete.id: No such file or directory

mv: cannot stat ‘complete_uniq.gtf’: No such file or directory

error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory

error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.

I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.

As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key problem unless I need an entirely new ~/.gm_key for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452341605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHQ5VOCZH5TLU3BWK3W2DQ57ANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>

KatharinaHoff avatar Mar 02 '23 18:03 KatharinaHoff

Maybe I misunderstood. You very likely need root privileges to install Singularity. If it is not on your cluster, conda is the next best approach.

Did you update Braker? Did you switch to master branch?

Katharina Hoff @.***> schrieb am Do. 2. März 2023 um 19:32:

The rootless warnings of singularity can be safely ignored.

John Urban @.***> schrieb am Do. 2. März 2023 um 19:26:

Well, I gave Braker3 a try with the GeneMark-ETP copy found here http://topaz.gatech.edu/GeneMark/etp.for_braker.tar.gz -- but gmetp.pl gives the same error that I opened up this thread with -- essentially complete.gtf and complete_uniq.gtf not made/found:

FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.

error, file not found: option --f1 complete.gtf

error on open file complete.id: No such file or directory

mv: cannot stat ‘complete_uniq.gtf’: No such file or directory

error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory

error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.

I am not using the container, so I know that makes this report extra annoying, and I do apologize for that! I personally couldn't get the whole singularity thing working, it kept complaining about root stuff (I'm on a remote cluster without root/sudo privileges). But if necessary, I will give that route another try in the future -- I'm sure there are local options to learn about.

As for my conda-supported approach detailed in this thread, and especially after getting the same error with the new GeneMark-ETP copy, I am skeptical that this is a ~/.gm_key problem unless I need an entirely new ~/.gm_key for GeneMark-ETP. It works fine for the other GeneMark software with Braker1/Braker2. GeneMark-ETP does run a little bit, but seems to fail to create/find/open those files.

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452341605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHQ5VOCZH5TLU3BWK3W2DQ57ANCNFSM6AAAAAAVHE6PTA . You are receiving this because you commented.Message ID: @.***>

KatharinaHoff avatar Mar 02 '23 18:03 KatharinaHoff

Hey - is Braker3 on the master branch now? I was on git checkout braker3.

As for singularity, I installed it using conda. So I have I guess I have a local copy of it. I followed the singularity instructions on the main page (e.g. singularity build braker3.sif docker://teambraker/braker3:latest). It seemed to install, but when I tried the next part (singularity exec braker3.sif print_braker3_setup.py or singularity exec braker3.sif braker.pl), it threw an error. I erased the sif file after that, but at the moment I am re-doing the first step to try to reproduce that error (or get past it).

JohnUrban avatar Mar 02 '23 18:03 JohnUrban

Ok. For the Singularity errors:

Command:

singularity exec braker3.sif print_braker3_setup.py

Error:

INFO:    Converting SIF file to temporary sandbox...
FATAL:   while extracting braker3.sif: root filesystem extraction failed: extract command failed: ERROR  : Failed to create user namespace: user namespace disabled
: exit status 1

Other Command:

singularity exec braker3.sif braker.pl

Same error:

INFO:    Converting SIF file to temporary sandbox...
FATAL:   while extracting braker3.sif: root filesystem extraction failed: extract command failed: ERROR  : Failed to create user namespace: user namespace disabled
: exit status 1

JohnUrban avatar Mar 02 '23 18:03 JohnUrban

The container is not working because you didn't install singularity as root. There are possibilities to get it working without root I think, but it depends on the kernel version, see https://docs.sylabs.io/guides/3.5/admin-guide/installation.html#install-nonsetuid The best option would be to ask your cluster admin to install singularity (if possible).

Thamos avatar Mar 02 '23 19:03 Thamos

I will ask our HPC admin. Possibly fakeroot has to be enabled in Singularity, but I am not sure. I vaguely recall that we discussed that a long time ago.

Edit: Oh, yes, you need to be root to install Singularity.

KatharinaHoff avatar Mar 02 '23 19:03 KatharinaHoff

Hey - is Braker3 on the master branch now? I was on git checkout braker3.

Yes, checkout master. We finally merged. And git pull to update to the latest code.

KatharinaHoff avatar Mar 02 '23 19:03 KatharinaHoff

Ok - I should have mentioned this for the braker3 branch already, but I didn't want to bombard you with issues (more so than I have).

Line 2295 in braker.pl that tries to assess the java version causes an error. The approach to getting the java version changed and the fix is simply changing it back to the old way.

New way:

$cmdString = "java -version 2>&1 | grep \"openjdk version\" | awk -F[\"\.] -v OFS=. '{print \$2,\$3}'";

Old way:

$cmdString = "java -version 2>&1 | awk -F[\\\"\\\.] -v OFS=. 'NR==1{print \$2,\$3}'";

Full context:

####################### set_JAVA_PATH #######################################
# * set path to java
# * also checks whether java version 1.8 is present
################################################################################

sub set_JAVA_PATH {
    my @required_files = ('java');
    $JAVA_PATH = set_software_PATH($java_path, "JAVA_PATH",
                    \@required_files, 'exit');

    #$cmdString = "java -version 2>&1 | grep \"openjdk version\" | awk -F[\"\.] -v OFS=. '{print \$2,\$3}'";
    $cmdString = "java -version 2>&1 | awk -F[\\\"\\\.] -v OFS=. 'NR==1{print \$2,\$3}'";
    my @javav = `$cmdString` or die("Failed to execute: $cmdString");
    if(not ($javav[0] =~ m/1\.8/ )){
        $prtStr = "\# " . (localtime) . " ERROR: in file " . __FILE__
            ." at line ". __LINE__ ."\n"
            . "You have installed java version $javav[0]. GUSHR requires version 1.8!\n"
            . "You can switch between java versions on your system with:\n"
            . "sudo update-alternatives --config java\n";
        $logString .= $prtStr;
        print STDERR $logString;
        exit(1);
    }

}

JohnUrban avatar Mar 02 '23 19:03 JohnUrban

Alright.

I ran Braker3 again with every thing up-to-date -- noting that I did have to change the java version line back to the old approach as described above.

I still get the same error I opened up this issue with (from braker3/errors/GeneMark-ETP.stderr).

FASTA index file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/genome.softmasked.fasta.fai created.
error, file not found: option --f1 complete.gtf
error on open file complete.id: No such file or directory
mv: cannot stat ‘complete_uniq.gtf’: No such file or directory
error on open file /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf: No such file or directory
error on create_regions.pl at /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/gmetp.pl line 2162.

Here are the contents of the GeneMark-ETP subdirectory inside the Braker3 working directory:

> ls braker3/GeneMark-ETP/

arx  data  etp_config.yaml  etp_data  filter_gmst.log  proteins.fa  prothint_gmst.log  rnaseq


> ls braker3/GeneMark-ETP/*/

braker3/GeneMark-ETP/arx/:
chr.names  genome.fa

braker3/GeneMark-ETP/data/:
genome.fasta  genome.softmasked.fasta  genome.softmasked.fasta.fai  proteins.fa

braker3/GeneMark-ETP/etp_data/:
forward.bam  reverse.bam

braker3/GeneMark-ETP/proteins.fa/:

braker3/GeneMark-ETP/rnaseq/:
gmst  hints  hisat2  stringtie


> ls braker3/GeneMark-ETP/*/*/

braker3/GeneMark-ETP/rnaseq/gmst/:
GeneMark_hmm.mod  genome_gmst_for_HC.gtf  genome_gmst.gtf  gms.log  transcripts_merged.fasta.gff

braker3/GeneMark-ETP/rnaseq/hints/:
bam2hints_forward.gff  bam2hints_merged.gff  bam2hints_reverse.gff  hintsfile_merged.gff  proteins.fa

braker3/GeneMark-ETP/rnaseq/hisat2/:
mapping_forward.bam  mapping_reverse.bam

braker3/GeneMark-ETP/rnaseq/stringtie/:
transcripts_forward.gff  transcripts_merged.fasta  transcripts_merged.gff  transcripts_reverse.gff

> ls braker3/GeneMark-ETP/*/*/*

braker3/GeneMark-ETP/rnaseq/gmst/GeneMark_hmm.mod              braker3/GeneMark-ETP/rnaseq/hints/bam2hints_forward.gff  braker3/GeneMark-ETP/rnaseq/hisat2/mapping_reverse.bam
braker3/GeneMark-ETP/rnaseq/gmst/genome_gmst_for_HC.gtf        braker3/GeneMark-ETP/rnaseq/hints/bam2hints_merged.gff   braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_forward.gff
braker3/GeneMark-ETP/rnaseq/gmst/genome_gmst.gtf               braker3/GeneMark-ETP/rnaseq/hints/bam2hints_reverse.gff  braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.fasta
braker3/GeneMark-ETP/rnaseq/gmst/gms.log                       braker3/GeneMark-ETP/rnaseq/hints/hintsfile_merged.gff   braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.gff
braker3/GeneMark-ETP/rnaseq/gmst/transcripts_merged.fasta.gff  braker3/GeneMark-ETP/rnaseq/hisat2/mapping_forward.bam   braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_reverse.gff

braker3/GeneMark-ETP/rnaseq/hints/proteins.fa:
log  prothint  tmp


...and so on

JohnUrban avatar Mar 02 '23 19:03 JohnUrban

It cannot find a file located here: braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/complete.gtf

So I decided to look at what is there:

> ls braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/

log  prothint  tmp

I wonder if the log file (braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/log) holds the answer or a hint - contents:

02-Mar-23 11:36:56 - INFO: Starting the GMST filtering and classification.
02-Mar-23 11:36:56 - INFO: Running the following system call: /central/groups/carnegie_poc/jurban/software/braker2/braker3/deps/genemark-etp/alt/ETP/bin/GeneMarkSTFiltering/gms2hints.pl --tseq /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.fasta --ggtf /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/stringtie/transcripts_merged.gff                --tgff /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/gmst/transcripts_merged.fasta.gff --out /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/gmsttbny79f0.gtf                  
02-Mar-23 11:36:56 - INFO: Making diamond database from /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa
02-Mar-23 11:36:56 - INFO: Running the following system call: diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd
02-Mar-23 11:37:16 - ERROR: Program exited due to an error in command: diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd
02-Mar-23 11:37:16 - ERROR: Check stderr for more details.

When I run the diamond makedb command to see the error:

> diamond makedb --in /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa -d /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/rnaseq/hints/proteins.fa/tmp/diamondDBbtbbubj0.dmnd


diamond v2.1.3.157 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 32
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /central/groups/carnegie_poc/jurban/data/coral/scratch/toyanno/braker3/masterbranch/braker3/GeneMark-ETP/data/proteins.fa
Opening the database file...  [0.143s]
Loading sequences...  [3.726s]
Masking sequences...  [2.614s]
Writing sequences...  [0.562s]
Hashing sequences...  [0.239s]
Loading sequences... Error: Error reading input stream at line 7477106: Invalid character (.) in sequence

JohnUrban avatar Mar 02 '23 20:03 JohnUrban

And looking more into that, for some reason many of the OrthoDB protein sequences end with a period... e.g.:

>307491_0:000000
MLAYADNIVVMGETKDINSTSKLISSNNFKYLGVNINNKIGMHIEINERITNGNSCYFSIIKFLRS.

I downloaded orthodb proteins like this:

wget --no-check-certificate https://v100.orthodb.org/download/odb10_metazoa_fasta.tar.gz
tar -xzf odb10_metazoa_fasta.tar.gz 
cat metazoa/Rawdata/* > metazoan-proteins-orthoDB.fasta
rm -r metazoa

I will look into removing these periods.... what strikes me funny though is that this did not cause an error when running Braker2. Or maybe it caused an error that went silent/undetected...?

JohnUrban avatar Mar 02 '23 20:03 JohnUrban

I just checked the sequence in orthodb 11 and there it seems to be okay. So maybe if you update to 11 the problem fixes itself.

>307491_0:000000        307491_0
MLAYADNIVVMGETKDINSTSKLISSNNFKYLGVNINNKIGMHIEINERITNGNSCYFSIIKFLRS

Thamos avatar Mar 02 '23 20:03 Thamos

I re-downloaded the OrthoDB proteins to confirm its not just my copy somehow, and indeed 17014 of the 8266016 metazoan proteins end with a period. But this was still v10 -- thanks to @Thamos for telling me about v11. I didn't realize there had been an update.

Nonetheless, I removed the periods at the end of ODBv10 seqs the following way:

 awk '{sub(/\.$/,""); print}'  proteins.fa > proteins.fixed.fa

That definitely solved the diamond makedb problem. And that in turn might solve my whole issue.... waiting to find out still.

I am still scratching my head as to why Braker2 didn't fail b/c of these period-containing sequences though. I'm somewhat guessing that it might have "failed silently" and perhaps I should be skeptical of those results.

JohnUrban avatar Mar 02 '23 20:03 JohnUrban

Do you have a (non empty) "prothint.gff" file in your braker2 directories? I think if "diamond makedb" didn't work there shouldn't be one, as prothint uses diamond. E.g. in my case with orthodb plants it's 54MB.

Thamos avatar Mar 02 '23 20:03 Thamos

Hi @JohnUrban, no need to be worried about that. BRAKER2 calls DIAMOND only via ProtHint which sanitizes the protein input (https://github.com/gatech-genemark/ProtHint/commit/19ef04c93bfa691bd6583017189b6e340a4513df).

In BRAKER3, DIAMOND is also called "directly" on raw protein input. That's definitely something to fix, thanks for pointing out.

tomasbruna avatar Mar 02 '23 20:03 tomasbruna

I'll open an issue about this in GeneMark-ETP.

tomasbruna avatar Mar 02 '23 20:03 tomasbruna

@Thamos the previous runs with the "dirty" protein sequences did not have that file. Now that I have a "pre-sanitized" protein file, Diamond is happily working at the moment, and I suspect I will get the "prothint.gff" file when it finishes up.

@tomasbruna glad I could point out a real issue here. I will keep you posted on whether or not this allows Braker3 to finish.

JohnUrban avatar Mar 02 '23 20:03 JohnUrban

@Thamos is there a link like https://v100.orthodb.org/download/odb10_metazoa_fasta.tar.gz for v11?

Else, I could download the whole v11 db here: https://data.orthodb.org/download/ ...do you have any tips on how to filter the entire DB to get just metazoan proteins?

JohnUrban avatar Mar 02 '23 20:03 JohnUrban