Bracken
Bracken copied to clipboard
Error: no reads found. Please check your Kraken report
Hello,
I am using kracken2, bracken and 16S_SILVA138_k2db to generate species and genus counts.
I am having trouble generating the bracken reports where I keep getting this error - Error: no reads found. Please check your Kraken report
kraken2 --db 16S_SILVA138_k2db --report 813.fastq.16S.kreport --report-zero-counts --threads 16 --use-names 813.fastq
bracken -d kraken2/16S_SILVA138_k2db -i 813.fastq.16S.kreport -o 813.fastq.std.species.breport -r 50 -l S -t 10
have u solved ur problem? I met the same error when utilized the silva and rdp
I have solved it by modifying the parameters, bracken -l G instead of -l S. if ur input was the 16s results, it should be forced on the genus level or more low tax level.
You are right, forcing to to genus level gives read counts. However, I am more interested in species counts using SILVA. Is that something I cannot do using Kraken?
This also happens when
database80mers.kmer_distrib
database.kraken
database80mers.kraken
files are empty after Bracken db build. Haven't found a cause yet.
EDIT:
Found the problem.
Typo in the manual (hard version): missing space
kraken2 --db=${KRAKEN_DB} --threads=10 <( find -L library \(-name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + ) > database.kraken
should be
kraken2 --db=${KRAKEN_DB} --threads=10 <( find -L library \( -name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + ) > database.kraken
(space before the first -name
).
The first time I used Bracken with the easy version, it got stuck at some point so I continued with the hard version and had no problems. 2nd time building a new db I started with the hard version and had the issue (the bracken-build shell script has the missing space). Going to try again the whole build now.
As a simplification to what @mihkelvaher has posted, you can also run the following command:
find -L library \( -name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + > sequences_found.fasta
kraken2 --db . --threads 48 sequences_found.fasta > database.kraken
Thus splitting the generation of database.kraken
in two steps. If you work with slurm workload managers, some of the brackets could be escaped weirdly and giving you a headache, splitting the step in two made my day.
But the data I put is meta, still showed this error
I am trying to run bracken on paired end seq data of microbiome, and I am getting the 'Error: no reads found. Please check your Kraken report' despite the fact that the report exists, it has reads and is not in mpa format. The database files aren't empty either. Have tried changing -l from default but didn't help. Running on cluster. Am at a loss here! Any input? Any other issues you've known to give this particular error?
kraken2 --paired /proj/sens2021578/nobackup/emmai/test_kraken/readsForAssembly_collated_R1.fastq /proj/sens2021578/nobackup/emmai/test_kraken/readsForAssembly_collated_R2.fastq -threads 12 --report /proj/sens2021578/nobackup/emmai/test_bracken/kraken_reports/NM177paired.kreport > /proj/sens2021578/nobackup/emmai/test_bracken/kraken_output/NM177paired.kraken Loading database information... done. 13716 sequences (6.89 Mbp) processed in 1.351s (609.3 Kseq/m, 305.87 Mbp/m). 11054 sequences classified (80.59%) 2662 sequences unclassified (19.41%)
bracken -d /proj/sens2021578/nobackup/emmai/test_bracken/mydb -i /proj/sens2021578/nobackup/emmai/test_bracken/kraken_reports/NM177paired.kreport -o NM177paired.bracken -r 250 Error: no reads found. Please check your Kraken report
head NM177paired.kreport 19.41 2662 2662 U 0 unclassified 80.59 11054 50 R 1 root 46.89 6432 0 D 10239 Viruses 46.62 6394 0 D1 2731342 Monodnaviria 46.62 6394 0 K 2732092 Shotokuvirae 46.62 6394 0 P 2732415 Cossaviricota 46.62 6394 0 C 2732421 Papovaviricetes 46.62 6394 0 O 2732533 Zurhausenvirales 46.62 6394 44 F 151340 Papillomaviridae
I recognized, if i use a classification level that does not exist in the report, the error looks like that. Check if you have any taxa on species (S) level in your report. Otherwise use the -l
option to the lowest existing rank. Possible options are: 'D','P','C','O','F','G','S'
@mziegler12 thanks for your input! Unfortunately this doesn't help, I have tried fiddling with the -l
option without success. There are quite a lot of (S) and some (S1) but neither S, S1, G or F work
19.41 2662 2662 U 0 unclassified
80.59 11054 50 R 1 root
46.89 6432 0 D 10239 Viruses
46.62 6394 0 D1 2731342 Monodnaviria
46.62 6394 0 K 2732092 Shotokuvirae
46.62 6394 0 P 2732415 Cossaviricota
46.62 6394 0 C 2732421 Papovaviricetes
46.62 6394 0 O 2732533 Zurhausenvirales
46.62 6394 44 F 151340 Papillomaviridae
46.25 6344 11 F1 2169595 Firstpapillomavirinae
45.87 6291 5 G 333750 Alphapapillomavirus
32.18 4414 0 S 337044 Alphapapillomavirus 5
32.18 4414 4414 S1 333762 Human papillomavirus type 26
12.35 1694 1211 S 337041 Alphapapillomavirus 9
3.52 483 483 S1 333760 Human papillomavirus type 16
0.55 76 76 S 337042 Alphapapillomavirus 7
0.46 63 0 S 337049 Alphapapillomavirus 11
0.46 63 63 S1 333764 Human papillomavirus type 34
0.16 22 0 S 333766 Alphapapillomavirus 13
0.16 22 22 S1 1671798 Human papillomavirus type 54
0.12 17 2 S 10570 Alphapapillomavirus 12
0.11 15 15 S1 990303 Papio hamadryas papillomavirus 1
0.16 22 0 G 325455 Gammapapillomavirus
0.15 21 0 G1 735504 unclassified Gammapapillomavirus
0.15 21 21 S 2049444 Gammapapillomavirus sp.
0.01 1 0 S 1513260 Gammapapillomavirus 15
0.01 1 1 S1 1070408 Human papillomavirus 135
0.15 20 0 G 334202 Mupapillomavirus
0.15 20 0 S 1961783 Mupapillomavirus 3
0.15 20 20 S1 1650736 Human papillomavirus 204
0.04 6 0 F1 333774 unclassified Papillomaviridae
0.04 6 0 F2 173087 Human papillomavirus types
0.04 6 6 S 652810 Human papillomavirus type 85
0.27 37 0 F 687329 Anelloviridae
0.26 36 4 G 687331 Alphatorquevirus
0.09 12 12 S 687357 Torque teno virus 18
0.06 8 8 S 687361 Torque teno virus 22
0.03 4 4 S 687358 Torque teno virus 19
0.02 3 3 S 687356 Torque teno virus 17
0.01 2 2 S 687347 Torque teno virus 8
0.01 1 1 S 687344 Torque teno virus 5
0.01 1 1 S 687351 Torque teno virus 12
0.01 1 1 S 687352 Torque teno virus 13
0.01 1 1 G 687332 Betatorquevirus
0.01 1 0 D1 2731341 Duplodnaviria
0.01 1 0 K 2731360 Heunggongvirae
0.01 1 0 P 2731361 Peploviricota
0.01 1 0 C 2731363 Herviviricetes
0.01 1 0 O 548681 Herpesvirales
0.01 1 0 F 10292 Herpesviridae
0.01 1 0 F1 10374 Gammaherpesvirinae
0.01 1 0 G 10375 Lymphocryptovirus
0.01 1 1 S 10376 Human gammaherpesvirus 4
33.29 4566 7 R1 131567 cellular organisms
27.18 3728 5 D 2 Bacteria
26.53 3639 1 D1 1783272 Terrabacteria group
25.83 3543 0 P 1239 Firmicutes
25.79 3538 0 C 91061 Bacilli
25.78 3536 1 O 186826 Lactobacillales
25.61 3512 0 F 33958 Lactobacillaceae
25.54 3503 42 G 1578 Lactobacillus
25.23 3460 3460 S 147802 Lactobacillus iners
0.01 1 1 S 52242 Lactobacillus gallinarum
0.07 9 0 G 2742598 Limosilactobacillus
0.07 9 9 S 1633 Limosilactobacillus vaginalis
0.16 22 0 F 1300 Streptococcaceae
0.16 22 0 G 1301 Streptococcus
0.09 13 12 S 1308 Streptococcus thermophilus
0.01 1 1 S1 1436725 Streptococcus thermophilus TH1477
0.02 3 3 S 1825069 Streptococcus marmotae
0.01 2 2 S 1335 Streptococcus equinus
0.01 2 0 G1 671232 Streptococcus anginosus group
0.01 1 1 S 1328 Streptococcus anginosus
0.01 1 0 S 1338 Streptococcus intermedius
0.01 1 1 S1 862966 Streptococcus intermedius C270
Hello, having the same identical issue.
>> Running Bracken
>> python src/est_abundance.py -i /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/kraken2_report.txt -o /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/bracken_report.txt -k /mnt/research/bonito_lab/Benucci/databases/kraken2/minikraken2_v1_8GB/database100mers.kmer_distrib -l S -t 10
>> Checking report file: /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/kraken2_report.txt
Error: no reads found. Please check your Kraken report
PROGRAM START TIME: 07-11-2022 17:16:43
I have up to level S2 in the kraken report so I used -l S
. I am using the default minikraken2_v1_8GB
with -r 100
,and all the *kmer_.distrib
seems to have data in it.
How can I solve it? Anyone @mihkelvaher @MatteoSchiavinato @sneha-nishtala @mziegler12 @ifanlyn95 @found a solution?
Thanks,
G.
Did you try the -l
flag for the highest classification level 'D'
? I know, i wrote this before, but maybe we can exclude that it has something to do with the classification level.
Did you try the
-l
flag for the highest classification level'D'
? I know, i wrote this before, but maybe we can exclude that it has something to do with the classification level.
I tried that yes, didn't help unfortunately
Dear @mziegler12,
I just tried and it doesn't work even with -l D
. Please see the code below. Thanks, G.
/mnt/home/benucci/anaconda2/envs/bracken/bin/bracken: illegal option -- -
version of Bracken: Usage: bracken -d MY_DB -i INPUT -o OUTPUT -w OUTREPORT -r READ_LEN -l LEVEL -t THRESHOLD
MY_DB location of Kraken database
INPUT Kraken REPORT file to use for abundance estimation
OUTPUT file name for Bracken default output
OUTREPORT New Kraken REPORT output file with Bracken read estimates
READ_LEN read length to get all classifications for (default: 100)
LEVEL level to estimate abundance at [options: D,P,C,O,F,G,S,S1,etc] (default: S)
THRESHOLD number of reads required PRIOR to abundance estimation to perform reestimation (default: 0)
>> Checking for Valid Options...
>> Running Bracken
>> python src/est_abundance.py -i /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/kraken2_report.txt -o /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/bracken_report.txt -k /mnt/research/bonito_lab/Benucci/databases/kraken2/minikraken2_v1_8GB/database100mers.kmer_distrib -l D -t 10
>> Checking report file: /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/kraken2_report.txt
Error: no reads found. Please check your Kraken report
PROGRAM START TIME: 07-15-2022 18:06:48
I ran into the same error, and it looked as if there needs to be at least threshold-many reads in at least one of the user-specified taxonomy levels. I dont know if this helps anyone else. According to the manual, the threshold is 10 by default (which differs from the conda-based post above).
@tuncK, Thanks, that may makes sense. But if that is the case then a more detailed error message should help users to understand what's happening. Also, I thought Brachen was able to calculate taxonomic abundance even in a complete genome. I do not have 10 contigs on the error example I have posted above.
I think if the reason is indeed what I am suspecting, I want to say that the program should actually continue as is but give a column of zeros as output. When it throws such an error, it triggers a cascade of other errors in the pipelines whereas there actually is not much to be checked or fixed.
@tunck
I run it on a genome with 39 conitgs and worked well at -l G
or -l S
. I then run it on a genome with 8 contigs and gave me the same error I got ont he genome with 2 contigs.
I get it but I still do not understand why that threashold of 10 matters for the tool, though...
G.