Bracken icon indicating copy to clipboard operation
Bracken copied to clipboard

Error: no reads found. Please check your Kraken report

Open sneha-nishtala opened this issue 4 years ago • 17 comments

Hello,

I am using kracken2, bracken and 16S_SILVA138_k2db to generate species and genus counts.

I am having trouble generating the bracken reports where I keep getting this error - Error: no reads found. Please check your Kraken report

kraken2 --db 16S_SILVA138_k2db --report 813.fastq.16S.kreport --report-zero-counts --threads 16 --use-names 813.fastq bracken -d kraken2/16S_SILVA138_k2db -i 813.fastq.16S.kreport -o 813.fastq.std.species.breport -r 50 -l S -t 10

sneha-nishtala avatar Sep 10 '20 18:09 sneha-nishtala

have u solved ur problem? I met the same error when utilized the silva and rdp

ifanlyn95 avatar Sep 24 '20 01:09 ifanlyn95

I have solved it by modifying the parameters, bracken -l G instead of -l S. if ur input was the 16s results, it should be forced on the genus level or more low tax level.

ifanlyn95 avatar Sep 24 '20 01:09 ifanlyn95

You are right, forcing to to genus level gives read counts. However, I am more interested in species counts using SILVA. Is that something I cannot do using Kraken?

sneha-nishtala avatar Oct 19 '20 18:10 sneha-nishtala

This also happens when

database80mers.kmer_distrib
database.kraken
database80mers.kraken

files are empty after Bracken db build. Haven't found a cause yet.

EDIT: Found the problem. Typo in the manual (hard version): missing space kraken2 --db=${KRAKEN_DB} --threads=10 <( find -L library \(-name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + ) > database.kraken should be kraken2 --db=${KRAKEN_DB} --threads=10 <( find -L library \( -name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + ) > database.kraken (space before the first -name).

The first time I used Bracken with the easy version, it got stuck at some point so I continued with the hard version and had no problems. 2nd time building a new db I started with the hard version and had the issue (the bracken-build shell script has the missing space). Going to try again the whole build now.

mihkelvaher avatar Feb 19 '21 08:02 mihkelvaher

As a simplification to what @mihkelvaher has posted, you can also run the following command:

find -L library \( -name "*.fna" -o -name "*.fa" -o -name "*.fasta" \) -exec cat {} + > sequences_found.fasta 
kraken2 --db . --threads 48 sequences_found.fasta > database.kraken 

Thus splitting the generation of database.kraken in two steps. If you work with slurm workload managers, some of the brackets could be escaped weirdly and giving you a headache, splitting the step in two made my day.

MatteoSchiavinato avatar Jun 09 '21 14:06 MatteoSchiavinato

But the data I put is meta, still showed this error

MAXINELSX avatar Apr 11 '22 05:04 MAXINELSX

I am trying to run bracken on paired end seq data of microbiome, and I am getting the 'Error: no reads found. Please check your Kraken report' despite the fact that the report exists, it has reads and is not in mpa format. The database files aren't empty either. Have tried changing -l from default but didn't help. Running on cluster. Am at a loss here! Any input? Any other issues you've known to give this particular error?

kraken2 --paired /proj/sens2021578/nobackup/emmai/test_kraken/readsForAssembly_collated_R1.fastq /proj/sens2021578/nobackup/emmai/test_kraken/readsForAssembly_collated_R2.fastq -threads 12 --report /proj/sens2021578/nobackup/emmai/test_bracken/kraken_reports/NM177paired.kreport > /proj/sens2021578/nobackup/emmai/test_bracken/kraken_output/NM177paired.kraken Loading database information... done. 13716 sequences (6.89 Mbp) processed in 1.351s (609.3 Kseq/m, 305.87 Mbp/m). 11054 sequences classified (80.59%) 2662 sequences unclassified (19.41%)

bracken -d /proj/sens2021578/nobackup/emmai/test_bracken/mydb -i /proj/sens2021578/nobackup/emmai/test_bracken/kraken_reports/NM177paired.kreport -o NM177paired.bracken -r 250 Error: no reads found. Please check your Kraken report

head NM177paired.kreport 19.41 2662 2662 U 0 unclassified 80.59 11054 50 R 1 root 46.89 6432 0 D 10239 Viruses 46.62 6394 0 D1 2731342 Monodnaviria 46.62 6394 0 K 2732092 Shotokuvirae 46.62 6394 0 P 2732415 Cossaviricota 46.62 6394 0 C 2732421 Papovaviricetes 46.62 6394 0 O 2732533 Zurhausenvirales 46.62 6394 44 F 151340 Papillomaviridae

emmaivansson avatar Jul 07 '22 12:07 emmaivansson

I recognized, if i use a classification level that does not exist in the report, the error looks like that. Check if you have any taxa on species (S) level in your report. Otherwise use the -l option to the lowest existing rank. Possible options are: 'D','P','C','O','F','G','S'

mziegler12 avatar Jul 07 '22 12:07 mziegler12

@mziegler12 thanks for your input! Unfortunately this doesn't help, I have tried fiddling with the -l option without success. There are quite a lot of (S) and some (S1) but neither S, S1, G or F work

19.41  2662    2662    U       0       unclassified
 80.59  11054   50      R       1       root
 46.89  6432    0       D       10239     Viruses
 46.62  6394    0       D1      2731342     Monodnaviria
 46.62  6394    0       K       2732092       Shotokuvirae
 46.62  6394    0       P       2732415         Cossaviricota
 46.62  6394    0       C       2732421           Papovaviricetes
 46.62  6394    0       O       2732533             Zurhausenvirales
 46.62  6394    44      F       151340                Papillomaviridae
 46.25  6344    11      F1      2169595                 Firstpapillomavirinae
 45.87  6291    5       G       333750                    Alphapapillomavirus
 32.18  4414    0       S       337044                      Alphapapillomavirus 5
 32.18  4414    4414    S1      333762                        Human papillomavirus type 26
 12.35  1694    1211    S       337041                      Alphapapillomavirus 9
  3.52  483     483     S1      333760                        Human papillomavirus type 16
  0.55  76      76      S       337042                      Alphapapillomavirus 7
  0.46  63      0       S       337049                      Alphapapillomavirus 11
  0.46  63      63      S1      333764                        Human papillomavirus type 34
  0.16  22      0       S       333766                      Alphapapillomavirus 13
  0.16  22      22      S1      1671798                       Human papillomavirus type 54
  0.12  17      2       S       10570                       Alphapapillomavirus 12
  0.11  15      15      S1      990303                        Papio hamadryas papillomavirus 1
  0.16  22      0       G       325455                    Gammapapillomavirus
  0.15  21      0       G1      735504                      unclassified Gammapapillomavirus
  0.15  21      21      S       2049444                       Gammapapillomavirus sp.
  0.01  1       0       S       1513260                     Gammapapillomavirus 15
  0.01  1       1       S1      1070408                       Human papillomavirus 135
  0.15  20      0       G       334202                    Mupapillomavirus
  0.15  20      0       S       1961783                     Mupapillomavirus 3
  0.15  20      20      S1      1650736                       Human papillomavirus 204
  0.04  6       0       F1      333774                  unclassified Papillomaviridae
  0.04  6       0       F2      173087                    Human papillomavirus types
  0.04  6       6       S       652810                      Human papillomavirus type 85
  0.27  37      0       F       687329      Anelloviridae
  0.26  36      4       G       687331        Alphatorquevirus
  0.09  12      12      S       687357          Torque teno virus 18
  0.06  8       8       S       687361          Torque teno virus 22
  0.03  4       4       S       687358          Torque teno virus 19
  0.02  3       3       S       687356          Torque teno virus 17
  0.01  2       2       S       687347          Torque teno virus 8
  0.01  1       1       S       687344          Torque teno virus 5
  0.01  1       1       S       687351          Torque teno virus 12
  0.01  1       1       S       687352          Torque teno virus 13
  0.01  1       1       G       687332        Betatorquevirus
  0.01  1       0       D1      2731341     Duplodnaviria
  0.01  1       0       K       2731360       Heunggongvirae
  0.01  1       0       P       2731361         Peploviricota
  0.01  1       0       C       2731363           Herviviricetes
  0.01  1       0       O       548681              Herpesvirales
  0.01  1       0       F       10292                 Herpesviridae
  0.01  1       0       F1      10374                   Gammaherpesvirinae
  0.01  1       0       G       10375                     Lymphocryptovirus
  0.01  1       1       S       10376                       Human gammaherpesvirus 4
 33.29  4566    7       R1      131567    cellular organisms
 27.18  3728    5       D       2           Bacteria
 26.53  3639    1       D1      1783272       Terrabacteria group
 25.83  3543    0       P       1239            Firmicutes
 25.79  3538    0       C       91061             Bacilli
 25.78  3536    1       O       186826              Lactobacillales
 25.61  3512    0       F       33958                 Lactobacillaceae
 25.54  3503    42      G       1578                    Lactobacillus
 25.23  3460    3460    S       147802                    Lactobacillus iners
  0.01  1       1       S       52242                     Lactobacillus gallinarum
  0.07  9       0       G       2742598                 Limosilactobacillus
  0.07  9       9       S       1633                      Limosilactobacillus vaginalis
  0.16  22      0       F       1300                  Streptococcaceae
  0.16  22      0       G       1301                    Streptococcus
  0.09  13      12      S       1308                      Streptococcus thermophilus
  0.01  1       1       S1      1436725                     Streptococcus thermophilus TH1477
  0.02  3       3       S       1825069                   Streptococcus marmotae
  0.01  2       2       S       1335                      Streptococcus equinus
  0.01  2       0       G1      671232                    Streptococcus anginosus group
  0.01  1       1       S       1328                        Streptococcus anginosus
  0.01  1       0       S       1338                        Streptococcus intermedius
  0.01  1       1       S1      862966                        Streptococcus intermedius C270

emmaivansson avatar Jul 07 '22 13:07 emmaivansson

Hello, having the same identical issue.

 >> Running Bracken 
      >> python src/est_abundance.py -i /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/kraken2_report.txt -o /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/bracken_report.txt -k /mnt/research/bonito_lab/Benucci/databases/kraken2/minikraken2_v1_8GB/database100mers.kmer_distrib -l S -t 10
>> Checking report file: /mnt/home/benucci/pipe-bac-pb/outputs/taxcalssabund_kraken2-brachen/pbio-2279.20869.bc1020_BAK8B_OA--bc1020_BAK8B_OA.ccs/kraken2_report.txt
Error: no reads found. Please check your Kraken report
PROGRAM START TIME: 07-11-2022 17:16:43

I have up to level S2 in the kraken report so I used -l S. I am using the default minikraken2_v1_8GB with -r 100 ,and all the *kmer_.distrib seems to have data in it. How can I solve it? Anyone @mihkelvaher @MatteoSchiavinato @sneha-nishtala @mziegler12 @ifanlyn95 @found a solution? Thanks, G.

Gian77 avatar Jul 11 '22 17:07 Gian77

Did you try the -l flag for the highest classification level 'D'? I know, i wrote this before, but maybe we can exclude that it has something to do with the classification level.

mziegler12 avatar Jul 14 '22 18:07 mziegler12

Did you try the -l flag for the highest classification level 'D'? I know, i wrote this before, but maybe we can exclude that it has something to do with the classification level.

I tried that yes, didn't help unfortunately

emmaivansson avatar Jul 15 '22 13:07 emmaivansson

Dear @mziegler12,

I just tried and it doesn't work even with -l D. Please see the code below. Thanks, G.

/mnt/home/benucci/anaconda2/envs/bracken/bin/bracken: illegal option -- -
version of Bracken: Usage: bracken -d MY_DB -i INPUT -o OUTPUT -w OUTREPORT -r READ_LEN -l LEVEL -t THRESHOLD
  MY_DB          location of Kraken database
  INPUT          Kraken REPORT file to use for abundance estimation
  OUTPUT         file name for Bracken default output
  OUTREPORT      New Kraken REPORT output file with Bracken read estimates
  READ_LEN       read length to get all classifications for (default: 100)
  LEVEL          level to estimate abundance at [options: D,P,C,O,F,G,S,S1,etc] (default: S)
  THRESHOLD      number of reads required PRIOR to abundance estimation to perform reestimation (default: 0)
 >> Checking for Valid Options...
 >> Running Bracken 
      >> python src/est_abundance.py -i /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/kraken2_report.txt -o /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/bracken_report.txt -k /mnt/research/bonito_lab/Benucci/databases/kraken2/minikraken2_v1_8GB/database100mers.kmer_distrib -l D -t 10
>> Checking report file: /mnt/home/benucci/genome-pipe-bac-pb/outputs/12_taxcalssabund_kraken2-brachen/pbio-2432.22840.bc1019_BAK8B_OA--bc1019_BAK8B_OA.ccs/kraken2_report.txt
Error: no reads found. Please check your Kraken report
PROGRAM START TIME: 07-15-2022 18:06:48

Gian77 avatar Jul 15 '22 18:07 Gian77

I ran into the same error, and it looked as if there needs to be at least threshold-many reads in at least one of the user-specified taxonomy levels. I dont know if this helps anyone else. According to the manual, the threshold is 10 by default (which differs from the conda-based post above).

tuncK avatar Jul 17 '22 16:07 tuncK

@tuncK, Thanks, that may makes sense. But if that is the case then a more detailed error message should help users to understand what's happening. Also, I thought Brachen was able to calculate taxonomic abundance even in a complete genome. I do not have 10 contigs on the error example I have posted above.

Gian77 avatar Jul 18 '22 13:07 Gian77

I think if the reason is indeed what I am suspecting, I want to say that the program should actually continue as is but give a column of zeros as output. When it throws such an error, it triggers a cascade of other errors in the pipelines whereas there actually is not much to be checked or fixed.

tuncK avatar Jul 18 '22 13:07 tuncK

@tunck

I run it on a genome with 39 conitgs and worked well at -l G or -l S. I then run it on a genome with 8 contigs and gave me the same error I got ont he genome with 2 contigs. I get it but I still do not understand why that threashold of 10 matters for the tool, though... G.

Gian77 avatar Aug 02 '22 21:08 Gian77