BayesTyper icon indicating copy to clipboard operation
BayesTyper copied to clipboard

Classifying kmers error with bayesTyper 1.4.1 and 1.5

Open karenfeng opened this issue 2 years ago • 1 comments

I'm having problems during the kmer classification step in bayesTyper genotype with both versions 1.4.1 and 1.5:

[19/10/2022 21:21:47] You are using BayesTyper (v1.4.1)

[19/10/2022 21:21:47] Seeding pseudo-random number generator with 1666239707 ...
[19/10/2022 21:21:47] Setting the kmer size to 55 ...

[19/10/2022 21:21:47] Parsed information for 1 sample(s)

[19/10/2022 21:21:47] Parsing reference genome ...
[19/10/2022 21:21:55] Parsed 66 reference genome chromosomes(s) (3095248640 nucleotides)

[19/10/2022 21:21:55] Parsing decoy sequence(s) ...
[19/10/2022 21:21:55] Parsed 129 decoy sequence(s) (4673901 nucleotides)

[19/10/2022 21:22:03] Maximum resident set size: 3.30645 Gb

[19/10/2022 21:22:03] Parsing variant clusters ...
[19/10/2022 21:22:05] Parsed 38805 variant clusters (68140 variants)

[19/10/2022 21:22:07] Parsing parameter kmers ...
[19/10/2022 21:22:09] Parsed 1000000 kmers

[19/10/2022 21:22:09] Maximum resident set size: 5.97004 Gb

[19/10/2022 21:22:09] Counting kmers in variant cluster paths ...
[19/10/2022 21:22:54] Counting kmers in inter-cluster regions and decoy sequence(s) ...

[19/10/2022 22:03:33] Parsing KMC table containing 14107141124 kmers for sample <REDACTED> ...

[20/10/2022 01:48:25] Classifying kmers in variant cluster paths ...

bayesTyper: /isdata/kroghgrp/jasi/bayesTyper/code/releases/v1.4.1_static/BayesTyper-1.4.1/src/bayesTyper/KmerHash.cpp:269: std::vector<std::vector<std::vector<KmerStats> > > ObservedKmerCountsHash<sample_bin>::calculateKmerStats(const std::vector<Sample>&) [with unsigned char sample_bin = 3u]: Assertion `!(*hash_it).second.isParameter()' failed.
[19/10/2022 10:01:54] You are using BayesTyper (v1.5)

[19/10/2022 10:01:54] Seeding pseudo-random number generator with 1666198914 ...
[19/10/2022 10:01:54] Setting the kmer size to 55 ...

[19/10/2022 10:01:54] Parsed information for 1 sample(s)

[19/10/2022 10:01:54] Parsing reference genome ...
[19/10/2022 10:02:01] Parsed 66 reference genome chromosomes(s) (3095248640 nucleotides)

[19/10/2022 10:02:01] Parsing decoy sequence(s) ...
[19/10/2022 10:02:01] Parsed 129 decoy sequence(s) (4673901 nucleotides)

[19/10/2022 10:02:08] Maximum resident set size: 3.30645 Gb


[19/10/2022 10:02:08] Parsing variant clusters ...
[19/10/2022 10:02:10] Parsed 38805 variant clusters (68140 variants)

[19/10/2022 10:02:11] Parsing parameter kmers ...
[19/10/2022 10:02:12] Parsed 1000000 kmers

[19/10/2022 10:02:12] Maximum resident set size: 5.97024 Gb

[19/10/2022 10:02:12] Counting kmers in variant cluster paths ...
[19/10/2022 10:02:41] Counting kmers in inter-cluster regions and decoy sequence(s) ...

[19/10/2022 10:36:05] Parsing KMC table containing 14107141124 kmers for sample <REDACTED> ...

[19/10/2022 13:39:02] Classifying kmers in variant cluster paths ...

bayesTyper: /isdata/kroghgrp/jasi/bayesTyper/code/releases/v1.5_static/BayesTyper-1.5/src/bayesTyper/KmerHash.cpp:278: std::vector<std::vector<std::vector<KmerStats> > > ObservedKmerCountsHash<sample_bin>::calculateKmerStats(const std::vector<Sample>&) [with unsigned char sample_bin = 3u]: Assertion `!(*hash_it).second.isParameter()' failed.

This looks related to https://github.com/bioinformatics-centre/BayesTyper/issues/13, but that appeared to be resolved in 1.4.1. Do you have any advice?

karenfeng avatar Oct 20 '22 15:10 karenfeng

Hi, thanks for writing. Would you be able to share all the command lines used including the kmer counting and clustering steps.

jonassibbesen avatar Oct 31 '22 11:10 jonassibbesen