kallisto
kallisto copied to clipboard
no reads pseudoalign when reads are the same length as transcripts in index or of length 3?
I have put a small example on [this gist].(https://gist.github.com/winni2k/64efa2e354a70a72d8a70a5ac373cc49)
When I run run.sh
, I get the following output:
0 reads pseudoalign
[build] loading fasta file transcripts.fa
[build] k-mer length: 3
[build] counting k-mers ... done.
[build] building target de Bruijn graph ... done
[build] creating equivalence classes ... done
[build] target de Bruijn graph has 3 contigs and contains 3 k-mers
[quant] fragment length distribution is truncated gaussian with mean = 4, sd = 0.1
[index] k-mer length: 3
[index] number of targets: 2
[index] number of k-mers: 3
[index] number of equivalence classes: 3
[quant] running in single-end mode
[quant] will process file 1: single.fq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 8 reads, 0 reads pseudoaligned
[~warn] no reads pseudoaligned.
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 52 rounds
0 reads pseudoalign in this case as well
[quant] fragment length distribution is truncated gaussian with mean = 4, sd = 0.1
[index] k-mer length: 3
[index] number of targets: 2
[index] number of k-mers: 3
[index] number of equivalence classes: 3
[quant] running in single-end mode
[quant] will process file 1: single_v3.fq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 8 reads, 0 reads pseudoaligned
[~warn] no reads pseudoaligned.
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 52 rounds
8 reads pseudoalign
[build] loading fasta file transcripts_v2.fa
[build] k-mer length: 3
[build] counting k-mers ... done.
[build] building target de Bruijn graph ... done
[build] creating equivalence classes ... done
[build] target de Bruijn graph has 3 contigs and contains 5 k-mers
[quant] fragment length distribution is truncated gaussian with mean = 4, sd = 0.1
[index] k-mer length: 3
[index] number of targets: 2
[index] number of k-mers: 5
[index] number of equivalence classes: 3
[quant] running in single-end mode
[quant] will process file 1: single.fq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 8 reads, 8 reads pseudoaligned
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 52 rounds
Summary:
- In the first case I have reads of length 4 and transcripts of length 4
- In the second case I have reads of length 3 and transcripts of length 4
- in the third case I have reads of length 4 and transcripts of length 5
I don't understand why the reads don't pseudoalign in the first two cases. Is this a bug or a feature?
I don't know the internals of what causes the issue, but it has something to do with how the fragment length is used to constrain possible alignments. You can use a fragment length of 1 to remove the constraint. I checked and this will result in all reads aligning in each case you provided.
I think k-mer length must be smaller than read length?