kallisto icon indicating copy to clipboard operation
kallisto copied to clipboard

Count matrix with and without lammano workflow

Open njbernstein opened this issue 2 years ago • 3 comments

Hi there,

If I run the same data through kb with and without --workflow lammano and of course the necessary associated files should I get back the same exact count matrix. For the count matrix here I am talking about X in the .h5ad object generated.

Best, Nick

njbernstein avatar Aug 05 '21 23:08 njbernstein

No, you will not be getting the same exact count matrix. When you do pseudoalignment with introns included, your numbers will be different.

Yenaled avatar Aug 19 '21 07:08 Yenaled

@Yenaled why is that?

njbernstein avatar Sep 07 '21 21:09 njbernstein

Your index is different. Some reads that couldn't map anywhere with a cDNA-only index will now map somewhere when you include introns. Some reads that mapped to a transcript with a cDNA-only index may now no longer map anywhere (because some k-mers in the read that originally didn't map may now map to an intronic region of some other transcript, causing the intersection of the k-compatibility classes associated with the read to be zero). Thus, your read counts will be different (not drastically so, but they will be different).

Yenaled avatar Sep 07 '21 22:09 Yenaled