chromap icon indicating copy to clipboard operation
chromap copied to clipboard

SAM output is erratic

Open fengchuiguo1994 opened this issue 10 months ago • 9 comments

When I use chromap with --SAM parameter, some reads are single aligned not paired aligned.

14751591 + 0 in total (QC-passed reads + QC-failed reads)
14751591 + 0 primary
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
14751591 + 0 mapped (100.00% : N/A)
14751591 + 0 primary mapped (100.00% : N/A)
14751591 + 0 paired in sequencing
7375803 + 0 read1
7375788 + 0 read2
14751591 + 0 properly paired (100.00% : N/A)
14751591 + 0 with itself and mate mapped
0 + 0 singletons (0.00% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

But, I extract these reads and realign again, all reads are paired aligned.

Finally, I found the single aligned reads are different in each test.

my command:

chromap --preset atac --remove-pcr-duplicates-at-cell-level --Tn5-shift -t 10 -r genome.fa -x mm10chromapindex -1 8kmousecortex_S1_L001_R1_001.fastq.gz -2 8kmousecortex_S1_L001_R3_001.fastq.gz -b 8kmousecortex_S1_L001_R2_001.fastq.gz --barcode-whitelist barcodelist -o test.sam --SAM

Image

fengchuiguo1994 avatar May 28 '25 02:05 fengchuiguo1994

Thank you for providing the example. This is related to #184 , and I'm looking into this issue now. Which version of Chromap did you use? If you are using 0.3.0, could you please test the previous version, just in case we introduced a bug? Could you please also paste the read sequence here, and I will test it specifically. Thank you!

mourisl avatar May 28 '25 03:05 mourisl

My chromap version is 0.2.4.

chromap -v
0.2.4-r467

There are some reads R1:

@A00836:1001:H5M52DSX3:3:1101:5285:3912 1:N:0:GACTTCCT
CCCCCACAGGAACTCATCTTGCCCTAGTTTTCGTTTATCTCTTCAAAAAG
+
FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1103:27832:1877 1:N:0:GACTTCCT
GTAATACCAATAATAATTGGAGGCTTTGGAAACTGACTTGTCCCACTAAT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 1:N:0:GACTTCCT
ACATTATCTTGTGCGAGCGAGCGTGCGCGTGTACACACACACACACACAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1110:17345:1501 1:N:0:GACTTCCT
GTGTGCATTTCTCATTTTTAAGTTTTTTAATGATTTCGTCATTTTTCAAG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 1:N:0:GACTTCCT
AGCCTCTTGCATCATCATGCTCTGCATCTGTCCTGGCTGAGTGTCTTCAT
+
FFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:20392:32362 1:N:0:GACTTCCT
ACAGAAAGGTCTTCCTGAGGCTGGCAGCAGATTCTCCCAGTATTCTCGTC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 1:N:0:GACTTCCT
CTGTACGACTTGGAATATGGCAAGAAAACTGAAAATCATGGAAAATGAGA
+
FFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 1:N:0:GACTTCCT
GTCCTTCAGTGTGCATTTCTCATTTTTCACGTTTTTTAGTGATTTCGTCA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 1:N:0:GACTTCCT
TTTCTACACAGCATTCAACTGCGACCAATGACATGAAAAATCATCGTTGT
+
FFFFF:FFFFF,F:,FFFFF,FFFF:FF::FFF,FF::F,FFFFFF,,FF
@A00836:1001:H5M52DSX3:3:1121:9353:31313 1:N:0:GACTTCCT
GGTTTACCCTCTGACTGTTCCATATTCCATACCTCCTCCCGGGGGGCGCC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,,F,,:,,,,,,,

R2

@A00836:1001:H5M52DSX3:3:1101:5285:3912 2:N:0:GACTTCCT
ATTGAGAGAGTACGTT
+
:FFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1103:27832:1877 2:N:0:GACTTCCT
ATTCCAACTCTGCGGT
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 2:N:0:GACTTCCT
TTGTTTGTGCACTCAA
+
,:FFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1110:17345:1501 2:N:0:GACTTCCT
TTCTCTCTGAAATGCA
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 2:N:0:GACTTCCT
GGAACTACTATCAGCT
+
FFF:FFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:20392:32362 2:N:0:GACTTCCT
ATTTCGACTATCCTAG
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 2:N:0:GACTTCCT
AATGGCTACCGCACTG
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 2:N:0:GACTTCCT
TTTACCATGCAAGTAA
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 2:N:0:GACTTCCT
AAAGCCTTGAAGACAT
+
::F,F::,,F,FFF:F
@A00836:1001:H5M52DSX3:3:1121:9353:31313 2:N:0:GACTTCCT
TCGTATACTACGTAAA
+
,FF:FF:FFFFFFFF:

R3

@A00836:1001:H5M52DSX3:3:1101:5285:3912 3:N:0:GACTTCCT
CACACACACACACACACACACACACACACACACACTTTATCCATTAACA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFF,:F
@A00836:1001:H5M52DSX3:3:1103:27832:1877 3:N:0:GACTTCCT
TGGTAGGAGTCAAAAACTTATATTATTTATTCGTGGGAATGCTATATCT
+
FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 3:N:0:GACTTCCT
GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTACACGCGCACGCTCG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF::F:FFFFFF,
@A00836:1001:H5M52DSX3:3:1110:17345:1501 3:N:0:GACTTCCT
CTGTAGGACGTGGAATATGGCAAGAAAAATTGAAAATCATGGAAAATGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 3:N:0:GACTTCCT
CTCACACACACACACACACACACACACACACACACACACACACACTAGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,,F,F
@A00836:1001:H5M52DSX3:3:1111:20392:32362 3:N:0:GACTTCCT
GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGGAGAGAGAGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 3:N:0:GACTTCCT
GTCCTTCAGTGGGCATTTCTCATTTTTCACATTTTTTAGTGATTTTGTC
+
FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 3:N:0:GACTTCCT
CTGTAGGACGTGGAATATGGCAAGAAAACTGGGAATAATGGAAAATGAG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 3:N:0:GACTTCCT
GTACTATTAGGCAGACTCCTAGAAGGGACCCAAAGTTTCATAATGATGA
+
F,FFFF,:,::FF,FFFFF,FF:FFF:F:FFFFF,F,:FFF,::F,F:F
@A00836:1001:H5M52DSX3:3:1121:9353:31313 3:N:0:GACTTCCT
GCTCTATGGTAAAGCTTGAGGTCAAGGATGGTGATTCCCTCAGCCATTC
+
,FFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

you can download the data (https://www.10xgenomics.com/datasets/8k-adult-mouse-cortex-cells-atac-v2-chromium-x-2-standard)

fengchuiguo1994 avatar May 29 '25 14:05 fengchuiguo1994

Thank you! I can reproduce this error. I think it is related to the --low-mem mode (implicitly activated in the --preset atac option), it might take some time to fix and I'm very actively working on this.

mourisl avatar May 29 '25 17:05 mourisl

I think I've found the issue and pushed the fix to the li_dev11 branch. Could you please give a try? If it works, I'll merge this into master branch and draft a new release. Thank you!

mourisl avatar Jun 01 '25 05:06 mourisl

sorry, the chromap i had run failed in li_dev11

Image

Image

Image

fengchuiguo1994 avatar Jun 03 '25 08:06 fengchuiguo1994

What was your running command?

mourisl avatar Jun 03 '25 14:06 mourisl

as the same as before

chromap --preset atac --remove-pcr-duplicates-at-cell-level --Tn5-shift -t 10 -r genome.fa -x mm10chromapindex -1 8kmousecortex_S1_L001_R1_001.fastq.gz -2 8kmousecortex_S1_L001_R3_001.fastq.gz -b 8kmousecortex_S1_L001_R2_001.fastq.gz --barcode-whitelist barcodelist -o test.sam --SAM

fengchuiguo1994 avatar Jun 06 '25 13:06 fengchuiguo1994

I think your command somehow runs "chromap.1" file, which is the readme/manual file. Have you run "make clean; make" to recompile chromap?

mourisl avatar Jun 06 '25 14:06 mourisl

o, i excute the "chromap.1". ok, let me "make" and rerun.

fengchuiguo1994 avatar Jun 07 '25 02:06 fengchuiguo1994