SAM output is erratic
When I use chromap with --SAM parameter, some reads are single aligned not paired aligned.
14751591 + 0 in total (QC-passed reads + QC-failed reads)
14751591 + 0 primary
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
14751591 + 0 mapped (100.00% : N/A)
14751591 + 0 primary mapped (100.00% : N/A)
14751591 + 0 paired in sequencing
7375803 + 0 read1
7375788 + 0 read2
14751591 + 0 properly paired (100.00% : N/A)
14751591 + 0 with itself and mate mapped
0 + 0 singletons (0.00% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
But, I extract these reads and realign again, all reads are paired aligned.
Finally, I found the single aligned reads are different in each test.
my command:
chromap --preset atac --remove-pcr-duplicates-at-cell-level --Tn5-shift -t 10 -r genome.fa -x mm10chromapindex -1 8kmousecortex_S1_L001_R1_001.fastq.gz -2 8kmousecortex_S1_L001_R3_001.fastq.gz -b 8kmousecortex_S1_L001_R2_001.fastq.gz --barcode-whitelist barcodelist -o test.sam --SAM
Thank you for providing the example. This is related to #184 , and I'm looking into this issue now. Which version of Chromap did you use? If you are using 0.3.0, could you please test the previous version, just in case we introduced a bug? Could you please also paste the read sequence here, and I will test it specifically. Thank you!
My chromap version is 0.2.4.
chromap -v
0.2.4-r467
There are some reads R1:
@A00836:1001:H5M52DSX3:3:1101:5285:3912 1:N:0:GACTTCCT
CCCCCACAGGAACTCATCTTGCCCTAGTTTTCGTTTATCTCTTCAAAAAG
+
FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1103:27832:1877 1:N:0:GACTTCCT
GTAATACCAATAATAATTGGAGGCTTTGGAAACTGACTTGTCCCACTAAT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 1:N:0:GACTTCCT
ACATTATCTTGTGCGAGCGAGCGTGCGCGTGTACACACACACACACACAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1110:17345:1501 1:N:0:GACTTCCT
GTGTGCATTTCTCATTTTTAAGTTTTTTAATGATTTCGTCATTTTTCAAG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 1:N:0:GACTTCCT
AGCCTCTTGCATCATCATGCTCTGCATCTGTCCTGGCTGAGTGTCTTCAT
+
FFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:20392:32362 1:N:0:GACTTCCT
ACAGAAAGGTCTTCCTGAGGCTGGCAGCAGATTCTCCCAGTATTCTCGTC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 1:N:0:GACTTCCT
CTGTACGACTTGGAATATGGCAAGAAAACTGAAAATCATGGAAAATGAGA
+
FFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 1:N:0:GACTTCCT
GTCCTTCAGTGTGCATTTCTCATTTTTCACGTTTTTTAGTGATTTCGTCA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 1:N:0:GACTTCCT
TTTCTACACAGCATTCAACTGCGACCAATGACATGAAAAATCATCGTTGT
+
FFFFF:FFFFF,F:,FFFFF,FFFF:FF::FFF,FF::F,FFFFFF,,FF
@A00836:1001:H5M52DSX3:3:1121:9353:31313 1:N:0:GACTTCCT
GGTTTACCCTCTGACTGTTCCATATTCCATACCTCCTCCCGGGGGGCGCC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFF,,F,,:,,,,,,,
R2
@A00836:1001:H5M52DSX3:3:1101:5285:3912 2:N:0:GACTTCCT
ATTGAGAGAGTACGTT
+
:FFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1103:27832:1877 2:N:0:GACTTCCT
ATTCCAACTCTGCGGT
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 2:N:0:GACTTCCT
TTGTTTGTGCACTCAA
+
,:FFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1110:17345:1501 2:N:0:GACTTCCT
TTCTCTCTGAAATGCA
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 2:N:0:GACTTCCT
GGAACTACTATCAGCT
+
FFF:FFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:20392:32362 2:N:0:GACTTCCT
ATTTCGACTATCCTAG
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 2:N:0:GACTTCCT
AATGGCTACCGCACTG
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 2:N:0:GACTTCCT
TTTACCATGCAAGTAA
+
FFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 2:N:0:GACTTCCT
AAAGCCTTGAAGACAT
+
::F,F::,,F,FFF:F
@A00836:1001:H5M52DSX3:3:1121:9353:31313 2:N:0:GACTTCCT
TCGTATACTACGTAAA
+
,FF:FF:FFFFFFFF:
R3
@A00836:1001:H5M52DSX3:3:1101:5285:3912 3:N:0:GACTTCCT
CACACACACACACACACACACACACACACACACACTTTATCCATTAACA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFF,:F
@A00836:1001:H5M52DSX3:3:1103:27832:1877 3:N:0:GACTTCCT
TGGTAGGAGTCAAAAACTTATATTATTTATTCGTGGGAATGCTATATCT
+
FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1108:27932:8563 3:N:0:GACTTCCT
GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTACACGCGCACGCTCG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF::F:FFFFFF,
@A00836:1001:H5M52DSX3:3:1110:17345:1501 3:N:0:GACTTCCT
CTGTAGGACGTGGAATATGGCAAGAAAAATTGAAAATCATGGAAAATGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1111:32886:30796 3:N:0:GACTTCCT
CTCACACACACACACACACACACACACACACACACACACACACACTAGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,,F,F
@A00836:1001:H5M52DSX3:3:1111:20392:32362 3:N:0:GACTTCCT
GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGGAGAGAGAGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF
@A00836:1001:H5M52DSX3:3:1113:24071:32690 3:N:0:GACTTCCT
GTCCTTCAGTGGGCATTTCTCATTTTTCACATTTTTTAGTGATTTTGTC
+
FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1116:10673:6104 3:N:0:GACTTCCT
CTGTAGGACGTGGAATATGGCAAGAAAACTGGGAATAATGGAAAATGAG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00836:1001:H5M52DSX3:3:1119:3613:20650 3:N:0:GACTTCCT
GTACTATTAGGCAGACTCCTAGAAGGGACCCAAAGTTTCATAATGATGA
+
F,FFFF,:,::FF,FFFFF,FF:FFF:F:FFFFF,F,:FFF,::F,F:F
@A00836:1001:H5M52DSX3:3:1121:9353:31313 3:N:0:GACTTCCT
GCTCTATGGTAAAGCTTGAGGTCAAGGATGGTGATTCCCTCAGCCATTC
+
,FFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
you can download the data (https://www.10xgenomics.com/datasets/8k-adult-mouse-cortex-cells-atac-v2-chromium-x-2-standard)
Thank you! I can reproduce this error. I think it is related to the --low-mem mode (implicitly activated in the --preset atac option), it might take some time to fix and I'm very actively working on this.
I think I've found the issue and pushed the fix to the li_dev11 branch. Could you please give a try? If it works, I'll merge this into master branch and draft a new release. Thank you!
sorry, the chromap i had run failed in li_dev11
What was your running command?
as the same as before
chromap --preset atac --remove-pcr-duplicates-at-cell-level --Tn5-shift -t 10 -r genome.fa -x mm10chromapindex -1 8kmousecortex_S1_L001_R1_001.fastq.gz -2 8kmousecortex_S1_L001_R3_001.fastq.gz -b 8kmousecortex_S1_L001_R2_001.fastq.gz --barcode-whitelist barcodelist -o test.sam --SAM
I think your command somehow runs "chromap.1" file, which is the readme/manual file. Have you run "make clean; make" to recompile chromap?
o, i excute the "chromap.1". ok, let me "make" and rerun.