strobealign icon indicating copy to clipboard operation
strobealign copied to clipboard

Adding option to output M instead of =/X CIGARs

Open danessel opened this issue 2 years ago • 4 comments

Hi, To have a look at the mapping using IGV the strobealign bam files are raising an error. This is caused by the CIGAR = instead of M. Maybe more post processing tools will have an issue with this.

danessel avatar Apr 05 '22 05:04 danessel

Hi @danessel,

Thanks for reporting. I could certainly add a feature to output either X/= cigars or M.

In the meantime; I think samtools fixmate can do this conversion.

(Based on the command run here which results in strobealign's output having cigar strings with M. Is this correct @TDDB-limagrain ?)

ksahlin avatar Apr 05 '22 06:04 ksahlin

Hi all, using strobealign v0.7 and samtools v1.12, I ran the following lines:

$alignbin -t 4 $reffasta $forward $reverse | \
        $samtoolsbin fixmate -u -m - - | \
        $samtoolsbin addreplacerg -u -r "@RG\tID:${line}\tSM:${line}\tLB:Solution\tPL:illumina\tPU:none" - - |
        $samtoolsbin sort  -u -T bam/$line - | \
        $samtoolsbin markdup -@2 --reference $reffasta --write-index --output-fmt cram,version=3.0 - bam/$line.sorted.cram

CRAM file looked like:

A00604:215:HNL2GDSXY:2:1142:7310:5290   163     Chr01   7       60      2S149M  =       197     341     ATTTTAATGTCTTATAACTTCCGAATTCAAACTCAAATATTTGAATGTCTTATAACTTCCTTATTCAAACTTAAGGTAAAGAAATTCTTACCATTATATAAAAGTTTACCATCTTTATCTTGTTTAGGATTAGTGATATATTGACACCACT      FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF,:FFFFFF:FF::FFFFFFF AS:i:288    MQ:i:60  MC:Z:151=       ms:i:5490       MD:Z:89T59      NM:i:1  RG:Z:CB10

and perfectly loaded in IGV 2.12 image

TDDB-limagrain avatar Apr 05 '22 08:04 TDDB-limagrain

Note that current IGV versions do support CIGAR X and = operators. It is not necessary to run samtools fixmate. Possibly @danessel was using an old IGV version.

This feature request is of course still valid as there may be people stuck on old IGV versions, but perhaps this reduces its importance.

marcelm avatar Jun 02 '22 12:06 marcelm

Perhaps there are some SNP/indel callers that require M cigars? But I agree that it reduces importance.

ksahlin avatar Jun 03 '22 09:06 ksahlin