pairtools icon indicating copy to clipboard operation
pairtools copied to clipboard

Fixing API incomplete functionality for read side detection

Open agalitsyna opened this issue 1 year ago • 0 comments

What was wrong?

  • Detection of the read side was not functional with some aligners after API refactoring https://github.com/open2c/pairtools/commit/9d99660ce8ffda66486c2b0cc296a6bb814576bf

Technical description of the issue:

  • Detection of the read side was moved from push_pysam to group_alignments_by_side, which relied on sam.is_read1 attribute of pysam entry. It simply does not always work as intended in pysam, and detection of the read side shall be done by sam.flag instead, like was done in https://github.com/open2c/pairtools/blob/6303de6d9e992e426285840bfd10e7d5dbbc1c84/pairtools/lib/parse.py#L220-L227
  • Inappropriate detection of the read side probably resulted in sometimes reporting the read side, potentially depending on the aligner. It did not work for single-end reads reported in issue https://github.com/open2c/pairtools/issues/247.

Solution:

  • group_alignments_by_side was refactored and fully merged with the old push_pysam function.

I did not design any specific test, as existing tests cover parse and parse2 read side detection; they just cannot cover all possible aligners that can report read side differently.

agalitsyna avatar Sep 27 '24 15:09 agalitsyna