pairtools icon indicating copy to clipboard operation
pairtools copied to clipboard

pairtools parsing engine abstraction

Open agalitsyna opened this issue 3 years ago • 1 comments

Premise:

  • pysam is fast but introduces multiple logistic complications. We may want to replace it with manual bam parser (Anton's style). pysam:
    • is complicated to link in isolated environments
    • leads to pairtools compilation errors without installing cython before pairtools
    • is not supported on osx
    • is not supported for python 3.11
  • New alignment formats emerge that can be parsed by pairtools with minor modifications, e.g. paf (pafpy)

Proposal:

Introduce an abstraction for parsing engine that can input data from APIs from different tools

Possible technical solutions:

  1. Make a universal wrapper class for "alignment" and create an io library for parsing it from data with different engines

  2. ?

Complications:

  1. Each new engine introduces a new dependency that we want to make optional

agalitsyna avatar Nov 09 '22 21:11 agalitsyna

What if the API were based on dataframes, instead of abstract alignment records?

nvictus avatar Nov 09 '22 23:11 nvictus