CoverM icon indicating copy to clipboard operation
CoverM copied to clipboard

Request: Foreign function interface

Open jakobnissen opened this issue 4 years ago • 4 comments
trafficstars

Thanks for your work on this tool. I'm sure it will see a lot of usage in future pipelines and/or tools.

However, as far as I can see, the interface and documentation exists only for the command-line interface. There is no foreign function interface. This means third party packages cannot use CoverM internally, but have to seperately install CoverM and call it from the shell.

It would be much easier to use CoverM if some high-level functions were exposed and documented. Even if the edges are rough.

jakobnissen avatar Jan 06 '21 11:01 jakobnissen

Hi,

Thanks for your interest. This sounds like a good idea, though I would probably like to concentrate my efforts on some other things first. Without asking you to delve into the code, which particular functions would you like to see exposed?

Do you mean so that it can be used by a python program specifically? Or Julia, which I see you are into? If Python, then you might be interested in https://github.com/apcamargo/pycoverm

ben

wwood avatar Jan 07 '21 02:01 wwood

I don't really know which functions I need, because I haven't gotten into the codebase. The functionality I'm looking for is: Given a set of BAM files, get a float32 matrix with the depth of each contig within the BAM files (and the names of the contigs, in order). If sorted bams are necessary for this to work, it would also be nice with a function that checked if a BAM file is sorted.

Ideally, functionality equivalent to coverm -b [BAMFILES ... ] -m trimmed_mean name --trim-min 0.1 --trim-max 0.9. Actually, I almost certainly know less about accurately estimating contig coverage than you do - I'm looking for a function to estimate contig depths for metagenomic binning.

Yep, I'm planning to use it in a Julia package. For that I need a #[no_mangle] extern pub fn function in a lib.rs. I think it It might be easier for me to create a small Rust library that exposes just what I need, like pycoverm does. But I'm not sure what CoverM functions to call!

An working alternative (which I'm using now in testing), is to install CoverM and call it from the shell in my program. But involving the shell is brittle, requires intermediary files to be generated and re-parsed, and complicates the installation process.

jakobnissen avatar Jan 07 '21 10:01 jakobnissen

Needless to say, that's something I'm also really interested in. But I can see that it would take a considerable amount of work.

Expanding on what Jakob already mentioned, I think that it would be best if the exposed interface was modular (something that I explain better here: https://github.com/apcamargo/pycoverm/issues/2). In my opinion, the modularity is one of the great aspects of CoverM and would be cool if it was accessible via the FFI. From what I remember and understand of CoverM's codebase (not a lot), that's entirely possible but would require some major modifications.

Regarding the functionality to check if the BAM file is sorted, CoverM does so using an "empirical" approach (that is, it checks the alignments themselves and doesn't look at the header of the file). Because feature was coded within a big function, and not in its own, I ended up writing a very simple function that reads the beginning of the file and checks if SO:coordinate is in the header.

apcamargo avatar Jan 07 '21 10:01 apcamargo

This is just to say that this need for me is somewhat obsolete now. For Python, there is pycoverm to serve the need. For Julia, there is CoverM_jll. I'm happy to close the issue, but maybe you also want an FFI.

jakobnissen avatar Mar 20 '23 13:03 jakobnissen