bioframe icon indicating copy to clipboard operation
bioframe copied to clipboard

bin-table operations

Open agalitsyna opened this issue 2 years ago • 1 comments

  • [ ] definitions
  • [ ] functions
    • [ ] regions_to_bins to assign binspans to a set of bp spans
    • [ ] align_tables #37
    • [ ] adjust_view_to_bintable to synchronize view start/end with bin start/end. We frequently have viewframe made at in bp resolution, while bins are much larger, and it is not clear to what regions to assign them. This tool will provide a standard for making this decision.

agalitsyna avatar Jun 15 '22 18:06 agalitsyna

What we already have in bioframe and what might be useful for the discussion.

  • filtering bin table by viewframe has at least three bioframe-ish solutions:
  1. bioframe.select with pd.concat - will output bins that overlap multiple regions multiple times
  2. bioframe.overlap with DataFrame.dropna - does not verify the viewframe for sorting/non-overlapping
  3. bioframe.assign_regions with DataFrame.dropna - the most reasonable way. Bin that overlaps multiple regions will be assigned to the one with the highest overlap.

agalitsyna avatar Jun 15 '22 23:06 agalitsyna