modkit icon indicating copy to clipboard operation
modkit copied to clipboard

Multi-base entropy

Open Ge0rges opened this issue 1 year ago • 3 comments
trafficstars

Hi Arthur,

I was wondering if multi-base entropy (i.e. --base A --base C) is something that made sense/was planned?

Ge0rges avatar Jun 11 '24 22:06 Ge0rges

Hey @Ge0rges,

It's planned, just need to do it. Thanks for showing interest in entropy I'll move it up the list.

ArtRand avatar Jun 11 '24 22:06 ArtRand

Hey art, I was wondering if you had a time line for this?

Ge0rges avatar Oct 05 '24 09:10 Ge0rges

@Ge0rges working on it. Next version of entropy will have some more functionality (including multi-base). Thanks for the poke.

ArtRand avatar Oct 10 '24 00:10 ArtRand

Hello @Ge0rges here is a build that allows you to use multiple bases/motifs at once. For example:

modkit entropy \
  -s ./reads.bam \
  -o ./output_dir \
  --ref ${ref_fasta} \
  --header \
  --regions ./regions.bed \
  --log modkit_entropy_regions.log \
  --motif CG 0 \
  --base A

One thing to note is that I made a slight change to the entropy calculation. Basically, I noticed that windows containing a positions with a lot of failing base modification calls (ones falling below the pass threshold) would have higher entropy than ones that wouldn't. Now the calculation is such that the opposite is true, considering the same underlying methylation entropy in the reads more confident calls will have higher entropy than if some of the calls are filtered. Please let me know if you encounter any problems. modkit_u16_x86_64.tar.gz

ArtRand avatar Dec 02 '24 20:12 ArtRand