modkit icon indicating copy to clipboard operation
modkit copied to clipboard

Question about making a per-modification bedMethyl score

Open cmdcolin opened this issue 6 months ago • 4 comments

I want to plot a 'reasonable' value that shows the "percent of a particular type of modification is present, relative to total coverage". the fraction modified is a good value for most cases, but sometimes a random base in a single read is modified in the middle of no where, and fraction_modified/score becomes 100 for e.g. a hydroxymethylation which doesn't seem worth notifying the user about.

example screenshot

Image

cmdcolin avatar May 09 '25 22:05 cmdcolin

Hello @cmdcolin,

Sorry for the terribly long response time (new Modkit features are coming!).

You could filter the bedMethyl to only positions that match the reference and/or remove records with a $N_{\text{valid}}$ below a certain level (e.g. the median coverage or something like that). There are probably some fancier techniques you could borrow from the DMR code, but I don't have something easy for you. Does JBrowse handle bedMethyl's natively? I should give it a try!

ArtRand avatar May 19 '25 13:05 ArtRand

Does JBrowse handle bedMethyl's natively? I should give it a try!

ya! we have tried to do a couple things to support it better, i am trying to update docs but there is a a bit of a trick that you can use to make bedMethyl tracks display in the quantitative format: when opening your track via the GUI you can specify it to be loaded as a "MultiQuantitativeTrack" then it will display like a nice bigwig style plot with different modifications as different subtracts

same idea but with manually editing into a config.json

{
  "type": "MultiQuantitativeTrack",
  "trackId": "colo829_tumor.ht_modkit.bed.gz",
  "name": "COLO829_tumor.ht_modkit bedMethyl",
  "assemblyNames": ["hg38"],
  "adapter": {
    "type": "BedTabixAdapter",
    "uri": "https://jbrowse.org/genomes/GRCh38/COLO829/COLO829_tumor.ht_modkit.bed.gz"
  }
}

share link showing a bedMethyl and CRAM file with the 'modifications' rendering turned on...ended up reproducing a lot of what igv does cause they do things quite well :)

https://jbrowse.org/code/jb2/v3.4.0/?config=test_data%2Fconfig_demo.json&session=share-3RseWNrC7N&password=Z61y7

cmdcolin avatar May 20 '25 22:05 cmdcolin

woops updated screenshot

Image

cmdcolin avatar May 20 '25 22:05 cmdcolin

regarding the original question:

You could filter the bedMethyl to only positions that match the reference and/or remove records with a N valid below a certain level (e.g. the median coverage or something like that). There are probably some fancier techniques you could borrow from the DMR code, but I don't have something easy for you. Does JBrowse handle bedMethyl's natively? I should give it a try!

I'll check into these...

It might be good to not have to cross-reference the original BAM/CRAM to get e.g. the median coverage or the 'matches reference' values...not sure if that is attainable though. also there is no "total sequencing coverage at this position" column (?) so it is hard to recompute fraction modified relative to total sequencing coverage at a given position as far as i can tell

cmdcolin avatar May 20 '25 22:05 cmdcolin