megalodon icon indicating copy to clipboard operation
megalodon copied to clipboard

Setting up input for epigenetic modification detection

Open GianlucaMattei opened this issue 2 years ago • 1 comments

Hello, I would like to run 5hmC and 5mC detection.

in this case should I use the dna_r9.4.1_450bps_modbases_5mc_hac.cfg or the dna_r9.4.1_450bps_modbases_dam-dcm-cpg_hac.cfg to detect all the methylated Cs (not only the in the GpGs context) ?

What is the difference between the two configs and is there any description about them anywhere?

About --ref-mods-all-motifs: is "N 0" correct (to retrieve also other types of modifications) or I should just use "C 0" ? (I was looking at this https://github.com/jkbonfield/hts-specs/blob/methylation/SAMtags.tex#L477)

GianlucaMattei avatar Oct 03 '22 11:10 GianlucaMattei

The dna_r9.4.1_450bps_modbases_dam-dcm-cpg_hac.cfg model was an initial proof of concept model and is not recommended. The dna_r9.4.1_450bps_modbases_5mc_hac.cfg model is an improvement over this and is the best model for R9 data. Though the best results for all-context modified base detection can be achieved with kit14 and the Rerio all-context Remora model. We will continue to develop updates for kit14 models, but are unlikely to update R9 models at this time.

marcus1487 avatar Oct 06 '22 23:10 marcus1487