remora
remora copied to clipboard
Question about data preparation for training model
Hi,
I understand that to train a model using remora you first have to basecall fully unmethylated (pcr) or fully methylated reads (sssI) then merge both result to build a training dataset using taiyaki/misc/merge_mappedsignalfiles.py
. However in my case I need to use only specific genomic positions I know to be always methylated/unmethylated from BS-seq reference. Is this something I can do with remora before the merging of basecalls ? Or using taiyaki ?
Thanks,
Paul
This is functionality that is quite difficult with the current megalodon/taiyaki framework. We are working on a major re-write of the data prep and will have this out in the next release. I will post here once this update has been applied and it should make such a use case much more feasible.