remora icon indicating copy to clipboard operation
remora copied to clipboard

Question about data preparation for training “All Context” modification model on Remora and the questions about the --refine-kmer-level-table input

Open sparkcyf opened this issue 8 months ago • 1 comments

I am currently working on training a model to detect DNA modifications in all contexts (all positions) using Remora. During the data preparation step, I encountered an issue where the --motif argument is required:

remora dataset prepare: error: the following arguments are required: --motif

I want to detect modifications at all positions, but I am unsure how to specify the --motif parameter to achieve this. In previous issues, such as https://github.com/nanoporetech/remora/issues/62 , it appears that the training scripts did not require this parameter.

Here is the script I am using for data preparation:

remora \
  dataset prepare \
  converted.pod5 \
  basecalls.bam  \
  --output-path mod_chunks \
  --refine-kmer-level-table tombo_model_5hmc.tsv \
  --refine-rough-rescale \
  --focus-reference-positions 5hmc_sites.bed \
  --mod-base m 5hmC

Could you please provide guidance on how to properly declare the --motif parameter for detecting modifications at all positions?

Thanks in advance!

sparkcyf avatar Jun 07 '24 08:06 sparkcyf