memes icon indicating copy to clipboard operation
memes copied to clipboard

helper to convert ame sequences to GRanges

Open snystrom opened this issue 4 years ago • 0 comments

Need to think about this some more & get feedback. If not using shuffled input it becomes difficult to label the regions by whether they're input or control sequences.

Possible solution modify get_sequences to add an optional ID label which users must use to convert sequences easily??? Seems too complicated.

# Attempt at writing sequence converter for AME results
ame_analysis_seq <- peaks %>% 
  resize(200, "center") %>% 
  get_sequence(dm.genome) %>% 
  runAme(evalue_report_threshold = 30, sequences = TRUE)

ame_analysis_seq$sequences[[1]] %>% 
  tidyr::separate(seq_id, c("pos", 'type'), sep = "_") %>% 
  # what about partitioning or background/control?
  # when using paritioning or control fasta, there is no ID appended after sequence info,
  # so no easy way to label them.... need to think about this
  dplyr::mutate(type = dplyr::case_when(is.na(type) ~ "input",
                                        type == "shuf" ~ "shuffle")) %>% 
  {
    dat <- .
    ranges <- GRanges(.$pos)
    mcols(ranges) <- dat %>% 
      dplyr::select(-pos)
    return(ranges)
  }

snystrom avatar Apr 29 '20 18:04 snystrom