dupree icon indicating copy to clipboard operation
dupree copied to clipboard

Function for obtaining / printing the text for a pair of duplicated blocks

Open russHyde opened this issue 5 years ago • 6 comments

For example, print_dup(dup_df[1, ])

Or, if we change dupree to return a list of class Dups, wherein each entry is of class Dup; then print(dups[[1]]) might be better syntax

russHyde avatar Nov 06 '19 09:11 russHyde

Note that the LCS algorithm in {stringdist} only computes the length of the LCS, it doesn't return the longest common subsequence. I can't find a good LCS implementation within CRAN (and don't want to depend on bioconductor packages since dupree is on CRAN now)

russHyde avatar Jan 10 '20 11:01 russHyde

? include an LCS implementation with dupree (can still use {stringdist} for computing the distances, but local LCS for computing the duplicated strings)

russHyde avatar Jan 10 '20 11:01 russHyde

[Could call to {textreuse} with the original code strings, rather than integer vectors] - but would require r-textreuse to be pushed to conda-forge for me to use this locally

russHyde avatar Jan 10 '20 11:01 russHyde

Just print the contents of the two (+) blocks for now. Can implement finding the actual LCS at a later stage

russHyde avatar Jan 23 '20 13:01 russHyde

Hi all, I made a little function to view diff between each couple of code string. I hope this can help somebody. The diffr package is needed.

dup_diff <- function(dupree_res, min_score = 0.45, nlines = 10) {
  dup_misc_filter <- dupree_res$dups_df |>
    filter(score > min_score)

  res <- list()
  for (i in seq_len(nrow(dup_misc_filter))) {
    dir.create(paste0(tempdir(), "/", i), showWarnings = FALSE)
    writeLines(readLines(dup_misc_filter$file_a[i])[dup_misc_filter$line_a[i] + c(0:nlines)],
               paste0(tempdir(), "/", i, "/file_a"))
    writeLines(readLines(dup_misc_filter$file_b[i])[dup_misc_filter$line_b[i] + c(0:nlines)],
               paste0(tempdir(), "/", i, "/file_b"))
    res[[i]]  <- diffr::diffr(paste0(tempdir(), "/", i, "/file_a"),
                              paste0(tempdir(), "/", i, "/file_b"))

  }
  return(res)
}

For example,

example_file <- system.file("extdata", "duplicated.R", package = "dupree")
dup <- dupree(example_file, min_block_size = 10)
dup
dif <- dup_diff(dup)

adrientaudiere avatar Feb 13 '24 05:02 adrientaudiere

Neat. Thanks

russHyde avatar Feb 14 '24 14:02 russHyde