waldo icon indicating copy to clipboard operation
waldo copied to clipboard

Compare individual strings

Open hadley opened this issue 4 years ago • 4 comments

Making it easier to spot (e.g.) a non-breaking space vs. a space.

hadley avatar Nov 29 '20 22:11 hadley

rvest is a common source of such confusion, e.g. https://github.com/tidyverse/rvest/issues/284

hadley avatar Dec 14 '20 14:12 hadley

Need to think about x <- c("\u00e1", "a\u0301")

And

cyrillic_a <- "А"
latin_a <- "A"
cyrillic_a == latin_a

hadley avatar Apr 28 '21 12:04 hadley

Something like this:

compare_string <- function(x, y) {
  x_char <- strsplit(x, "")[[1]]
  y_char <- strsplit(y, "")[[1]]

  ses1 <- ses(x_char, y_char)
  if (nrow(ses1) == 0 || nrow(ses1) > length(x_char)) {
    return(list())
  }

  # If diff equivalent to every letter changing, don't highlight differences
  ses2 <- ses_elementwise(x_char, y_char)
  if (nrow(ses2) <= nrow(ses1)) {
    return(c(x, y))
  }

  diff <- ses1
  diff$x_start <- 1
  diff$x_end <- length(x)
  diff$y_start <- 1
  diff$y_end <- length(y)
  diff <- diff_complete(diff)

  aligned <- diff_align(diff, x, y)
  aligned$x[is.na(aligned$x)] <- " "
  aligned$y[is.na(aligned$y)] <- " "
  c(paste(aligned$x, collapse = ""), paste(aligned$y, collapse = ""))
}
  • Need to refactor diff_align() to make it easier to apply different formatting — don't style unchanged characters and highlight individual changes with background colours
  • Need option to apply in format_diff_matrix()? Can only be down after diff data is aligned.

hadley avatar Jul 18 '21 19:07 hadley

https://fosstodon.org/@gaborcsardi/110961573755872008

hadley avatar Aug 28 '23 05:08 hadley