waldo
waldo copied to clipboard
Compare individual strings
Making it easier to spot (e.g.) a non-breaking space vs. a space.
rvest is a common source of such confusion, e.g. https://github.com/tidyverse/rvest/issues/284
Need to think about x <- c("\u00e1", "a\u0301")
And
cyrillic_a <- "А"
latin_a <- "A"
cyrillic_a == latin_a
Something like this:
compare_string <- function(x, y) {
x_char <- strsplit(x, "")[[1]]
y_char <- strsplit(y, "")[[1]]
ses1 <- ses(x_char, y_char)
if (nrow(ses1) == 0 || nrow(ses1) > length(x_char)) {
return(list())
}
# If diff equivalent to every letter changing, don't highlight differences
ses2 <- ses_elementwise(x_char, y_char)
if (nrow(ses2) <= nrow(ses1)) {
return(c(x, y))
}
diff <- ses1
diff$x_start <- 1
diff$x_end <- length(x)
diff$y_start <- 1
diff$y_end <- length(y)
diff <- diff_complete(diff)
aligned <- diff_align(diff, x, y)
aligned$x[is.na(aligned$x)] <- " "
aligned$y[is.na(aligned$y)] <- " "
c(paste(aligned$x, collapse = ""), paste(aligned$y, collapse = ""))
}
- Need to refactor
diff_align()
to make it easier to apply different formatting — don't style unchanged characters and highlight individual changes with background colours - Need option to apply in
format_diff_matrix()
? Can only be down after diff data is aligned.
https://fosstodon.org/@gaborcsardi/110961573755872008