fuzzyjoin
fuzzyjoin copied to clipboard
Suggestion: help in constructing custom match_fun's.
It would be nice to have a simple way to get the match_fun as used in stringdist_join() so that a custom match_fun could be built using it. For example, this SO post https://stackoverflow.com/q/60336083/2554330 wants fuzzy matching, but only if the first letter is an exact match. It would be nice to write a solution like this:
fuzzy_match <- stringdist_match(max_dist = 2, ...) # with args from stringdist_join that are
# used in its match_fun
first_letter_match <- function(col1, col2)
sub("(^.).*", "\\1", col1) == sub("(^.).*", "\\1", col2)
custom_match <- function(col1, col2)
first_letter_match(col1, col2) & fuzzy_match(col1, col2)
fuzzy_inner_join(df1, df2, by = "name", match_fun = custom_match)