fuzzyjoin icon indicating copy to clipboard operation
fuzzyjoin copied to clipboard

Suggestion: help in constructing custom match_fun's.

Open dmurdoch opened this issue 5 years ago • 0 comments

It would be nice to have a simple way to get the match_fun as used in stringdist_join() so that a custom match_fun could be built using it. For example, this SO post https://stackoverflow.com/q/60336083/2554330 wants fuzzy matching, but only if the first letter is an exact match. It would be nice to write a solution like this:

fuzzy_match <- stringdist_match(max_dist = 2, ...) # with args from stringdist_join that are 
                                                   # used in its match_fun

first_letter_match <- function(col1, col2) 
                                    sub("(^.).*", "\\1", col1) == sub("(^.).*", "\\1", col2)
custom_match <- function(col1, col2) 
                                    first_letter_match(col1, col2) & fuzzy_match(col1, col2)

fuzzy_inner_join(df1, df2, by = "name", match_fun = custom_match)

dmurdoch avatar Feb 21 '20 13:02 dmurdoch