fuzzyjoin
fuzzyjoin copied to clipboard
Fuzzy_join example for help or vignette. Match (key==key) and date between (startDate and endDate)
More examples: I've used this package in other powerful ways, but on proprietary data. I'm interested in ideas for use cases that can be provided as vignettes.
fuzzy_left_join(A, B
by = c(
"key" = "key",
"date" = "startDate",
"date" = "endDate"
),
match_fun = list("==", ">=", "<=")
)
Getting into longer datasets (ex. 6000 rows on each side), execution is substantially faster (0.11s vs 1662.77s) to perform this in two steps; join followed by filter. FYI
system.time({
a<-inner_join(A, B, by = c("key" = "key"))
a<-filter(a,
date >= startDate &
date <= endDate
)
})
I ended up doing like that because I was always hitting out of memory...