fuzzyjoin icon indicating copy to clipboard operation
fuzzyjoin copied to clipboard

Fuzzy_join example for help or vignette. Match (key==key) and date between (startDate and endDate)

Open IndigoJay opened this issue 5 years ago • 2 comments

More examples: I've used this package in other powerful ways, but on proprietary data. I'm interested in ideas for use cases that can be provided as vignettes.

fuzzy_left_join(A, B
  by = c(
    "key" = "key",
    "date" = "startDate",
    "date" = "endDate"
  ),
  match_fun = list("==", ">=", "<=")
)

IndigoJay avatar Oct 02 '19 19:10 IndigoJay

Getting into longer datasets (ex. 6000 rows on each side), execution is substantially faster (0.11s vs 1662.77s) to perform this in two steps; join followed by filter. FYI

system.time({ 
  a<-inner_join(A, B, by = c("key" = "key"))
  a<-filter(a,
            date >= startDate &
            date <= endDate
            )
  })

IndigoJay avatar Oct 03 '19 16:10 IndigoJay

I ended up doing like that because I was always hitting out of memory...

espinielli avatar Apr 14 '21 15:04 espinielli