dplyr icon indicating copy to clipboard operation
dplyr copied to clipboard

Add `merge` indicator after a join

Open nmarantz opened this issue 3 years ago • 3 comments

A prior iteration of this request was closed and locked as old, but the asserted rationale does not address the request. It would be very useful if the dplyr mutating join functions allowed for an option to create a variable indicating which observations were joined (or not) from each data frame. This is a feature available in Stata's merge command and in Pandas (indicator = True). It has been implemented in R through the statar package, but it would be great to have it included in the dplyr mutating join functions.

The purported rationale for closing the earlier iteration of this feature request was that tidylog has solved the problem. The tidylog package is great, but it doesn't add any variables to the merged data frame, which was the initial feature request.

nmarantz avatar Aug 30 '22 00:08 nmarantz

Relevant example from linked issue:

import pandas as pd

df1 = pd.DataFrame({'key1' : ['a','b','c'], 'v1' : [1,2,3]})
df2 = pd.DataFrame({'key1' : ['a','b','d'], 'v2' : [4,5,6]})

df1.merge(df2, how = 'left', indicator = True)
#>  key1  v1   v2     _merge
#> 0    a   1  4.0       both
#> 1    b   2  5.0       both
#> 2    c   3  NaN  left_only

If we implemented this, we'd presumably instead require the name of the column to create and store either x, y, or both.

hadley avatar Aug 30 '22 19:08 hadley

This doesn't seem too bad

library(dplyr)

df1 <- tibble(key1 = c("a", "b", "c"), v1 = 1:3)
df2 <- tibble(key1 = c("a", "b", "d"), v1 = 4:6)

df1 <- df1 %>%
  mutate(id1 = TRUE)

df2 <- df2 %>%
  mutate(id2 = TRUE)

left_join(df1, df2, by = "key1") %>%
  mutate(
    indicator = case_when(
      id1 & id2 ~ "both",
      id1 ~ "left",
      id2 ~ "right"
    ),
    .keep = "unused"
  )
#> # A tibble: 3 × 4
#>   key1   v1.x  v1.y indicator
#>   <chr> <int> <int> <chr>    
#> 1 a         1     4 both     
#> 2 b         2     5 both     
#> 3 c         3    NA left

DavisVaughan avatar Aug 30 '22 19:08 DavisVaughan