widyr
widyr copied to clipboard
Any way to save more column when carrying pairwise_ function?
I've been benefited from widyr::pairwise_count
for years. It is really fast, however, recently I need to get all the combinations within the group and I tried use it again, but this time I want to keep the group ID. Usually, I would mutate
a new id (named "id2" usually) and group by this new column, and then use pairwise_count
. But it is really slow!
Let me give an example:
> library(dplyr)
> dat <- tibble(group = rep(1:5, each = 2),
+ letter = c("a", "b",
+ "a", "c",
+ "a", "c",
+ "b", "e",
+ "b", "f"))
>
> # count the number of times two letters appear together
> pairwise_count(dat, letter, group)
# A tibble: 8 x 3
item1 item2 n
<chr> <chr> <dbl>
1 b a 1
2 c a 2
3 a b 1
4 e b 1
5 f b 1
6 a c 2
7 b e 1
8 b f 1
Any way I could get the group number? Just like below:
library(dplyr)
library(widyr)
dat <- tibble(group = rep(1:5, each = 2),
letter = c("a", "b",
"a", "c",
"a", "c",
"b", "e",
"b", "f"))
dat %>%
mutate(group2 = group) %>%
group_by(group) %>%
pairwise_count(letter,group2) %>%
ungroup()
# A tibble: 10 x 4
group item1 item2 n
<int> <chr> <chr> <dbl>
1 1 b a 1
2 1 a b 1
3 2 c a 1
4 2 a c 1
5 3 c a 1
6 3 a c 1
7 4 e b 1
8 4 b e 1
9 5 f b 1
10 5 b f 1
But it is rather slow when there are more groups, any solutions to make it faster? Thanks.