ggbeeswarm
ggbeeswarm copied to clipboard
geom_quasirandom connect grouped points with geom_line
Hello,
I would like to connect related points that are separated within groups using geom_quasirandom, but am unsure how to do this. With geom_jitter this is done with the position_dodge argument (see here: https://stackoverflow.com/questions/39533456/how-to-jitter-both-geom-line-and-geom-point-by-the-same-magnitude), however, I am unable to figure this out for this package. I feel
Below is a reproducible example. I feel like this is close, however, the points and relevant lines are not in fact connected as they should be.
Any help or advice would be most appreciated. Thank you!!
library(ggplot2);library(ggbeeswarm);library(dplyr)
iris %>%
dplyr::mutate(flower = rep(1:nrow(iris), each = 3, len = nrow(iris))) %>% #make a variable to connect the dots
dplyr::mutate(timepoint = rep(c(1,2,3), each = 1, len = nrow(iris))) %>%
ggplot2::ggplot(aes(x = interaction(Species, timepoint), y = Petal.Width)) +
ggplot2::geom_boxplot(outlier.shape = NA) +
ggplot2::geom_line(aes(group = flower), color = "grey") +
ggbeeswarm::geom_quasirandom(groupOnX = T,
size = 4,
pch = 1,
aes(fill = Species)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Hi @wipperman, thanks for suggesting this. I'm filing it as an enhancement that I'll try to incorporate in the next (very overdue) round of bug fixes.
Adding position=ggbeeswarm::position_quasirandom()
works well for me, I don't think any changes apart from adding an example are needed. An example:
library(ggplot2)
mpg$after2000 = mpg$year > 2000
data = aggregate(
hwy ~ manufacturer + model + after2000 + drv,
mpg,
mean
)
(
ggplot(data, aes(x=after2000, y=hwy))
+ geom_violin()
+ geom_line(
aes(group=interaction(manufacturer, model)),
color='grey',
position=ggbeeswarm::position_quasirandom()
)
+ ggbeeswarm::geom_quasirandom(aes(color=drv))
+ scale_color_discrete(
labels=c(
'f'='front-wheel drive',
'r'='rear-wheel drive',
'4'='four-weel drive'
),
name='the type of drive train'
)
+ ylab('highway miles per gallon')
+ theme_bw()
)
Hi, thank you for the tip! It does not seem to work though when adding a non-null dodge.width
argument to geom_quasirandom()
... For instance, a pseudocode like:
geom_quasirandom(
dodge.width = 0.5,
size = 2,
alpha = 0.25,
show.legend = FALSE
) +
geom_line(
data = . %>% filter(mode != "Control"),
aes(group = interaction(participant, syllable) ),
size = 0.5,
alpha = 0.25,
show.legend = FALSE,
position = ggbeeswarm::position_quasirandom(dodge.width = 0.5)
) +
Results in a plot like:
Any idea on how to "connect the dots"?
Thanks!
Ladislas
Hi @lnalborczyk, I looked into this and I believe that the issue is that defining a group aesthetic changes how the density calculation (and therefore the position) is performed. So in your example, the calculated 'group' being used to distribute the points is the combined x+color aesthetics, whereas the calculated 'group' for the line is (participant+syllable+x+color). Consequently, the position calculation for the lines is altered.
The reason that previous example works and yours doesn't is because specifying dodge.width
triggers geom_quasirandom
to adjust position by calculated 'group' (not necessarily the group aesthetic). If dodge.width is NULL, it instead just groups by the x aesthetic. This is essentially the equivalent of adding position_dodge(width=...)
.
On that note, I looked to see if I could get one of the base ggplot2 position functions like jitter/dodge/jitterdodge to do what you're looking for and I couldn't find a workable solution (though I might have overlooked it). The suggestion was to consider a faceting approach, which I think would work but certainly wouldn't look as clean as what you're envisioning.
Hello! I recently discovered this package, and I'm very appreciative of it's functionality. I'm trying to connect points with a geom_line, and I'm having trouble getting the line to follow the quasirandom position of the points. For example, when I try the example provided above, the lines are all positioned in the center.
I believe that I'm running version 0.7.2
library(ggplot2)
mpg$after2000 = mpg$year > 2000
data = aggregate(
hwy ~ manufacturer + model + after2000 + drv,
mpg,
mean
)
(
ggplot(data, aes(x=after2000, y=hwy))
+ geom_violin()
+ geom_line(
aes(group=interaction(manufacturer, model)),
color='grey',
position=ggbeeswarm::position_quasirandom()
)
+ ggbeeswarm::geom_quasirandom(aes(color=drv))
+ scale_color_discrete(
labels=c(
'f'='front-wheel drive',
'r'='rear-wheel drive',
'4'='four-weel drive'
),
name='the type of drive train'
)
+ ylab('highway miles per gallon')
+ theme_bw()
)
@gibson-amandag I can confirm that this is a regression. Your code works well in 0.6.0 but fails in 0.7.1 and 0.7.2
0.6.0 | 0.7.2 |
---|---|
A workaround would be using geom_segment
but it only works partially. It is kind of giving up on the second half:
library(ggplot2)
mpg$after2000 = mpg$year > 2000
data = aggregate(
hwy ~ manufacturer + model + after2000 + drv,
mpg,
mean
)
(
ggplot(data, aes(x=after2000, y=hwy))
+ geom_violin()
+ geom_segment(
data=unstack(data, hwy ~ after2000),
aes(x=FALSE, xend=TRUE, y=FALSE., yend=TRUE.),
color='grey',
position=ggbeeswarm::position_quasirandom()
)
+ ggbeeswarm::geom_quasirandom(aes(color=drv))
+ scale_color_discrete(
labels=c(
'f'='front-wheel drive',
'r'='rear-wheel drive',
'4'='four-weel drive'
),
name='the type of drive train'
)
+ ylab('highway miles per gallon')
+ theme_bw()
)
Here is a monkeypatch which fixes the geom_segment
approach in version 0.7.2:
ggbeeswarm <- getNamespace("ggbeeswarm")
unlockBinding("offset_quasirandom", ggbeeswarm)
ggbeeswarm$offset_quasirandom <- function(
data,
width = 0.4,
vary.width = FALSE,
max.length = NULL,
...
) {
x.offset <- vipor::aveWithArgs(
data$y, data$x,
FUN = vipor::offsetSingleGroup,
maxLength = if (vary.width) {max.length} else {NULL},
...
)
x.offset <- x.offset * width
data$x <- data$x + x.offset
if ('xend' %in% colnames(data) && 'yend' %in% colnames(data)) {
x.offset <- vipor::aveWithArgs(
data$yend, data$xend,
FUN = vipor::offsetSingleGroup,
maxLength = if (vary.width) {max.length} else {NULL},
...
)
x.offset <- x.offset * width
data$xend <- data$xend + x.offset
}
data
}
lockBinding("offset_quasirandom", ggbeeswarm)
Hi @gibson-amandag, thanks for the bug report and reprex and apologies for the delay in responding. @krassowski, thanks for figuring out a monkeypatch- I'll look into incorporating it into the package.
PR is ready :) But if you find it easier to rework it feel free to close - no hard feelings.
Hi,
Thank you for this great function, that I have been using.
I have have the same issue of lines being disconnected from points (I am using geom_line since there are various points along my lines). I would love to see it working in future version (although I understand that this may not be your priority), Thanks