naniar
naniar copied to clipboard
geom_miss_point appears to be jittering unrandomly
When using geom_miss_point()
, if there are missing values for both variables, then these are displayed along the a diagonal line. This isn't obvious in the vignette examples but when the number of missing values is high then it's obvious. See bottom left of the plot in the reprex below.
Is it possible for these to instead be unequal / uncorrelated?
I think this is something to do with how random uniform values are being selected for the jitter - they seem to be selected from a pre-set list of random numbers, rather than truely random each time. To see this, try running just the ggplot(df) + naniar::geom_miss_point(aes(x, y))
over and over again in R, and you'll notice the jittered points don't change. (When you do something similar for geom_jitter
, the plot will differ slightly each time.) So, I think it's selecting from the same set of "random" uniform values. set.seed()
just before the ggplot
call makes no difference. It's not obvious to me from the source code why this might be happening.
Thanks very much for a brilliant package :+1:
Reprex:
library('tidyverse')
library('naniar')
df <- tibble(
x=rnorm(1000, 2, 1),
y=rnorm(1000, 5, 1)
) %>%
mutate(
x=ifelse(runif(1000)>0.2, x, NA),
y=ifelse(runif(1000)>0.2, y, NA)
)
ggplot(df) + naniar::geom_miss_point(aes(x, y))
Great point!
I'll see what I can do to rectify this - it would be good if these points were randomly jittered in a square shape.