ggrepel
ggrepel copied to clipboard
Example of algorithm not giving good results
Summary
I have an example where the labels overlap even under a large number of iterations. It should be easy to find positions that do not overlap. I'm not quite sure why this isn't working. I would also suggest adding a warning if the algorithm cannot find label positions without overlaps.
Minimal code example
Here is the minimum amount of code neeeded to demonstrate the issue:
set.seed(1)
rnd_label <- function(n) {
paste0(sample(LETTERS, n, replace = TRUE), collapse = "")
}
dt <- data.table(x = c(261156.9, 272704.9, 277045.2, 296642.4, 321354.4),
y = c(11.57584, 12.72191, 12.92108, 13.33607, 13.39414),
label_length = c(12, 23, 32, 35, 39))
dt[, label := lapply(label_length, rnd_label)]
ylim <- c(10, 14.0)
xlim <- c(2.5e5, 4e5)
g <- ggplot(dt, aes(x, y, label=label)) +
geom_line() +
geom_point() +
scale_y_continuous(breaks = scales::breaks_pretty(n=9), lim=ylim, name = "y") +
scale_x_continuous(labels=comma, breaks = scales::breaks_pretty(n = 9), lim=xlim, name = "x") +
geom_label_repel(seed=1, size=3, max.iter = 200000, segment.alpha = 0.25, min.segment.length = 0.25, box.padding=1.2)
ggsave("test_repel.png", plot = g, width = 6.5, height = 4.017, units = "in")
Here is an image of the output produced by the code:
Version information
Here is the output from sessionInfo()
in my R session:
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dampack_0.2.0 devtools_2.3.0 usethis_1.6.1 stringr_1.4.0 ggrepel_0.8.2 scales_1.1.0 ggplot2_3.3.2 data.table_1.12.8 here_0.1
loaded via a namespace (and not attached):
[1] tidyselect_1.0.0 xfun_0.13 remotes_2.1.1 purrr_0.3.4 reshape2_1.4.4 splines_3.6.3 lattice_0.20-41 colorspace_1.4-1 vctrs_0.2.4 testthat_2.3.2 yaml_2.2.1 mgcv_1.8-31 rlang_0.4.5
[14] pkgbuild_1.0.7 pillar_1.4.3 glue_1.4.0 withr_2.2.0 sessioninfo_1.1.1 lifecycle_0.2.0 plyr_1.8.6 munsell_0.5.0 gtable_0.3.0 memoise_1.1.0 knitr_1.28 callr_3.4.3 ps_1.3.2
[27] fansi_0.4.1 triangle_0.12 Rcpp_1.0.4 backports_1.1.6 desc_1.2.0 pkgload_1.0.2 truncnorm_1.0-8 farver_2.0.3 fs_1.4.1 ellipse_0.4.1 packrat_0.5.0 digest_0.6.25 stringi_1.4.6
[40] processx_3.4.2 dplyr_0.8.5 grid_3.6.3 rprojroot_1.3-2 cli_2.0.2 tools_3.6.3 magrittr_1.5 tibble_3.0.1 crayon_1.3.4 pkgconfig_2.0.3 Matrix_1.2-18 ellipsis_0.3.0 prettyunits_1.1.1
[53] assertthat_0.2.1 rstudioapi_0.11 R6_2.4.1 nlme_3.1-147 compiler_3.6.3
I am (sometimes) getting similar results. What works for me is the
adjustment of force (including force_pull in the development version) and
padding (point.padding, box.padding) parameters. Would it make sense to
have some auto-adjustment based on the number of overlaps as a stop
criterion? This could have an interface like geom_label(force="auto")
?
@MFairley Thanks for pointing this out! I think the algorithm can certainly be improved to work for this particular case. Since the two labels at the top are at exactly the same y-axis position, they don't exert a force along the y-axis on each other.
One simple way to work around this would be to use "jittered" text label coordinates instead of exact coordinates when calculating forces inside ggrepel. That would eliminate the possibility for labels to be at exactly the same y-axis position.
Another problem is when a text label is surrounded from all sides by data points. The label can get stuck in the middle of the points instead of escaping to the outside of the point cloud.
@krassowski I'm open to any suggestions or ideas. By the way, I apologize for being slow with PRs and issues.