scales
scales copied to clipboard
Feature request: `transform_squish` or `label_squish`
When passing scales::squish() to the oob argument of a scale (say scale_fill_viridis_c()) I find that it would often be nice to also transform the breaks and labels as follows:
- If there are oob values on the upper end, always have a break for the upper limit labeled as "≥{upper limit}"
- If there are oob values on the lower end, always have a break for the lower limit labeled as "≤{lower limit}"
I'm finding this difficult to do programmatically without setting the breaks and limits manually since the breaks function operates on the values before the squish happens (I think). It would be cool if there were labeling and breaks functions that could help automate this.
library(tidyverse)
set.seed(123)
df <- expand_grid(x = 1:10, y = 1:10) |>
mutate(z = runif(n(), min = 10, max = 105))
df |> filter(z > 100)
#> # A tibble: 5 × 3
#> x y z
#> <int> <int> <dbl>
#> 1 2 1 101.
#> 2 2 10 101.
#> 3 3 4 104.
#> 4 4 1 101.
#> 5 9 7 104.
p <- ggplot(df, aes(x = x, y = y, fill = z)) +
geom_raster()
# Desired output
p + scale_fill_continuous(
limits = c(10, 98),
oob = scales::squish,
breaks = c(25, 50, 75, 98),
labels = c("25", "50", "75", "≥98")
)

Created on 2024-07-01 with reprex v2.1.0
Session info
sessionInfo()
#> R version 4.3.3 (2024-02-29)
#> Platform: x86_64-apple-darwin20 (64-bit)
#> Running under: macOS Sonoma 14.5
#>
#> Matrix products: default
#> BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/Phoenix
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
#> [5] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
#> [9] ggplot2_3.5.1 tidyverse_2.0.0
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.5 highr_0.11 compiler_4.3.3 reprex_2.1.0
#> [5] tidyselect_1.2.1 xml2_1.3.6 scales_1.3.0 yaml_2.3.8
#> [9] fastmap_1.2.0 R6_2.5.1 labeling_0.4.3 generics_0.1.3
#> [13] curl_5.2.1 knitr_1.47 munsell_0.5.1 pillar_1.9.0
#> [17] tzdb_0.4.0 rlang_1.1.4 utf8_1.2.4 stringi_1.8.4
#> [21] xfun_0.44 fs_1.6.4 timechange_0.3.0 cli_3.6.2
#> [25] withr_3.0.0 magrittr_2.0.3 digest_0.6.35 grid_4.3.3
#> [29] rstudioapi_0.16.0 hms_1.1.3 lifecycle_1.0.4 vctrs_0.6.5
#> [33] evaluate_0.23 glue_1.7.0 farver_2.1.2 fansi_1.0.6
#> [37] colorspace_2.1-0 rmarkdown_2.26 tools_4.3.3 pkgconfig_2.0.3
#> [41] htmltools_0.5.8.1
Hmm.. this seems fairly impossible to do in a way that fits in with how other breaks_ and labels_ functions work. The breaks argument to ggplot2::continuous_scale() only knows about the limits, and the limits argument takes the breaks as an input, so there's no way for it to tell if the actual data extend beyond the limits in either direction. Maybe this is possible with a transformation?
While not identical, https://github.com/r-lib/scales/issues/368 also had a similar goal in mind.
Yeah, that's part of the way there, but there's no guarantee that the limits are included in the labels. The examples in #368 just happen to work because of the chosen limits. For example:
library(ggplot2)
library(scales)
ggplot(faithful, aes(waiting, eruptions)) +
geom_point() +
scale_y_continuous(
limits = c(2, 4.2),
labels = \(...) {l <- scales::label_number()(...); l[c(1, length(l))] <- paste0(c("≤","≥"), l[c(1, length(l))]); l},
oob = oob_squish
)

#no label for ≥4.2
Created on 2024-10-04 with reprex v2.1.1
It'd be nice to be able to enforce that the limits appear as labels in a way that looks nice-ish (not overlapping with other labels), but maybe this is a separate issue.
This is as far as I got, but it doesn't quite work for position scales because "Note that for position scales, limits are provided after scale expansion." (from the breaks arg of scale_y_continuous()). Adding expand = c(0,0) makes the y-axis breaks correct, but then the points at the limits get cut in half or go outside the plot area with clip = "off".
library(ggplot2)
library(scales)
breaks_limits <- function (n = 5, tol = 0.1, min = TRUE, max = TRUE, ...)
{
n_default <- n
scales:::force_all(n, tol, min, max, ...)
function(x, n = n_default) {
breaks <- pretty(x, n, ...)
#force limits to be included and remove breaks outside of limits
if (isTRUE(min)) {
breaks <- c(x[1], breaks)
}
if (isTRUE(max)) {
breaks <- c(x[2], breaks)
}
breaks <- unique(sort(breaks))
breaks <- breaks[breaks>=x[1] & breaks<=x[2]]
#remove breaks too close to limits that they are likely to overlap
scl_br <- (breaks - min(breaks)) / diff(range(breaks)) #or diff(x)
if (isTRUE(min) & abs(scl_br[1] - scl_br[2]) < tol) {
breaks <- breaks[-2]
}
if (isTRUE(max) & abs(scl_br[length(scl_br)] - scl_br[length(scl_br) - 1]) < tol) {
breaks <- breaks[-(length(breaks)-1)]
}
labels <- as.character(breaks)
if (isTRUE(min)) {
labels[1] <- paste0("≤ ", labels[1])
}
if (isTRUE(max)) {
labels[length(labels)] <- paste0("≥ ", labels[length(labels)])
}
names(breaks) <- labels
breaks
}
}
ggplot(faithful, aes(waiting, eruptions, color = eruptions)) +
geom_point() +
scale_y_continuous(
limits = c(NA, 4.2),
breaks = breaks_limits(min = FALSE),
labels = \(x) names(x),
oob = oob_squish,
expand = c(0,0)
) +
scale_color_continuous(
limits = c(NA, 4.2),
breaks = breaks_limits(min = FALSE),
labels = \(x) names(x),
oob = oob_squish
) +
coord_cartesian(clip = "off")

Created on 2024-10-04 with reprex v2.1.1