scales icon indicating copy to clipboard operation
scales copied to clipboard

Feature request: `transform_squish` or `label_squish`

Open Aariq opened this issue 1 year ago • 4 comments

When passing scales::squish() to the oob argument of a scale (say scale_fill_viridis_c()) I find that it would often be nice to also transform the breaks and labels as follows:

  • If there are oob values on the upper end, always have a break for the upper limit labeled as "≥{upper limit}"
  • If there are oob values on the lower end, always have a break for the lower limit labeled as "≤{lower limit}"

I'm finding this difficult to do programmatically without setting the breaks and limits manually since the breaks function operates on the values before the squish happens (I think). It would be cool if there were labeling and breaks functions that could help automate this.

library(tidyverse)
set.seed(123)
df <- expand_grid(x = 1:10, y = 1:10) |> 
  mutate(z = runif(n(), min = 10, max = 105))
df |> filter(z > 100)
#> # A tibble: 5 × 3
#>       x     y     z
#>   <int> <int> <dbl>
#> 1     2     1  101.
#> 2     2    10  101.
#> 3     3     4  104.
#> 4     4     1  101.
#> 5     9     7  104.

p <- ggplot(df, aes(x = x, y = y, fill = z)) +
  geom_raster()
  
# Desired output
p + scale_fill_continuous(
    limits = c(10, 98),
    oob = scales::squish,
    breaks = c(25, 50, 75, 98),
    labels = c("25", "50", "75", "≥98")
  )

Created on 2024-07-01 with reprex v2.1.0

Session info
sessionInfo()
#> R version 4.3.3 (2024-02-29)
#> Platform: x86_64-apple-darwin20 (64-bit)
#> Running under: macOS Sonoma 14.5
#> 
#> Matrix products: default
#> BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/Phoenix
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4    
#>  [5] purrr_1.0.2     readr_2.1.5     tidyr_1.3.1     tibble_3.2.1   
#>  [9] ggplot2_3.5.1   tidyverse_2.0.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.5      highr_0.11        compiler_4.3.3    reprex_2.1.0     
#>  [5] tidyselect_1.2.1  xml2_1.3.6        scales_1.3.0      yaml_2.3.8       
#>  [9] fastmap_1.2.0     R6_2.5.1          labeling_0.4.3    generics_0.1.3   
#> [13] curl_5.2.1        knitr_1.47        munsell_0.5.1     pillar_1.9.0     
#> [17] tzdb_0.4.0        rlang_1.1.4       utf8_1.2.4        stringi_1.8.4    
#> [21] xfun_0.44         fs_1.6.4          timechange_0.3.0  cli_3.6.2        
#> [25] withr_3.0.0       magrittr_2.0.3    digest_0.6.35     grid_4.3.3       
#> [29] rstudioapi_0.16.0 hms_1.1.3         lifecycle_1.0.4   vctrs_0.6.5      
#> [33] evaluate_0.23     glue_1.7.0        farver_2.1.2      fansi_1.0.6      
#> [37] colorspace_2.1-0  rmarkdown_2.26    tools_4.3.3       pkgconfig_2.0.3  
#> [41] htmltools_0.5.8.1

Aariq avatar Jul 01 '24 18:07 Aariq

Hmm.. this seems fairly impossible to do in a way that fits in with how other breaks_ and labels_ functions work. The breaks argument to ggplot2::continuous_scale() only knows about the limits, and the limits argument takes the breaks as an input, so there's no way for it to tell if the actual data extend beyond the limits in either direction. Maybe this is possible with a transformation?

Aariq avatar Sep 24 '24 23:09 Aariq

While not identical, https://github.com/r-lib/scales/issues/368 also had a similar goal in mind.

teunbrand avatar Oct 04 '24 11:10 teunbrand

Yeah, that's part of the way there, but there's no guarantee that the limits are included in the labels. The examples in #368 just happen to work because of the chosen limits. For example:

library(ggplot2)
library(scales)
ggplot(faithful, aes(waiting, eruptions)) +
  geom_point() +
  scale_y_continuous(
    limits = c(2, 4.2),
    labels = \(...) {l <- scales::label_number()(...); l[c(1, length(l))] <- paste0(c("≤","≥"), l[c(1, length(l))]); l},
    oob = oob_squish
  )

#no label for ≥4.2

Created on 2024-10-04 with reprex v2.1.1

It'd be nice to be able to enforce that the limits appear as labels in a way that looks nice-ish (not overlapping with other labels), but maybe this is a separate issue.

Aariq avatar Oct 04 '24 17:10 Aariq

This is as far as I got, but it doesn't quite work for position scales because "Note that for position scales, limits are provided after scale expansion." (from the breaks arg of scale_y_continuous()). Adding expand = c(0,0) makes the y-axis breaks correct, but then the points at the limits get cut in half or go outside the plot area with clip = "off".

library(ggplot2)
library(scales)
breaks_limits <- function (n = 5, tol = 0.1, min = TRUE, max = TRUE, ...) 
{
  n_default <- n
  scales:::force_all(n, tol, min, max, ...)
  function(x, n = n_default) {
    breaks <- pretty(x, n, ...)
    
    #force limits to be included and remove breaks outside of limits
    if (isTRUE(min)) {
      breaks <- c(x[1], breaks)
    }
    if (isTRUE(max)) {
      breaks <- c(x[2], breaks)
    }
    breaks <- unique(sort(breaks))
    breaks <- breaks[breaks>=x[1] & breaks<=x[2]]
    
    #remove breaks too close to limits that they are likely to overlap
    scl_br <- (breaks - min(breaks)) / diff(range(breaks)) #or diff(x)
    if (isTRUE(min) & abs(scl_br[1] - scl_br[2]) < tol) {
      breaks <- breaks[-2]
    }
    if (isTRUE(max) & abs(scl_br[length(scl_br)] - scl_br[length(scl_br) - 1]) < tol) {
      breaks <- breaks[-(length(breaks)-1)]
    }
    labels <- as.character(breaks)
    if (isTRUE(min)) {
      labels[1] <- paste0("≤ ", labels[1])
    }
    if (isTRUE(max)) {
      labels[length(labels)] <- paste0("≥ ", labels[length(labels)])
    }
    names(breaks) <- labels
    breaks
  }
}

ggplot(faithful, aes(waiting, eruptions, color = eruptions)) +
  geom_point() +
  scale_y_continuous(
    limits = c(NA, 4.2),
    breaks = breaks_limits(min = FALSE),
    labels = \(x) names(x),
    oob = oob_squish,
    expand = c(0,0)
  ) +
  scale_color_continuous(
    limits = c(NA, 4.2),
    breaks = breaks_limits(min = FALSE),
    labels = \(x) names(x),
    oob = oob_squish
  ) +
  coord_cartesian(clip = "off")

Created on 2024-10-04 with reprex v2.1.1

Aariq avatar Oct 04 '24 18:10 Aariq