ggplot2 icon indicating copy to clipboard operation
ggplot2 copied to clipboard

`scale_*_*` `labels` argument often doesn't work as expected with a function

Open davidhodge931 opened this issue 2 months ago • 5 comments

The help says that the labels arg of scale_*_* etc can take a function that inputs the breaks, and returns something.

I am creating some functions that manipulate the labels based on the position of the element, and is does not always work as the help indicates that is should.

image

library(tidyverse)
library(palmerpenguins)

hold_3rd <- function(x) {
  c("", "", as.character(x[2]), rep("", times = length(x) - 3))
}

#sometimes works as expected
penguins |> 
  ggplot() +
  geom_point(
    aes(x = flipper_length_mm,
        y = body_mass_g),
  ) +
  scale_x_continuous(labels = \(x) hold_3rd(scales::comma(x)))
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_point()`).


#sometimes does not
penguins |> 
  ggplot() +
  geom_point(
    aes(x = bill_length_mm,
        y = body_mass_g),
  ) +
  scale_x_continuous(labels = \(x) hold_3rd(scales::comma(x)))
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_point()`).

Created on 2024-05-03 with reprex v2.1.0

davidhodge931 avatar May 03 '24 03:05 davidhodge931

This is because the labelling function is applied to breaks before the out-of-bounds breaks are censored. In your second example, you'd need to discard the out-of-bound breaks and it works as intended.

library(palmerpenguins)
library(ggplot2)
library(scales)

hold_3rd <- function(x) {
  x[-3] <- ""
  x
}

penguins |> 
  ggplot() +
  geom_point(
    aes(x = bill_length_mm,
        y = body_mass_g),
  ) +
  scale_x_continuous(
    labels = \(x) hold_3rd(comma(x)),
    breaks = \(x) oob_discard(extended_breaks()(x), x)
  )
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_point()`).

Created on 2024-05-05 with reprex v2.1.0

teunbrand avatar May 05 '24 15:05 teunbrand

Thanks @teunbrand. It'd be great to be able to do this with oob's other than oob_discard. Could the labelling function be applied to breaks after the out-of-bounds breaks are censored?

davidhodge931 avatar May 05 '24 20:05 davidhodge931

I imagine this would get hairy. If we discard oob breaks before labelling, labels given as atomic vectors will become out of sync. In addition, minor breaks might be miscalculated without oob breaks. I'll keep this issue open as a prompt to explore this more fully, but the answer for now is 'probably not'.

teunbrand avatar May 05 '24 21:05 teunbrand

It would be useful, but don't want to break everything! Feel free to close whenever

davidhodge931 avatar May 05 '24 21:05 davidhodge931

The main use-case for this would be for a labelling function that labels every second break, and leaves every other one as "". It would work much more intuitively, if it always started from the first break within bounds

davidhodge931 avatar May 06 '24 21:05 davidhodge931

I have thought about this some more, and while we could implement this in ggplot2 without problems for ggplot2, this might unecessarily break other people's packages. Back during reverse dependency checks for 3.5.0, I came across a bunch of code in packages that made unorthodox* use of label functions that would break again if this were to be changed. For that reason, I don't want to change the way this works.

* = using lookup tables or returning fixed-length atomic vectors

Now, for your use case, I had forgotten that breaks arrive at the labelling function in a pre-censored state (i.e. oob breaks are NA). You can exploit this as follows. Similar to the reprex:

library(ggplot2)

only_show_nth <- function(n) {
  force(n)
  function(x) {
    i <- which(is.finite(x))
    x[-i[n]] <- ""
    x
  }
}

ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  scale_x_continuous(
    labels = only_show_nth(2)
  )

Similar to the use-case you describe:

show_every_nth <- function(n) {
  force(n)
  function(x) {
    i <- which(is.finite(x))
    i <- i[seq_along(i) %% n == 0]
    x[-i] <- ""
    x
  }
}

ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  scale_x_continuous(
    labels = show_every_nth(2)
  )

Created on 2024-05-08 with reprex v2.1.0

teunbrand avatar May 08 '24 12:05 teunbrand

That's awesome, thanks @teunbrand.

Works as expected for positional scales, but not for colour scales?

Also, I assume you're not interested in putting an argument in the scales::label_* functions to support this?

library(tidyverse)

show_every_nth <- function(n = 2, offset = 0) {
  force(n)
  function(x) {
    i <- which(is.finite(x))
    i <- i[seq_along(i) %% n == (offset + 1)]
    x[-i] <- ""
    x
  }
}

ggplot(mpg, aes(displ, hwy, colour = displ)) +
  geom_point() +
  scale_x_continuous(labels = show_every_nth(2)) +
  scale_y_continuous(labels = show_every_nth(2)) +
  scale_colour_gradientn(colors = viridis::viridis(9), labels = show_every_nth())


ggplot(mpg, aes(displ, hwy, colour = hwy)) +
  geom_point() +
  scale_x_continuous(labels = show_every_nth(2)) +
  scale_y_continuous(labels = show_every_nth(2)) +
  scale_colour_gradientn(colors = viridis::viridis(9), labels = show_every_nth())

Created on 2024-05-09 with reprex v2.1.0

davidhodge931 avatar May 08 '24 19:05 davidhodge931

Instead of an argument in a scales::label_* function, it might work better as a function.

Let me know if you'd like to implement something like this in {scales}. Otherwise, I'll chuck it in {ggblanket}

label_every_nth <- function(n = 2, offset = 0, ...) {
  function(x) {
    i <- which(is.finite(x) | is.character(x) | is.factor(x) | is.logical(x))
    i <- i[seq_along(i) %% n == (offset + 1)]

    if (is.numeric(x)) x <- scales::comma(x, ...)
    else x <- format(x, ...)

    x[-i] <- ""
    x
  }
}

davidhodge931 avatar May 08 '24 20:05 davidhodge931