scales icon indicating copy to clipboard operation
scales copied to clipboard

tickmark/break calculations with exp_trans, probability_trans fail often

Open fabian-s opened this issue 10 years ago • 10 comments

Some ranges seem to break the calculation of the breaks/tickmarks for these transformed axes: It works for y <- 1:3,  y <- (-10):-7, e.g., but I haven't found an example with more than 4 values that worked ... ?!?

library(ggplot2)
df <- data.frame(x = 1:10)

ggplot(df, aes(x, 1)) + 
  geom_blank() +
  scale_x_continuous(trans = scales::exp_trans())
#> Warning in self$trans$inverse(limits): NaNs produced
#> Error in if (zero_range(as.numeric(limits))) {: missing value where TRUE/FALSE needed

ggplot(df, aes(x, 1)) + 
  geom_blank() +
  scale_x_continuous(trans = scales::logit_trans())
#> Warning in qfun(x, ...): NaNs produced
#> Warning: Transformation introduced infinite values in continuous x-axis

Created on 2019-10-31 by the reprex package (v0.3.0)

fabian-s avatar Jun 12 '15 16:06 fabian-s

I can reproduce the first error; the second two examples only throw a warning and create an empty plot.

karawoo avatar Jun 14 '17 16:06 karawoo

Is this the same issue?

> mydf  ggplot(mydf, aes(x=x,y=y))+geom_point()+scale_y_continuous(trans='exp')+expand_limits(y=1.5)
# works - some lower y values cause problems

> ggplot(mydf, aes(x=x,y=y))+geom_point()+scale_y_continuous(trans='exp')+expand_limits(y=1.1)
Error in if (zero_range(as.numeric(limits))) { : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In self$trans$inverse(limits) : NaNs produced
> traceback()
17: f(..., self = self)
16: self$get_breaks(range)
15: f(..., self = self)
14: ggproto_parent(ScaleContinuous, self)$break_info(range)
13: f(..., self = self)
12: scale_details$break_info(range)
11: train_cartesian(scale_details$y, self$limits$y, "y")
10: f(..., self = self)
9: coord$train(list(x = self$panel_scales$x[[ix]], y = self$panel_scales$y[[iy]]))
8: (function (ix, iy) 
   {
       coord$train(list(x = self$panel_scales$x[[ix]], y = self$panel_scales$y[[iy]]))
   })(dots[[1L]][[1L]], dots[[2L]][[1L]])
7: mapply(FUN = f, ..., SIMPLIFY = FALSE)
6: Map(compute_range, self$panel_layout$SCALE_X, self$panel_layout$SCALE_Y)
5: f(..., self = self)
4: layout$train_ranges(plot$coordinates)
3: ggplot_build(x)
2: print.ggplot(x)
1: function (x, ...) 
   UseMethod("print")(x)

> ggplot(mydf, aes(x=x,y=y))+geom_point()+scale_y_continuous(trans='identity')+expand_limits(y=1.1)
# works - error involves trans='exp'

braunb avatar Jun 14 '17 22:06 braunb

I found the error I was getting with exp_trans came from negative values getting passed to the inverse function in exp_trans (which is log(x)). Somehow this is due to the process used to create the padding around the graph, which acts, in a way, to extend the range of the axis. I don't know how the padding is calculated, but a low positive value for the lower limit generates a negative value for the lower limit with the padding.

Adding

expand=c(0,0)
to scale_y_continuous eliminates the error (and the padding)
> ggplot(mydf, aes(x=x,y=y))+geom_point()+scale_y_continuous(trans='exp')+expand_limits(y=1.1)
# error

> ggplot(mydf, aes(x=x,y=y))+geom_point()+scale_y_continuous(trans='exp',  expand=c(0,0))+expand_limits(y=1.1)
# works

It seems like too large a value either the 1st or 2nd element in the expand= argument can lead to this error.

braunb avatar Jun 15 '17 03:06 braunb

I get problems using atan transformations. What I believe happens is:

  • data is transformed
  • axes are extended an additional 10% (on the transformed scale)
    • this results in values outside the range of the function
  • when calling the inverse function to transform back, get errors or warning or non-monotone values.

When creating a transformation you can specify the domain of the function. Really need another argument, to specify the range of the function - and the 10% expansion should respect this.

library(ggplot2)
library(scales)

atan_trans <- trans_new(
  name = "atan",
  transform = function(x) atan(x),
  inverse = function(x) tan(x),
  breaks = function(x) {
    print(x)
    breaks <- extended_breaks()(x)
    print(breaks)
    breaks
  }
)

plot(atan_trans, xlim = c(-10, 10))


data <- data.frame(
  x1 = seq(-2, 2, length = 10),
  x2 = seq(-5, 5, length = 10),
  x3 = seq(-20, 20, length = 10)
)

ggplot(data = data) +
  geom_point(aes(x1, 1)) +
  scale_x_continuous(trans = atan_trans)

#> [1] -2.714768  2.714768
#> [1] -3 -2 -1  0  1  2  3

ggplot(data = data) +
  geom_point(aes(x2, 1)) +
  scale_x_continuous(trans = atan_trans)

#> [1] -16.63125  16.63125
#> [1] -20 -10   0  10  20

ggplot(data = data) +
  geom_point(aes(x3, 1)) +
  scale_x_continuous(trans = atan_trans)

#> [1]  9.757818 -9.757818
#> [1] -10  -5   0   5  10

atan(16.63125) / atan(5)
#> [1] 1.1
atan(2.714768) / atan(2)
#> [1] 1.1
# Both are 1.1 - axes are expanded 10% on the transformed scale.

atan(20) * 1.1 # exceeds pi/2
#> [1] 1.672922
tan(atan(20) * 1.1)
#> [1] -9.757818
# -9.757818 - but this is computed from the wrong branch of the tan function.

# I suspect that the data is transformed, the axes are extended an extra
# 10% in each direction on the transformed scale, causing values to exceed
# the range of the atan function (-pi/2, pi/2).
# Hence some of the values are outside the domain of the inverse function
# (in this case tan).
# 
# For some functions that would then give undefined values - like taking
# the sqrt of a negative number.
# For others, like atan, it uses a different branch, so the inverse
# transformation is not monotone.

Created on 2019-10-31 by the reprex package (v0.3.0)

Original code

More details:

library(ggplot2)
library(scales)

atan_trans <-
  trans_new(name = "atan",
            transform = function(x) atan(x),
            inverse = function(x) tan(x),
            breaks = function(x) {
              print(x)
              breaks <- pretty(c(pmax(-5, min(x)), pmin(5, max(x))))
              print(breaks)
              names(breaks) <- breaks
              return(breaks)
              })
#             domain = c(-pi/2, pi/2), # adding domain has no effect
# It shouldn't. Really need an argument 'range'.


data <- data.frame(x1 = seq(-5, 5, length=6),
                   x2 = seq(-20, 20, length=6),
                   x3 = seq(-2, 2, length=6))

ggplot(data = data) +
  geom_density(aes(x = x3)) +
  scale_x_continuous(trans = atan_trans)
# [1] -2.714768  2.714768  # x
# [1] -3 -2 -1  0  1  2  3 # breaks
# x axis labels are shown

ggplot(data = data) +
  geom_density(aes(x = x1)) +
  scale_x_continuous(trans = atan_trans)
# [1] -16.63125  16.63125  # x
# [1] -6 -4 -2  0  2  4  6 # breaks
# x axis labels are shown

ggplot(data = data) +
  geom_density(aes(x = x2)) +
  scale_x_continuous(trans = atan_trans)
# [1] -9.757818  9.757818  # x
# [1] -6 -4 -2  0  2  4  6 # breaks
# x axis labels are not shown

atan(16.63125) / atan(5)
atan(2.714768) / atan(2)
# Both are 1.1 - axes are expanded 10% on the transformed scale.

atan(20) * 1.1 # exceeds pi/2
tan(atan(20) * 1.1)
# -9.757818 - but this is computed from the wrong branch of the tan function.

# I suspect that the data is transformed, the axes are extended an extra
# 10% in each direction on the transformed scale, causing values to exceed
# the range of the atan function (-pi/2, pi/2).
# Hence some of the values are outside the domain of the inverse function
# (in this case tan).
# 
# For some functions that would then give undefined values - like taking
# the sqrt of a negative number.
# For others, like atan, it uses a different branch, so the inverse
# transformation is not monotone.

TimHesterberg avatar Oct 16 '17 00:10 TimHesterberg

I wrote: axes are extended an additional 10% (on the transformed scale) That is from the middle, so only 5% of the range of the values.

TimHesterberg avatar Oct 16 '17 01:10 TimHesterberg

Thanks @TimHesterberg — I think your analysis of the problem is correct.

I think the right way to resolve this is to add a range parameter to extended_breaks() so that the limits are truncated to the range of the scale before breaks are computed.

hadley avatar Oct 31 '19 17:10 hadley

No, that doesn't make sense because the breaks function gets the untransformed data; the problem really is the expansion that ggplot2 is doing. Maybe the transformers should get a expand argument which would be a function that ggplot2 would call to expand the axes. Or, as you say, I could add a range parameter, and ggplot2 could respect that.

...

Problem with providing range is that it's not necessarily fixed for all transformations in a family so it would be more general to provide an expand parameter that was a function that took the limits and returned the expanded values.

hadley avatar Oct 31 '19 17:10 hadley

I'm running into this problem frequently with the "exp" transformation. No matter what the range of the data, setting transformation = "exp" always yields this error:

library(ggplot2)
dat <- data.frame(x = 5:100)

ggplot(dat) + 
  aes(x = x, y = x) + 
  scale_y_continuous(trans = "exp")
#> Warning in self$trans$inverse(limits): NaNs produced
#> Error in if (zero_range(as.numeric(limits))) {: missing value where TRUE/FALSE needed

Created on 2021-10-09 by the reprex package (v2.0.1)

Is there any possibility this transformation might be fixed?

bwiernik avatar Oct 09 '21 19:10 bwiernik

I'm running into this problem frequently with the "exp" transformation. No matter what the range of the data, setting transformation = "exp" always yields this error:

library(ggplot2)
dat <- data.frame(x = 5:100)

ggplot(dat) + 
  aes(x = x, y = x) + 
  scale_y_continuous(trans = "exp")
#> Warning in self$trans$inverse(limits): NaNs produced
#> Error in if (zero_range(as.numeric(limits))) {: missing value where TRUE/FALSE needed

Created on 2021-10-09 by the reprex package (v2.0.1)

Is there any possibility this transformation might be fixed?

Seconding this. Is there anyway to circumvent this problem?

Agasax avatar Apr 15 '22 15:04 Agasax

Might be partially resolved by https://github.com/tidyverse/ggplot2/pull/4775. It gets rid of the warnings/errors, but perhaps the transformation could do with some refinement in choosing breaks.

# remotes::install_github("tidyverse/ggplot2#4775")

library(ggplot2)
dat <- data.frame(x = 5:100)

ggplot(dat) + 
  aes(x = x, y = x) + 
  geom_point() +
  scale_y_continuous(trans = "exp")

Created on 2022-04-15 by the reprex package (v2.0.1)

teunbrand avatar Apr 15 '22 16:04 teunbrand

Closed in favour of #405

thomasp85 avatar Nov 02 '23 09:11 thomasp85