insight icon indicating copy to clipboard operation
insight copied to clipboard

`get/find_transformation` with linear transformations

Open mattansb opened this issue 3 years ago • 10 comments
trafficstars

  1. get/find_transformation should not return identity if an unsupported transformation is present.
  2. Should support linear transformations?
m <- lm(I(2 * mpg + 3) ~ hp, mtcars)
insight::find_transformation(m)
#> [1] "identity"

Created on 2022-06-27 by the reprex package (v2.0.1)

mattansb avatar Jun 27 '22 13:06 mattansb

Is there a context where these concerns arise other than I()?

Generally, we should extract the contents of I() and evaluate that as a function with numerical derivatives

bwiernik avatar Jun 27 '22 18:06 bwiernik

For (1), a user can use some other unsupported function e.g. foo(y) ~ 1, or datawizard::ranktransform(y) ~ ..

And (2) should also work for functions of linear transformation:

  • scale() and datawizard::standardise/standardize() / datawizard::center/centre() (all 3 of which can be inverted with datawizard::unstandardise/unstandardize() https://github.com/easystats/datawizard/pull/191
  • datawizard::normalize() anddatawizard::change_scale/data_rescale()which can be inverted withdatawizard::unnormalize()` https://github.com/easystats/datawizard/pull/191

mattansb avatar Jul 05 '22 08:07 mattansb

The problem is how to detect foo()? cbind(x - y) should return "identity", foo() should return "unknown". Are there any other exceptions?

strengejacke avatar Jul 05 '22 08:07 strengejacke

The problem is how to detect foo()? cbind(x - y) should return "identity", foo() should return "unknown". Are there any other exceptions?

Perhaps we should just return "identity" if no function or manipulation is detected? All others can be NULL?


Here is some working code to make trans/inversetrans functions for linear transformation functions above:

Define functions
as_linear_transform <- function(x, ...) {
  UseMethod("as_linear_transform")
}

as_linear_inverse <- function(x, ...) {
  UseMethod("as_linear_inverse")
}


as_linear_transform.numeric <- function(x, ...) {
  coefs <- .get_ab(x)
  function(x) {
    (x - coefs["a"]) / coefs["b"]
  }
}


as_linear_inverse.numeric <- function(x, ...) {
  coefs <- .get_ab(x)
  function(x) {
    x * coefs["b"] + coefs["a"]
  }
}



.get_ab <- function(x) {
  attr <- attributes(x)
  attr_names <- names(attr)
  
  if (all(c("center", "scale") %in% attr_names)) {
    a <- attr[["center"]]
    b <- attr[["scale"]]
  } else if (all(c("scaled:center", "scaled:scale") %in% attr_names)) {
    a <- attr[["scaled:center"]]
    b <- attr[["scaled:scale"]]
  } else if (all(c("min_value", "range_difference") %in% attr_names)) {
    a <- attr[["min_value"]]
    b <- attr[["range_difference"]]
    
    if ("to_range" %in% attr_names) {
      to_range <- attr[["to_range"]]
      
      b <- (b / diff(to_range)) 
      a <- a - b * to_range[1]
    }
  }
  
  c(a = a, b = b)
}
library(datawizard)
x <- rnorm(4, 40, 13)

Build trans/inverse functions from linear transformation functions in datawizard

foo <- as_linear_transform(standardize(x))
foo(x)
#> [1] -1.39325099  0.09865699  0.32238805  0.97220595
standardize(x)
#> [1] -1.39325099  0.09865699  0.32238805  0.97220595
#> attr(,"center")
#> [1] 40.5878
#> attr(,"scale")
#> [1] 11.42871
#> attr(,"robust")
#> [1] FALSE


foo <- as_linear_transform(scale(x))
foo(x)
#> [1] -1.39325099  0.09865699  0.32238805  0.97220595
scale(x)
#>             [,1]
#> [1,] -1.39325099
#> [2,]  0.09865699
#> [3,]  0.32238805
#> [4,]  0.97220595
#> attr(,"scaled:center")
#> [1] 40.5878
#> attr(,"scaled:scale")
#> [1] 11.42871


foo <- as_linear_transform(change_scale(x, to = c(3, 14.5), range = c(-30, 200)))
foo(x)
#> [1] 5.733237 6.585766 6.713614 7.084943
change_scale(x, to = c(3, 14.5), range = c(-30, 200))
#> [1] 5.733237 6.585766 6.713614 7.084943
#> attr(,"min_value")
#> [1] -30
#> attr(,"range_difference")
#> [1] 230
#> attr(,"to_range")
#> [1]  3.0 14.5

Build inverse trans/inverse functions from linear transformation functions in datawizard

goo <- as_linear_inverse(center(x))
x
#> [1] 24.66474 41.71532 44.27228 51.69886
goo(center(x))
#> [1] 24.66474 41.71532 44.27228 51.69886
#> attr(,"center")
#> [1] 40.5878
#> attr(,"scale")
#> [1] 1
#> attr(,"robust")
#> [1] FALSE


goo <- as_linear_inverse(normalize(x))
x
#> [1] 24.66474 41.71532 44.27228 51.69886
goo(normalize(x))
#> [1] 24.66474 41.71532 44.27228 51.69886
#> attr(,"include_bounds")
#> [1] TRUE
#> attr(,"min_value")
#> [1] 24.66474
#> attr(,"range_difference")
#> [1] 27.03411


goo <- as_linear_inverse(scale(x))
x
#> [1] 24.66474 41.71532 44.27228 51.69886
goo(scale(x))
#>          [,1]
#> [1,] 24.66474
#> [2,] 41.71532
#> [3,] 44.27228
#> [4,] 51.69886
#> attr(,"scaled:center")
#> [1] 40.5878
#> attr(,"scaled:scale")
#> [1] 11.42871

Created on 2022-07-05 by the reprex package (v2.0.1)

mattansb avatar Jul 05 '22 08:07 mattansb

Perhaps we should just return "identity" if no function or manipulation is detected? All others can be NULL?

But cbind() is a function and should not return "unknown".

strengejacke avatar Jul 05 '22 09:07 strengejacke

I'm not following either of your last comments @mattansb

bwiernik avatar Jul 05 '22 20:07 bwiernik

@bwiernik I gave examples of functions the preform simple linear transformations (scale, center, standardize, normalize and change_scale) that could potentially be used in a formula (e.g., scale(y) ~ x) and how to obtain the transformation functions and their inverse (which is what get_transformation() returns, potentially).

mattansb avatar Jul 06 '22 12:07 mattansb

I thought when we talk about "transformation" in the meaning of this function, we're talking about a different scale, like normal -> log, or normal -> exp, not standardizing/centering. So you suggest including those as well?

strengejacke avatar Jul 06 '22 12:07 strengejacke

Hmmm I think it might be useful; having scale(y) ~ x give a transformation of "identity" might be a little misleading, perhaps?

But if this would be too much work / break some stuff, we can save this issue for a later date (:

mattansb avatar Jul 06 '22 12:07 mattansb

Perhaps we could add a custom output first and then refine?

DominiqueMakowski avatar Aug 31 '22 03:08 DominiqueMakowski