yardstick icon indicating copy to clipboard operation
yardstick copied to clipboard

add informativeness and markedness

Open topepo opened this issue 6 years ago • 1 comments

See this reference by Powers.

topepo avatar Aug 10 '18 00:08 topepo

First thoughts using the binary case: Informedness = Recall + Inverse Recall – 1, but i think it generalizes to multiclass, which I didn't implement.

The binary case is also called Youden's J statistic https://en.wikipedia.org/wiki/Youden%27s_J_statistic

library(yardstick)
#> Warning: package 'yardstick' was built under R version 3.4.4
#> Loading required package: broom
#> Warning: package 'broom' was built under R version 3.4.4
library(rlang)
#> Warning: package 'rlang' was built under R version 3.4.4

informedness <- function(data, ...) {
  UseMethod("informedness")
}

informedness.table <- 
  function(data, ...) {
  
    recall_value <- recall(data, ...)
    
    # If we got this far, its 2x2, otherwise recall() would've errored
    invert <- c(2,1)
    inverse_recall_value <- recall(data[invert, invert], ...)
    
    # # Alternative, but a quick microbenchmark shows it to be slower on a 2x2 table
    # event_first <- getOption("yardstick.event_first", TRUE)
    # options(yardstick.event_first = !event_first)
    # on.exit({ options(yardstick.event_first = event_first) })
    # inverse_recall_value <- recall(data)
    
    recall_value + inverse_recall_value - 1
  }

informedness.data.frame <- 
  function(data, truth, estimate, na.rm = TRUE, ...) {
    vars <-
      yardstick:::factor_select(
        data = data,
        truth = !!enquo(truth),
        estimate = !!enquo(estimate),
        ...
      )
    
    xtab <- yardstick:::vec2table(
      truth = data[[vars$truth]],
      estimate = data[[vars$estimate]],
      na.rm = na.rm,
      two_class = TRUE,
      dnn = c("Prediction", "Truth"),
      ...
    )
    
    informedness.table(xtab)
  }

data("two_class_example")
informedness(two_class_example, truth, predicted)
#> [1] 0.6732334

# But also note (see the wiki page)
sens(two_class_example, truth, predicted) + spec(two_class_example, truth, predicted) - 1 
#> [1] 0.6732334

# Because sens = recall and spec = inverse recall

Created on 2018-08-15 by the reprex package (v0.2.0).

DavisVaughan avatar Aug 15 '18 18:08 DavisVaughan