TTR icon indicating copy to clipboard operation
TTR copied to clipboard

rolling trimmed mean

Open ethanbsmith opened this issue 1 year ago • 7 comments
trafficstars

Description

new functionality supporting rolling trimmed means

when trim * n is not an integer, there is special handling needed for the boundaries: https://stats.stackexchange.com/questions/4252/how-to-calculate-the-truncated-or-trimmed-mean

very hokey rough implementation.

runTrimmedMean <- function(x, n, cumulative = FALSE, trim) {
  if ((trim > 0.5) || (trim < 0)) stop("trim must be between 0,.5")
  x <- try.xts(x, error = as.matrix)
  if (ncol(x) > 1) stop("only supports univariate x")
  if (nrow(x) < n) {
    is.na(x) <- T
    return(x)
  }

  v <- as.vector(x)
  if (cumulative) {
    r <- sapply(seq_len(nrow(x)), FUN = \(n) {mean(v[1:n], na.rm = T, trim = trim)})
  } else {
    #data.table::frollapply slightly faster than zoo::rollapply
    r <- data.table::frollapply(v, n = n, FUN = \(vals) {mean(vals, na.rm = T, trim = trim)})
  }
  return(reclass(r, x))
}

ethanbsmith avatar Mar 01 '24 21:03 ethanbsmith

also, not sure if this should be a standalone function in R, or just add a trim parameter to runMean, like how base R does it

I think that one way to think about this is that there are just variants of a single function, with runMean being an optimization for runTrimmedMean(trim = 0) and runMedian an optimization for runTrimmedMean(trim = .5)

ethanbsmith avatar Mar 01 '24 22:03 ethanbsmith

also, not sure if this should be a standalone function in R, or just add a trim parameter to runMean, like how base R does it

If I may join the discussion; I think having it as a trim-parameter is a good idea - then it could applied on all rolling means there is in TTR.

serkor1 avatar Mar 02 '24 06:03 serkor1

it seems to me that the place to put this would be in runMean, since it already exists, not as a new function

trim in mean.default is defined as:

  if (trim > 0 && n) {
    if (is.complex(x)) 
      stop("trimmed means are not defined for complex data")
    if (anyNA(x)) 
      return(NA_real_)
    if (trim >= 0.5) 
      return(stats::median(x, na.rm = FALSE))
    lo <- floor(n * trim) + 1
    hi <- n + 1 - lo
    x <- sort.int(x, partial = unique(c(lo, hi)))[lo:hi]
  }

so your prototype functionality would deliver the median of the data for any value of trim>0.05

braverock avatar Mar 02 '24 14:03 braverock

agree this should ideally just extend runMean util i get time to work on a proper C pr, here is a usable R implementation:

runTrimmedMean <- function(x, n, cumulative = FALSE, trim = 0) {
  if (trim <= 0) return(runMean(x, n = n, cumulative = cumulative))
  if (trim >= 0.5) return(runMedian(x, n = n, cumulative = cumulative))
  x <- try.xts(x, error = as.matrix)
  if (NCOL(x) > 1) stop("ncol(x) > 1. runSum only supports univariate 'x'")

  len.x <- nrow(x)
  r <- rep(NA_real_, len.x)
  v <- as.vector(x)
  start <- max(last(which(is.na(v))) + 1, 1)

  if (len.x >= (start + n)) {
    trim.start <- floor(trim * n) + 1
    trimmed.range <- trim.start:(n + 1 - trim.start)
    i <- start + n - 1
    buff <- sort(v[start:i])
    r[i] <- sum(buff[trimmed.range]) / length(trimmed.range)
    while(i < len.x) {
      i <- i+1
      if (cumulative) {
        #expand buffer
        n <- n + 1
        trim.start <- floor(trim * n) + 1
        trimmed.range <- trim.start:(n + 1 - trim.start)
      } else {
        #remove old value from buffer. will always get found by match, so no need for error handling
        buff <- buff[-match(v[i-n], buff)]
      }
      #add new value to buffer. match +  insertion is faster than resorting whole buffer
      after <- match(TRUE, v[i] <= buff, nomatch = length(buff) + 1) - 1
      buff <- append(buff, v[i], after = after) # < 1 will pre-pend
      r[i] <- sum(buff[trimmed.range]) / length(trimmed.range) #na's already handled, so bypass checks in mean
    }
  }
  return(reclass(r, x))
}

ethanbsmith avatar Mar 27 '24 22:03 ethanbsmith

Great start! Would it make sense to have unit tests of this function versus something like rollapply that uses mean with a trim value?

joshuaulrich avatar Mar 28 '24 16:03 joshuaulrich

yes. thats exactly how i tested this version. ill formalize them, as i need to that anyways

ethanbsmith avatar Mar 28 '24 17:03 ethanbsmith

Awesome. You can use 'tinytest' (it's a near drop-in replacement for 'testthat') instead of 'RUnit'. I intend to convert all the TTR tests to 'tinytest' at some point.

joshuaulrich avatar Mar 28 '24 17:03 joshuaulrich