openair icon indicating copy to clipboard operation
openair copied to clipboard

TheilSen percentage change

Open MohoWu opened this issue 4 years ago • 1 comments

Hi David,

I have been using the TheilSen function to calculate some trends and discovered a weird case when the slope.percent is out of bounds of the lower.percent and upper.percent.

library(openair)

test <- structure(list(
  date = structure(c(1009843200, 1041379200, 1072915200, 
                     1104537600, 1136073600, 1167609600, 1199145600), tzone = "UTC", class = c("POSIXct", 
                                                                                               "POSIXt")), 
  value = c(1.172296, 1.013507, 1.382664, 4.498945, 
            0.773616, 0.581274, 0.550437)), row.names = c(NA, -7L), 
  class = "data.frame")

trend <- TheilSen(test, pol = "value", avg.time = "year")

image

trend$data$res2

default p.stars       date     conc       a             b   upper.a
1 default         2004-12-31 1.424677 4.79562 -0.0002958474 -24.48354
      upper.b  lower.a      lower.b         p      slope intercept
1 0.002023326 12.03392 -0.000833171 0.1769616 -0.1079843   4.79562
  intercept.lower intercept.upper      lower     upper slope.percent
1        12.03392       -24.48354 -0.3041074 0.7385141     -8.072048
  lower.percent upper.percent
1     -13.24614     -88.45522

I thought the way the percentage is calculated is as follows: (y-intercept for the end date - y-intercept for the start date)/(end date - start date). If that's the case the upper.percent should be a positive value judging from the graph. I'd appreciate any thoughts on this.

Cheers, Hao

MohoWu avatar Jul 15 '20 16:07 MohoWu

The problem here is how the percent change is defined as 100 * (Cend/Cstart – 1). On two of the lines this makes sense where the start and end have positive concentrations. The problem is the dashed line with the positive slope. It as a start concentration that is negative (-0.85), which obviously does not make sense. The thing to do therefore is not express the changes as percent (I’m not a fan of this anyway because there are lots of ways this could be defined). It is best to stick to the raw trends, which make sense. There’s no other way around this unless a different definition was used…

davidcarslaw avatar Aug 12 '20 11:08 davidcarslaw