CVXR icon indicating copy to clipboard operation
CVXR copied to clipboard

How to fit a constrained three-part linear spline to time series data?

Open nikosGeography opened this issue 7 months ago • 0 comments

Hi, thank you for creating the package, keep up the good work. This is not a bug nor a feature request, it's more like an issue I have when I try to utilize the package for a problem I have, so feel free to close the issue if you think this question is not acceptable here.

I'm working with time series data representing nighttime lights (NTL) across multiple cities, aiming to model the response to a known disruption with a fixed start and end date.

I want to fit a three-part linear spline to each NTL time series:

  • fa: Pre-disruption (before disruption start)

  • fb: During disruption (between disruption start and end)

  • fc: Post-disruption (after disruption end)

The spline must be continuous (i.e., join at the disruption start and end). The slope of fa should always be 0 (flat pre-disruption trend). I aim to fit this spline to each time series (I have data for many cities) while enforcing constraints on the slopes of fb and fc to match the conceptual recovery pattern:

Chronic Vulnerability:

  • fa: flat (0)

  • fb: negative

  • fc: negative

I want to fit this pattern to observed data and calculate the R². What's the best way to implement this, ensuring continuity and enforcing these slope constraints? Just to be clear, the observed (actual) data have the pattern shown in the attached image.

Image

What I am looking for is an automatic way (i.e., no fixed values) to fit a 3-part linear-splines model (one model per period) with the constraints I mentioned above, that connect to known knots (i.e., disruption dates, red dotted lines in the plot).

I tried to simulate such 3-part linear splines with constraints on slope direction (i.e., set the monotonicity of the slope to be negative between and after the knots) using CVXR because from what I read online looks the right package for the job, but unfortunately after MANY tries I couldn't make it work. Is it possible to achieve my goals using CVXR? If needed, I can share the last try I made with the code.

Many thanks.

The dataset:

> dput(df)
structure(list(date = c("01-01-18", "01-02-18", "01-03-18", "01-04-18", 
"01-05-18", "01-06-18", "01-07-18", "01-08-18", "01-09-18", "01-10-18", 
"01-11-18", "01-12-18", "01-01-19", "01-02-19", "01-03-19", "01-04-19", 
"01-05-19", "01-06-19", "01-07-19", "01-08-19", "01-09-19", "01-10-19", 
"01-11-19", "01-12-19", "01-01-20", "01-02-20", "01-03-20", "01-04-20", 
"01-05-20", "01-06-20", "01-07-20", "01-08-20", "01-09-20", "01-10-20", 
"01-11-20", "01-12-20", "01-01-21", "01-02-21", "01-03-21", "01-04-21", 
"01-05-21", "01-06-21", "01-07-21", "01-08-21", "01-09-21", "01-10-21", 
"01-11-21", "01-12-21", "01-01-22", "01-02-22", "01-03-22", "01-04-22", 
"01-05-22", "01-06-22", "01-07-22", "01-08-22", "01-09-22", "01-10-22", 
"01-11-22", "01-12-22", "01-01-23", "01-02-23", "01-03-23", "01-04-23", 
"01-05-23", "01-06-23", "01-07-23", "01-08-23", "01-09-23", "01-10-23", 
"01-11-23", "01-12-23"), ba = c(5.631965012, 5.652943903, 5.673922795, 
5.698648054, 5.723373314, 5.749232037, 5.77509076, 5.80020167, 
5.82531258, 5.870469864, 5.915627148, 5.973485875, 6.031344603, 
6.069760262, 6.10817592, 6.130933313, 6.153690706, 6.157266393, 
6.16084208, 6.125815676, 6.090789273, 6.02944691, 5.968104547, 
5.905129394, 5.842154242, 5.782085265, 5.722016287, 5.666351167, 
5.610686047, 5.571689415, 5.532692782, 5.516260933, 5.499829083, 
5.503563375, 5.507297667, 5.531697846, 5.556098024, 5.583567118, 
5.611036212, 5.636610944, 5.662185675, 5.715111139, 5.768036603, 
5.862347902, 5.956659202, 6.071535763, 6.186412324, 6.30989678, 
6.433381236, 6.575014889, 6.716648541, 6.860849606, 7.00505067, 
7.099267331, 7.193483993, 7.213179035, 7.232874077, 7.203921341, 
7.174968606, 7.12081735, 7.066666093, 6.994413881, 6.922161669, 
6.841271288, 6.760380907, 6.673688099, 6.586995291, 6.502777891, 
6.418560491, 6.338127583, 6.257694675, 6.179117301)), class = "data.frame", row.names = c(NA, 
-72L))

Disruption dates

lockdown_dates_retail <- list(
  ba = as.Date(c("2020-03-01", "2021-05-01"))
)

Session info

R version 4.5.0 (2025-04-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
  LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    
tzcode source: internal
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] dplyr_1.1.4

nikosGeography avatar Jun 05 '25 15:06 nikosGeography