prophet icon indicating copy to clipboard operation
prophet copied to clipboard

[R] Error: "Problem while computing ds = set_date(ds)" in predict method

Open laresbernardo opened this issue 2 years ago • 1 comments

Hi meta-mates! I've encountered an issue I can't seem to solve and wanted your help given I'm not sure it's a bug in prophet or in the way I'm using it. As you might already know, we use prophet to decompose the data when using Robyn, our Marketing Science MMM Open-source solution. When the data input is daily (not weekly), we are getting this error:

RDS file to replicate: temp.RDS.zip

temp <- readRDS("~/temp.RDS")
mod <- fit.prophet(temp$modelRecurrence, temp$dt_regressors)
forecastRecurrence <- predict(mod, temp$dt_regressors)
Error in `dplyr::mutate()`:
! Problem while computing `ds = set_date(ds)`.
✖ `ds` must be size 450 or 1, not 452.
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_trace()
<error/dplyr:::mutate_error>
Error in `dplyr::mutate()`:
! Problem while computing `ds = set_date(ds)`.
✖ `ds` must be size 450 or 1, not 452.
---
Backtrace:
     ▆
  1. ├─stats::predict(mod, dt_regressors)
  2. ├─prophet:::predict.prophet(mod, dt_regressors)
  3. │ └─prophet:::predict_seasonal_components(object, df)
  4. │   └─prophet:::make_all_seasonality_features(m, df)
  5. │     └─prophet:::make_holiday_features(m, df$ds, holidays)
  6. │       └─... %>% tidyr::spread(holiday, x, fill = 0)
  7. ├─tidyr::spread(., holiday, x, fill = 0)
  8. ├─dplyr::mutate(., x = 1)
  9. ├─dplyr::do(...)
 10. ├─dplyr::filter(., dplyr::row_number() == 1)
 11. ├─dplyr::group_by(., holiday, ds)
 12. ├─dplyr::mutate(., ds = set_date(ds))
 13. ├─dplyr:::mutate.data.frame(., ds = set_date(ds))
 14. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
 15. │   ├─base::withCallingHandlers(...)
 16. │   └─mask$eval_all_mutate(quo)
 17. ├─dplyr:::dplyr_internal_error(...)
 18. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data)
 19. │   └─rlang:::signal_abort(cnd, .file)
 20. │     └─base::signalCondition(cnd)
 21. └─dplyr `<fn>`(`<dpl:::__>`)
 22.   └─rlang::abort(...)

I noticed the temp$modelRecurrence$holidays element contains 450 rows, so not sure why the error says it should contain 452 when running the (backtrace) tidyr::spread. Maybe this is a hint? CC: @bletham

laresbernardo avatar Aug 17 '22 14:08 laresbernardo

Believe I found a solve for this inside the construct_holiday_dataframe function.

In this code block

  if (!is.null(m$train.holiday.names)) {
    row.to.keep <- which(all.holidays$holiday %in% m$train.holiday.names)
    all.holidays <- all.holidays[row.to.keep, ]
    holidays.to.add <- data.frame(holiday = setdiff(m$train.holiday.names, 
      all.holidays$holiday))
    all.holidays <- suppressWarnings(dplyr::bind_rows(all.holidays, 
      holidays.to.add))
  }

when there is an empty dataframe for holidays.to.add, the bind_rows function is adding an NA into the ds column in all.holidays

if we change all.holidays <- suppressWarnings(dplyr::bind_rows(all.holidays, holidays.to.add)) to

all.holidays <- ifelse(nrow(holidays.to.add) > 0, suppressWarnings(dplyr::bind_rows(all.holidays, holidays.to.add), all.holidays)

Then I believe it should work based on the testing I did.

kyletgoldberg avatar Aug 24 '22 17:08 kyletgoldberg