Robyn icon indicating copy to clipboard operation
Robyn copied to clipboard

Validation procedure of Robyn

Open ghltk opened this issue 1 year ago • 8 comments

Hi team, Currently, we have finished building the Robyn model and we want to apply the Budget Allocation results to our next month's budget plan. If we apply the model, we need to measure its effectiveness. However, since the Robyn guide doesn't have 'how to do incremental verification of media channels', so I'm asking here.

In the Budget Allocation Onepager, we can see Total Response of Media channel. At first, we simply set Total Response as a target KPI, and tried to see how far we reached that number after a simulation period (ex. 4 weeks). However, we found that the total response obtained through the refresh model was not comparable with the total response obtained through the Initial model. Since two models used different data, they measure the media and non-media channel's contribution differently. So it cannot be used as a validation value.

[Questions]

  • Can I set Total Reponse as the target KPI for the media channel in the Budget Allocation result of the Initial Model?
  • If it's okay, how can I get the numbers to be compared for verification after the simulation period (4 weeks)?
  • Is there a way to check whether the expected response have been reached for each channel in the same way?

I'd appreciate if you could expalin abount the validation procedure of Robyn. Even if it is not a 100% scientific verification method, I would be helpful if you could explain the currently available validation methods. (Currently, we are not in a situation where we can proceed with Calibration and Geo Lift, so please comment except for that method.)

Thank you!😊

ghltk avatar Jul 06 '23 09:07 ghltk

Hi sorry for the late reply, to your question of "budget allocator and initial model not matching", please check this answer from me on another similar issue.

Regarding validation, as explained above, you'll get the same spend share as the initial model by doing this. I just picked a random model using the simulated data :

library(dplyr)
AllocatorCollect1 <- robyn_allocator(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  date_range = "all", # Default last month as initial period
  channel_constr_low = 0.7,
  channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
  scenario = "max_response",
  export = create_files
)
OutputCollect$xDecompAgg %>% filter(solID == select_model & !is.na(spend_share)) %>% select(rn, spend_share, effect_share) %>% arrange(rn)
AllocatorCollect1$dt_optimOut %>% select(channels, initSpendShare, initResponseUnitShare)

image As you can see in the result, spend share are the same after setting date_range to all, because the initial model onepager considers all dates, while the allocator contains 4 weeks by default.

And yes, the effect share is different, which is also explained in the linked comment above. For initial model, the effect share is just the % of all weekly avg. effect, or simply the historical share. For allocator, I need to use the weekly avg. spend to simulate the weekly avg. carryover and then simulate the weekly avg. response. It's a simulation process, NOT the historical share anymore.

gufengzhou avatar Jul 24 '23 03:07 gufengzhou

Just found a bug and pushed a fix on the response function. Now you can validate between initial model, robyn_response & robyn_allocator as followed:

## comparing responses
last_period <- 1
media_sorted <- sort(InputCollect$paid_media_spends)

## get last period response from initial model
val_response_a <- OutputCollect$xDecompVecCollect %>%
  filter(solID == select_model) %>%
  select(ds, media_sorted) %>%
  tail(last_period)

## get last period response from robyn_response
val_response_b <- list()
for (i in seq_along(media_sorted)){
  Response <- robyn_response(
    InputCollect = InputCollect,
    OutputCollect = OutputCollect,
    select_model = select_model,
    metric_name = media_sorted[i],
    date_range = paste0("last_", last_period)
  )
  val_response_b["ds"] <- data.frame(ds = Response$date)
  val_response_b[media_sorted[i]] <- data.frame(response = Response$response_total)

}
val_response_b <- bind_cols(val_response_b)

## get last period response from robyn_allocator
AllocatorCollect1 <- robyn_allocator(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  date_range = paste0("last_", last_period), # Default last month as initial period
  # total_budget = NULL, # When NULL, default is total spend in date_range
  channel_constr_low = 0.7,
  channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
  # channel_constr_multiplier = 3,
  scenario = "max_response",
  export = create_files
)
val_response_c <- AllocatorCollect1$dt_optimOut %>% select(date_min, date_max, initResponseUnit)

val_response_a
val_response_b
val_response_c

When doing last_period <- 1, you can see they all align: image

When doing last_period <- 3, initial model & response function align and outputs the historical response for every period, but allocator does runs a simulation behind and thus will use avg. carryover of the last 3 period to determine the result. image

gufengzhou avatar Jul 24 '23 08:07 gufengzhou

@gufengzhou , So given the above example looking at the results for facebook_s. to get the expected total response for the alloted period will be:

  initresponseUnit Periods TOTAL
when period 1 106625.5 1 106625.5
when Period last 3 99103.13 3 297309.4

tgtod002 avatar Aug 14 '23 20:08 tgtod002

I ran the validation using my data...Trying to understand the results.

Val_response_a:

48 11/27/2022 $ 161,919.87 $      2,361.20
49 12/4/2022 $ 127,123.26 $           58.60
50 12/11/2022 $ 269,868.51 $              0.90
51 12/18/2022 $ 240,025.21 $              0.01
52 12/25/2022 $ 242,438.92 $              0.00

Val_response_b"

48 11/27/2022 $    161,919.87 $         2,361.20
49 12/4/2022 $    127,123.26 $              58.60
50 12/11/2022 $    269,868.51 $                0.90
51 12/18/2022 $    240,025.21 $                0.01
52 12/25/2022 $    242,438.92 $                0.00

val_response_c:

  date_min date_max initResponseUnit
CSI_SPEND 1/2/2022 12/25/2022 4241.296224
WRAP_SPEND 1/2/2022 12/25/2022 402.289325

why is val_response_c so low. Does it divide by 52? (I ran the MMM for 52 weeks)

Thanks again for all your help on this.

tgtod002 avatar Aug 14 '23 23:08 tgtod002

The response level is calculated by a given spend on a given historical carryover/adstock. The given spend for response_c is the avg spend of the last 52 weeks, and the given carryover is the avg carryover of the 52 weeks.

Without looking into your data, the low response_c is because of the relatively high avg carryover and/or relatively low avg spend, compared to the actual daily levels. it's possible.

In your case, I suggest you to experiment with different date_range to find out the appropriate levels. We recommend using rather more recent periods instead of too far backwards to reflect your recent adstocking and saturation behaviour.

gufengzhou avatar Aug 15 '23 15:08 gufengzhou

@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?

My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?

tgtod002 avatar Aug 29 '23 01:08 tgtod002

@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?

My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?

Did you get any better understanding on the budget allocator? Can you share your findings? I have spend so much time and I still don't get what is going on here.

AdimDrewnik avatar Aug 14 '24 15:08 AdimDrewnik