Robyn
Robyn copied to clipboard
Validation procedure of Robyn
Hi team, Currently, we have finished building the Robyn model and we want to apply the Budget Allocation results to our next month's budget plan. If we apply the model, we need to measure its effectiveness. However, since the Robyn guide doesn't have 'how to do incremental verification of media channels', so I'm asking here.
In the Budget Allocation Onepager, we can see Total Response of Media channel. At first, we simply set Total Response as a target KPI, and tried to see how far we reached that number after a simulation period (ex. 4 weeks). However, we found that the total response obtained through the refresh model was not comparable with the total response obtained through the Initial model. Since two models used different data, they measure the media and non-media channel's contribution differently. So it cannot be used as a validation value.
[Questions]
- Can I set Total Reponse as the target KPI for the media channel in the Budget Allocation result of the Initial Model?
- If it's okay, how can I get the numbers to be compared for verification after the simulation period (4 weeks)?
- Is there a way to check whether the expected response have been reached for each channel in the same way?
I'd appreciate if you could expalin abount the validation procedure of Robyn. Even if it is not a 100% scientific verification method, I would be helpful if you could explain the currently available validation methods. (Currently, we are not in a situation where we can proceed with Calibration and Geo Lift, so please comment except for that method.)
Thank you!😊
Hi sorry for the late reply, to your question of "budget allocator and initial model not matching", please check this answer from me on another similar issue.
Regarding validation, as explained above, you'll get the same spend share as the initial model by doing this. I just picked a random model using the simulated data :
library(dplyr)
AllocatorCollect1 <- robyn_allocator(
InputCollect = InputCollect,
OutputCollect = OutputCollect,
select_model = select_model,
date_range = "all", # Default last month as initial period
channel_constr_low = 0.7,
channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
scenario = "max_response",
export = create_files
)
OutputCollect$xDecompAgg %>% filter(solID == select_model & !is.na(spend_share)) %>% select(rn, spend_share, effect_share) %>% arrange(rn)
AllocatorCollect1$dt_optimOut %>% select(channels, initSpendShare, initResponseUnitShare)
As you can see in the result, spend share are the same after setting date_range to all, because the initial model onepager considers all dates, while the allocator contains 4 weeks by default.
And yes, the effect share is different, which is also explained in the linked comment above. For initial model, the effect share is just the % of all weekly avg. effect, or simply the historical share. For allocator, I need to use the weekly avg. spend to simulate the weekly avg. carryover and then simulate the weekly avg. response. It's a simulation process, NOT the historical share anymore.
Just found a bug and pushed a fix on the response function. Now you can validate between initial model, robyn_response & robyn_allocator as followed:
## comparing responses
last_period <- 1
media_sorted <- sort(InputCollect$paid_media_spends)
## get last period response from initial model
val_response_a <- OutputCollect$xDecompVecCollect %>%
filter(solID == select_model) %>%
select(ds, media_sorted) %>%
tail(last_period)
## get last period response from robyn_response
val_response_b <- list()
for (i in seq_along(media_sorted)){
Response <- robyn_response(
InputCollect = InputCollect,
OutputCollect = OutputCollect,
select_model = select_model,
metric_name = media_sorted[i],
date_range = paste0("last_", last_period)
)
val_response_b["ds"] <- data.frame(ds = Response$date)
val_response_b[media_sorted[i]] <- data.frame(response = Response$response_total)
}
val_response_b <- bind_cols(val_response_b)
## get last period response from robyn_allocator
AllocatorCollect1 <- robyn_allocator(
InputCollect = InputCollect,
OutputCollect = OutputCollect,
select_model = select_model,
date_range = paste0("last_", last_period), # Default last month as initial period
# total_budget = NULL, # When NULL, default is total spend in date_range
channel_constr_low = 0.7,
channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
# channel_constr_multiplier = 3,
scenario = "max_response",
export = create_files
)
val_response_c <- AllocatorCollect1$dt_optimOut %>% select(date_min, date_max, initResponseUnit)
val_response_a
val_response_b
val_response_c
When doing last_period <- 1
, you can see they all align:
When doing last_period <- 3
, initial model & response function align and outputs the historical response for every period, but allocator does runs a simulation behind and thus will use avg. carryover of the last 3 period to determine the result.
@gufengzhou , So given the above example looking at the results for facebook_s. to get the expected total response for the alloted period will be:
initresponseUnit | Periods | TOTAL | |
---|---|---|---|
when period 1 | 106625.5 | 1 | 106625.5 |
when Period last 3 | 99103.13 | 3 | 297309.4 |
I ran the validation using my data...Trying to understand the results.
Val_response_a:
48 | 11/27/2022 | $ 161,919.87 | $ 2,361.20 |
---|---|---|---|
49 | 12/4/2022 | $ 127,123.26 | $ 58.60 |
50 | 12/11/2022 | $ 269,868.51 | $ 0.90 |
51 | 12/18/2022 | $ 240,025.21 | $ 0.01 |
52 | 12/25/2022 | $ 242,438.92 | $ 0.00 |
Val_response_b"
48 | 11/27/2022 | $ 161,919.87 | $ 2,361.20 |
---|---|---|---|
49 | 12/4/2022 | $ 127,123.26 | $ 58.60 |
50 | 12/11/2022 | $ 269,868.51 | $ 0.90 |
51 | 12/18/2022 | $ 240,025.21 | $ 0.01 |
52 | 12/25/2022 | $ 242,438.92 | $ 0.00 |
val_response_c:
date_min | date_max | initResponseUnit | |
---|---|---|---|
CSI_SPEND | 1/2/2022 | 12/25/2022 | 4241.296224 |
WRAP_SPEND | 1/2/2022 | 12/25/2022 | 402.289325 |
why is val_response_c so low. Does it divide by 52? (I ran the MMM for 52 weeks)
Thanks again for all your help on this.
The response level is calculated by a given spend on a given historical carryover/adstock. The given spend for response_c is the avg spend of the last 52 weeks, and the given carryover is the avg carryover of the 52 weeks.
Without looking into your data, the low response_c is because of the relatively high avg carryover and/or relatively low avg spend, compared to the actual daily levels. it's possible.
In your case, I suggest you to experiment with different date_range to find out the appropriate levels. We recommend using rather more recent periods instead of too far backwards to reflect your recent adstocking and saturation behaviour.
@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?
My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?
@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?
My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?
Did you get any better understanding on the budget allocator? Can you share your findings? I have spend so much time and I still don't get what is going on here.