pymc-marketing
pymc-marketing copied to clipboard
MMM Example Notebook: Time Varying Coefficients
To prove the flexibility and potential of framework (namely, bayesian stats and pymc
), we would like to have an example notebook to illustrate how to extend the base model (based on simulated data) introduced in https://github.com/pymc-labs/pymmmc/pull/41 by allowing time varying coefficients via gaussian processes.
Are there any concrete plans on if/when this is going to be implemented. I would be quite(!) interested in this.
Are there any concrete plans on if/when this is going to be implemented. I would be quite(!) interested in this.
I think we definitely want to bring this up sometime (no ETA). In the meantime here is a simulated example using a gaussian random walk. Still, I think gaussian processes should work (and scale) a bit better, see for example this blog post by PyMC Labs
Thanks for the quick response. I was actually hoping you would point to the hierarchical Gaussian process post :+1: I'll reread your post about the Gaussian random walk, as it seems simpler.
I've ported uber/orbit like model in for time-varying coefficient in pymc. Here are MVP snippets
# From: https://github.com/uber/orbit/blob/dev/orbit/utils/kernels.py
def gaussian_kernel(t, t_refs, rho: float = 0.1, alpha: float = 1, point_to_flatten: float = 1):
ker = np.where(
(t <= point_to_flatten).reshape(-1, 1),
(alpha**2) * np.exp(-1 * np.power(t.reshape(-1, 1) - t_refs.reshape(1, -1), 2)/(2 * rho**2 )),
(alpha**2) * np.exp(-1 * np.power(np.array([point_to_flatten] * t.shape[0]).reshape(-1, 1) - t_refs.reshape(1, -1), 2)/(2 * rho**2 ))
)
return ker / np.sum(ker, axis=1, keepdims=True)
def triangle_kernel(t, t_refs):
ker = np.zeros((t.shape[0], t_refs.shape[0]))
ker[t < t_refs[0], 0] = 1
for idx in range(t_refs.shape[0] - 1):
valid_idxes = (t >= t_refs[idx]) & (t < t_refs[idx + 1])
norm = t_refs[idx + 1] - t_refs[idx]
ker[valid_idxes, idx] = np.abs(t[valid_idxes] - t_refs[idx + 1])/norm
ker[valid_idxes, idx + 1] = np.abs(t[valid_idxes] - t_refs[idx])/norm
ker[t >= t_refs[-1], -1] = 1
return ker / np.sum(ker, axis=1, keepdims=True)
And
# Hyper parameters
n_knots = 10
n_order = 3
date = data_df.date.to_numpy()
t = np.array(data_df.date.index)
t = (t - t.min())/(t.max() - t.min())
y = data_df["y"].to_numpy().reshape(-1)
x = data_df[["x1", "x2", "x3"]].to_numpy()
t_refs = np.linspace(t.min(), t.max(), n_knots + 1)
periods = np.array(data_df.date.dt.day_of_year) / 365.25
trend_ker = triangle_kernel(t, t_refs)
reg_ker = gaussian_kernel(t, t_refs)
fourier_features = pd.DataFrame(
{
f"{func}_order_{order}": getattr(np, func)(2 * np.pi * periods * order)
for order in range(1, n_order + 1)
for func in ("sin", "cos")
}
)
And
with pm.Model(coords=coords) as model:
# Trend Priors
b_trend_sigma = pm.HalfNormal("b_trend_sigma", sigma=1)
b_trend = pt.cumsum(pm.Laplace("b_trend", mu=0, b=b_trend_sigma, shape=(n_knots + 1, )))
# Fourier Priors
b_fourier_sigma = pm.HalfNormal("b_fourier_sigma", sigma=1)
b_fourier = pt.cumsum(pm.Laplace("b_fourier", mu=0, b=b_fourier_sigma, shape=(n_knots + 1, 2 * n_order)), axis=0)
# Regression Priors
pool_mu = pm.Normal("pool_mu", mu=0, sigma=1)
pool_sigma = pm.HalfNormal("pool_sigma", sigma=1)
reg_mu = pm.Normal("reg_mu", mu=pool_mu, sigma=pool_sigma, shape=(x.shape[1], ))
reg_sigma = pm.HalfNormal("reg_sigma", sigma=0.25, shape=(x.shape[1], ))
if is_only_positive:
b_regs = pt.abs(
pm.Normal(
"b_regs",
mu=pt.abs(reg_mu.reshape((x.shape[1], 1))),
sigma=reg_sigma.reshape((x.shape[1], 1)),
shape=(x.shape[1], len(t_refs), )
).T
)
else:
b_regs = pm.Normal(
"b_regs",
mu=reg_mu.reshape((x.shape[1], 1)),
sigma=reg_sigma.reshape((x.shape[1], 1)),
shape=(x.shape[1], len(t_refs), )
).T
# Calculation
l_trend = pm.Deterministic("l_trend", var=(pm.math.dot(trend_ker, b_trend)), dims="date")
s_fourier = pm.Deterministic("s_fourier", var=((fourier_features.to_numpy() * pm.math.dot(trend_ker, b_fourier))).sum(axis=1), dims="date")
r_feat = pm.Deterministic("r_feat", var=((x * pm.math.dot(reg_ker, b_regs))).sum(axis=1), dims="date")
nu = pm.Gamma(name="nu", alpha=10, beta=1)
sigma = pm.HalfNormal(name="sigma", sigma=0.5)
mu = pm.Deterministic("mu", var=(l_trend + s_fourier + r_feat), dims="date")
# --- likelihood ---
pm.StudentT(name="likelihood", nu=nu, mu=mu, sigma=sigma, observed=y, dims="date")
Using synthetic data from https://orbit-ml.readthedocs.io/en/latest/tutorials/ktr2.html
Wow! Amazing! Do you wanna open a PR?
I'm not sure where to put it exactly to be honest. Do I have to create model like DelayedSaturatedMMM
, or it should be an auxiliary script entirely (not MMM model, but can be use as a template)?
If you want you can create a PR with a little example. We do not merge one but we open one (together) where we start designing the API, for example. We will ask the PyMC-marketing team for feedback :)
Okay, I've done it https://github.com/pymc-labs/pymc-marketing/pull/250
Update, I had posted a question on pymc discourse about HGP implementation, and I've got it to work (as a concept) from bwengals's gist. So it should be able to add HGP as Time Varying Coefficients too?
Oh wow! You are being super fast! 😄 (I'm not responding fast as I'm in parental leave). Feel free to open an other PR with the HS implementation! 🤜🤛