pymc-marketing icon indicating copy to clipboard operation
pymc-marketing copied to clipboard

RuntimeError: Incorrect output dtype for return value #0: Expected: int64, Actual: int32

Open kb-open opened this issue 7 months ago • 16 comments

mmm.fit() results in the following error. Tried with 0.11.0, 0.12.0, 0.13.0.

`XlaRuntimeError: FAILED_PRECONDITION: Buffer Definition Event: Error dispatching computation: %sCpuCallback error: Traceback (most recent call last):
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\jax\_src\interpreters\mlir.py", line 2973, in _wrapped_callback
RuntimeError: Incorrect output dtype for return value #0: Expected: int64, Actual: int32`

The only workaround is to set pytensor.config.floatX = 'float32'. However, that works only for 0.11.0. For other versions, this option gives the following error with fit:

TypeError: Cannot convert Type Scalar(float32, shape=()) (of Variable media_temporal_latent_multiplier_f_mean) into Type Scalar(float64, shape=()). You can try to manually convert media_temporal_latent_multiplier_f_mean into a Scalar(float64, shape=()).

But even with 0.11.0 and pytensor.config.floatX = 'float32' option, the mmm.optimize_budget() function doesn't work (even though fit works). All in all, there is no combination with which optimize_budget runs successfully. Pasting the full error message with 0.11.0 and pytensor.config.floatX = 'float32' option below:

C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\budget_optimizer.py:202: UserWarning: Using default equality constraint
  self.set_constraints(
C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\mmm.py:2361: UserWarning: No budget bounds provided. Using default bounds (0, total_budget) for each channel.
  return allocator.allocate_budget(
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[82], line 1
----> 1 allocation_strategy, optimization_result = model_bayesian.optimize_budget(budget=budget_per_time_unit,
      2                                                                           num_periods=campaign_period,
      3                                                                           budget_bounds=None)

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\mmm.py:2361, in MMM.optimize_budget(self, budget, num_periods, budget_bounds, response_variable, utility_function, constraints, default_constraints, **minimize_kwargs)
   2350 from pymc_marketing.mmm.budget_optimizer import BudgetOptimizer
   2352 allocator = BudgetOptimizer(
   2353     num_periods=num_periods,
   2354     utility_function=utility_function,
   (...)
   2358     model=self,
   2359 )
-> 2361 return allocator.allocate_budget(
   2362     total_budget=budget,
   2363     budget_bounds=budget_bounds,
   2364     **minimize_kwargs,
   2365 )

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\budget_optimizer.py:466, in BudgetOptimizer.allocate_budget(self, total_budget, budget_bounds, minimize_kwargs)
    463     minimize_kwargs = {**self.DEFAULT_MINIMIZE_KWARGS, **minimize_kwargs}
    465 # 6. Run the SciPy optimizer
--> 466 result = minimize(
    467     fun=self._compiled_functions[self.utility_function]["objective"],
    468     jac=self._compiled_functions[self.utility_function]["gradient"],
    469     x0=initial_guess,
    470     bounds=bounds,
    471     constraints=self._compiled_constraints,
    472     **minimize_kwargs,
    473 )
    475 # 7. Process results
    476 if result.success:

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_minimize.py:722, in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
    719     res = _minimize_cobyla(fun, x0, args, constraints, callback=callback,
    720                            bounds=bounds, **options)
    721 elif meth == 'slsqp':
--> 722     res = _minimize_slsqp(fun, x0, args, jac, bounds,
    723                           constraints, callback=callback, **options)
    724 elif meth == 'trust-constr':
    725     res = _minimize_trustregion_constr(fun, x0, args, jac, hess, hessp,
    726                                        bounds, constraints,
    727                                        callback=callback, **options)

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_slsqp_py.py:336, in _minimize_slsqp(func, x0, args, jac, bounds, constraints, maxiter, ftol, iprint, disp, eps, callback, finite_diff_rel_step, **unknown_options)
    322 exit_modes = {-1: "Gradient evaluation required (g & a)",
    323                0: "Optimization terminated successfully",
    324                1: "Function evaluation required (f & c)",
   (...)
    331                8: "Positive directional derivative for linesearch",
    332                9: "Iteration limit reached"}
    334 # Set the parameters that SLSQP will need
    335 # meq, mieq: number of equality and inequality constraints
--> 336 meq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    337           for c in cons['eq']]))
    338 mieq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    339            for c in cons['ineq']]))
    340 # m = The total number of constraints

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_slsqp_py.py:336, in <listcomp>(.0)
    322 exit_modes = {-1: "Gradient evaluation required (g & a)",
    323                0: "Optimization terminated successfully",
    324                1: "Function evaluation required (f & c)",
   (...)
    331                8: "Positive directional derivative for linesearch",
    332                9: "Iteration limit reached"}
    334 # Set the parameters that SLSQP will need
    335 # meq, mieq: number of equality and inequality constraints
--> 336 meq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    337           for c in cons['eq']]))
    338 mieq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    339            for c in cons['ineq']]))
    340 # m = The total number of constraints

File ~\anaconda3\envs\mmm\Lib\site-packages\pytensor\compile\function\types.py:944, in Function.__call__(self, output_subset, *args, **kwargs)
    942 else:
    943     try:
--> 944         arg_container.storage[0] = arg_container.type.filter(
    945             arg,
    946             strict=arg_container.strict,
    947             allow_downcast=arg_container.allow_downcast,
    948         )
    950     except Exception as e:
    951         i = input_storage.index(arg_container)

File ~\anaconda3\envs\mmm\Lib\site-packages\pytensor\tensor\type.py:216, in TensorType.filter(self, data, strict, allow_downcast)
    207     if up_dtype != self.dtype:
    208         err_msg = (
    209             f"{self} cannot store a value of dtype {data.dtype} without "
    210             "risking loss of precision. If you do not mind "
   (...)
    214             f'"function". Value: "{data!r}"'
    215         )
--> 216         raise TypeError(err_msg)
    217 elif (
    218     allow_downcast is None
    219     and isinstance(data, float | np.floating)
   (...)
    222     # Special case where we allow downcasting of Python float
    223     # literals to floatX, even when floatX=='float32'
    224     data = np.asarray(data, self.dtype)

TypeError: Bad input argument to pytensor function with name "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\constraints.py:93" at index 0 (0-based).  
Backtrace when that variable is created:

  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3061, in _run_cell
    result = runner(coro)
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3266, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3445, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3505, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "C:\Users\kbope\AppData\Local\Temp\ipykernel_5644\1074298961.py", line 1, in <module>
    allocation_strategy, optimization_result = model_bayesian.optimize_budget(budget=budget_per_time_unit,
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\mmm.py", line 2352, in optimize_budget
    allocator = BudgetOptimizer(
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\budget_optimizer.py", line 180, in __init__
    self._budgets_flat = pt.tensor("budgets_flat", shape=(size_budgets,))
Vector(float32, shape=(7,)) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to float32, or 2) set "allow_input_downcast=True" when calling "function". Value: "array([0.71428573, 0.71428573, 0.71428573, 0.71428573, 0.71428573,
       0.71428573, 0.71428573])"

kb-open avatar Apr 20 '25 16:04 kb-open

It would be great if some workaround could be provided for 0.11.0 with pytensor.config.floatX = 'float32' so that optimize_budget works at least. A proper fix for why the original issue #1427 occurs can be provided later. @williambdean

Thanks in advance!

kb-open avatar Apr 21 '25 17:04 kb-open

Hi @kb-open

Thanks for raising an issue! Can you provide some details about the environment that you are using and a minimal code block to reproduce

Something based on this would be helpful:

https://github.com/pymc-labs/pymc-marketing/blob/2a4dd0362804577b971482ea192bf5e817a8eb43/README.md?plain=1#L81-L103

williambdean avatar Apr 21 '25 17:04 williambdean

Environment:

conda create -n mmm python=3.11.4 anaconda
conda activate mmm
conda install m2w64-toolchain
pip install pymc-marketing==0.11.0 arviz numpyro CausalPy
conda install jupyter numpy pandas matplotlib seaborn statsmodels scikit-learn cromp cvxpy tensorflow-probability

Minimal Code Block:

import pytensor  
pytensor.config.floatX = 'float32' # Workaround for a bug in pymc_marketing

import numpy as np
import pandas as pd

from pymc_marketing.mmm import GeometricAdstock, LogisticSaturation, MMM
from pymc_marketing.mmm.budget_optimizer import optimizer_xarray_builder

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/mmm_example.csv"
data = pd.read_csv(data_url, parse_dates=["date_week"])

channel_columns=["x1", "x2"]

mmm = MMM(
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    date_column="date_week",
    channel_columns=channel_columns,
    control_columns=[
        "event_1",
        "event_2",
        "t",
    ],
    yearly_seasonality=2,
)

X = data.drop("y", axis=1)
y = data["y"]

sampler_config = {'chains': 4, 'draws': 2_000, 'tune': 1_500, 'progressbar': True,
                  'cores': 4, 
                  'nuts_sampler': 'numpyro',
                  'target_accept': 0.85, 
                  'random_seed':1}
mmm.fit(X, y, **sampler_config)

budget_per_time_unit = 5
campaign_period = 8

# The initial split per channel
budget_per_channel = budget_per_time_unit / len(channel_columns)

# Initial budget allocation strategy for each channel
initial_budget = optimizer_xarray_builder(np.array([budget_per_channel]*len(channel_columns)),
                                          channel=channel_columns)

allocation_strategy, optimization_result = mmm.optimize_budget(budget=budget_per_time_unit,
                                                               num_periods=campaign_period,
                                                               budget_bounds=None)

Error:

TypeError                                 Traceback (most recent call last)
Cell In[7], line 8
      4 # Initial budget allocation strategy for each channel
      5 initial_budget = optimizer_xarray_builder(np.array([budget_per_channel]*len(channel_columns)),
      6                                           channel=channel_columns)
----> 8 allocation_strategy, optimization_result = mmm.optimize_budget(budget=budget_per_time_unit,
      9                                                                num_periods=campaign_period,
     10                                                                budget_bounds=None)

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\mmm.py:2361, in MMM.optimize_budget(self, budget, num_periods, budget_bounds, response_variable, utility_function, constraints, default_constraints, **minimize_kwargs)
   2350 from pymc_marketing.mmm.budget_optimizer import BudgetOptimizer
   2352 allocator = BudgetOptimizer(
   2353     num_periods=num_periods,
   2354     utility_function=utility_function,
   (...)
   2358     model=self,
   2359 )
-> 2361 return allocator.allocate_budget(
   2362     total_budget=budget,
   2363     budget_bounds=budget_bounds,
   2364     **minimize_kwargs,
   2365 )

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\budget_optimizer.py:466, in BudgetOptimizer.allocate_budget(self, total_budget, budget_bounds, minimize_kwargs)
    463     minimize_kwargs = {**self.DEFAULT_MINIMIZE_KWARGS, **minimize_kwargs}
    465 # 6. Run the SciPy optimizer
--> 466 result = minimize(
    467     fun=self._compiled_functions[self.utility_function]["objective"],
    468     jac=self._compiled_functions[self.utility_function]["gradient"],
    469     x0=initial_guess,
    470     bounds=bounds,
    471     constraints=self._compiled_constraints,
    472     **minimize_kwargs,
    473 )
    475 # 7. Process results
    476 if result.success:

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_minimize.py:722, in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
    719     res = _minimize_cobyla(fun, x0, args, constraints, callback=callback,
    720                            bounds=bounds, **options)
    721 elif meth == 'slsqp':
--> 722     res = _minimize_slsqp(fun, x0, args, jac, bounds,
    723                           constraints, callback=callback, **options)
    724 elif meth == 'trust-constr':
    725     res = _minimize_trustregion_constr(fun, x0, args, jac, hess, hessp,
    726                                        bounds, constraints,
    727                                        callback=callback, **options)

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_slsqp_py.py:336, in _minimize_slsqp(func, x0, args, jac, bounds, constraints, maxiter, ftol, iprint, disp, eps, callback, finite_diff_rel_step, **unknown_options)
    322 exit_modes = {-1: "Gradient evaluation required (g & a)",
    323                0: "Optimization terminated successfully",
    324                1: "Function evaluation required (f & c)",
   (...)
    331                8: "Positive directional derivative for linesearch",
    332                9: "Iteration limit reached"}
    334 # Set the parameters that SLSQP will need
    335 # meq, mieq: number of equality and inequality constraints
--> 336 meq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    337           for c in cons['eq']]))
    338 mieq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    339            for c in cons['ineq']]))
    340 # m = The total number of constraints

File ~\anaconda3\envs\mmm\Lib\site-packages\scipy\optimize\_slsqp_py.py:336, in <listcomp>(.0)
    322 exit_modes = {-1: "Gradient evaluation required (g & a)",
    323                0: "Optimization terminated successfully",
    324                1: "Function evaluation required (f & c)",
   (...)
    331                8: "Positive directional derivative for linesearch",
    332                9: "Iteration limit reached"}
    334 # Set the parameters that SLSQP will need
    335 # meq, mieq: number of equality and inequality constraints
--> 336 meq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    337           for c in cons['eq']]))
    338 mieq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
    339            for c in cons['ineq']]))
    340 # m = The total number of constraints

File ~\anaconda3\envs\mmm\Lib\site-packages\pytensor\compile\function\types.py:944, in Function.__call__(self, output_subset, *args, **kwargs)
    942 else:
    943     try:
--> 944         arg_container.storage[0] = arg_container.type.filter(
    945             arg,
    946             strict=arg_container.strict,
    947             allow_downcast=arg_container.allow_downcast,
    948         )
    950     except Exception as e:
    951         i = input_storage.index(arg_container)

File ~\anaconda3\envs\mmm\Lib\site-packages\pytensor\tensor\type.py:216, in TensorType.filter(self, data, strict, allow_downcast)
    207     if up_dtype != self.dtype:
    208         err_msg = (
    209             f"{self} cannot store a value of dtype {data.dtype} without "
    210             "risking loss of precision. If you do not mind "
   (...)
    214             f'"function". Value: "{data!r}"'
    215         )
--> 216         raise TypeError(err_msg)
    217 elif (
    218     allow_downcast is None
    219     and isinstance(data, float | np.floating)
   (...)
    222     # Special case where we allow downcasting of Python float
    223     # literals to floatX, even when floatX=='float32'
    224     data = np.asarray(data, self.dtype)

TypeError: Bad input argument to pytensor function with name "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\constraints.py:93" at index 0 (0-based).  
Backtrace when that variable is created:

  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3061, in _run_cell
    result = runner(coro)
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3266, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3445, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\IPython\core\interactiveshell.py", line 3505, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "C:\Users\kbope\AppData\Local\Temp\ipykernel_26940\2695351581.py", line 8, in <module>
    allocation_strategy, optimization_result = mmm.optimize_budget(budget=budget_per_time_unit,
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\mmm.py", line 2352, in optimize_budget
    allocator = BudgetOptimizer(
  File "C:\Users\kbope\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\mmm\budget_optimizer.py", line 180, in __init__
    self._budgets_flat = pt.tensor("budgets_flat", shape=(size_budgets,))
Vector(float32, shape=(2,)) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to float32, or 2) set "allow_input_downcast=True" when calling "function". Value: "array([2.5, 2.5])"

kb-open avatar Apr 21 '25 18:04 kb-open

@williambdean Hope the above info helps.

kb-open avatar Apr 21 '25 18:04 kb-open

Additional info: I've tried adding allow_input_downcast=True in pymc_marketing\mmm\constraints.py:93. Then it fails elsewhere in the code for similar dtype mismatch issue.

kb-open avatar Apr 21 '25 18:04 kb-open

@williambdean Hope you've been able to reproduce the issue with the minimal code block I shared and the environment I shared. Kindly please feel free if you need anything else.

kb-open avatar Apr 22 '25 17:04 kb-open

I was not able to reproduce exactly. Your work around causes the failure for me. Without the work-around works

williambdean avatar Apr 22 '25 18:04 williambdean

But without the workaround, the fit method doesn't work, as issue #1427 mentions. Have you tried with 'nuts_sampler': 'numpyro' as mentioned in my minimal code block? Not sure what's missing.

kb-open avatar Apr 22 '25 18:04 kb-open

Yes, I did. I'm putting in a fix with https://github.com/pymc-labs/pymc-marketing/pull/1640

Can you see if that change works for you?

williambdean avatar Apr 22 '25 18:04 williambdean

This code is already present in the pytensor/tensor/type.py at line 807 in the anaconda environment I'm using. So nope, that doesn't help unfortunately.

if dtype is None:
        dtype = config.floatX

kb-open avatar Apr 22 '25 18:04 kb-open

Yeah, i saw that too. Then more digging is needed. What is wrong with your workaround? Doesnt seem like unreasonable thing to do

williambdean avatar Apr 22 '25 19:04 williambdean

Well, the workaround fixes the fit issue (#1427), but finally optimize_budget fails, as you could also reproduce.

kb-open avatar Apr 22 '25 19:04 kb-open

Tried pytensor/compile/function/types.py line: 947. It goes ahead but fails elsewhere with TypeError: expected type_num 11 (NPY_FLOAT32) got 12 error.

try:
                        arg_container.storage[0] = arg_container.type.filter(
                            arg,
                            strict=arg_container.strict,
                            allow_downcast=True, #arg_container.allow_downcast,
                        )

kb-open avatar Apr 23 '25 11:04 kb-open

Did https://github.com/pymc-labs/pymc-marketing/pull/1636 affect this?

williambdean avatar Apr 23 '25 13:04 williambdean

Did #1636 affect this?

Well, #1636 was merged just 2 days back, and I'm using 0.11.0 which is older than this. So, this might not be the root cause.

kb-open avatar Apr 23 '25 14:04 kb-open