prophet icon indicating copy to clipboard operation
prophet copied to clipboard

mcmc_samples > 0 cause ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0

Open rajanant opened this issue 2 years ago • 2 comments

Some error while try to use mcmc sampling parameter not equal to zero prophet==1.1.1

import pandas as pd
from prophet import Prophet

data = [
    {"ds": "2022-08-30", "y": 3.0},
    {"ds": "2022-08-16", "y": 3.0},
    {"ds": "2022-08-02", "y": 3.0},
    {"ds": "2022-07-19", "y": 3.0},
    {"ds": "2022-07-05", "y": 2.0},
    {"ds": "2022-06-21", "y": 2.0},
    {"ds": "2022-06-07", "y": 2.0},
    {"ds": "2022-05-23", "y": 2.0},
    {"ds": "2022-05-05", "y": 2.0},
    {"ds": "2022-04-21", "y": 2.0},
    {"ds": "2022-04-07", "y": 2.0},
    {"ds": "2022-03-24", "y": 2.0},
    {"ds": "2022-03-10", "y": 2.0},
    {"ds": "2022-02-24", "y": 2.0},
    {"ds": "2022-02-10", "y": 2.0},
    {"ds": "2022-01-26", "y": 2.0},
    {"ds": "2022-01-12", "y": 7.0},
    {"ds": "2021-12-28", "y": 7.0},
    {"ds": "2021-12-14", "y": 7.0},
    {"ds": "2021-11-29", "y": 7.0},
    {"ds": "2021-11-12", "y": 7.0},
    {"ds": "2021-10-29", "y": 6.0},
    {"ds": "2021-10-13", "y": 5.0},
    {"ds": "2021-09-29", "y": 5.0}
]

df = pd.DataFrame(data)
df['ds'] = pd.to_datetime(df['ds'])

print(df)

target = float(df.loc[df.index[0]]['y'])
df = df.loc[df.index[1:]]

m = Prophet(mcmc_samples=1000)  # with mcmc_samples=0 works fine
m.fit(df)

future = m.make_future_dataframe(periods=14)
print('future', future, sep='\n')

forecast = m.predict(future)

print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']])

Here is the full output

INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 17.
DEBUG:cmdstanpy:input tempfile: /tmp/tmpswmsreqz/857g2udc.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmpswmsreqz/wwkwqfc0.json
DEBUG:cmdstanpy:cmd: /usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin info
cwd: None
DEBUG:cmdstanpy:Command ['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'info']
	error during processing Machine is not on the network
08:31:41 - cmdstanpy - INFO - CmdStan installation /usr/local/lib/python3.7/dist-packages/prophet/stan_model/cmdstan-2.26.1 missing makefile, cannot get version.
INFO:cmdstanpy:CmdStan installation /usr/local/lib/python3.7/dist-packages/prophet/stan_model/cmdstan-2.26.1 missing makefile, cannot get version.
08:31:41 - cmdstanpy - INFO - Cannot determine whether version is before 2.28.
INFO:cmdstanpy:Cannot determine whether version is before 2.28.
08:31:41 - cmdstanpy - INFO - CmdStan start processing
INFO:cmdstanpy:CmdStan start processing
           ds    y
0  2022-08-30  3.0
1  2022-08-16  3.0
2  2022-08-02  3.0
3  2022-07-19  3.0
4  2022-07-05  2.0
5  2022-06-21  2.0
6  2022-06-07  2.0
7  2022-05-23  2.0
8  2022-05-05  2.0
9  2022-04-21  2.0
10 2022-04-07  2.0
11 2022-03-24  2.0
12 2022-03-10  2.0
13 2022-02-24  2.0
14 2022-02-10  2.0
15 2022-01-26  2.0
16 2022-01-12  7.0
17 2021-12-28  7.0
18 2021-12-14  7.0
19 2021-11-29  7.0
20 2021-11-12  7.0
21 2021-10-29  6.0
22 2021-10-13  5.0
23 2021-09-29  5.0
chain 1
00:02 Sampling completed
chain 2
00:02 Sampling completed
chain 3
00:01 Sampling completed
chain 4
00:01 Sampling completed
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'id=1', 'random', 'seed=11862', 'data', 'file=/tmp/tmpswmsreqz/857g2udc.json', 'init=/tmp/tmpswmsreqz/wwkwqfc0.json', 'output', 'file=/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_1.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 1
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'id=2', 'random', 'seed=11862', 'data', 'file=/tmp/tmpswmsreqz/857g2udc.json', 'init=/tmp/tmpswmsreqz/wwkwqfc0.json', 'output', 'file=/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_2.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 2
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'id=3', 'random', 'seed=11862', 'data', 'file=/tmp/tmpswmsreqz/857g2udc.json', 'init=/tmp/tmpswmsreqz/wwkwqfc0.json', 'output', 'file=/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_3.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 3
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'id=4', 'random', 'seed=11862', 'data', 'file=/tmp/tmpswmsreqz/857g2udc.json', 'init=/tmp/tmpswmsreqz/wwkwqfc0.json', 'output', 'file=/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_4.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
                                                                                                                                                                                                                                                                                                                                08:31:43 - cmdstanpy - INFO - CmdStan done processing.
INFO:cmdstanpy:CmdStan done processing.
DEBUG:cmdstanpy:runset
RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4
 cmd (chain 1):
	['/usr/local/lib/python3.7/dist-packages/prophet/stan_model/prophet_model.bin', 'id=1', 'random', 'seed=11862', 'data', 'file=/tmp/tmpswmsreqz/857g2udc.json', 'init=/tmp/tmpswmsreqz/wwkwqfc0.json', 'output', 'file=/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_1.csv', 'method=sample', 'num_samples=500', 'num_warmup=500', 'algorithm=hmc', 'adapt', 'engaged=1']
 retcodes=[0, 0, 0, 0]
 per-chain output files (showing chain 1 only):
 csv_file:
	/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_1.csv
 console_msgs (if any):
	/tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_0-stdout.txt
DEBUG:cmdstanpy:Chain 1 console:
method = sample (Default)
  sample
    num_samples = 500
    num_warmup = 500
    save_warmup = 0 (Default)
    thin = 1 (Default)
    adapt
      engaged = 1 (Default)
      gamma = 0.050000000000000003 (Default)
      delta = 0.80000000000000004 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
    algorithm = hmc (Default)
      hmc
        engine = nuts (Default)
          nuts
            max_depth = 10 (Default)
        metric = diag_e (Default)
        metric_file =  (Default)
        stepsize = 1 (Default)
        stepsize_jitter = 0 (Default)
id = 1
data
  file = /tmp/tmpswmsreqz/857g2udc.json
init = /tmp/tmpswmsreqz/wwkwqfc0.json
random
  seed = 11862
output
  file = /tmp/tmpswmsreqz/prophet_model_jnqzlee/prophet_model-20221012083141_1.csv
  diagnostic_file =  (Default)
  refresh = 100 (Default)
  sig_figs = -1 (Default)
  profile_file = profile.csv (Default)


Gradient evaluation took 0.001375 seconds
1000 transitions using 10 leapfrog steps per transition would take 13.75 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: normal_id_glm_lpdf: Scale vector is 0, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 300 / 1000 [ 30%]  (Warmup)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 0.48 seconds (Warm-up)
               0.315 seconds (Sampling)
               0.795 seconds (Total)


08:31:43 - cmdstanpy - WARNING - Non-fatal error during sampling:
Exception: normal_id_glm_lpdf: Scale vector is 0, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Exception: normal_id_glm_lpdf: Scale vector is 0, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Consider re-running with show_console=True if the above output is unclear!
WARNING:cmdstanpy:Non-fatal error during sampling:
Exception: normal_id_glm_lpdf: Scale vector is 0, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Exception: normal_id_glm_lpdf: Scale vector is 0, but must be positive finite! (in '/project/python/stan/prophet.stan', line 137, column 2 to line 142, column 4)
Consider re-running with show_console=True if the above output is unclear!

future
           ds
0  2021-09-29
1  2021-10-13
2  2021-10-29
3  2021-11-12
4  2021-11-29
5  2021-12-14
6  2021-12-28
7  2022-01-12
8  2022-01-26
9  2022-02-10
10 2022-02-24
11 2022-03-10
12 2022-03-24
13 2022-04-07
14 2022-04-21
15 2022-05-05
16 2022-05-23
17 2022-06-07
18 2022-06-21
19 2022-07-05
20 2022-07-19
21 2022-08-02
22 2022-08-16
23 2022-08-17
24 2022-08-18
25 2022-08-19
26 2022-08-20
27 2022-08-21
28 2022-08-22
29 2022-08-23
30 2022-08-24
31 2022-08-25
32 2022-08-26
33 2022-08-27
34 2022-08-28
35 2022-08-29
36 2022-08-30
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-7-3e1d0a7a7109>](https://localhost:8080/#) in <module>
     43 print('future', future, sep='\n')
     44 
---> 45 forecast = m.predict(future)
     46 
     47 print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']])

1 frames
[/usr/local/lib/python3.7/dist-packages/prophet/forecaster.py](https://localhost:8080/#) in predict_seasonal_components(self, df)
   1355             beta_c = self.params['beta'] * component_cols[component].values
   1356 
-> 1357             comp = np.matmul(X, beta_c.transpose())
   1358             if component in self.component_modes['additive']:
   1359                 comp *= self.y_scale

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2000 is different from 1)

rajanant avatar Oct 12 '22 08:10 rajanant

I ran into a really similar issue. From what I can tell this is related to seasonalities, in my case we had monthly data for less than two years and didn't explicitly add a custom seasonality.

It appears that there are some assumptions being made about the data always having a seasonality of daily, weekly, or yearly selected by the set_auto_seasonalities method, or at a minimum a custom seasonality. If this is unset the array dimensions will not match.

Example array dimensions: X.shape==(10,1) and beta_c.transpose().shape==(2000,)

We were able to get around this by setting a custom monthly seasonality: model.add_seasonality(name="monthly", period=30.5, fourier_order=5)

It seems like there should be some better error handling when seasonality cannot be computed automatically.

cerebralmind avatar Oct 12 '22 20:10 cerebralmind

Seasonality cannot be calculated automatically if you don't have enough data.

SultanovAR avatar Jan 15 '24 17:01 SultanovAR