gramex
gramex copied to clipboard
GRAMEX-182 ⁃ MLHandler TimeSeries throws a KeyError
I created this gramex.yaml
url:
mlhandler/forecast:
pattern: /$YAMLURL/forecast
handler: MLHandler
kwargs:
data:
url: $YAMLPATH/inflation.csv # Inflation dataset
model:
index_col: index # Use index column as timestamps
target_col: R
class: SARIMAX
params:
order: [7, 1, 0] # Creates ARIMA estimator with (p,d,q)=(7,1,0)
# Add other parameters similarly
and copied this inflation.csv.
When I ran Gramex and clicked on template's "Predict" button...
I got an error with this log:
INFO 19-Apr 07:57:08 __init__ PORT Gramex 1.78.0 | D:\temp\ts | Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit
(AMD64)]
DEBUG 19-Apr 07:57:08 config PORT Loading config: d:\site\gramener.com\viz\async-gramex\gramex\gramex.yaml
DEBUG 19-Apr 07:57:08 config PORT Loading config: D:\temp\ts\gramex.yaml
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: version
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: mime
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: threadpool
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: cache
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: handlers
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: log
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: eventlog
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: app
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: otp
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: schedule
INFO 19-Apr 07:57:08 __init__ PORT Initialising schedule:gramex_update
DEBUG 19-Apr 07:57:08 scheduler PORT schedule:gramex_update: Next run in 57771.3s
DEBUG 19-Apr 07:57:08 __init__ PORT Loading service: url
DEBUG 19-Apr 07:57:08 __init__ PORT url:mlhandler/forecast (MLHandler)
DEBUG 19-Apr 07:57:08 __init__ PORT Gramex update ran recently. Deferring check.
DEBUG 19-Apr 07:57:09 config PORT Loading config: d:\site\gramener.com\viz\async-gramex\gramex\apps.yaml
DEBUG 19-Apr 07:57:09 config PORT Loading config: C:\Users\anand\AppData\Local\Gramex Data\apps\apps.yaml
DEBUG 19-Apr 07:57:09 config PORT Loading config: d:\site\gramener.com\viz\async-gramex\gramex\handlers\openapiconfig.yaml
DEBUG 19-Apr 07:57:09 cache PORT Flushing C:\Users\anand\AppData\Local\Gramex Data\apps\mlhandler\mlhandler-forecast\config.json
DEBUG 19-Apr 07:57:09 cache PORT Flushing C:\Users\anand\AppData\Local\Gramex Data\apps\mlhandler\mlhandler-forecast\config.json
d:\site\gramener.com\viz\async-gramex\gramex\ml_api.py:230: UserWarning: Model changed, removing old parameters.
warnings.warn("Model changed, removing old parameters.")
DEBUG 19-Apr 07:57:09 cache PORT Flushing C:\Users\anand\AppData\Local\Gramex Data\apps\mlhandler\mlhandler-forecast\config.json
DEBUG 19-Apr 07:57:09 cache PORT Flushing C:\Users\anand\AppData\Local\Gramex Data\apps\mlhandler\mlhandler-forecast\config.json
DEBUG 19-Apr 07:57:10 __init__ PORT url:favicon (FileHandler) -90
DEBUG 19-Apr 07:57:10 __init__ PORT url:default (FileHandler) -100
DEBUG 19-Apr 07:57:10 __init__ PORT Running callback: app
INFO 19-Apr 07:57:10 __init__ PORT Listening on port 9988
INFO 19-Apr 07:57:10 __init__ 9988 <Ctrl-B> opens the browser. <Ctrl-D> starts the debugger.
INFO 19-Apr 07:57:15 __init__ 9988 200 GET / (127.0.0.1) 0.00ms default
INFO 19-Apr 07:57:15 __init__ 9988 200 GET /favicon.ico (127.0.0.1) 0.00ms favicon
INFO 19-Apr 07:57:25 __init__ 9988 200 GET / (127.0.0.1) 0.00ms default
INFO 19-Apr 07:57:25 __init__ 9988 200 GET /favicon.ico (127.0.0.1) 0.00ms favicon
INFO 19-Apr 07:57:30 __init__ 9988 200 GET /forecast (127.0.0.1) 10.07ms mlhandler/forecast
INFO 19-Apr 07:57:30 __init__ 9988 200 GET /forecast?_cache&_limit=5&_format=json&_meta=y (127.0.0.1) 0.00ms mlhandler/forecast
INFO 19-Apr 07:57:30 __init__ 9988 200 GET /forecast?_cache&_opts (127.0.0.1) 0.00ms mlhandler/forecast
ERROR 19-Apr 07:57:34 mlhandler 9988 'The `start` argument could not be matched to a location related to the index of the data.'
Traceback (most recent call last):
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "d:\site\gramener.com\viz\async-gramex\gramex\handlers\mlhandler.py", line 191, in _predict
target = data.pop(score_col)
File "D:\anaconda\3.7\lib\site-packages\pandas\core\frame.py", line 5226, in pop
return super().pop(item=item)
File "D:\anaconda\3.7\lib\site-packages\pandas\core\generic.py", line 870, in pop
result = self[item]
File "D:\anaconda\3.7\lib\site-packages\pandas\core\frame.py", line 3458, in __getitem__
indexer = self.columns.get_loc(key)
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: ''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas\_libs\index.pyx", line 460, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 2131, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 2140, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 429, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas\_libs\index.pyx", line 462, in pandas._libs.index.DatetimeEngine.get_loc
KeyError: Timestamp('1970-01-01 00:00:00')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\datetimes.py", line 703, in get_loc
return Index.get_loc(self, key, method, tolerance)
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: Timestamp('1970-01-01 00:00:00')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 357, in get_prediction_index
start, base_index, data.row_labels
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 279, in get_index_label_loc
raise e
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 243, in get_index_label_loc
loc, index, index_was_expanded = get_index_loc(key, index)
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 176, in get_index_loc
loc = index.get_loc(key)
File "D:\anaconda\3.7\lib\site-packages\pandas\core\indexes\datetimes.py", line 705, in get_loc
raise KeyError(orig_key) from err
KeyError: Timestamp('1970-01-01 00:00:00')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\site\gramener.com\viz\async-gramex\gramex\handlers\mlhandler.py", line 199, in _predict
data = self.model.predict(data, target_col=tcol)
File "d:\site\gramener.com\viz\async-gramex\gramex\sm_api.py", line 78, in predict
return self.res.predict(start, end, exog=exog, **kwargs)
File "D:\anaconda\3.7\lib\site-packages\statsmodels\base\wrapper.py", line 113, in wrapper
obj = data.wrap_output(func(results, *args, **kwargs), how)
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 3403, in predict
prediction_results = self.get_prediction(start, end, dynamic, **kwargs)
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 3287, in get_prediction
self.model._get_prediction_index(start, end, index))
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 843, in _get_prediction_index
data=self.data,
File "D:\anaconda\3.7\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 361, in get_prediction_index
"The `start` argument could not be matched to a"
KeyError: 'The `start` argument could not be matched to a location related to the index of the data.'
INFO 19-Apr 07:57:34 __init__ 9988 200 GET /forecast?Dp=-0.00313258&index=1972-04-01 (127.0.0.1) 36.61ms mlhandler/forecast
┆Issue is synchronized with this Jira Bug
@sanand0 The template does not yet support SARIMAX. Forecasting requires a different interface:
- Different kwargs from sklearn
- Different evaluation metrics from sklearn
- Prediction / forecasting needs a "start" and "end" timestamp + any exogenous data. This needs some work. There was a PR that did this but it's too stale to reuse directly.
Moreover, to do this properly, we shouldn't be conforming to the current template which so heavily favours sklearn. So we can create a new one. Either of us can come up with a mock, and I'll build it.