gluonts icon indicating copy to clipboard operation
gluonts copied to clipboard

Estimator.create_predictor triggers RuntimeError with DeepAREstimator

Open pbruneau opened this issue 3 years ago • 3 comments

Description

When trying to adapt the custom callback as described at https://ts.gluon.ai/master/tutorials/mxnet_models/trainer_callbacks.html to DeepAREstimator, I get a RuntimeError.

I guess this is not exactly a bug, as there must be additional steps to have DeepAREstimator properly initialized so that create_predictor is happy, but I can't figure out what those are.

To Reproduce

The following gist features a reproducible example (up to refreshing wrt recent API evolutions): https://gist.github.com/pbruneau/51a57ab799bb95651cbaa58d1d6e9bb6. As it is, with the SimpleFeedForwardEstimator block, it runs as expected. But commenting it out, and uncommenting the DeepAREstimator block, one gets a RuntimeError.

Error message or code output

Traceback (most recent call last):
  File "create_predictor_error.py", line 167, in <module>
    predictor = estimator.create_predictor(transformation=transformation, trained_network=training_network)
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/gluonts/model/deepar/_estimator.py", line 483, in create_predictor
    copy_parameters(trained_network, prediction_network)
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/gluonts/mx/util.py", line 115, in copy_parameters
    net_source.save_parameters(model_dir_path)
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/mxnet/gluon/block.py", line 449, in save_parameters
    arg_dict = {key: val._reduce() for key, val in params.items()}
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/mxnet/gluon/block.py", line 449, in <dictcomp>
    arg_dict = {key: val._reduce() for key, val in params.items()}
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 391, in _reduce
    block = self.list_data()
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 589, in list_data
    return self._check_and_get(self._data, list)
  File "/opt/conda/envs/rapids/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 241, in _check_and_get
    "nested child Blocks"%(self.name))
RuntimeError: Parameter 'deepartrainingnetwork0_None_distr_mu_weight' has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks

Environment

  • Operating system: Ubuntu 18.04
  • Python version: 3.7.10
  • GluonTS version: 0.8.1
  • MXNet version: 1.8.0
  • CUDA version: 11.0

pbruneau avatar Sep 22 '21 14:09 pbruneau

In the meantime I found a way (see https://gist.github.com/pbruneau/04c0dce4bdfb66ffac3f554f1b98c706)

Basically:

  • Now I'm passing in the estimator at callback instantiation
  • I'm using the create_predictor() of the resulting self.estimator along with the training_network obtained along the on_epoch_end call to create the predictor used in the callback

I have the impression it works as expected, but I'm waiting for the feedback from someone knowledgeable of the gluonts internals, as my way may interfere with those internals.

pbruneau avatar Sep 23 '21 10:09 pbruneau

This worked for me as well.

cgoliver avatar Aug 15 '22 09:08 cgoliver

This also works for me. It makes sense to pass estimator to the call back initiation.

feay1234 avatar Apr 11 '23 08:04 feay1234