d3rlpy [BUG] Dimension error when trying to fit Probabilistic Ensemble Dynamics model to discrete action dataset

When trying to fit a PED model to any discrete action dataset I get a runtime error: RuntimeError: torch.cat(): Tensors must have same number of dimensions: got 2 and 1

This is caused by the forward method of the encoder, specifically x = torch.cat([x, action], dim=1)

As I understand it, no action-conditioned encoder can be used with discrete actions because of this.

Playing around with it and the pendulum example in the docs I can only deduce it is due to the following behaviour of MDPDataset:

as one axis is removed for discrete actions leading to the mismatch.

Is this on purpose?

Sep 08 '21 17:09 siomvas

@siomvas Hello, thank for the issue. You should set discrete_action=True. https://github.com/takuseno/d3rlpy/blob/747b1747ad3e41eae8a93a8e02ca02c4d9e0ccb0/d3rlpy/dynamics/probabilistic_ensemble_dynamics.py#L90

Sep 09 '21 23:09 takuseno

@takuseno Thanks for getting back to me, silly mistake! Changing that flag lets training start, but training loss is nan loss=nan and evaluation throws ValueError: The parameter loc has invalid values due to that.

Sep 10 '21 01:09 siomvas

@siomvas It sounds weird. Could you share a minimum example code to reproduce the issue?

Sep 10 '21 12:09 takuseno