machin
machin copied to clipboard
len(tmp_observations) < 2 on PPO raise ValueError: The parameter probs has invalid values
It seems that your code produce error if the len of your trajectory < 2 ( len(tmp_observations) < 2). I tested this on PPO I don't know if this happens with all algorithms.
The error:
ValueError: The parameter probs has invalid values
Could you please provide a minimal reproducible example and full stack trace, other log outputs, etc?
Here it is a minimal example to reproduce the error: testPPO.txt.
Just by setting max_steps = 0 in the provided example, in this case you will have len( tmp_observations ) == 1 and will raise the following error:
Traceback (most recent call last):
File ".../PPO issues.py", line 94, in
Process finished with exit code 1
Sorry for the late comment, that's definitely a boundary case that needed to be fixed. I'm working on a paper recently and my computer also got hacked so this problem might hang around here a little longer, I'm really sorry for the inconvenience.