mushroom-rl
mushroom-rl copied to clipboard
[Categorical DQN/Rainbow] Inconsistent behavior of Categorical DQN for an even number of atoms
For an even number of atoms, the calculation of self._a_values
(see here) does not seem to be 100% correct. This behavior is reproducible via
import torch
v_min = -5
v_max = 5
n_atoms = 20
delta = (v_max - v_min) / (n_atoms - 1) # delta = 0.5263157894736842
torch.arange(v_min, v_max + delta, delta)
which yields
tensor([-5.0000, -4.4737, -3.9474, -3.4211, -2.8947, -2.3684, -1.8421, -1.3158,
-0.7895, -0.2632, 0.2632, 0.7895, 1.3158, 1.8421, 2.3684, 2.8947,
3.4211, 3.9474, 4.4737, 5.0000, 5.5263])
and is too big. The expected result would be this tensor:
tensor([-5.0000, -4.4737, -3.9474, -3.4211, -2.8947, -2.3684, -1.8421, -1.3158,
-0.7895, -0.2632, 0.2632, 0.7895, 1.3158, 1.8421, 2.3684, 2.8947,
3.4211, 3.9474, 4.4737, 5.0000])
According to torch.arange an easy solution would be to add a small eps
value instead of delta
, e.g.
self._a_values = torch.arange(self._v_min, self._v_max + 10e-9, delta)
or cutoff the last value in the case when the tensor is too big or use some internal eps
value instead of a hard coded one.