[BUG] `target_value_network_params` initialization bug in `convert_to_functional()`
Bug Description
-
During the initialization of
LossModuleand execution ofconvert_to_functional(), the first layer parameters are set asUninitializedParameter(shape=torch.Size(-1)). Consequently, whentarget_value_network_paramsis cloned from this, it becomesParameter(torch.Size(0)), leading to the error:RuntimeError: mat2 must be a matrix, got 1-D tensor. -
In the
DiscreteCQLLossclass, when callingvalue_estimate, theparamsargument should referencetarget_paramsinstead, as indicated in this line of code.
To Reproduce
Steps to reproduce the behavior.
import torch
from tensordict import TensorDict
from torchrl.data import OneHotDiscreteTensorSpec
from torchrl.modules import DistributionalQValueActor, MLP
from torchrl.objectives import DistributionalDQNLoss
nbins = 3
batch_size = 5
action_dim = 2
module = MLP(out_features=(nbins, action_dim), depth=2)
action_spec = OneHotDiscreteTensorSpec(action_dim)
qvalue_actor = DistributionalQValueActor(
module=module,
spec=action_spec,
support=torch.arange(nbins),
)
loss_module = DistributionalDQNLoss(
qvalue_actor,
gamma=0.99,
delay_value=True,
)
td = TensorDict(
{
"observation": torch.randn(batch_size, 4),
"action": torch.nn.functional.one_hot(
torch.randint(0, action_dim, (batch_size,)), action_dim
).float(),
"next": {
"observation": torch.randn(batch_size, 4),
"reward": torch.randn(batch_size, 1),
"done": torch.zeros(batch_size, 1, dtype=torch.bool),
},
},
batch_size=[batch_size],
)
loss = loss_module(td)
print("Computed loss:", loss)
File ../../python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat2 must be a matrix, got 1-D tensor
System info
python=3.10.15 torchrl=0.5.0 torch=2.4.1
import torchrl, numpy, sys
print(torchrl.__version__, numpy.__version__, sys.version, sys.platform)
0.5.0 2.1.2 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:20) [Clang 17.0.6 ] darwin
Possible fixes
Specify input features in MLP module.
module = MLP(in_features=4, out_features=(nbins, action_dim), depth=2)
This prevents the module parameters from being UninitializedParameter
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
- [x] I have read the documentation (required)
- [x] I have provided a minimal working example to reproduce the bug (required)
Thanks to @BY571 and @vmoens