Skander Moalla

Results 27 comments of Skander Moalla

Happens on two different clusters with different CPUs and GPUs (same Docker image though, the NVIDIA NGC PyTorch).

On MPS it's not segfault anymore but the original arbitrary number bug: ```python from tensordict.nn import TensorDictModule from torch import nn from torchrl.envs import ( EnvCreator, ExplorationType, StepCounter, TransformedEnv, SerialEnv,...

I'll poke a bit now and give my feedback soon. So I should test with this branch on TorchRL and main on Tensordict?

All good for CUDA! Awesome! (tested some scripts but didn't check the PR code) ```bash ❯ python Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux Type "help",...

Not yet for MPS. For SerialEnv (https://github.com/skandermoalla/TorchRL/blob/34c8abf19fd5a5177a2d5eadd5a5b1f57d51ab6c/tests/issue_env_device_serial.py) I have different errors that appear arbitrarily: ```bash python(22437,0x1d9879300) malloc: tiny_free_list_remove_ptr: Internal invariant broken (next ptr of prev): ptr=0x139ced580, prev_next=0x0 python(22437,0x1d9879300) malloc: ***...

Not for me. Running the above script gives me the same transient errors I described. Which commits are you using? I'm on the main branches of both TorchRL and TensorDict....

Almost solved. It works with Serial and Parallel Env, but somehow breaks when a Transformed env is added on top of the ParallelEnv. ```python from tensordict.nn import TensorDictModule from torch...