rl icon indicating copy to clipboard operation
rl copied to clipboard

[BugFix] step_mdp keep_others=False in losses

Open vmoens opened this issue 2 years ago • 5 comments

cc @Blonck

vmoens avatar Jun 06 '23 10:06 vmoens

All failing tests are from non-tensordict so this seems safe to land when the tests are fixed

vmoens avatar Jun 06 '23 14:06 vmoens

The proposed changes could potentially result in unpredictable outcomes for torchrl users. Users might employ other TensorDict keys, expecting them to transition from ("next", "some_key") to "some_key" after one step. This anticipated behavior will subtly shift, which may go unnoticed. Users might not realize that the switch from ("next", "some_key") to "some_key" no longer occurs since both will still yield valid values, and there is no certainty that this alteration will cause any code to fail.

However, I'm unsure if this is a real problem or only a theoretical one.

Blonck avatar Jun 06 '23 17:06 Blonck

I'm not sure: in which case should things fail? The keeps_other option is there for when there is a context, non time-dependent tensor in the root tensordict that is being copied at every step. There are two usages for losses:

  • the data originates from a TorchRL environment. In that case we just need to make sure that the data in "next" is appropriate. Context tensors will be created via TensorDictPrimer and we should be safe
  • the data is created by the users. In that case too, one would expect that "next" contains all the data necessary to be used as input to the model IMO.

vmoens avatar Jun 06 '23 20:06 vmoens

You are right. I got the meaning of the keyword wrong.

I was also wrong with the BC, the input tensordict will be cloned in all loss_module.forward() instances. So I see no problems with this PR.

Blonck avatar Jun 07 '23 04:06 Blonck

One issue could be with RNNs if the next states is not placed in ("next", "hidden") but overwritten over ("hidden"). But even in this case there's a confusion between the state at t and the state at t+1. To me it's safe to assume that entries will be present in "next" if they need to. Since it's more a matter of security to avoid silent bugs, we can wait a bit until we merge this...

vmoens avatar Jun 07 '23 08:06 vmoens

Hi @vmoens!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot avatar May 23 '25 17:05 facebook-github-bot