ray icon indicating copy to clipboard operation
ray copied to clipboard

[RLlib] Chaining sub-models in RLModules with dynamic spec keys inside forward methods ("Solution 1")

Open ArturNiederfahrenhorst opened this issue 2 years ago • 0 comments

Why are these changes needed?

As discussed via Slack, we need to match the future ModelCatalog, the Models and RLModule in the sense of where to define specs and keys for these specs, and also what information to use when calling a model inside an RLModule. "Solution 1" has some advantages:

  1. We don't end up with a "static graph" pattern where the chaining of models happens at configuration time or Model-instantiation-time, but not at call-time
  2. ModelCatalog code does not know/set the structure of what is to be constructed inside the RLModule

This PR introduces "Solution #1", with the addition of ModelIOMapping, which dynamically creates distinct IO keys for different sub-models. It therefore avoids much of the information leakage of implementation detail from keys used in input specs of these models.

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [x] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

ArturNiederfahrenhorst avatar Dec 23 '22 08:12 ArturNiederfahrenhorst