Sven Mika issues

Results 32 issues of


                                            Sven Mika

[RLlib] Eval workers use async req manager.

Add new config option: `enable_evaluation_v2`. This is an experimental setting. If True: * Evaluation workers will be organized inside a AsyncRequestsManager object, no matter what the eval settings are (e.g....

Add stateful-support to LSTMLayers.

LSTMLayers should be able to store the last returned internal-state and use it for the next pass through (similar to Keras). Requires a reset method. Requires batch-size checking between a...

[Core] A Component should keep a registry for its optimizers.

Such that Agents can access all optimizers for the build process without explicitly knowing its components or their sub-components. E.g.: An Agent contains a PPOAlgorithmComponent, which in turn contains two...

[Algorithms] Change return values of `action_from_preprocessed_state`.

It's very confusing that `action_from_preprocessed_state` returns a tuple: (action, preprocessed_state). This should rather be a dict with keys "action", "preprocessed_state" or only the actions as that's what the API's name...

[Algorithms] Return dict (instead of 2-tuple) from API method: get_action.

The Agent API method `get_action` should return a dict, instead of currently a 2-tuple (action, preprocessed_state). The dict would have the keys "action" and "preprocessed_state". This is already good practice...

[Core] Make it optional for Components to describe which input-args they need in order to create their variables.

Currently this is done purely implicitly by the naming and organization of the set of all API-input args. E.g. a Component has the API-methods: api_a(self, arg1, arg2) and api_b(arg2). It...

[ Core ] API-methods should return named tuples, not dicts.

Instead of returning dicts in some cases, API-methods should return a named tuple object, which can then be used for: - keyed indexing (just like a dict): a = self.[some-API-method]()...

[RLlib] `Algorithm.add_policy()` should alternatively accept an already instantiated policy object.

Signed-off-by: sven1977 `Algorithm.add_policy()` should alternatively accept an already instantiated policy object. * Same for `RolloutWorker.add_policy()`. * Enhanced existing test case to cover this behavior. ## Why are these changes needed?...

tests-ok

[RLlib] `before_sub_environment_reset()` callback enhancements (add `next_episode` arg).

## Why are these changes needed? ## Related issue number ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR....

[RLlib] Run all algorithms "compilation" tests on a GPU as well (besides the existing CPU tests).

Run all algorithms "compilation" tests on a GPU as well (besides the existing CPU tests). ## Why are these changes needed? ## Related issue number ## Checks - [x] I've...