Sven Mika
Sven Mika
Add new config option: `enable_evaluation_v2`. This is an experimental setting. If True: * Evaluation workers will be organized inside a AsyncRequestsManager object, no matter what the eval settings are (e.g....
LSTMLayers should be able to store the last returned internal-state and use it for the next pass through (similar to Keras). Requires a reset method. Requires batch-size checking between a...
Such that Agents can access all optimizers for the build process without explicitly knowing its components or their sub-components. E.g.: An Agent contains a PPOAlgorithmComponent, which in turn contains two...
It's very confusing that `action_from_preprocessed_state` returns a tuple: (action, preprocessed_state). This should rather be a dict with keys "action", "preprocessed_state" or only the actions as that's what the API's name...
The Agent API method `get_action` should return a dict, instead of currently a 2-tuple (action, preprocessed_state). The dict would have the keys "action" and "preprocessed_state". This is already good practice...
Currently this is done purely implicitly by the naming and organization of the set of all API-input args. E.g. a Component has the API-methods: api_a(self, arg1, arg2) and api_b(arg2). It...
Instead of returning dicts in some cases, API-methods should return a named tuple object, which can then be used for: - keyed indexing (just like a dict): a = self.[some-API-method]()...
Signed-off-by: sven1977 `Algorithm.add_policy()` should alternatively accept an already instantiated policy object. * Same for `RolloutWorker.add_policy()`. * Enhanced existing test case to cover this behavior. ## Why are these changes needed?...
## Why are these changes needed? ## Related issue number ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR....
Run all algorithms "compilation" tests on a GPU as well (besides the existing CPU tests). ## Why are these changes needed? ## Related issue number ## Checks - [x] I've...