[RLlib] make_multi_callbacks with new API stack error
What happened + What you expected to happen
The current validation of the new API stack using the make_multi_callbacks function compares the source code of the method on_episode_created of the DefaultCallback class to the source code of the same method of the _MultiCallbacks class (returned from make_multi_callbacks function):
if (
self.uses_new_env_runners
and not self.is_multi_agent()
and self.callbacks_class is not DefaultCallbacks
):
default_src = inspect.getsource(DefaultCallbacks.on_episode_created)
try:
user_src = inspect.getsource(self.callbacks_class.on_episode_created)
# In case user has setup a `partial` instead of an actual Callbacks class.
except AttributeError:
user_src = default_src
if default_src != user_src:
raise ValueError(
"When using the new API stack in single-agent and with EnvRunners, "
"you cannot override the `DefaultCallbacks.on_episode_created()` "
"method anymore! This particular callback is no longer supported "
"b/c we are using `gym.vector.Env`, which automatically resets "
"individual sub-environments when they are terminated. Instead, "
"override the `on_episode_start` method, which gets fired right "
"after the `env.reset()` call."
)
The straightforward solution (copying the DefaultCallbacks source code to the _MultiCallbacks class) doesn't work (different strings because of indentation). Not overriding the method causes another error to come up:
TypeError: Found missing callback method: {'on_episode_created'}
Therefore, not overriding seems not to be an option. After this last error message I commented the validation of the on_episode_created method and now this error:
TypeError: make_multi_callbacks.<locals>._MultiCallbacks.on_environment_created() missing 1 required keyword-only argument: 'env_context'
This last one can be solved easily changing the env_context to env_config in the on_environment_created method of make_multi_callbacks function:
@override(DefaultCallbacks)
def on_environment_created(
self,
*,
env_runner: "EnvRunner",
env: gym.Env,
env_config: EnvContext,
env_index: Optional[int] = None,
**kwargs,
) -> None:
for callback in self._callback_list:
callback.on_environment_created(
env_runner=env_runner,
env=env,
env_config=env_config,
env_index=env_index,
**kwargs,
)
Versions / Dependencies
ray: 2.10 gymnasium: 0.28.1 python: 3.10.11 OS: Windows 11
Reproduction script
from ray.rllib.algorithms import ppo
from ray.rllib.examples.env.stateless_cartpole import StatelessCartPole
from ray.rllib.env.single_agent_env_runner import SingleAgentEnvRunner
from ray.rllib.algorithms.callbacks import DefaultCallbacks, make_multi_callbacks
config = (
ppo.PPOConfig()
.environment(StatelessCartPole)
.experimental(
_enable_new_api_stack=True,
)
.framework("torch")
.training(model={"uses_new_env_runners": True})
.rollouts(
num_rollout_workers = 0,
env_runner_cls= SingleAgentEnvRunner,
)
.resources(
num_learner_workers=0,
num_gpus_per_learner_worker=0,
num_cpus_for_local_worker=1,
)
.callbacks(make_multi_callbacks([DefaultCallbacks, DefaultCallbacks]))
)
config.build()
Issue Severity
Medium: It is a significant difficulty but I can work around it.