[RLlib]: SimpleQ TF2 is broken
What happened + What you expected to happen
Something's broken with the SimpleQ TF2 action distribution, but I can't track down the bug. This doesn't happen with TF1/Torch.
Versions / Dependencies
ray 3.0.0dev0 (master) Ubuntu 20.04 tensorflow 2.7.0 (pypi)
Reproduction script
from ray.rllib.algorithms.simple_q import SimpleQConfig
config = (
SimpleQConfig()
.environment(env="CartPole-v0")
.rollouts(num_rollout_workers=0)
.framework("tf2")
.exploration(exploration_config={"type": "SoftQ", "temperature": 1.0})
)
algo = config.build()
policy = algo.get_policy()
batch = algo.workers.local_worker().sample()
log_likelihoods = policy.compute_log_likelihoods(batch["actions"], batch["obs"])
Issue Severity
Medium: It is a significant difficulty but I can work around it.
@Rohan138 This seems to be relevant to the policy_v1 vs v2 migration that we saw the other day. I'll leave you assigned for now. For some reason, we don't test compute_log_likelihood methods at all.
Hello! This P2 issue has seen no activity in at least 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.
Please remove the pending-cleanup label if you believe this issue should remain open.
Thanks for contributing to Ray!