寇谦
寇谦
I want to konw if there are any researches focusing on leveraging RLHF in MARL problems.

Hello! The class MediumLevelPlanner used in overcooked.py seems to be deprecated in the latest overcooked_ai version, which class can be used instead?
In the get_episode() function, the rewards have been turned into reward-to-gos, which is not describe in the paper. for agent_trajectory in episode: rtgs = 0 for i in reversed(range(len(agent_trajectory))): rtgs...
真实面试题缺失
如题
Could you please provide the api server.py for chunk convert