LightZero icon indicating copy to clipboard operation
LightZero copied to clipboard

feature(rjy): add crowd md env new, and multi-head policy

Open nighood opened this issue 8 months ago • 0 comments

  1. New Environment: CrowdSim

    • Description: The CrowdSim environment is a grid world simulation where robots navigate through an environment populated with humans. The primary task for the robots is to minimize the average age of information (AoI) of the humans by moving to their locations and collecting data. Key features of the environment include:
      • Dynamic Interaction: Humans generate data at a constant rate, and robots must manage their limited energy supply while moving to collect this data.
      • Modes:
        • Easy Mode: Robots can only collect data from humans within a certain range, and collecting data resets the AoI of a human to zero.
        • Hard Mode: Robots can collect data from humans even when not within range, and collecting data does not reset the total AoI.
      • Initialization: The environment starts with a dataset of human locations and timestamps. Robots aim to minimize the average AoI by efficiently collecting data.
      • Completion Criteria: The environment is considered solved when the average AoI is minimized to a certain threshold or the time limit is reached.
      • Additional Features: Methods for resetting, closing, and stepping through the environment, seeding for reproducibility, saving replay videos, and generating random actions. Detailed properties for accessing observation space, action space, and reward space.
  2. Multi-Head Policy Version for MuZero, EfficientZero, and Sampled EfficientZero

    • Modification: Introduced multi-head policy versions for the MuZero, EfficientZero, and Sampled EfficientZero algorithms.

nighood avatar Jun 07 '24 08:06 nighood