Anirban Santara issues

Results 9 issues of


                                            Anirban Santara

Documentation and Getting Started Tutorial

Hi, Could you please provide a detailed documentation of the organization of the code and what all algorithms are available? Also, a "getting started" tutorial on running ```im_pipeline.py``` with standard/default...

Restructuring MADRaS into a Multi-agent MDP

## Pending work: [Do not merge with master until the list is empty] 1. Add randomization features 2. Add training code for rllib 3. Add done handling for the case...

enhancement

Revise Intervehicular Communication protocol

The following architecture must be implemented: 1. The buffer should be organized as a queue of dictionaries. Each dictionary must hold the variables of one time step. 2. Each agent...

enhancement

Mapping agent and functions for plotting variables on the map

# Utilities for running RL experiments using MADRaS * Custom track widths * Learning curve analysis * Torque-RPM characterization of cars * Evaluation of RL agents

Initialize damage to the initial value observed.

In some tracks and for some cars, the initial value of `self.ob.damage` is `>0`. In these cases, the environment detects collision right at the outset of an episode. To prevent...

bug

Minor bug in imports

Maybe the imports section of MADRaS/traffic/example_usage.py needs a fix. It seems `import traffic.const_vel as playGame_const_vel_s` should be `import traffic.const_vel_s as playGame_const_vel_s`

bug

Safe recovery

Cleaned up the code and restructured it into an object oriented structure. ## Why? Clutter in `agent2.py` made it almost impossible to debug ## Test: ``` cd $RECOVERYPATH python agent_trainer.py...

State and reward normalization bug

## Location: https://github.com/Santara/safeRL/blob/c52382977616075971de68b56e031192e388ce6c/safe_recovery/agent_config.yml#L18-L19 ## Issue: Setting these options to `true` throws the following `Tensorflow reuse error`

Recovery policy code only supports single instance rollout and training

In the following line, reward is specified as a list with a single element. https://github.com/hari-sikchi/safeRL/blob/b4f0443b109d5d3290771528115087eb5dd763ce/safe_recovery/agent2.py#L352-L354 For multiple parallel rollouts, this list should be filled up asynchronously by different instances of...