lagom issues

[setup.py] get_version: replace `importlib` with raw text scanning

e.g. ```python # only read through the raw file without importing any modules def _get_version(dirpath): var_name = '__version__' file_path = Path(dirpath)/'__init__.py' with open(file_path, 'r') as fp: for line in fp:...

zuoxingdong

Use more fair RL evaluation metrics: IQM, Optimality Gap, Median, Mean

Reference paper: Deep Reinforcement Learning at the Edge of the Statistical Precipice. Github repo: https://github.com/google-research/rliable

zuoxingdong

enhancement

Breaking changes: new experiment API

zuoxingdong

Agent.choose_action: replace the mode string argument with internal self.training

Sync. RL baselines by calling `agent.train()` and `agent.eval()` explicitly.

zuoxingdong

refactor

Extract independent snapshot functions for metrics: network, RL, experiment, statistics etc.

This will dramatically increase reusability and reduce boilerplate code. e.g. ```python snapshot_nn(net, ['total_parameters', 'trainable_parameters']) ``` Networks: - Total parameters: trainable parameters, un-trainable parameters - Extract the number of such parameters...

zuoxingdong

enhancement

design

refactor

Agent constructor: replace environment object as spec object

This will ease the checkpointing, we don't have to serialize and load the entire environment within the agent. The `env_spec` object has sufficient information to build the agent e.g. the...

zuoxingdong

design

refactor

Logging: convert loaded loggings entirely into pandas `DataFrame`

Load the entire logging folder including multiple configurations and multiple random runs. All the post-processings are performed on the DataFrame, e.g. smoothing the episode returns. e.g. ID | lr |...

zuoxingdong

design

refactor

Refactor Hyperparameter classes

Inspired by Amazon SageMaker and Ray, try to refactor the classes for hyperparameters. e.g. - Categorical - Continuous - Logarithmic: for small scale, e.g. learning rate

zuoxingdong

design

refactor

Support Python 3.8

- In the script folder: remove the "typing" package, it is not necessary anymore.

zuoxingdong

enhancement

Migrate obs/reward normalization from env.wrappers into Agent itself

- Make online statistics as `nn.Parameter` and registered inside the module. It becomes trackable - Similar style with how the BatchNorm is implemented in PyTorch - Different behavior between train/eval...

zuoxingdong

design

refactor

lagom
lagom copied to clipboard

Metadata

[setup.py] get_version: replace `importlib` with raw text scanning

Use more fair RL evaluation metrics: IQM, Optimality Gap, Median, Mean

Breaking changes: new experiment API

Agent.choose_action: replace the mode string argument with internal self.training

Extract independent snapshot functions for metrics: network, RL, experiment, statistics etc.

Agent constructor: replace environment object as spec object

Logging: convert loaded loggings entirely into pandas `DataFrame`

Refactor Hyperparameter classes

Support Python 3.8

Migrate obs/reward normalization from env.wrappers into Agent itself

← Metadata

Owner

Metadata

lagom lagom copied to clipboard

Metadata

← Metadata

Owner

Metadata

lagom
lagom copied to clipboard