Daniel Filan

Results 13 issues of Daniel Filan

(apologies for the bad branch name) Includes: - A new class of CNN reward functions - A reward function wrapper to ensure inputs are always channel-first - Modifications to normalization...

### 🚀 Feature When the reward function is trained by the `CrossEntropyRewardTrainer` in `preference_comparisons.py`, it currently takes a batch of trajectory fragments and preferences, calculates its reward for each pair...

enhancement

## Description Fixes #560. ## Testing Ran all tests. Results: 3543 passed, 595 skipped, 4234 warnings.

## Bug description Attempting to load SB3 models from Huggingface in `serialize.py` often raises a `FileExistsError`, that tells us "Outdated policy format: we do not support restoring normalization statistics from...

bug

## Description Add config options to train CNNs on image environments. This also involves letting configs set environment wrappers, so that environments like Atari can be appropriately wrapped. ## Testing...

## Bug description When installing dependencies in master, seemingly every version of `awscli` is being installed. See [here](https://imgur.com/a/x8JkQt1) for illustration. ## Steps to reproduce Run the following: ``` git clone...

bug

## Bug description The `save` function in `data/types.py` saves sequences of trajectories in numpy's `.npz` format. That said, some scripts save these with the name `final.pkl` or similar, making it...

bug

## Problem In `algorithms/dagger.py`, there is a function called `_save_dagger_demo` that saves trajectories in `.npz` format. This is redundant with the `save` function in `data/types.py` which also saves trajectories in...

enhancement

Loosens up requirements, allowing some newer versions of things. NB: I tried to follow the advice of running `cd test && pytest --gpu` but got an "unrecognized arguments" error: ```...

Currently, the code requires an old version of pytorch: 1.8.1. Upgrading pytorch to 1.12.1 doesn't break any tests, so the requirement should probably be loosened.