seals
seals copied to clipboard
Benchmark environments for reward modelling and imitation learning algorithms.
When trying to load the swimmer environment, I get a ``` ValueError: XML Error: Schema violation: unrecognized attribute: 'collision' ``` error when mujoco 3.1.1 is installed. Downgrading to `mujoco==2.3.7` fixes...
I found that the README was still referencing `gym` instead of `gymnasium`.
`gymnasium` as an [AutoResetWrapper](https://gymnasium.farama.org/api/wrappers/misc_wrappers/#gymnasium.wrappers.AutoResetWrapper) that behaves close to ours but it does not hide the termination conditions. We should use the upstream AutoResetWrapper and just provide one that does nothing...
In `base_envs.py`, the class `TabularModelPOMDP` has a method `obs_dtype` that returns the data type of observation vectors - specifically, it returns `self.observation_matrix.dtype`. However, it is typed as returning an int....
Old tests: ```python def test_model_envs(env): """Smoke test for each of the ModelBasedEnv methods with type checks. Args: env: The environment to test. Raises: AssertionError if test fails. """ state =...
We should have a base wrapper class that inherits from `ResettablePOMDP` so that subclasses of this environment can be wrapped type-safely.
```python from imitation.algorithms.adversarial.airl import AIRL from imitation.rewards.reward_nets import BasicShapedRewardNet from imitation.util.networks import RunningNorm from stable_baselines3 import PPO from stable_baselines3.common.evaluation import evaluate_policy from stable_baselines3.common.vec_env import DummyVecEnv, SubprocVecEnv import gym import seals...