Erik Jenner comments

Repositories
Issues
Comments

Results 2 comments of


                                            Erik Jenner

Randomness control for different `exploration_frac` in preference comparisons

Thanks for spotting and describing this, not sure why I set `deterministic_policy=True` in the exploration wrapper. So it should be fine to make that `False`, at least I agree it...

Load expert models for testing from huggingface hub

> I've seen policies and reward networks sometimes have the number of envs get baked into their expected observation/action shape in the past. Although I thought that was no longer...