es_pytorch
es_pytorch copied to clipboard
High performance implementation of Deep neuroevolution in pytorch using mpi4py. Intended for use on HPC clusters
- When an individual becomes stuck, reset params to the best performer seen yet
- Try both types of recombination as a means to _reset_ the params and explore a different area
- Self adaptive theta as seen in Evolutionary strategies a comprehensive introduction - Use multiple different theta values for each noise ind. This allows one to search along the trajectory...
From testing novelty search seems to be doing as it is programmed, but in practice it performs poorly. Options: - [x] compare to openai's method of calculating the novelty metric...
Move params in direction of best ever reward. Needs to be done with population
* Use a temperature param that decreases over time to control the size of the noise - similar to epsilon greedy