rustneat
rustneat copied to clipboard
Some algorithm considerations
I will keep updating it. Some of these are quite subtle, but I think we should address everything.
-
[x] 1. Should
mutate_add_connection()
be allowed to addi -> i
connections? (right now I would say yes. These can have an effect on the NN.) edit: I will allow it for now -
[x] 2. Adding a connection that already exists: should it keep the old weight or the new? edit: I will keep the new weight for now
-
[ ] 3. Creating offspring: there is 25% chance to just mutate the parent, and 75% chance to mate two organisms, but in that case no mutations happen. Is this inspired by literature? Another idea is to always mate two organisms, followed by mutation (say, by 25% chance).
-
[x] 4. Interspecies mating - is this supported by literature? (doesn't have to be, just wondering about the justifications)
-
[ ] 5.
Specie::generate_offspring()
currently just picks N organisms randomly, but the NEAT paper seems to say that we should pick the N best-performing organisms. Also, currently the champion organism within the specie is added (if specie size > 5). Why is that? -
[ ] 6. Doesn't seem like we use shared fitness. (look at "explicit fitness sharing" in the NEAT paper).
- yes, I don't know exactly how they affect the NN.
- I don't believe that the paper take care about this, I think it's doesn't care, only some kind of optimization could happen here.
- This is part of the configuration. I didn't remember where this values come from. You should be able to change this values to improve the performance.
- yes, it's included in original paper, I think it's a very important part of the NEAT, it's used to avoid non absolute maximums I think.
- Has sense, if you found this in the paper we should change it. The champion is added later probably because we found it in some implementations.
- I'm not sure. I think it's implicit in the algorithm.
Thanks for your comments!
Regarding shared fitness, I can't see how it is implicit in the algorithm.
In my understanding, the shared fitness of an organism = its fitness divided by the number of organisms in its species. So I wrote this in Specie
/// Get the average shared fitness of the organisms in the species.
/// This is the same as the average of the real fitness of the members,
/// divided by the number of members.
pub fn average_shared_fitness(&self) -> f64 {
let n_organisms = self.organisms.len().value_as::<f64>().unwrap();
if n_organisms == 0.0 {
return 0.0;
}
let avg_fitness = self.organisms.iter().map(|o| o.fitness)
.sum::<f64>() / n_organisms;
avg_fitness / n_organisms
}
Edit: Looking at this C# implementation, they just take the mean fitness as well, not the shared fitness. I wonder, is the "explicit shared fitness" really implicit?
Edit 2: I just observed that with the average_shared_fitness
as above, we start with one species of size say 150. Then the next generation there may be one organism that evolved to become a new species so we have species sizes 1, 149
. The next step we have 149, 1
... and it keeps alternating like that, obviously punishing big species way too much.
Regarding question 3: What do you think about this "Another idea is to always mate pairs of organisms, followed by mutation (say, by 25% chance)"? Maybe we can try both and see which performs best.
I found a great resource here http://www.automatonsadrift.com/neat/ that answers question 5.
If no members of a species rise above their existing champion in fitness for a set number of generations, the entire species is terminated, unless its champion is the population champion.
The lowest-performing fraction of each species does not reproduce, and the highest performer from each species carries over to the next generation via per-species elitism. Any remaining reproduction spots are filled through random selection.
I was wrong when I said that the paper says that we should take the N best-performing organisms -- it only says that we should cull the worst-performing ones.
Feel free to test mate pairs and make mutation. Let's see what happen.