deep_q_rl icon indicating copy to clipboard operation
deep_q_rl copied to clipboard

learning.csv doesn't contain average loss per epoch

Open redsphinx opened this issue 8 years ago • 0 comments

I don't know if this is a real issue and I don't know any other appropriate place to ask about this, it caused some confusion for me. I assumed it was loss per epoch, since it didn't say what it was explicitly. Also I'm still not sure if I'm correct about this and I want to find out how to plot the mean_loss per epoch.

The first column says it contains mean_loss. If this is per epoch, it should contain 100 values, assuming I run it with the default parameters. However it contains a lot more values. This number seems to be dependent on the number of episodes per total training; the higher number of episodes, the higher the amount of mean_loss values.

So I followed the loss all the way down the rabbit hole. Assuming the default parameters, each epoch is 50000 steps max. It tries to run as many episodes as possible while making each episode run for as many steps possible. At the start of a new episode we have self.loss_averages = []. Updating the self.loss_averages.append(loss) in def step() in ale_agent.py happens every time step. This happens until we reach 50000 steps or until the agent dies. In which case we take the mean of self.loss_averages and update the learning.csv: self._update_learning_file() in def end_episode(). Then if we have steps left we start a new episode, self.loss_averages = []. So the learning.csv actually contains mean_loss per episode. However, when I sum the number of episodes per epoch for all epochs, this number is not equal to the losses. So it can't be the mean_loss per episode.

So what is mean_loss in the learning.csv and how can I plot mean_loss per epoch?

redsphinx avatar Jul 10 '16 23:07 redsphinx