Danijar Hafner

Results 165 comments of Danijar Hafner

@miyosuda That video seems like the agent memorized the environment. I think the paper authors use random starts to create a fairer evaluation. They sample a number from 0 to...

I see, so the agent just learned a good behavior that results in very repetitive episodes.

Doesn't subtracting the mean from the advantages have the effect of an entropy regularizer? Ignoring the clipping, the objective is `logp * (adv - mean) / std = logp *...

What's the reason for dumping the buffer to a text file before searching it? I'd be happy to help make this faster. The plugin is a great idea but right...

Would it be possible to bind the keys (maybe different ones to avoid conflicts) once during startup rather than every time copy mode is entered?

I see how it makes sense for code cells that Ctrl+Return runs the cell and stays in Vim command mode. But for markdown cells, the command doesn't do anything right...

Still occurs sometimes: `(sqlite3.OperationalError) database is locked`

Please email me with requests for the flights dataset. I'm not sure if we're allowed to publish the pre-processed dataset.

Thanks for the detailed comment! I agree that averaging is nicer than taking the maximum, but at this point it's more important to be compatible with the vast existing literature...

Hi @JesseFarebro, do you have an idea for a workaround here? It's the only issue holding me back from switching to `ale-py` and the new V5 envs. To implement max...