Marco Pleines

Results 54 comments of Marco Pleines

As of now, I'm struggling with the issue that the computed action values grow exponentially towards positive or negative infinity.

The learning rate slightly delays this outcome. Nevertheless, for the outputs I'm expecting values to be less than 2. Just because of the fact that the maximum reward for the...

I'm still trying to figure out the issue. Maybe I'm misusing the volume class, or I might don't have enough experience with the actual implementation of neural nets (like understanding...

And this is a flow chart of the implementation of the Q Learning part ![brainforwardbackward](https://user-images.githubusercontent.com/6996955/28888368-a4db130a-77c0-11e7-8129-c5bd0b893de9.png)

If used like seen below, the environment cannot establish a connection to Python. The socket somehow fails. This is tested on two Windows machines. Surprisingly, if set to one environment,...

I just looked into updating Obstacle Tower. Updating ml-agents is not a problem. However, updating Unity messes up the lighting. I don't really have time to debug this and I...

I guess I'll implement a video recorder to observe the agent's performance during inference. I'm using a stochastic policy.

Hi, your Obstacle Tower Build is most likely outdated: Linux (x86_64) https://storage.googleapis.com/obstacle-tower-build/v3.1/obstacletower_v3.1_linux.zip Mac OS X https://storage.googleapis.com/obstacle-tower-build/v3.1/obstacletower_v3.1_osx.zip Windows https://storage.googleapis.com/obstacle-tower-build/v3.1/obstacletower_v3.1_windows.zip

HI @X-DDDDD the most reliable way to test your custom algorithm is only to use one agent inside one build instance. Being in need of multiple agents/environments, you have to...