qlearning4k icon indicating copy to clipboard operation
qlearning4k copied to clipboard

Q-learning for Keras

Results 10 qlearning4k issues
Sort by recently updated
recently updated
newest added

Hi , I have two problem in ```test_catch.py``` and ```test_snake.py``` - python version : Anaconda3/python3.5 - platform : macOS - backend : Tensorflow In ```test_catch.py``` ```shell== Traceback (most recent call...

Using TensorFlow backend. Traceback (most recent call last): File "C:\Users\Me\Anaconda3\envs\tf_rl\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 654, in _call_cpp_shape_fn_impl input_tensors_as_shapes, status) File "C:\Users\Me\Anaconda3\envs\tf_rl\lib\contextlib.py", line 66, in __exit__ next(self.gen) File "C:\Users\Me\Anaconda3\envs\tf_rl\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status))...

we should have `self.grid_size-2` instead of `self.grid_size-1` https://github.com/farizrahman4u/qlearning4k/blob/4c029dd60f0809c689808826ddc40d5e57f5ea7d/qlearning4k/games/catch.py#L36

Hi I seem to be running `$ python test_catch.py` and here is what I seem to get Epoch 1000/1000 | Loss 0.0104 | Epsilon 0.10 | Win count 793 Traceback...

Will this repo be the place where multiple reinforcement learning algorithms (q-learning, A3C, ...) be implemented for Keras?

I have been getting NaN losses when I try to train my agent on a game. I tracked it back to the get_batch function in memory - Y (the output...

I noticed that training on CPU-only is much faster than on GPU. With GPU, usage is about 18% according to nvidia-smi. Is this because it trains for each epoch, and...

Would nice to have a dyna q learner for the training.

Currently all state transitions are allowed, it would be nice in the base class if we could override a method which allows us to return the set of allowed next...

…zers As per https://github.com/fchollet/keras/issues/2417 the train mode is explicitly passed to K.function in order to avoid errors.