reinforcelearn icon indicating copy to clipboard operation
reinforcelearn copied to clipboard

Errors in the Agents vignette

Open russellcameronthomas opened this issue 6 years ago • 7 comments

I have been working through the Agents vignette (https://cran.rstudio.com/web/packages/reinforcelearn/vignettes/agents.html) and I found a few errors.

  1. In the "Value Functions" section, half way down, there is sample code following "For a neural network you can use the keras package." The 3rd line of the current sample code is:

layer_dense(shape = 10L, input_shape = 4L, activation = "linear") %>%

There is no such parameter named shape. Instead, use units:

layer_dense(units = 10L, input_shape = 4L, activation = "linear") %>%

  1. The 4th line of the same sample code is currently:

makeValueFunction("neural.network", model)

Which fails because the model is not a named parameter. Instead, it should be:

makeValueFunction("neural.network", model = model)

... assuming that 'model' is the correct name of the parameter. I can't tell because there is no complete working example.

  1. It appears that there is no complete working sample code for neural network. I would like to see one or more complete working example somewhere, including experience replay.

russellcameronthomas avatar May 28 '18 23:05 russellcameronthomas

I agree with your findings, @russellcameronthomas. And I would also like to see a complete working example solving e.g. the Gridworld environment using a Keras defined neural network. There's at least one hiccup when I try to construct the code myself:

env = makeEnvironment("gridworld", shape = c(3, 3), goal.states = 0L)

library(keras)
model = keras_model_sequential() %>% 
  layer_dense(units = 10L, input_shape = 4L, activation = "linear") %>%
  compile(optimizer = optimizer_sgd(lr = 0.1), loss = "mae")
val.fun <- makeValueFunction("neural.network", model=model)

policy = makePolicy("epsilon.greedy", epsilon = 0.2)
algorithm = makeAlgorithm("qlearning")

agent = makeAgent(policy, val.fun, algorithm)

interact(env, agent, n.episodes = 5L)
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: Error when checking : expected dense_18_input to have shape (4,) but got array with shape (1,)

Using debug (successively) on the functions interact, agent$act, agent$act2 and agent$val.fun$predictQ it becomes clear that the error appears on the predict function.

The error can be reproduced by simply running e.g. :

state <- 8
predict(model,state)

So, the input expected by predict is an array with 4 rows instead of 1? Strangely enough, setting

state <- array(1:4,dim=c(4,1))
predict(model,state)

still gives the same error, while reversing rows and columns does give a result:

> state <- array(1:4,dim=c(1,4))
> predict(model,state)
          [,1]        [,2]      [,3]     [,4]     [,5]      [,6]      [,7]       [,8]      [,9]   [,10]
[1,] 0.2218989 -0.05123037 -3.889088 2.001848 -1.76616 -1.108999 -0.616395 -0.2599579 -1.725355 2.20175

@markusdumke: would be great if you could help me/us out on this one, because your code and documentation are a pleasure to work with, and they clarify a lot with respect to Reinforcement Learning and Function Approximation using neural nets (and coding R in an object-oriented way). Thanks for that!

dkd58 avatar Aug 27 '18 10:08 dkd58

Thanks for the report! I will have to look into this more deeply. It can take a few days until I find time though.

The neural network part is clearly experimental in the current state (and some of the other parts also). A problem when working with neural networks in R I experienced is that keras neural network updates were very slow, making it very hard to learn something useful, at least when using the typical online update in RL.

markusdumke avatar Aug 28 '18 17:08 markusdumke

Thanks for your reaction, @markusdumke, I really appreciate it. In the meantime I've been investigating a bit more, and I think I've found a solution. It seems there are two problems:

  1. The definition of the neural net does not fit the gridworld problem at hand. The state space has cardinality 9 (3x3) and can be represented by either 1 integer number (1-9) or by 9 binaries, but not by 4 values (as is now specified by input_shape=4L). If we change the model definition accordingly (note that also the units have been changed to 4 in order to reflect the number of possible actions):
model = keras_model_sequential() %>% 
  layer_dense(units = 4L, input_shape = 1L, activation = "linear") %>%
  compile(optimizer = optimizer_sgd(lr = 0.1), loss = "mae")

and we redefine the value function and the agent, then everything works fine:

val.fun <- makeValueFunction("neural.network", model=model)
agent = makeAgent(policy, val.fun, algorithm)
interact(env, agent, n.episodes = 5L)
  1. The predict function expects an array as input, not a vector (unless, it seems, the input is a scalar, as in the above). So, if we have input_shape>1L we need to convert the input to an array first. This can be done using the convenient agent preprocess function:
preprocess <- function(x) {
  array(to_categorical(x, num_classes = env$n.states),dim=c(1,env$n.states))
}

Now, if the input to our NN model has a separate binary for each cell in the grid (which seems better than using the statenumber as an integer value), so input_shape=9L, and if we then set


agent = makeAgent(policy, val.fun, algorithm, preprocess=preprocess)
interact(env, agent, n.episodes = 5L)

the problem is also solved.

dkd58 avatar Aug 29 '18 08:08 dkd58

I think I will remove the neural network code because it is not working reliably as you discovered until I find a better solution.

markusdumke avatar Mar 10 '19 20:03 markusdumke

@markusdumke: It would be a pity if your great code examples would be removed, Markus. It really helped me a lot. Maybe you could restructure it using my suggestions?

dkd58 avatar Mar 13 '19 09:03 dkd58

Any update on this issue? @dkd58 have you tried to make pull request?

mg64ve avatar Jun 29 '19 19:06 mg64ve

@mg64ve No, I didn't do that yet, but it looks like a good idea. I'll have to look into the mechanics of pull requests first, though. Don't want to mess things up...

dkd58 avatar Jul 05 '19 11:07 dkd58