Deep-Q-Learning
Deep-Q-Learning copied to clipboard
DQN with RGB image as input
Hello I try to do a DQN with a RGB image. Python code :
def build_QNetwork_RGB(n_actions, learning_rate, history_length, input_shape):
"""Builds a dueling DQN as a Keras model
Arguments:
n_actions: Number of possible action the agent can take
learning_rate: Learning rate
input_shape: Shape of the preprocessed frame the model sees
history_length: Number of historical frames the agent can see
Returns:
A compiled Keras model
"""
model_input = Input(shape=(history_length, input_shape[0], input_shape[1], input_shape[2]))
x = Lambda(lambda layer: layer / 255)(model_input) # normalize by 255
x = Conv2D(32, (8, 8), strides=4, kernel_initializer=VarianceScaling(scale=2.), activation='relu', use_bias=False)(x)
x = Conv2D(64, (4, 4), strides=2, kernel_initializer=VarianceScaling(scale=2.), activation='relu', use_bias=False)(x)
x = Conv2D(64, (3, 3), strides=1, kernel_initializer=VarianceScaling(scale=2.), activation='relu', use_bias=False)(x)
x = Conv2D(1024, (7, 7), strides=1, kernel_initializer=VarianceScaling(scale=2.), activation='relu', use_bias=False)(x)
# Split into value and advantage streams
val_stream, adv_stream = Lambda(lambda w: tf.split(w, 2, 4))(x) # custom splitting layer
val_stream = Flatten()(val_stream)
val = Dense(1, kernel_initializer=VarianceScaling(scale=2.))(val_stream)
adv_stream = Flatten()(adv_stream)
adv = Dense(n_actions, kernel_initializer=VarianceScaling(scale=2.))(adv_stream)
# Combine streams into Q-Values
reduce_mean = Lambda(lambda w: tf.reduce_mean(w, axis=1, keepdims=True)) # custom layer for reduce mean
q_vals = Add()([val, Subtract()([adv, reduce_mean(adv)])])
# Build model
model = Model(model_input, q_vals)
model.compile(Adam(learning_rate), loss=tf.keras.losses.Huber())
model.summary()
return model**_
=========================================================== when I used this function to build network :
INPUT_SHAPE = (84, 84, 3) # Size of the preprocessed input frame. HISTORY_LENGTH = 5 Num_Actions = 4
BATCH_SIZE = 32 # Number of samples the agent learns from at once LEARNING_RATE = 0.00001
MAIN_DQN = build_QNetwork_RGB(Num_Actions, LEARNING_RATE, HISTORY_LENGTH, INPUT_SHAPE)
=========================================================== I got : Model: "model"
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) [(None, 5, 84, 84, 3 0
lambda (Lambda) (None, 5, 84, 84, 3) 0 input_1[0][0]
conv2d (Conv2D) (None, 5, 20, 20, 32 6144 lambda[0][0]
conv2d_1 (Conv2D) (None, 5, 9, 9, 64) 32768 conv2d[0][0]
conv2d_2 (Conv2D) (None, 5, 7, 7, 64) 36864 conv2d_1[0][0]
conv2d_3 (Conv2D) (None, 5, 1, 1, 1024 3211264 conv2d_2[0][0]
lambda_1 (Lambda) [(None, 5, 1, 1, 512 0 conv2d_3[0][0]
flatten_1 (Flatten) (None, 2560) 0 lambda_1[0][1]
dense_1 (Dense) (None, 4) 10244 flatten_1[0][0]
flatten (Flatten) (None, 2560) 0 lambda_1[0][0]
lambda_2 (Lambda) (None, 1) 0 dense_1[0][0]
dense (Dense) (None, 1) 2561 flatten[0][0]
subtract (Subtract) (None, 4) 0 dense_1[0][0]
lambda_2[0][0]
add (Add) (None, 4) 0 dense[0][0]
subtract[0][0]
Total params: 3,299,845 Trainable params: 3,299,845 Non-trainable params: 0
I feel that this model is not well built. Do you have an idea how to correct it if this is the case?