random-network-distillation icon indicating copy to clipboard operation
random-network-distillation copied to clipboard

Wrong PPO Model architecture.

Open alirezakazemipour opened this issue 4 years ago • 2 comments

According to the DQN nature paper and PPO1 implementation, this line:

X = activ(conv(X, 'c3', nf=64, rf=4, stride=1, init_scale=np.sqrt(2), data_format=data_format))

should be changed to:

X = activ(conv(X, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), data_format=data_format))

In short, kernel size is wrong!

alirezakazemipour avatar Oct 06 '20 16:10 alirezakazemipour

这两行有什么区别?

xiaioding avatar Apr 10 '23 04:04 xiaioding

@xiaioding The difference is in kernel sizes (rf.)

alirezakazemipour avatar Apr 11 '23 14:04 alirezakazemipour