policy-gradient icon indicating copy to clipboard operation
policy-gradient copied to clipboard

Why normalize predicted probabilities?

Open abhigenie92 opened this issue 8 years ago • 1 comments

prob = aprob / np.sum(aprob) https://github.com/keon/policy-gradient/blob/master/pg.py#L46

I am not sure if this line is really required, as they would be already normalized due to softmax. Please let me know in case I am missing something.

abhigenie92 avatar Jun 24 '17 01:06 abhigenie92

I think this line code is no use

helloiss avatar Nov 25 '17 03:11 helloiss