parmesan icon indicating copy to clipboard operation
parmesan copied to clipboard

Wrong gradients in NormalizingPlanarFlowLayer

Open Era-Dorta opened this issue 8 years ago • 0 comments

If I understand correctly, equation 11 in the paper is computed here, where for a batch of 5 and with 3 features, h'(w^t + b) should have a shape of (5,) and w of (3,), thus psi should be a (5, 3), and psi_u (5,). However, in the current implementation psi is (5,) and psi_u is a scalar. So the solution would be the change the dot product for a element-wise product. Is that right or did I make a mistake?

Era-Dorta avatar Jan 10 '18 12:01 Era-Dorta