convnet-aig icon indicating copy to clipboard operation
convnet-aig copied to clipboard

Is the Gumbel-Softmax formulation accurate?

Open atiorh opened this issue 4 years ago • 0 comments

Thanks for releasing the code!

I have been reviewing how the Gumbel-Softmax[1] trick was used and both the paper and the code suggest that the "relevance scores are interpreted as log probabilities"[2] but how come the output of a convolutional layer is interpreted as being a strictly negative quantity? (This is unlikely to break training but silently yield suboptimal performance due to inaccurate approximate sampling from the discrete distribution)

Please let me know, maybe there is a subtle intuition or training dynamic at play here that I am missing. Thanks!

[1] https://arxiv.org/pdf/1611.01144.pdf (Equation 1) [2] https://arxiv.org/pdf/1711.11503.pdf (Section 3.3, page 5)

atiorh avatar May 12 '20 23:05 atiorh