bysowhat
bysowhat
warning: enumeration value ‘CUDNN_STATUS_RUNTIME_FP_OVERFLOW’ not handled in switch [-Wswitch]
hello: do you use gumbelsoftmax? i wonder why don't you add random noise in gumbelsoftmax and the temperature doesn't change in whole training process? thanks
could you explain what the f is in equ 4 in your paper? tanks a lot 
Hi, what is normalization_scaling for? Thanks