autoregressive
autoregressive copied to clipboard
Reconsider the softmax distribution
PixelCNN++ argues that neighboring intensity usually correlate is not captured by the softmax distribution. Instead, they propose a mixture model consisting of the logitic distribution (like normal but with heavier tails). https://arxiv.org/pdf/1701.05517.pdf
also add bits-per-dim metric that measures how many bits are required to encode pixel intensities. see E.2 of https://arxiv.org/pdf/1705.07057.pdf https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial11/NF_image_modeling.html
See also for how to optimize based on cdf
https://github.com/Rayhane-mamah/Tacotron-2/issues/155