Multiscale-Parametric-t-SNE
Multiscale-Parametric-t-SNE copied to clipboard
Query about KL divergence calculation
Hello and thank you for the implementation! I am trying to implement this in PyTorch and I am stuck on the KL loss. Can you explain to me the dimensionalities on the KL loss and especially this line.
I am having a bit of a hard time following this and it would be nice to get an explanation as I am not familiar with the tsne implementation. From my understanding and tests in my implementation P should be (batch_size, 2) and Q is (batch_size, batch_size) how can the division work there with different matrix sizes?