TransVAE
TransVAE copied to clipboard
Update loss.py
Dimension of mu and logvar is (Batch x d_latent). If we just calculate mean of all as previously implemented, this would result in mean of all d_latent regardless of individual data. Instead, we have to make sum of latent representation, and then get mean along batch. I think it is more close to original meaning of KL Divergence.
ps. I write this comment since I am truly interested in your research, and based on your research, I am expanding some concept of exploration.
Thank you for your contribution.
I think you're right but I'm going to hold off on merging for now until I have a chance to test the behavior myself. I think this bug may have actually become a feature in some ways. I see you've forked so I assume you've already modified it in your forked version and can still run all the code you want?
Also I'm excited to see how you expand on the concept of exploration! I'm working on an update that will include the option to append a set of property predictor layers to the latent space. Depending on how you are approaching it, this could give you a way to probe exploration that is not 100% reliant on purely structural fingerprints.
Yes it is running fine! :)