Lda2vec-Tensorflow icon indicating copy to clipboard operation
Lda2vec-Tensorflow copied to clipboard

Improvement in Learning Rate and Topics Learned?

Open dbl001 opened this issue 6 years ago • 0 comments

I have experimented with adjustments to the 'lda_loss' function: E.g. Lda2vec.py:

            normalized = tf.nn.l2_normalize(self.mixture.topic_embedding, axis=1)
            loss_lda = self.lmbda * fraction * self.prior() + (self.learning_rate*tf.reduce_sum(tf.matmul(normalized, normalized, adjoint_b = True, name="topic_matrix")))

This change to the lda-loss learning algorithm reduces the correlation between topics in the topic_embedding matrix.

Also, this NIPS paper discusses a methodology for quantifying LDA performance, specifically, by measuring: word intrusion and topic intrusion.

http://users.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf

Please experiment and let me know what you find.

Topic Similarity Matrix after 33 Epochs: image

dbl001 avatar Jun 21 '19 19:06 dbl001