Results 4 comments of ffftzh

The maplist stores all [wordId, weight_in_the_topic] pair for a topic. This line only sorts maplist based on weight_in_the_topic. After sorting the list, we can retrieve the top N words with...

In Readme, you will find Dirichlet(α) as alpha and Dirichlet(β) as beta. Both of them are the parameters inputed by user to run the code.

@ShanceWang BTM trains topics on word co-occurrence. Documents are treated as a mixture of co-occurred word-pairs. So the document is meaningless when it only contains one word and doesn't have...

Maybe there is some empty line in the dataset or it is just a special character that looks like a space.