lda icon indicating copy to clipboard operation
lda copied to clipboard

Feature - API could allow the corpus to be created accretively

Open 0o-de-lally opened this issue 7 years ago • 2 comments

I'd like to be able to add sentences over time to the corpus, and not all at once. Something like: var index = lda.addSentence('string') // returns an array index or unique id

which later could use:

var topicModel = lda.process(index, numTopics, termsPer)

@primaryobjects do you have any thoughts on this?

0o-de-lally avatar Apr 29 '17 22:04 0o-de-lally

lda works by building a dictionary of all unique terms, so it needs to know all of the words ahead of time. What you could do though, is when adding a new word, rebuild the dictionary and recalculate the topics.

Feel free to fork! :)

primaryobjects avatar Apr 29 '17 22:04 primaryobjects

Similar to what @lpgeiger suggested above would be to separate the logic of executing the calculation from that of constructing the lda object and building the document list.

So you could add new functions such as:
addDocument(doc) addDocuments(docsArray) execute(numTopics, termCount)

mikelax avatar Sep 21 '17 18:09 mikelax