tomotopy
tomotopy copied to clipboard
Python package of Tomoto, the Topic Modeling Tool
I was calculating c_v topic coherence for several topic models with a reference corpus of 100000 wikipedia articles, with pretty high coherence scores (0.65ish). for my final results i wanted...
Is support for the M1 on the roadmap? I get the following when trying to install on M1: $ pip install tomotopy Collecting tomotopy Using cached tomotopy-0.12.2.tar.gz (1.1 MB) Preparing...
When training a simple LDA model, it runs fine on one machine (win11, WSL2, ubuntu) but it does not work on another (Google Cloud Platform). On gcp, it stops in...
Hi, I was wondering if there's a way to get word embedding vectors in topic space after training tomotopy LDA model? Thank you for your amazing work~
안녕하세요 저는 현재 한국의 대학교에서 학부생으로 재학 중에 있습니다. 지식이 부족한 탓에 예제 코드를 엄청 참고해서 하고있는데, tomotopy 로 Dynamic Topic Modeling을 하려고합니다. 그런데, DTM 으로 학습된 모델로 coherence를 계산하려고...
I have text in a dataframe and was adding it in like this: ```python for text in df['text']: mdl.add_doc(text.strip().split()) ``` This works fine However, when I tried to remove stopwords...
I'm working with tweets and want to weight them by likes; I couldn't find an obvious way to do this going over the docs. Is this possible?
Tomotopy currently loads all of documents before training, and then it trains on these documents. However, what I find is that I have a very large corpus (about 750,000 documents)...
Hello @bab2min, I am trying to use your implementation of the C_v coherence measure to evaluate both topic models that are included in tomotopy and some that are not. Therefore...
It appears that the perplexity used mdl.perplexity is fixed to the training set. Would it be possible to add a function to calculate perplexity on a pre-defined holdout set?