tomotopy issues

bug: cv topic coherence dependent on size of reference corpus

2

I was calculating c_v topic coherence for several topic models with a reference corpus of 100000 wikipedia articles, with pretty high coherence scores (0.65ish). for my final results i wanted...

RutgerEttes

bug

Support for Apple Silicon M1

5

Is support for the M1 on the roadmap? I get the following when trying to install on M1: $ pip install tomotopy Collecting tomotopy Using cached tomotopy-0.12.2.tar.gz (1.1 MB) Preparing...

stephangreene

enhancement

GCP problems. Can I turn performance maximization off?

3

When training a simple LDA model, it runs fine on one machine (win11, WSL2, ubuntu) but it does not work on another (Google Cloud Platform). On gcp, it stops in...

benreaves

bug

Is there a way to get topic vector?

4

Hi, I was wondering if there's a way to get word embedding vectors in topic space after training tomotopy LDA model? Thank you for your amazing work~

Sixy1204

question

DTModel에서 coherence 구하기

1

안녕하세요 저는 현재 한국의 대학교에서 학부생으로 재학 중에 있습니다. 지식이 부족한 탓에 예제 코드를 엄청 참고해서 하고있는데, tomotopy 로 Dynamic Topic Modeling을 하려고합니다. 그런데, DTM 으로 학습된 모델로 coherence를 계산하려고...

Kwon-subin

bug

"RuntimeError: Either `words` or `rawWords` must be filled" using `add_doc` sometimes

6

I have text in a dataframe and was adding it in like this: ```python for text in df['text']: mdl.add_doc(text.strip().split()) ``` This works fine However, when I tried to remove stopwords...

batmanscode

enhancement

Is there a way to 'weight' docs?

4

I'm working with tweets and want to weight them by likes; I couldn't find an obvious way to do this going over the docs. Is this possible?

batmanscode

enhancement

Ability to stream corpus data to LDAModel (or any other model)

Tomotopy currently loads all of documents before training, and then it trains on these documents. However, what I find is that I have a very large corpus (about 750,000 documents)...

jalustig

enhancement

Question: Calculating Coherence. What words are expected as Targets?

8

Hello @bab2min, I am trying to use your implementation of the C_v coherence measure to evaluate both topic models that are included in tomotopy and some that are not. Therefore...

hhagedorn

question

documentation

holdout perplexity

4

It appears that the perplexity used mdl.perplexity is fixed to the training set. Would it be possible to add a function to calculate perplexity on a pre-defined holdout set?

PearlOnyx08

enhancement

tomotopy
tomotopy copied to clipboard

Metadata

bug: cv topic coherence dependent on size of reference corpus

Support for Apple Silicon M1

GCP problems. Can I turn performance maximization off?

Is there a way to get topic vector?

DTModel에서 coherence 구하기

"RuntimeError: Either `words` or `rawWords` must be filled" using `add_doc` sometimes

Is there a way to 'weight' docs?

Ability to stream corpus data to LDAModel (or any other model)

Question: Calculating Coherence. What words are expected as Targets?

holdout perplexity

← Metadata

Owner

Metadata

tomotopy tomotopy copied to clipboard

Metadata

← Metadata

Owner

Metadata

tomotopy
tomotopy copied to clipboard