tomotopy issues

Results 67 tomotopy issues

Sort by recently updated

DTM k=1 exit program and not show any messages

```python #데이터 준비, vocab 준비 및 전처리 df = list() with open("result/mecab_lda_corpus.csv", mode="r", encoding="UTF-8") as f: df = f.readlines() df = [i.rstrip() for i in df] df = list(reversed(df)) time_point_list...

Indigo-Coder-github

모델에 사용자 사전 추가 기능 건의(Suggest that add user's own vocabulary into model)

안녕하세요. 다른 토픽모델링 라이브러리를 쓰다가 빠르고 사용하기 편해서 넘어온 사용자입니다. 사용하다보니 조금 아쉬운 점이 있어서 건의를 남깁니다. [sklearn의 tfidfvectorizer](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html)의 parameter에는 vocabulary가 있어서 불용어나 특정 단어 제외 등의 전처리를 하지 않은...

Indigo-Coder-github

Make threadpool queue size configurable

Analysis [here](https://github.com/bab2min/tomotopy/issues/182#issuecomment-1715862530) suggests that the queue size is too small to be effective for parallel processing. It's a bit hard to follow the code path, but it seems to be...

erip

corpus 를 save로 저장 후 load로 다시 읽었지만 값이 제대로 나오지 않아서요

tomotopy-0.12.5 사용 중이구요... process_corpus = tp.utils.Corpus() load_corpus = tp.utils.Corpus() ..... process_corpus .save('save.corpus') load_corpus.load('save.corpus') ... process_corpus 값은 extract_ngrams 으로 값이 나오지만 load_corpus 에서는 빈값만 나와서요... 혹시나 해서 pickle로 해도 마찬가지여서... pickle.dump(data,...

nowtoday

The training is really fast, but the inference speed is very slow. I read the document and wrote batch, multi-core, but it is still very slow. Is there any other way to optimize the inference speed?

The problem encountered is the same, occupying 100G of memory, 40 cores are turned on, and reasoning is performed on texts with a length of less than 5000 words, 2...

xiaohuzi1996