gensim icon indicating copy to clipboard operation
gensim copied to clipboard

How to use LdaModel with Callback

Open hellpanderrr opened this issue 4 years ago • 1 comments

Problem description

I wonder if I can implement early stopping while training LdaModel using Callbacks and throwing exception. But when I try to use Callback class gensim throws an error about logger attribute. If I add logger it then requires get_value method, i.e. it treats Callback like it's a Metric class. So how do you use it correctly?

Steps/code/corpus to reproduce

from gensim import corpora, models
from gensim.models.callbacks import Callback    


texts = [['Lorem ipsum dolor sit amet, consectetur adipiscing elit'],['sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ']]
bigram = models.Phrases(texts, min_count=5, threshold=100)
trigram = models.Phrases(bigram[texts], threshold=100)
bigram_mod = models.phrases.Phraser(bigram)
trigram_mod = models.phrases.Phraser(trigram)
dictionary = corpora.Dictionary(trigram_mod[bigram_mod[texts]])

corpus = [dictionary.doc2bow(text) for text in texts]
callback = Callback(metrics=['DiffMetric'])
lda_model = models.LdaModel(
    corpus, num_topics=3, id2word=dictionary,  callbacks=[callback] )

AttributeError Traceback (most recent call last) in 13 14 lda_model = models.LdaModel( ---> 15 corpus, num_topics=3, id2word=dictionary, callbacks=[callback] )

~\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in init(self, corpus, num_topics, id2word, distributed, chunksize, passes, update_every, alpha, eta, decay, offset, eval_every, iterations, gamma_threshold, minimum_probability, random_state, ns_conf, minimum_phi_value, per_word_topics, callbacks, dtype) 517 if corpus is not None: 518 use_numpy = self.dispatcher is not None --> 519 self.update(corpus, chunks_as_numpy=use_numpy) 520 521 def init_dir_prior(self, prior, name):

~\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in update(self, corpus, chunksize, decay, offset, passes, update_every, eval_every, iterations, gamma_threshold, chunks_as_numpy) 945 # pass the list of input callbacks to Callback class 946 callback = Callback(self.callbacks) --> 947 callback.set_model(self) 948 # initialize metrics list to store metric values after every epoch 949 self.metrics = defaultdict(list)

~\Anaconda3\lib\site-packages\gensim\models\callbacks.py in set_model(self, model) 482 # store diff diagonals of previous epochs 483 self.diff_mat = Queue() --> 484 if any(metric.logger == "visdom" for metric in self.metrics): 485 if not VISDOM_INSTALLED: 486 raise ImportError("Please install Visdom for visualization")

~\Anaconda3\lib\site-packages\gensim\models\callbacks.py in (.0) 482 # store diff diagonals of previous epochs 483 self.diff_mat = Queue() --> 484 if any(metric.logger == "visdom" for metric in self.metrics): 485 if not VISDOM_INSTALLED: 486 raise ImportError("Please install Visdom for visualization")

AttributeError: 'Callback' object has no attribute 'logger'

Versions

Windows-10-10.0.18362-SP0 Python 3.5.6 |Anaconda custom (64-bit)| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)] NumPy 1.15.2 SciPy 1.1.0 gensim 3.8.1 FAST_VERSION 1

hellpanderrr avatar Feb 22 '20 13:02 hellpanderrr

Callbacks in gensim are indeed bit limited, not the same thing as in tensorflow, say.

maciejskorski avatar Jun 22 '23 06:06 maciejskorski