gensim
gensim copied to clipboard
How to use LdaModel with Callback
Problem description
I wonder if I can implement early stopping while training LdaModel using Callbacks and throwing exception.
But when I try to use Callback class gensim throws an error about logger
attribute. If I add logger
it then requires get_value
method, i.e. it treats Callback like it's a Metric class. So how do you use it correctly?
Steps/code/corpus to reproduce
from gensim import corpora, models
from gensim.models.callbacks import Callback
texts = [['Lorem ipsum dolor sit amet, consectetur adipiscing elit'],['sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ']]
bigram = models.Phrases(texts, min_count=5, threshold=100)
trigram = models.Phrases(bigram[texts], threshold=100)
bigram_mod = models.phrases.Phraser(bigram)
trigram_mod = models.phrases.Phraser(trigram)
dictionary = corpora.Dictionary(trigram_mod[bigram_mod[texts]])
corpus = [dictionary.doc2bow(text) for text in texts]
callback = Callback(metrics=['DiffMetric'])
lda_model = models.LdaModel(
corpus, num_topics=3, id2word=dictionary, callbacks=[callback] )
AttributeError Traceback (most recent call last)
in 13 14 lda_model = models.LdaModel( ---> 15 corpus, num_topics=3, id2word=dictionary, callbacks=[callback] ) ~\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in init(self, corpus, num_topics, id2word, distributed, chunksize, passes, update_every, alpha, eta, decay, offset, eval_every, iterations, gamma_threshold, minimum_probability, random_state, ns_conf, minimum_phi_value, per_word_topics, callbacks, dtype) 517 if corpus is not None: 518 use_numpy = self.dispatcher is not None --> 519 self.update(corpus, chunks_as_numpy=use_numpy) 520 521 def init_dir_prior(self, prior, name):
~\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in update(self, corpus, chunksize, decay, offset, passes, update_every, eval_every, iterations, gamma_threshold, chunks_as_numpy) 945 # pass the list of input callbacks to Callback class 946 callback = Callback(self.callbacks) --> 947 callback.set_model(self) 948 # initialize metrics list to store metric values after every epoch 949 self.metrics = defaultdict(list)
~\Anaconda3\lib\site-packages\gensim\models\callbacks.py in set_model(self, model) 482 # store diff diagonals of previous epochs 483 self.diff_mat = Queue() --> 484 if any(metric.logger == "visdom" for metric in self.metrics): 485 if not VISDOM_INSTALLED: 486 raise ImportError("Please install Visdom for visualization")
~\Anaconda3\lib\site-packages\gensim\models\callbacks.py in
(.0) 482 # store diff diagonals of previous epochs 483 self.diff_mat = Queue() --> 484 if any(metric.logger == "visdom" for metric in self.metrics): 485 if not VISDOM_INSTALLED: 486 raise ImportError("Please install Visdom for visualization") AttributeError: 'Callback' object has no attribute 'logger'
Versions
Windows-10-10.0.18362-SP0 Python 3.5.6 |Anaconda custom (64-bit)| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)] NumPy 1.15.2 SciPy 1.1.0 gensim 3.8.1 FAST_VERSION 1
Callbacks in gensim
are indeed bit limited, not the same thing as in tensorflow, say.