Gennady Shtech
Gennady Shtech
What I've tried: + set same dictionary for both vectorizers by ``` wiki_batches.dictionary.load_text('./train_data/full.dict.filtered') pubs_batches.dictionary.load_text('./train_data/full.dict.filtered') ``` It doesn't help. As for the problem, I'd like to explain what means "it loses...
Some results of my investigation. I found, that I missed 2 parameters: ``` SQLALCHEMYBACKEND_DROP_ALL_TABLES = False SQLALCHEMYBACKEND_CLEAR_CONTENT = False ``` Now I see strange behavior: if I start worker.db it...
Now I see. When `MessageBus` starts it does `self.spider_feed_partitions = [i for i in range(settings.get('SPIDER_FEED_PARTITIONS'))]` Then in `SpiderFeedStream` ``` self.partitions = messagebus.spider_feed_partitions self.ready_partitions = set(self.partitions) ``` So, worker at start...
No, the problem is to understand: what should I do to be sure if my crawling state is saved between runs. So, at first I found two parameters which prevents...
@simon-lund thank you! You saved me 2 hours of life! I've spent 1 hour to detect that problem occurs AFTER LLM initialization. But still there is a lot to dig...