spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Mismatched IDs error when using nlp.rehearse with listeners

Open thomashacker opened this issue 2 years ago • 0 comments

Discussed in https://github.com/explosion/spaCy/discussions/10861

Using nlp.rehearse on a pipelines with a tok2vec listener results in ValueError: [E953] Mismatched IDs.

Originally posted by nashcaps2255 May 27, 2022 Have a textcat multilabel model which I am trying to update with nlp.rehearse to alleviate the catastrophic forgetting problem.

nlp = spacy.load('my_model')

examples = []
for line in file_:
   text, label = line.split("|")
   doc = nlp(text)
   gold_dict = {"cats": {label: float(1)}}
   gold_dict = Example.from_dict(doc, gold_dict)
   examples.append(example)


optimizer = nlp.resume_training()
nlp.rehearse(examples, sgd = optimizer) 

Results in......

ValueError: [E953] Mismatched IDs received by the Tok2Vec listener: 179568814531392983158587824 vs. 2172509679243279887229

thomashacker avatar Jan 02 '23 20:01 thomashacker