trankit
trankit copied to clipboard
KeyError: 'lemma'
Following the code from https://trankit.readthedocs.io/en/latest/training.html#training-a-lemmatizer i get a KeyError: 'lemma':
Setting up training config...
Initialized lemmatizer trainer
Training dictionary-based lemmatizer
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
[<ipython-input-9-a90867cc5ef3>](https://localhost:8080/#) in <module>()
11
12 # start training
---> 13 trainer.train()
3 frames
[/content/trankit/trankit/tpipeline.py](https://localhost:8080/#) in train(self)
680 self._train_posdep()
681 elif self._task == 'lemmatize':
--> 682 self._train_lemma()
683 elif self._task == 'ner':
684 self._train_ner()
[/content/trankit/trankit/tpipeline.py](https://localhost:8080/#) in _train_lemma(self)
581
582 def _train_lemma(self):
--> 583 self._lemma_model.train()
584
585 def _train_ner(self):
[/content/trankit/trankit/models/lemma_model.py](https://localhost:8080/#) in train(self)
379 self.config.logger.info("Training dictionary-based lemmatizer")
380 self.trainer.train_dict(
--> 381 [[token[TEXT], token[UPOS], token[LEMMA]] for sentence in self.train_batch.doc for token in sentence if
382 not (
383 type(token[ID]) == tuple and len(token[ID]) == 2)])
[/content/trankit/trankit/models/lemma_model.py](https://localhost:8080/#) in <listcomp>(.0)
381 [[token[TEXT], token[UPOS], token[LEMMA]] for sentence in self.train_batch.doc for token in sentence if
382 not (
--> 383 type(token[ID]) == tuple and len(token[ID]) == 2)])
384 dev_preds = self.trainer.predict_dict(
385 [[token[TEXT], token[UPOS]] for sentence in self.dev_batch.doc for token in sentence if
KeyError: 'lemma'
The recent version from https://github.com/UniversalDependencies/UD_Thai-PUD is used as trainings and development data.
There are no Lemmas in the training data. So there can't be lemmatizer?! Can't i use the the other parts of the pipeline? When i run
from trankit import Pipeline
p = Pipeline(lang='customized', cache_dir='./save_dir')
the following error occurs:
BadZipFile: File is not a zip file