Tim Schopf
Results
5
comments of
Tim Schopf
HI @mdsutter, the current KeyphraseVectorizers release `v0.0.10` does not exclude any spaCy pipeline components by default anymore. Instead, you can now define which components to exclude in the `space_exclude` parameter....
I'm closing this issue now and consider it as solved, since I did not receive an answer stating otherwise.
I encountered the same issue with BERTopic using a large dataset. The way BERTopic uses the vectorizer somehow results in huge memory consumption. I suspect the reason for this is...
Feel free to open a PR in the [lemmatizer](https://github.com/TimSchopf/KeyphraseVectorizers/tree/lemmatizer) branch. I will then add this feature in a later release.
Solved with the `v0.0.12` release.