Tim Schopf

Results 5 comments of Tim Schopf

HI @mdsutter, the current KeyphraseVectorizers release `v0.0.10` does not exclude any spaCy pipeline components by default anymore. Instead, you can now define which components to exclude in the `space_exclude` parameter....

I'm closing this issue now and consider it as solved, since I did not receive an answer stating otherwise.

I encountered the same issue with BERTopic using a large dataset. The way BERTopic uses the vectorizer somehow results in huge memory consumption. I suspect the reason for this is...

Feel free to open a PR in the [lemmatizer](https://github.com/TimSchopf/KeyphraseVectorizers/tree/lemmatizer) branch. I will then add this feature in a later release.

Solved with the `v0.0.12` release.