kwx icon indicating copy to clipboard operation
kwx copied to clipboard

BERT, LDA, and TFIDF based keyword extraction in Python

Results 13 kwx issues
Sort by recently updated
recently updated
newest added

[spaCy](https://github.com/explosion/spaCy) has new loading mechanisms in the later versions that produce errors in data preparation within [kwx.utils](https://github.com/andrewtavis/kwx/blob/main/src/kwx/utils.py). The scripts should be changed to check the spaCy version so that these...

bug
good first issue

The current translation feature found in [kwx.utils.translate_output()](https://github.com/andrewtavis/kwx/blob/main/src/kwx/utils.py) is based on [py-googletrans](https://github.com/ssut/py-googletrans), which is steadily being less and less maintained. A better option would be if the translation feature could be...

bug
enhancement
good first issue

A major difference between BERT and LDA kwx implementations is that there are no visualization methods for BERT. It would be good to add a [pyLDAvis](https://github.com/bmabey/pyLDAvis) style visualization of topic...

enhancement
help wanted

Hi Andrew, again me :) I want to ask two questions about the algorithm. When using the first BERT model, why are we remove ngrams and can't we use them...

question

Hi Andrew, I was trying the Keyword Extraction API with TF-IDF, the code is: bert_kws = extract_kws( method="TFIDF", # "BERT", "LDA", "TFIDF", "frequency" bert_st_model="xlm-r-bert-base-nli-stsb-mean-tokens", text_corpus=corpus_no_ngrams, # automatically tokenized if using...

question

This issue is for discussing and eventually implementing key-phrase extraction for BERT in kwx. It would be best to first collect code snippets and documentation links for how to best...

enhancement
help wanted

This issue is for discussing and eventually implementing key-phrase extraction for LDA in kwx. It would be best to first collect code snippets and documentation links for how to best...

enhancement
help wanted

This issue is for discussing and eventually implementing key-phrase extraction for TFIDF in kwx. It would be best to first collect code snippets and documentation links for how to best...

enhancement
help wanted

Please use this issue to suggest other methods for keyword extraction that could be included in kwx. Suggestions would ideally include some of the following: - A blogpost or other...

good first issue
question

### **1 st changes** - In this modified code, the `spacy_version` variable is used to store the version of the `SpaCy` library. Inside the loop, the code checks whether the...