pke issues

Adding RAKE?

Would it be good to add RAKE implementations to this repo? - https://github.com/aneesha/RAKE - https://github.com/csurfer/rake-nltk

DonaldTsang

Create Document Frequency matrix accept only input_dir

I would like to process corpus of documents by TFIDF model. My corpus is one txt file where each line is document. It is fine as input for any models...

Benja1972

enhancement

AttributeError: module 'scipy.sparse' has no attribute 'coo_array'

Installed using `!pip install git+https://github.com/boudinfl/pke.git` Made sure spacy is installed and the 'en' model is downloaded. Similar error is posted for pyg - https://github.com/pyg-team/pytorch_geometric/issues/4378 Tried upgrading scipy and networkx as...

rxnandakumar

How to manipulate length of key-phrases?

I am applying the multipartite and topical rank methods in some phrase extraction method and was wondering if there is some parameter which I can manipulate to get longer phrases....

shyambhu-mukherjee

Keyword dataset

Can anyone suggest a dataset on which unsupervised keyword detection algorithms like multipartite graph, BERT etc can be applied to check the accuracy , precision etc.

aradhana298

TfIdf still state of the art?

This is more a question: From looking at the benchmark results https://github.com/boudinfl/pke/blob/master/results.md it seems simple TfIdf outperforms every other algorithm on the inspec dataset not only in speed, but also...

asmaier

extractor.load_document (Spacy) limitation of 1000000 characters

While using extractor.load_document() encountering this error: ValueError: [E088] Text of length 1717453 exceeds maximum of 1000000. The parser and NER models require roughly 1GB of temporary memory per 100,000 characters...

MainakMaitra

Adds RAKE

1

I implemented and tested RAKE within the pke framework, as requested by [138](https://github.com/boudinfl/pke/issues/138)

alexzvk

KP-Miner: why candidate_df is 1 for n-grams except unigram?

1

In KP-Miner implementation, n-gram candidates with `n>1` are assigned `candidate_df=1`. See https://github.com/boudinfl/pke/blob/8f1d05dcc52041c9920ba0f9d5231fe6086d12c4/pke/unsupervised/statistical/kpminer.py#L143 ```python .... # loop throught the candidates for k, v in self.candidates.items(): # get candidate document frequency candidate_df...

atalnarayan

pke
pke copied to clipboard

Metadata

Adding RAKE?

Create Document Frequency matrix accept only input_dir

AttributeError: module 'scipy.sparse' has no attribute 'coo_array'

How to manipulate length of key-phrases?

Keyword dataset

TfIdf still state of the art?

extractor.load_document (Spacy) limitation of 1000000 characters

Adds RAKE

KP-Miner: why candidate_df is 1 for n-grams except unigram?

← Metadata

Owner

Metadata

pke pke copied to clipboard

Metadata

← Metadata

Owner

Metadata

pke
pke copied to clipboard