pytextrank icon indicating copy to clipboard operation
pytextrank copied to clipboard

Is `biasedtextrank` implemented?

Open ahmed-moubtahij opened this issue 2 years ago • 4 comments

https://github.com/DerwenAI/pytextrank/blob/9ab64507a26f946191504598f86021f511245cd7/pytextrank/base.py#L305

self.focus_tokens is initialized to an empty set but I don't see where it is parameterized?

e.g.

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("biasedtextrank")
focus = "my example focus"
doc = nlp(text)

At what point can I inform the model of the focus?

ahmed-moubtahij avatar May 30 '22 19:05 ahmed-moubtahij

Hi @Ayenem ,

Thanks for checking out pytextrank.

Here"s how you can use biasedtextrank:

import spacy
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("biasedtextrank")

text = "your text here"
focus = "your focus here"
doc = nlp(text)

print(doc._.phrases) # gives you default phrases

doc._.textrank.change_focus(focus,bias=10.0,  default_bias=0.0) # provide focus, adjust bias to achieve desired disparity between focus vs rest 
print(doc._.phrases) # gives you ranked phrases as per focus 

Ankush-Chander avatar May 31 '22 11:05 Ankush-Chander

That worked, thank you!

If I understand correctly, 10.0 is the highest bias value which means the phrases will be the most "focused" they can be? And default_bias is just irrelevant when bias is specified?

ahmed-moubtahij avatar Jun 07 '22 17:06 ahmed-moubtahij

Biased text rank is special case of personalized textrank where we add more weight to the focus nodes.

In base textrank each node has weight one and then those weight go through normalization(1/sum of all weights) so that they are in range[0,1].

bias is the weight assigned to the focus nodes. Default bias is weight assigned to the non focus nodes(1 if not explicitly provided by the user.).

In the example each focus node will get the weight of 10 and non focus nodes will get the weight 0.

Please refer to this PR for detailed discussion. https://github.com/DerwenAI/pytextrank/pull/132

Ankush-Chander avatar Jun 08 '22 03:06 Ankush-Chander

So I'm already doing what I intended by replicating the default_bias=0.0 in your example. Cool feature, thanks!

Also, do you know if there has been progress on the implementation of an optional input KG for biased textrank? @ceteri https://github.com/DerwenAI/pytextrank/pull/132#issuecomment-803681104

ahmed-moubtahij avatar Jun 11 '22 19:06 ahmed-moubtahij