pytextrank icon indicating copy to clipboard operation
pytextrank copied to clipboard

sentence scores for summarization

Open shyamcody opened this issue 5 years ago • 3 comments
trafficstars

In the text rank extension, we have a summarization method; but that option will be more effective if you expose the sentence score as a touple with the sentences that come out as summarization output. It can be asked by an input parameter whether the user just wants a summary or also the sentence scores. Let me know if there is a way to get the sentence scores in current settings too. If it is not possible, then I would like to try and add a PR for the same if allowed. Thanks.

shyamcody avatar Nov 03 '20 14:11 shyamcody

Thank you @shyamcody - we just did a relatively large refactoring of the code base, heading toward the spaCy 3.x release, and now this should be much simpler. I may be able to get it into the next release.

ceteri avatar Feb 15 '21 17:02 ceteri

Hi @shyamcody,

In the current main branch, there are two new methods for the TextRank object:

  • add get_unit_vector() method to expose the characteristic unit vector
  • add calc_sent_dist() method to expose the sentence distance measures (for summarization)

Will these provide what you were asking about?

These haven't been pushed to a new PyPi release just yet. Would be great to get your feedback first.

ceteri avatar Mar 01 '21 00:03 ceteri

Hey @ceteri thanks for the mention; I will check and give you my feedback

shyamcody avatar Mar 01 '21 09:03 shyamcody