CX_DB8 icon indicating copy to clipboard operation
CX_DB8 copied to clipboard

Support *actual* textrank

Open Hellisotherpeople opened this issue 4 years ago • 2 comments

I'm not actually doing the proper TextRank algorithm and I should experiment with that to see how effective it is.

Going to implement it with networkx most likely, shouldn't be difficult. Might be slow for large documents with word level models.

Hellisotherpeople avatar Sep 28 '19 04:09 Hellisotherpeople

Wow! Implementing TextRank properly dramatically increased the coherency of my summaries - I guess that it makes sense that doing a walk through the word-embedding powered graph will give more coherent summaries.

Unfortunate side effect - speed of summarization takes a sizeable hit unless I can find a better implementation of PageRank.

Hellisotherpeople avatar Oct 01 '19 04:10 Hellisotherpeople

I haven't actually merged that code yet to the repo - I'll do that soon so that other people can try textrank or other graph algorithms

Hellisotherpeople avatar Dec 08 '19 05:12 Hellisotherpeople