bert-extractive-summarization icon indicating copy to clipboard operation
bert-extractive-summarization copied to clipboard

Limiting the source length of the text for the summary

Open alex-romanovskii opened this issue 3 years ago • 0 comments

Hello I dug deeper into your code, I was interested to see the weight for each sentence. As a result, I saw that not all sentences receive weight. I started to analyze each line and here's what's interesting: src_subtoken_idxs = src_subtoken_idxs[:-1][:max_pos] here we cut off the token's indexes, by default this value is 512. As I understand it, for the summary it use sequences with a total token length not exceeding 512 tokens.

I will be glad to answer

alex-romanovskii avatar Jun 26 '21 19:06 alex-romanovskii