bert-extractive-summarization
bert-extractive-summarization copied to clipboard
Limiting the source length of the text for the summary
Hello
I dug deeper into your code, I was interested to see the weight for each sentence. As a result, I saw that not all sentences receive weight. I started to analyze each line and here's what's interesting:
src_subtoken_idxs = src_subtoken_idxs[:-1][:max_pos]
here we cut off the token's indexes, by default this value is 512. As I understand it, for the summary it use sequences with a total token length not exceeding 512 tokens.
I will be glad to answer