PreSumm
PreSumm copied to clipboard
(DEV) [-mode test_text] [-task abs]: not abstracting entire document
Command line input
python3 train.py -task abs -mode test_text -text_src ../raw_data/input_text.txt -test_from ../models/model_step_148000.pt -log_file ../logs/xsum -result_path ../results/xsum -visible_gpus -1
Problem Results is consistently from extracting one sentence from beginning of document. I thought changing the '-max_len' or 'max_pos' argument would resolve this issue but I have tried
-max_len 3000
-max_len 30000
With the same result. Regardless of the value the abstractor result is the same.
I have also tried to change 'max_pos' with the inputs below
-max_pos 3000
-max_pos 30000
but it doesn't matter the value. I just get a torch error, for example:
torch.Size([512, 768]) from checkpoint, the shape in current model is torch.Size([3000, 768]).
Questions
- How do I know if I am correctly inputting the text file for analysis?
- How should I change my code to analyze a large text document?
- Could the result be a consequence of an incoherently written document?
I got the same problem, can't summarize a larger text file than the sample input given by the repository. At first I got IndexError: tensors used as indices must be long, byte or bool tensors
, then I tried putting different values for -max_pos
and got the same error as yours.
hey, I'm facing the same issue. Did anyone figure out a solution for that?
@gandharvsuri Nothing yet. Still waiting.
Hi, i tried this command to make a summary for src_text.txt but i don't find any result.
python /users/omri/workspace/Trainbert/PreSumm/src/train.py -mode test_text -task ext -test_from /users/omri/workspace/Trainbert/PreSumm/models/ext_model/model_step_39000.pt -text_src /users/omri/workspace/Trainbert/PreSumm/raw_data/src_text.txt -min_length 200 -max_length 1000 -result_path /users/omri/workspace/Trainbert/PreSumm/results/ext_bert_cnndm -visible_gpus 1
Please can any one help me how to get the summary?
@dhouhaomri Check the results folder for the ext_bert_cnndm.candidate file.
Hi, thank you. But I got two files. (. Condidate and. Gold). Gold file is empty, do you know why?
De : Brandon Touchet [email protected] Envoyé : lundi 1 juin 2020 15:18 À : nlpyang/PreSumm [email protected] Cc : OMRI Dhouha [email protected]; Mention [email protected] Objet : Re: [nlpyang/PreSumm] (DEV) [-mode test_text] [-task abs]: not abstracting entire document (#161)
@dhouhaomrihttps://github.com/dhouhaomri Check the results folder for the ext_bert_cnndm.candidate file.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/nlpyang/PreSumm/issues/161#issuecomment-636887109, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMYWYWUFZCV5B3MXEIQOJ6LRUO2DFANCNFSM4M3X6O4A.
@STEMlib Hi, I am facing the same issue. Result is one or few sentences extracted from the beginning of document. Did you find the solution?
Still facing this issue. Any updates would be appreciated. Thank you!
@cdd-grc20 @lavanaythakral
No solution yet. Perhaps we will need to look deeper into the code to understand the problem. Because, I suspect we may not get help anytime soon.