PreSumm icon indicating copy to clipboard operation
PreSumm copied to clipboard

Using Model for Inference

Open r-vi opened this issue 5 years ago • 29 comments

Hi, thanks for your work here. How do we use the pretrained models for inference (summarization) on article text? I have downloaded the trained cnndm model (.pt), but how do I load this in a python program to use? Thanks!

r-vi avatar Aug 27 '19 23:08 r-vi

use -mode test -test_from PT_FILE

nlpyang avatar Aug 29 '19 12:08 nlpyang

Hi ! I am also trying to use the model for inference but when I use the above mode with a singular checkpoint. Using this command python train.py -task abs -mode test -test_from ../bert_data/cnndm.test.0.bert.pt -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/ -log_file ../logs/val_abs_bert_cnndm -model_path ../models/model_step_148000.pt -sep_optim true -use_interval true -visible_gpus 0 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

I get this error opt = vars(checkpoint['opt']) TypeError: list indices must be integers or slices, not str

d1sharma avatar Aug 29 '19 20:08 d1sharma

@d1sharma -test_from PT_FILE, PT_FILE should be the model file not the bert_data file

nlpyang avatar Aug 29 '19 20:08 nlpyang

Oh. Got it. So I need both -model_path and -test_from to point to the same model ?

d1sharma avatar Aug 29 '19 20:08 d1sharma

with test mode model_path is not required

nlpyang avatar Aug 29 '19 21:08 nlpyang

Thank you so much! Just last question, when I try

python train.py -task abs -mode test -test_from ../models/model_step_148000.pt -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/ -log_file ../logs/val_abs_bert_cnndm -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

I still get an error with missing the test FileNotFoundError: [Errno 2] No such file or directory: '../bert_data/.test.pt'

d1sharma avatar Aug 29 '19 21:08 d1sharma

Try with same command but replace :

-bert_data_path ../bert_data/

by

-bert_data_path ../bert_data/cnndm

astariul avatar Sep 02 '19 01:09 astariul

hi , when I run the code as follow order I meet a mistake, so when you have some free time , I hope you can give me some help,thanks.looking forward to your reply.

order: python3 train.py -task ext -mode validate -test_all -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/cnndm -log_file ../logs/val_abs_bert_cnndm -model_path ../models -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

errors:

Traceback (most recent call last): File "train.py", line 145, in validate_ext(args, device_id) File "/home/wy/PreSumm-master/src/train_extractive.py", line 129, in validate_ext test_ext(args, device_id, cp, step) File "/home/wy/PreSumm-master/src/train_extractive.py", line 203, in test_ext trainer.test(test_iter, step) File "/home/wy/PreSumm-master/src/models/trainer_ext.py", line 289, in test rouges = test_rouge(self.args.temp_dir, can_path, gold_path) File "/home/wy/PreSumm-master/src/others/utils.py", line 79, in test_rouge r = pyrouge.Rouge155(temp_dir=temp_dir) File "/home/wy/PreSumm-master/src/others/pyrouge.py", line 123, in init self.__set_rouge_dir(rouge_dir) File "/home/wy/PreSumm-master/src/others/pyrouge.py", line 439, in __set_rouge_dir self._home_dir = self.__get_rouge_home_dir_from_settings() File "/home/wy/PreSumm-master/src/others/pyrouge.py", line 453, in __get_rouge_home_dir_from_settings with open(self._settings_file) as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/wy/.pyrouge/settings.ini'

hi-wangyan avatar Sep 12 '19 07:09 hi-wangyan

Encountering same error as @hi-wangyan

EDIT: I noticed I have not installed ROUGE, but because service is currently unavailable. Apart from that, do we need to create a settings file for it?

adriantomas avatar Sep 12 '19 08:09 adriantomas

Encountering same error as @hi-wangyan

EDIT: I noticed I have not installed ROUGE, but because service is currently unavailable. Apart from that, do we need to create a settings file for it?

So even is not available on the main website, is possible to get a working copy from: https://stackoverflow.com/questions/45894212/installing-pyrouge-gets-error-in-ubuntu

adriantomas avatar Sep 12 '19 18:09 adriantomas

hi, but I have install the package of pyrouge successful. Now I can not find where exist some errors:

my first order is : python3 train.py -task ext -mode train -bert_data_path ../bert_data/cnndm -ext_dropout 0.1 -model_path ../models -lr 2e-3 -visible_gpus 0,1,2 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50000 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

my second order is : python train.py -task ext -mode validate -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/cnndm -log_file ../logs/val_ext_bert_cnndm -model_path ../models/model_step_50000.pt -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/ext_result_bert_cnndm

but now I don't know whether the run order is right

hi-wangyan avatar Sep 13 '19 03:09 hi-wangyan

Hello, When I ran the command

python train.py -task abs -mode test -test_from ../models/model_step_148000.pt -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/ -log_file ../logs/val_abs_bert_cnndm -sep_optim true -use_interval true -visible_gpus 0 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

with pre-trained weights model_step_148000.pt , an error occured

File "train.py", line 135, in test_abs(args, device_id, cp, step) File "D:\summariser\PreSumm-master\src\train_abstractive.py", line 208, in test_abs checkpoint = torch.load(test_from, map_location=lambda storage, loc: storage) File "C:\Users\Suraj.Maurya\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 386, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "C:\Users\Suraj.Maurya\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 580, in _load deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly) OSError: [Errno 22] Invalid argument

AyushSoral avatar Sep 25 '19 03:09 AyushSoral

ok,Thank you  very much!

---Original--- From: "fatmalearning"<[email protected]> Date: Tue, Oct 15, 2019 18:02 PM To: "nlpyang/PreSumm"<[email protected]>; Cc: "Mention"<[email protected]>;"wangyan"<[email protected]>; Subject: Re: [nlpyang/PreSumm] Using Model for Inference (#11)

Try with same command but replace :

-bert_data_path ../bert_data/

by

-bert_data_path ../bert_data/cnndm

when I did like you this error displayed from me

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/Abstractive/PreSumm/bert_data/cnndm.test.pt' my commany is

!python '/content/drive/My Drive/Abstractive/PreSumm/src/train.py' -task abs -mode test -test_from "/content/drive/My Drive/Abstractive/PreSumm/models/model_step_148000.pt" -batch_size 3000 -test_batch_size 500 -bert_data_path "/content/drive/My Drive/Abstractive/PreSumm/bert_data/cnndm" -log_file "/content/drive/My Drive/Abstractive/PreSumm/logs/val_abs_bert_cnndm" -model_path "/content/drive/My Drive/Abstractive/PreSumm-master/models/" -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path "/content/drive/My Drive/Abstractive/PreSumm-master/results/abs_bert_cnndm" -temp_dir "/content/drive/My Drive/Abstractive/PreSumm/temp/" -visible_gpus True

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

hi-wangyan avatar Oct 15 '19 11:10 hi-wangyan

I used this command from my google colab !python '/content/drive/My Drive/Abstractive/PreSumm/src/train.py' -task abs -mode test -test_from "/content/drive/My Drive/Abstractive/PreSumm/models/model_step_148000.pt" -batch_size 3000 -test_batch_size 500 -bert_data_path "/content/drive/My Drive/Abstractive/PreSumm/bert_data/cnndm/" -log_file "/content/drive/My Drive/Abstractive/PreSumm/logs/val_abs_bert_cnndm" -model_path "/content/drive/My Drive/Abstractive/PreSumm-master/models/" -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path "/content/drive/My Drive/Abstractive/PreSumm/logs/abs_bert_cnndm" -temp_dir "/content/drive/My Drive/Abstractive/PreSumm/temp/" -visible_gpus True It show this error

Traceback (most recent call last): File "/content/drive/My Drive/Abstractive/PreSumm/src/train.py", line 135, in test_abs(args, device_id, cp, step) File "/content/drive/My Drive/Abstractive/PreSumm/src/train_abstractive.py", line 225, in test_abs predictor.translate(test_iter, step) File "/content/drive/My Drive/Abstractive/PreSumm/src/models/predictor.py", line 144, in translate for batch in data_iter: File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 143, in iter for batch in self.cur_iter: File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 284, in iter for idx, minibatch in enumerate(self.batches): File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 262, in create_batches for buffer in self.batch_buffer(data, self.batch_size * 300): File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 230, in batch_buffer ex = self.preprocess(ex, self.is_test) File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 198, in preprocess tgt = ex['tgt'][:self.args.max_tgt_len][:-1]#+[2] KeyError: 'tgt'

fatmalearning avatar Oct 15 '19 18:10 fatmalearning

actually i printed ex

ex {'src': [101, 1037, 2118, 1997, 5947, 3076, 2038, 2351, 3053, 2093, 2706, 2044, 1037, 2991, 1999, 4199, 1999, 1037, 6878, 13742, 2886, 1999, 4199, 1012, 102, 101, 4080, 9587, 29076, 1010, 2322, 1010, 2013, 8904, 3449, 9644, 1010, 4307, 1010, 2018, 2069, 2074, 3369, 2005, 1037, 13609, 2565, 1999, 3304, 2043, 1996, 5043, 3047, 1999, 2254, 1012, 102, 101, 2002, 2001, 10583, 2067, 2000, 3190, 3081, 2250, 10771, 2006, 2233, 2322, 1010, 2021, 2002, 2351, 2006, 4465, 1012, 102, 101, 4080, 9587, 29076, 1010, 2322, 1010, 2013, 8904, 3449, 9644, 1010, 4307, 1010, 1037, 2118, 1997, 5947, 3076, 2038, 2351, 3053, 2093, 2706, 2044, 1037, 2991, 1999, 4199, 1999, 1037, 6878, 13742, 102, 101, 2002, 2001, 2579, 2000, 1037, 2966, 4322, 1999, 1996, 3190, 2181, 1010, 2485, 2000, 2010, 2155, 2188, 1999, 8904, 3449, 9644, 1012, 102, 101, 2002, 2351, 2006, 4465, 2012, 7855, 3986, 2902, 1011, 2966, 19684, 1005, 1055, 2436, 14056, 3581, 18454, 6199, 2319, 2758, 1037, 3426, 1997, 2331, 24185, 1050, 1005, 1056, 2022, 2207, 2127, 6928, 2012, 1996, 5700, 1012, 102, 101, 3988, 2610, 4311, 5393, 1996, 2991, 2001, 2019, 4926, 2021, 4614, 2024, 11538, 1996, 6061, 2008, 9587, 29076, 2001, 20114, 1012, 102, 101, 2006, 4465, 1010, 2010, 5542, 9460, 2626, 3784, 1024, 1036, 2023, 2851, 2026, 5542, 4080, 1005, 1055, 3969, 2001, 4196, 2039, 2000, 6014, 1012, 102, 101, 3988, 2610, 4311, 5393, 1996, 2991, 2001, 2019, 4926, 2021, 4614, 2024, 11538, 1996, 6061, 2008, 9587, 29076, 2001, 20114, 102, 101, 1036, 2012, 1996, 2927, 1997, 2254, 2002, 2253, 2000, 4199, 2000, 2817, 7548, 1998, 2006, 1996, 2126, 2188, 2013, 1037, 2283, 2002, 2001, 23197, 4457, 1998, 6908, 2125, 1037, 2871, 6199, 2958, 1998, 2718, 1996, 5509, 2917, 1012, 102, 101, 1036, 2002, 2001, 1999, 1037, 16571, 1998, 1999, 4187, 4650, 2005, 2706, 1012, 1005, 102, 101, 13723, 20073, 1010, 2040, 2056, 2016, 2003, 1037, 2485, 2155, 2767, 1010, 2409, 2026, 9282, 2166, 1010, 2008, 9587, 29076, 2018, 2069, 2042, 1999, 1996, 2406, 2005, 2416, 2847, 2043, 1996, 5043, 3047, 1012, 102, 101, 2016, 2056, 2002, 2001, 2001, 2894, 2012, 1996, 2051, 1997, 1996, 6884, 6101, 1998, 3167, 5167, 2020, 7376, 1012, 102, 101, 2016, 2794, 2008, 2002, 2001, 1999, 1037, 2512, 1011, 2966, 2135, 10572, 16571, 1010, 2383, 4265, 3809, 8985, 1998, 4722, 9524, 1012, 102, 101, 9587, 29076, 2001, 1037, 2353, 1011, 2095, 5446, 2350, 2013, 8904, 3449, 9644, 1010, 5665, 1012, 1010, 2040, 2001, 8019, 1999, 1037, 13609, 1011, 2146, 2565, 2012, 2198, 9298, 4140, 2118, 1012, 102, 101, 9587, 29076, 6272, 2000, 1996, 2082, 1005, 1055, 3127, 1997, 1996, 13201, 16371, 13577, 1010, 4311, 1996, 3190, 10969, 2040, 6866, 1037, 3696, 2648, 1037, 2311, 3752, 1036, 11839, 2005, 9587, 29076, 1012, 1005, 102, 101, 1996, 13577, 1005, 1055, 5947, 3127, 2623, 4465, 5027, 3081, 10474, 2008, 1037, 3986, 2326, 2097, 2022, 2218, 2006, 3721, 2000, 3342, 9587, 29076, 1012, 102],

'labels': [0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'segs': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'clss': [0, 25, 57, 78, 112, 136, 174, 197, 223, 245, 285, 301, 337, 358, 382, 416, 452],

'src_txt': ['a university of iowa student has died nearly three months after a fall in rome in a suspected robbery attack in rome .', 'andrew mogni , 20 , from glen ellyn , illinois , had only just arrived for a semester program in italy when the incident happened in january .', 'he was flown back to chicago via air ambulance on march 20 , but he died on sunday .', 'andrew mogni , 20 , from glen ellyn , illinois , a university of iowa student has died nearly three months after a fall in rome in a suspected robbery', 'he was taken to a medical facility in the chicago area , close to his family home in glen ellyn .', "he died on sunday at northwestern memorial hospital - medical examiner 's office spokesman frank shuftan says a cause of death wo n't be released until monday at the earliest .", 'initial police reports indicated the fall was an accident but authorities are investigating the possibility that mogni was robbed .', "on sunday , his cousin abby wrote online : this morning my cousin andrew 's soul was lifted up to heaven .", 'initial police reports indicated the fall was an accident but authorities are investigating the possibility that mogni was robbed', ' at the beginning of january he went to rome to study aboard and on the way home from a party he was brutally attacked and thrown off a 40ft bridge and hit the concrete below .', "he was in a coma and in critical condition for months . '", 'paula barnett , who said she is a close family friend , told my suburban life , that mogni had only been in the country for six hours when the incident happened .', 'she said he was was alone at the time of the alleged assault and personal items were stolen .', 'she added that he was in a non-medically induced coma , having suffered serious infection and internal bleeding .', 'mogni was a third-year finance major from glen ellyn , ill. , who was participating in a semester-long program at john cabot university .', "mogni belonged to the school 's chapter of the sigma nu fraternity , reports the chicago tribune who posted a sign outside a building reading pray for mogni . '", "the fraternity 's iowa chapter announced sunday afternoon via twitter that a memorial service will be held on campus to remember mogni ."],

'tgt_txt': 'andrew mogni , 20 , from glen ellyn , illinois , had only just arrived for a semester program when the incident happened in januaryhe was flown back to chicago via air on march 20 but he died on sundayinitial police reports indicated the fall was an accident but authorities are investigating the possibility that mogni was robbedhis cousin claims he was attacked and thrown 40ft from a bridge'}

it is not include tgt only , could you help me ?

fatmalearning avatar Oct 15 '19 18:10 fatmalearning

How did you preprocess the data ?
Did you do it by yourself or use the already processed data ?

astariul avatar Oct 15 '19 23:10 astariul

use the already processed data

---Original--- From: "Cola"<[email protected]> Date: Wed, Oct 16, 2019 07:36 AM To: "nlpyang/PreSumm"<[email protected]>; Cc: "Mention"<[email protected]>;"wangyan"<[email protected]>; Subject: Re: [nlpyang/PreSumm] Using Model for Inference (#11)

How did you preprocess the data ? Did you do it by yourself or use the already processed data ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

hi-wangyan avatar Oct 16 '19 01:10 hi-wangyan

It's weird because when I download the data, I have tgt key inside the dictionary.

Try to download again the data and check if tgt is inside this time ?

astariul avatar Oct 16 '19 01:10 astariul

Sorry @hi-wangyan I thought you are someone else ^^

Your error is about pyrouge.

To check if your pyrouge installation is working :

python -m pyrouge.test


If some tests are not passing, I advise you to reinstall pyrouge by following this tutorial.

astariul avatar Oct 16 '19 01:10 astariul

It doesn't matter, I can't do this task now, the code can't run, the server isn't working yet

---Original--- From: "Cola"<[email protected]> Date: Wed, Oct 16, 2019 09:39 AM To: "nlpyang/PreSumm"<[email protected]>; Cc: "Mention"<[email protected]>;"wangyan"<[email protected]>; Subject: Re: [nlpyang/PreSumm] Using Model for Inference (#11)

Sorry @hi-wangyan I thought you are someone else ^^

Your error is about pyrouge.

To check if your pyrouge installation is working :

python -m pyrouge.test

If some tests are not passing, I advise you to reinstall pyrouge by following this tutorial

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

hi-wangyan avatar Oct 16 '19 01:10 hi-wangyan

@nlpyang @Colanim

How did you preprocess the data ? Did you do it by yourself or use the already processed data ?

I used the already processed data data from the project PreSumm for CNN/Dailymail now with this command !python '/content/drive/My Drive/Abstractive/PreSumm/src/train.py' -task abs -mode test -test_from "/content/drive/My Drive/Abstractive/PreSumm/models/model_step_148000.pt" -batch_size 3000 -test_batch_size 500 -bert_data_path "/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/" -log_file "/content/drive/My Drive/Abstractive/PreSumm/logs/val_abs_bert_cnndm" -model_path "/content/drive/My Drive/Abstractive/PreSumm-master/models/" -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path "/content/drive/My Drive/Abstractive/PreSumm/logs/abs_bert_cnndm" -temp_dir "/content/drive/My Drive/Abstractive/PreSumm/temp/" -visible_gpus True Now it show this error

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/.test.pt'

could you help me ?

fatmalearning avatar Oct 16 '19 11:10 fatmalearning

replace :

-bert_data_path ../bert_data/

by

-bert_data_path ../bert_data/cnndm

---Original--- From: "fatmalearning"<[email protected]> Date: Wed, Oct 16, 2019 19:42 PM To: "nlpyang/PreSumm"<[email protected]>; Cc: "Mention"<[email protected]>;"wangyan"<[email protected]>; Subject: Re: [nlpyang/PreSumm] Using Model for Inference (#11)

How did you preprocess the data ? Did you do it by yourself or use the already processed data ?

I used the already processed data data from the project PreSumm for CNN/Dailymail now with this command !python '/content/drive/My Drive/Abstractive/PreSumm/src/train.py' -task abs -mode test -test_from "/content/drive/My Drive/Abstractive/PreSumm/models/model_step_148000.pt" -batch_size 3000 -test_batch_size 500 -bert_data_path "/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/" -log_file "/content/drive/My Drive/Abstractive/PreSumm/logs/val_abs_bert_cnndm" -model_path "/content/drive/My Drive/Abstractive/PreSumm-master/models/" -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path "/content/drive/My Drive/Abstractive/PreSumm/logs/abs_bert_cnndm" -temp_dir "/content/drive/My Drive/Abstractive/PreSumm/temp/" -visible_gpus True Now it show this error

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/.test.pt'

could you help me ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

hi-wangyan avatar Oct 16 '19 12:10 hi-wangyan

replace : -bert_data_path ../bert_data/ by -bert_data_path ../bert_data/cnndm ---Original--- From: "fatmalearning"<[email protected]> Date: Wed, Oct 16, 2019 19:42 PM To: "nlpyang/PreSumm"<[email protected]>; Cc: "Mention"<[email protected]>;"wangyan"<[email protected]>; Subject: Re: [nlpyang/PreSumm] Using Model for Inference (#11) How did you preprocess the data ? Did you do it by yourself or use the already processed data ? I used the already processed data data from the project PreSumm for CNN/Dailymail now with this command !python '/content/drive/My Drive/Abstractive/PreSumm/src/train.py' -task abs -mode test -test_from "/content/drive/My Drive/Abstractive/PreSumm/models/model_step_148000.pt" -batch_size 3000 -test_batch_size 500 -bert_data_path "/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/" -log_file "/content/drive/My Drive/Abstractive/PreSumm/logs/val_abs_bert_cnndm" -model_path "/content/drive/My Drive/Abstractive/PreSumm-master/models/" -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path "/content/drive/My Drive/Abstractive/PreSumm/logs/abs_bert_cnndm" -temp_dir "/content/drive/My Drive/Abstractive/PreSumm/temp/" -visible_gpus True Now it show this error FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/Abstractive/PreSumm/bert_data/bert_data_cnndm_final/.test.pt' could you help me ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

thanks @hi-wangyan it covered this error but after i did this step . it shows another error Traceback (most recent call last): File "/content/drive/My Drive/Abstractive/PreSumm/src/train.py", line 135, in <module> test_abs(args, device_id, cp, step) File "/content/drive/My Drive/Abstractive/PreSumm/src/train_abstractive.py", line 225, in test_abs predictor.translate(test_iter, step) File "/content/drive/My Drive/Abstractive/PreSumm/src/models/predictor.py", line 144, in translate for batch in data_iter: File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 143, in __iter__ for batch in self.cur_iter: File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 290, in __iter__ batch = Batch(minibatch, self.device, self.is_test) File "/content/drive/My Drive/Abstractive/PreSumm/src/models/data_loader.py", line 34, in __init__ mask_src = 1 - (src == 0) File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 325, in __rsub__ return _C._VariableFunctions.rsub(self, other) RuntimeError: Subtraction, the -operator, with a bool tensor is not supported. If you are trying to invert a mask, use the~orbitwise_not()operator instead. could you help me ?

fatmalearning avatar Oct 16 '19 14:10 fatmalearning

thanks , the problem was solved by downgrade the pytorch version to 1.1 instead of 1.2 or 1.3

fatmalearning avatar Oct 16 '19 16:10 fatmalearning

hey I used this command from my google colab for Model Training:

!python train.py -task ext -mode train -bert_data_path "/content/drive/My Drive/BERTSUMEXT/bert_data/cnndm" -ext_dropout 0.1 -model_path "/content/drive/My Drive/BERTSUMEXT/models" -lr 2e-3 -visible_gpus 0 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50000 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

but It shows this error i need help please

[2019-11-20 20:10:35,503 INFO] * number of parameters: 120512513 [2019-11-20 20:10:35,503 INFO] Start training... [2019-11-20 20:10:35,673 INFO] Loading train dataset from /content/drive/My Drive/BERTSUMEXT/bert_data/cnndm.train.123.bert.pt, number of examples: 2001 Traceback (most recent call last): File "/content/drive/My Drive/BERTSUMEXT/src/train.py", line 146, in train_ext(args, device_id) File "/content/drive/My Drive/BERTSUMEXT/src/train_extractive.py", line 203, in train_ext train_single_ext(args, device_id) File "/content/drive/My Drive/BERTSUMEXT/src/train_extractive.py", line 245, in train_single_ext trainer.train(train_iter_fct, args.train_steps) File "/content/drive/My Drive/BERTSUMEXT/src/models/trainer_ext.py", line 137, in train for i, batch in enumerate(train_iter): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 142, in iter for batch in self.cur_iter: File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 278, in iter for idx, minibatch in enumerate(self.batches): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 256, in create_batches for buffer in self.batch_buffer(data, self.batch_size * 300): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 224, in batch_buffer ex = self.preprocess(ex, self.is_test) File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 195, in preprocess tgt = ex['tgt'][:self.args.max_tgt_len][:-1]+[2] KeyError: 'tgt'

Oussamamt avatar Nov 20 '19 20:11 Oussamamt

I have the same problem!

nimahassanpour avatar Feb 06 '20 22:02 nimahassanpour

hey I used this command from my google colab for Model Training:

!python train.py -task ext -mode train -bert_data_path "/content/drive/My Drive/BERTSUMEXT/bert_data/cnndm" -ext_dropout 0.1 -model_path "/content/drive/My Drive/BERTSUMEXT/models" -lr 2e-3 -visible_gpus 0 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50000 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

but It shows this error i need help please

[2019-11-20 20:10:35,503 INFO] * number of parameters: 120512513 [2019-11-20 20:10:35,503 INFO] Start training... [2019-11-20 20:10:35,673 INFO] Loading train dataset from /content/drive/My Drive/BERTSUMEXT/bert_data/cnndm.train.123.bert.pt, number of examples: 2001 Traceback (most recent call last): File "/content/drive/My Drive/BERTSUMEXT/src/train.py", line 146, in train_ext(args, device_id) File "/content/drive/My Drive/BERTSUMEXT/src/train_extractive.py", line 203, in train_ext train_single_ext(args, device_id) File "/content/drive/My Drive/BERTSUMEXT/src/train_extractive.py", line 245, in train_single_ext trainer.train(train_iter_fct, args.train_steps) File "/content/drive/My Drive/BERTSUMEXT/src/models/trainer_ext.py", line 137, in train for i, batch in enumerate(train_iter): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 142, in iter for batch in self.cur_iter: File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 278, in iter for idx, minibatch in enumerate(self.batches): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 256, in create_batches for buffer in self.batch_buffer(data, self.batch_size * 300): File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 224, in batch_buffer ex = self.preprocess(ex, self.is_test) File "/content/drive/My Drive/BERTSUMEXT/src/models/data_loader.py", line 195, in preprocess tgt = ex['tgt'][:self.args.max_tgt_len][:-1]+[2] KeyError: 'tgt'

Had the same error when I used the data downloaded from the BertSum README (https://github.com/nlpyang/BertSum). Changed it with the one from PreSumm and I no longer have the error. Hope it helps!

LauraGheoldan avatar Mar 11 '20 23:03 LauraGheoldan

I have the same problem!

@LauraGheoldan is right. Or you can process the data yourself (follow the 5 steps of Option2)

nimahassanpour avatar Mar 12 '20 01:03 nimahassanpour

Hi; you may need to make the .log file in the directory yourself.

On Mon, May 11, 2020 at 3:34 AM dhouhaomri [email protected] wrote:

Hi, i tried this command for mode training: python /users/omri/workspace/Trainbert/PreSumm/src/train.py -task ext -mode train -bert_data_path /users/omri/workspace/Trainbert/PreSumm/bert_data -ext_dropout 0.1 -model_path /users/omri/workspace/Trainbert/PreSumm/models -lr 2e-3 -visible_gpus -1 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50000 -accum_count 2 -log_file /workspace/Trainbert/PreSumm/logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

but when i run it the error is that there is no "ext_bert_cnndm" log_file. Where i can find logs files beause this directory is empty. Any help please?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nlpyang/PreSumm/issues/11#issuecomment-626556654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AISPT52K4ZOGP4EI5HNGM33RQ62AFANCNFSM4IQQTNTA .

nimahassanpour avatar May 14 '20 03:05 nimahassanpour