sidenet icon indicating copy to clipboard operation
sidenet copied to clipboard

Training data query

Open riddhirdasani opened this issue 7 years ago • 5 comments
trafficstars

Hello Shashi, I am unable to understand why do we need these three directories and what role do they play in training? 1. preprocessed_data_directory 2. gold_summary_directory 3. doc_sentence_directory. Can you provide little more insights of these three.? When I was trying , one epoch has already finished and then this error appeared.

riddhirdasani avatar Apr 02 '18 01:04 riddhirdasani

For training you only need (1). (2) is used to estimate ROUGE scores. (1) and (3) is used during decoding. What error do you get after the first epoch?

shashiongithub avatar Apr 02 '18 09:04 shashiongithub

File "/home/beast/riddhi/main/data_utils.py", line 116, in process_predictions_rankedtopthree docsents = open(sent_filename).readlines() FileNotFoundError: [Errno 2] No such file or directory: '/home/beast/riddhi/main/JP_herman/cnn/validation-sent/8f6b39e6c63b0ae3546cdfeb8209693f292b060e.summary.final.org_sents'

sent_filename which got generated in line 115 by using FLAGS.doc_sentence_directory, cannot be opened in line 116. I just presumed and made directory , what exactly should be there in this directory?

riddhirdasani avatar Apr 03 '18 00:04 riddhirdasani

It should point to 3) doc_sentence_directory directory.

shashiongithub avatar Apr 03 '18 07:04 shashiongithub

Yes and what should be there, from where can I get it?

riddhirdasani avatar Apr 03 '18 15:04 riddhirdasani

Please check: https://github.com/shashiongithub/sidenet/issues/2

shashiongithub avatar Apr 03 '18 15:04 shashiongithub