TensorFlow-Summarization icon indicating copy to clipboard operation
TensorFlow-Summarization copied to clipboard

Purpose of doc_dict.txt and sum_dict.txt

Open hannesb0 opened this issue 6 years ago • 2 comments

Hey,

I am quite new in machine learning and I would like to use your code to train my own text summarization model. What I am currently wondering about is, what exactly are you needing the doc_dict.txt and sum_dict.txt files for?

How would I have to adapt them, if I would like to generate my own training data?

Thanks for any help!

hannesb0 avatar Jun 21 '18 17:06 hannesb0

It contains all the unique words in the doccuments and summaries.

Pamulapati13 avatar Jun 27 '18 01:06 Pamulapati13

Thanks for your answer!

Am I right in assuming, that the sum_dic.txt file also contains some abbreviations, e.g. pm for prime minister?

Because if not, why would I need two separate dictionaries and could not use one and the same for the documents and the summaries?

hannesb0 avatar Jun 29 '18 16:06 hannesb0