TensorFlow-Summarization
TensorFlow-Summarization copied to clipboard
Purpose of doc_dict.txt and sum_dict.txt
Hey,
I am quite new in machine learning and I would like to use your code to train my own text summarization model. What I am currently wondering about is, what exactly are you needing the doc_dict.txt and sum_dict.txt files for?
How would I have to adapt them, if I would like to generate my own training data?
Thanks for any help!
It contains all the unique words in the doccuments and summaries.
Thanks for your answer!
Am I right in assuming, that the sum_dic.txt file also contains some abbreviations, e.g. pm for prime minister?
Because if not, why would I need two separate dictionaries and could not use one and the same for the documents and the summaries?