Shakkala icon indicating copy to clipboard operation
Shakkala copied to clipboard

Fine-tuning or Retrain

Open mahsayedsalem opened this issue 6 years ago • 2 comments

Thank you for the great work!

I wonder if I needed to fine tune the model or retrain it, how to pre-process my data? Specially the network's output, what does the "28" vector represents?

mahsayedsalem avatar Jan 25 '19 08:01 mahsayedsalem

To preprocess the data for training or prediction use: def prepare_input(self, input_sent):

If you want information about each input max length, check following code part (each model version tried different length):

if version == 1: self.max_sentence = 495 elif version == 2: self.max_sentence = 315 elif version == 3: self.max_sentence = 315

For more information about outputs, print (output_int_to_vocab) array: output_int_to_vocab = helper.load_binary('output_int_to_vocab',dictionary_folder)

Thanks for the interest in the library will try to include a notebook about fine-tuning for better contribution.

Barqawiz avatar Jan 25 '19 13:01 Barqawiz

Thanks @Barqawiz

prepare_input will convert the string to integers, I guess what we are looking for is a method to load the final text, with diacritics, input, and create two files, one plain, and the result, which contains the list of diacritics, the same thing that the keras model outputs after using logits_to_text.

For sure you had used some preparation scripts. Having those is helpful if you are willing to share. If it is okay, please :)

It may be too much to ask for the training code I know :D

Best wishes,

AbdallahNasir avatar Jun 20 '19 12:06 AbdallahNasir