handwriting_line_generation icon indicating copy to clipboard operation
handwriting_line_generation copied to clipboard

Format of your dataset and utilities

Open rezwanh001 opened this issue 3 years ago • 1 comments

Dear Concern, It's great project. But It is hard for me for exploring the exact format of your dataset. Please give a brief review on format of Dataset and need ways/utilities for other language's offline handwritten line generation. Thanks.

rezwanh001 avatar Nov 21 '21 10:11 rezwanh001

The project used two datasets, the IAM and RIMES. The objects in datasets/ are used to read them in their individual formats and then prepare the data for what the trainer and model expect. If you're wanting to use this on your own dataset/new language, I'd recommed looking at what I said here: https://github.com/herobd/handwriting_line_generation/issues/23

The data passed to the trainer from the __getitem__ function of the dataset objects is a dictionary with these elements:

  • image: Normalized to height of 64 pixels, range of -1 (background) to 1 (foreground/ink). Can be multiple images by the same author. (They'll all get appened in the same batch, but reshaped with the a_batch_size during style extraction)
  • gt: Plain text for the image(s)
  • label: IntTensor with the char indexes (startiing at 1) of the ground truth text for the image(s). Padded to be the same length.
  • label_lengths: IntTensor with the length of each label (before padding).
  • a_batch_size: (defined in collate function) The number consecutive images by the same author.
  • name: ID for each image (for display purposes)
  • author: author (for display purposes)

You'll see a couple more things returned from the actually dataset files, but they aren't used by the trainer.

herobd avatar Nov 22 '21 20:11 herobd