augraphy icon indicating copy to clipboard operation
augraphy copied to clipboard

Improve Performance on Letterpress and Other Augmentations Relying on Noise Generation

Open cs-mshah opened this issue 3 years ago • 2 comments

Augmentations that rely on perlin noise generation are particularly slow, including Letterpress and others.

It would be great if the augmentations taking more time can be made more efficient/leverage GPU as it is too slow to practically use the bottom ones in the list for training.

I tried to train a model using letterpress and found that its one epoch was taking 12x more time than without applying the augmentation. I timed most augmentations on augmenting 7 images and here are the results:

Screenshot from 2022-12-04 12-45-04

Here is the code for timing:

aug_list = [
        DirtyDrum(line_concentration=0.5, noise_intensity=1.0, direction=2),
        BleedThrough(intensity_range=(0.6, 1.0), offsets=(7, 7), alpha=0.5),
        DirtyRollers(),
        Dithering(),
        Faxify(),
        InkBleed(severity=(0.5, 0.8)),
        Letterpress(),
        LowInkRandomLines(count_range=(10,15)),
        Markup(),
        PencilScribbles(size_range=(250, 400), count_range=(1, 10), stroke_count_range=(1, 6)),
        BrightnessTexturize(),
        ColorPaper(),
        Gamma(),
        Geometric(rotate_range=(-3,3)),
        LightingGradient(),
        PageBorder(width_range=(5,10)),
        SubtleNoise(subtle_range=25),
        BadPhotoCopy(),
        BindingsAndFasteners(ntimes=(2, 4)),
        Folding(fold_count=4),
        Jpeg(),
        NoiseTexturize()
        ]

    times = []

    for aug in aug_list:
        start_time = time.time()
        aug_imgs = []
        for img in imgs:
            aug_imgs.append(aug(img))
        end_time = time.time()
        times.append(end_time - start_time)

cs-mshah avatar Dec 04 '22 07:12 cs-mshah

Thanks for the feedback. Right now the performance improvement is in our improvement roadmap and it should be included in the next major update.

kwcckw avatar Dec 04 '22 08:12 kwcckw

The key issue with these slower augmentations is the noise generation process. So, we will use this issue to focus on approaches to speed noise generation while retaining an essential level of random variation in the distortions.

These augmentations should all be improved once we can improve the noise generation process:

  • Letterpress
  • BleedThrough
  • BadPhotoCopy
  • LightingGradient
  • PageBorder
  • NoiseTexturize
  • DirtyDrum
  • InkBleed
  • Faxify

We've recently released a performance improvement via #270 which included use of Numba to optimize loops. However, we found there remain a lot of opportunity to improve the noise generation processes which most heavily impact augmentation performance.

See greater than 100% performance improvements from recent Augraphy updates: https://github.com/sparkfish/augraphy/issues/270#issuecomment-1502517272

jboarman avatar Apr 11 '23 00:04 jboarman