Augmentor icon indicating copy to clipboard operation
Augmentor copied to clipboard

Create n augmented images for each image

Open paulfauthmayer opened this issue 7 years ago • 4 comments

With a given dataset, I would like to produce n augmented images for each image in the specified folder and save them in the output folder.

If I do something like this:

dataset_size = len( os.listdir( '/path/to/dataset' ) )     # dataset_size == 100
n = 3
p.sample( n * dataset_size )

It picks 300 random images from the dataset and creates the augmented images. However, this results in a disproportionate dataset with some images being processed more than n times and some not being processed at all. I would prefer to do the random picking at a later point in training and not while generating/ augmenting the dataset.

I guess that I could also do this with the following code, but that's not the prettiest way to do it.

import itertools

n = 3
for _ in itertools.repeat(None, n):
    p.process()    # or p.sample(0) for that matter

Am I missing something, is there a nicer way to do this? If not, I think a way to tell the sample function to not pick an image at random would be appreciated.

Thanks!

paulfauthmayer avatar Jan 15 '19 22:01 paulfauthmayer

Hi @paulfauthmayer yeah, that's a nice idea, I hadn't thought of that. I will add that to the process( ) function in the next update. Right now though I think what you're doing is as good a workaround as could be done, even if it is not very pretty as you say :-)

mdbloice avatar Jan 16 '19 08:01 mdbloice

@mdbloice, has you add this functionality to process() function? I can not wait to try it .

Zhang-O avatar Nov 04 '19 02:11 Zhang-O

@paulfauthmayer, thanks for your code !

Zhang-O avatar Nov 04 '19 03:11 Zhang-O

@paulfauthmayer, thanks for your code !

Zhang-O avatar Nov 04 '19 03:11 Zhang-O