jpeg2dct icon indicating copy to clipboard operation
jpeg2dct copied to clipboard

Support image transformations

Open dreamflasher opened this issue 6 years ago • 10 comments

Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?

dreamflasher avatar Apr 10 '19 14:04 dreamflasher

Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?

I guess an easy solution is just to first resize ,crop,rotation the normally decoded jpeg picture and then encode it again

bupticybee avatar Jul 17 '19 08:07 bupticybee

That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.

dreamflasher avatar Jul 17 '19 09:07 dreamflasher

That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.

I don't see why it can't be done in real time, the process decode normally -> data augmentation -> encode normally -> jpeg2dct should't take a lot of time.

Of course a more clever way is to somehow map the data augmentation process to the "input image" after jpeg2dect.

bupticybee avatar Jul 18 '19 05:07 bupticybee

I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)

dreamflasher avatar Jul 18 '19 07:07 dreamflasher

I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)

I understand the purpose of is to save time of decoding/encoding, during inference time. all the extra encoding and decoding due to data augmentation should only happen in training, since data augmentation should only be adopted in training. Therefore the extra time wasted in encoding/decoding when inference is zero.

bupticybee avatar Jul 18 '19 09:07 bupticybee

Speeding up training is relevant, and that's what I personally care about.

dreamflasher avatar Jul 18 '19 10:07 dreamflasher

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct

ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

bupticybee avatar Jul 18 '19 10:07 bupticybee

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct

ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

Hi, I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?

kfirgoldberg avatar Mar 30 '20 14:03 kfirgoldberg

Speeding up training is relevant, and that's what I personally care about.

Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?

Hi, I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?

No, sorry. Zero progress

bupticybee avatar Mar 30 '20 16:03 bupticybee

Speeding up training is relevant, and that's what I personally care about.

Dear @dreamflasher, encoding/decoding is not the part of the process speeding up the inference. DCT is already a compressed form. In order to compress the given image to an equivalent size that DCT has, you have to use a reasonable amount of convolutional layers which require heavy computational need. The main idea of this work is using already compressed image form to avoid 1st and 2nd blocks in ResNet which involves lots of Convolutional layers.

saitarslanboun avatar Nov 16 '20 08:11 saitarslanboun