STF icon indicating copy to clipboard operation
STF copied to clipboard

Getting weird results with STF, need help !

Open jjeremy40 opened this issue 2 years ago • 6 comments

Hi ! Thank you for the great work !

I'm having some difficulties to reproduce the results on other datasets though.... And I could really use your help !

I trained STF (transformer version) on Waymo & BDD100K opensource datasets (imgs from front cam of an autonomous vehicle, 180000 imgs in total), for the almost same lambda values [0.0009, 0.0018, 0.0035, 0.0067, 0.013, 0.025, 0.0483] using MSE Loss for 200 epochs each. Using Weight&Bias app, the training results looks fine (see graph below). Don't they ?

Capture d’écran 2022-09-25 à 15 43 58

And when I test my new weights on Waymo I get weird results... Indeed : (see the graph below) ... - on cropped (256, 256) images (just like the validation on during training), the results make sense (green curve) - on full size images (with the appropriate padding, just like your compressai.utils.eval_model code), the results don't make sense (red curve) - the black curve is the results I get on Waymo using the weights trained with lambda = [0.0035, 0.025] on OpenImages that you generously provide

I have done those test multiple times and I always get the same strange results...

Capture d’écran 2022-09-25 à 15 32 13

Do you have any idea where the bug can come from ?? Is it the training datasets ? Why the cropping change the results ?

I would be extremely grateful for your help !!!

jjeremy40 avatar Sep 25 '22 13:09 jjeremy40

Hi ! The loss curves look fine during training, but get weird results on 1920x1280 images while testing, especially at low bpp. I have not tried on specific data domain like Waymo or BDD100K. But I guess intuitively there is a domain gap between train(256x256) and test(1920x1280 in Waymo), as all the images are from a fixed angle. How about trying to resize the original images to a low resolution while training ?And have you ever tested on other datasets?

Googolxx avatar Sep 26 '22 06:09 Googolxx

Supplementally, the training data is randomly cropped to 256x256, and the val data is centrally cropped to 256x256.

Googolxx avatar Sep 26 '22 06:09 Googolxx

Thanks for answering !

I have also tested on Kodak dataset (but there's only 25 images...) :

  • black curve : your opensource weights (trained on OpenImages) on waymo
  • red curve : my weights (trained on Waymo+BDD100K) on waymo
  • bleu curve : your results on kodak from your article
  • gray curve : my weights (trained on Waymo+BDD100K) on kodak Untitled

By "trying to resize the original images to a low resolution while training" you mean resize Waymo&BDD100K images to, lets say 480x320 for Waymo & 640x360 for BDD100K, instead of RandomCrop(256) while training ?

jjeremy40 avatar Sep 26 '22 13:09 jjeremy40

Such as apply the transforms like "train_transforms = transforms.Compose( [transforms.Resize([960, 640]), transforms.RandomCrop(256), transforms.ToTensor()])" instead of https://github.com/Googolxx/STF/blob/0e2480434542af1ac9566e903c57ca0d9d9bf954/train.py#L277-L279

Googolxx avatar Sep 26 '22 13:09 Googolxx

Ok I'll try that. Thanks !

jjeremy40 avatar Sep 26 '22 15:09 jjeremy40

can you try transfer onnx? thinks

ld-xy avatar Sep 27 '22 13:09 ld-xy