ECO-pytorch icon indicating copy to clipboard operation
ECO-pytorch copied to clipboard

some difference between pytorch and caffe version

Open ntuyt opened this issue 6 years ago • 4 comments

Hi, Can Thanks so much for your contribution for this pytorch version. I have tested on something dataset, I found the performance is around 3-4 worse than the caffe version. I obtain 38.6 accuracy using pytorch code ECO-lite with 16 segements whereas the caffe achieved 42.4 accuracy.

I have some questions about the difference between this pytorch version and original caffe version.

  1. The mean is [104 117 123] in caffe whereas the pytorch version uses [104 117 128].
  2. The caffe model resizes the image to [320 240] followed by cropping whereas the pytorch version resize the image to [256 256] followed by cropping.

Do you think these differences will result into performance difference? Meanwhile, I wonder the pretrained model on Kinetics you provided is obtained by training your pytorch model or just convert from the pretrained caffe model?

Looking forward to your reply. Best, Tan

ntuyt avatar Feb 14 '19 01:02 ntuyt

Hi @ntuyt,

Thanks for your interests in this repo. I also found this problem these days.

The pretrained model is obtained by training pytorch model on Kinetics dataset.

  • Have you tried to modifiy these differences and get any improvement?
  • Did you use the latest Kinetics pretrained model (eco_lite_rgb_16F_kinetics_v3.pth.tar)?

I try to figure out the problem recently, if you have any ideas, please feel free to tell me.

Thanks!

Best wishes, Can

zhang-can avatar Feb 18 '19 04:02 zhang-can

Hi, Can Thanks so much for your reply. I used the eco_lite_rgb_16F_kinetics_v3.pth.tar. I finetuned the eco_lite_rgb_16F_kinetics_v3.pth.tar on something dataset. Using the default settings in your pytorch code, I achieved 38.6 accuracy. But after I change the resize codes, I first resize it into [240 320] instead of [256,256], I get around 40.1 accuracy. It is still worse than reported 42.4 in the paper. Since your pretrained kinetics model is based on 256x256, I guess we could further improve the performance by training the model in kinetics with 240x360 resizing. Do you have time to retrain your model on kinetics using 240x360 rescaling?

ntuyt avatar Feb 18 '19 09:02 ntuyt

Hi @zhang-can I used the eco_lite_rgb_16F_kinetics_v3.pth.tar. I saw that the best_prec1 was 61.77%. However, when i retrained on the kinetics, the score was a little worse than this. So what is your training strategy? Thanks!

jslslee avatar Feb 20 '19 03:02 jslslee

@ntuyt Hi , I resize the training image into [240,320],but get an error named "RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 7 and 8 in dimension 3 at c:\a\w\1\s\windows\pytorch\aten\src\thc\generic/THCTensorMath.cu:83" .why occur this phenomenon?can you share your changed code about this ?

chen-ming2019 avatar May 09 '19 04:05 chen-ming2019