ISC21-Descriptor-Track-1st Final optimizer state for the model

Hello @lyakaap

Thanks a lot for this work. I am trying to take this and finetune over a certain task. Is it possible you can provide the state of final optimizer after 4th stage of training. We want to try an experiment where it will be very useful.

Thank you.

May 05 '22 11:05 shubhamjain0594

Thanks for your interest to my work.

You can download the checkpoint including optimizer state here: https://drive.google.com/file/d/1Z9G2yhYep0woJuKitaLJ2W06WUHWxbAv/view?usp=sharing

May 07 '22 05:05 lyakaap

Thank you for your reply @lyakaap

With fourth stage I meant the final training phase before applying post-processing (the fourth stage as described in your paper). I believe that should be v107_0009.pth.tar

It will be great if you can share that. Thank you.

May 11 '22 12:05 shubhamjain0594

Then, this will do :) https://drive.google.com/file/d/1ySea-NJp_J0aWvma_WmVbc3Hnwf5LHUf/view

May 11 '22 12:05 lyakaap

Thank you for this. It has been very useful (both the files).

Lastly, what command and what value of gem-eval-p do you use to do intermediate evaluation? I am trying to replicate the results but I get scores of 0.72 with final model while you have reported 0.755 in your paper.

May 17 '22 09:05 shubhamjain0594

That's weird. It should match the performance if you execute inference code as described README. Please make sure that you evaluate with the private-set of phase-1.

May 17 '22 10:05 lyakaap

Okay, I found the bug in my evaluation code. Thanks for your help.

Can you also provide the final model after stage 2, i.e. final model for v86.

May 17 '22 11:05 shubhamjain0594

Sorry, it seems that I have deleeted the weights of the models prior to stage2...

Nov 28 '22 01:11 lyakaap

Then, this will do :) https://drive.google.com/file/d/1ySea-NJp_J0aWvma_WmVbc3Hnwf5LHUf/view

Hey Iyakaap, thank you for your work.

I'd like to reproduce the fourth stage based on this output from stage3, but i met some problem. Do you still remember what's the batch_size, num_negatives, learning_rate, and how many GPU you have used to train the fourth stage?

Dec 15 '22 01:12 GorillaSX

@GorillaSX You can check this branch for reproducing our results: https://github.com/lyakaap/ISC21-Descriptor-Track-1st/tree/reproduce

I think you can reproduce by following:

python v107.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
  --epochs 10 --lr 0.5 --wd 1e-6 --batch-size 16 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 1000 \
  ../input/training_images/

Dec 15 '22 01:12 lyakaap

@GorillaSX You can check this branch for reproducing our results: https://github.com/lyakaap/ISC21-Descriptor-Track-1st/tree/reproduce

I think you can reproduce by following:

python v107.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
  --epochs 10 --lr 0.5 --wd 1e-6 --batch-size 16 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 1000 \
  ../input/training_images/

I appreciate it @lyakaap.

I have tried this, but it seems i cannot put 16 batchs each with 30 neg samples into a single GPU. Would you mind tell me if you remember how many GPUs you used or what's the number of neg samples?

Dec 16 '22 09:12 GorillaSX

I remembered that I used 16 A100 GPUS.

Dec 16 '22 13:12 lyakaap

ISC21-Descriptor-Track-1st ISC21-Descriptor-Track-1st copied to clipboard

Final optimizer state for the model

ISC21-Descriptor-Track-1st
ISC21-Descriptor-Track-1st copied to clipboard