Segment-Everything-Everywhere-All-At-Once icon indicating copy to clipboard operation
Segment-Everything-Everywhere-All-At-Once copied to clipboard

Training Time Estimation

Open ziqipang opened this issue 2 years ago • 3 comments

Hi,

Thank you for the excellent work, and I plan to use your research as the foundation for my agenda! However, I have limited computation resources, so I would like to ask about the estimated time to train the models.

  • I am using a 4GPU machine to run with batch size 16, and each epoch takes ~3 hours. I am curious: what is the speed on your side? If my speed is too slow, I might spend time debugging with the server.
  • For the sake of quick iteration, I am curious if you have tried to use less data or epochs during the development stage. If so, could you please share some insights of your experiment settings and the degradation of performance when training less?

Thank you again for your kind help!

ziqipang avatar Dec 31 '23 00:12 ziqipang

Your training speed doesn't seem to be wrong, one trick I developed for the next project is to precompute the language encoder weight: https://github.com/UX-Decoder/FIND/blob/708ddf53ab594fe6be642bae2ff54eb42cdb8a9a/configs/grin/focalt_unicl_lang.yaml#L58. If you are interested in this, I could share more details.

MaureenZOU avatar Dec 31 '23 01:12 MaureenZOU

@MaureenZOU Thank you so much for the prompt reply! It is good to know that my current speed is normal, but it is also quite long to train a model using around 6 days (150 hours). I will share with you if I find any ways to accelerate this process in the future.

I am quite interested in the techniques you mentioned. I will be really grateful if you could share with me your tricks of pre-compute the language encoder weights. Thank you!

ziqipang avatar Dec 31 '23 03:12 ziqipang

Hello, what kind of GPU are you using.

Jeenedo2023 avatar Mar 21 '24 14:03 Jeenedo2023