Question on fine-tuning time

Open JunseokLee98 opened this issue 1 year ago • 1 comments

Thank you for sharing the paper and code. While reading the Experimental Settings section in the 5.2 Implementation, I have a question about fine-tuning time.

Could you please let me know approximate fine-tuning time for Multimodal-CoT if you remember?

I am trying to understand the paper and code for re-implementation. However, due to limited computing resources(no multi-GPUs), I have to use cloud services. This has led me to calculate the approximate fine-tuning time, as cloud companies charge based on hour.

Jan 19 '24 09:01 JunseokLee98

Hi, it may need 8/24 hours to train a base/large model using an A100 GPU, respectively. This may also depend on the exact GPU. As it has been a long time after the training, I could not ensure if I remember it accurately. An efficient way would be running the code and the log will show the approximate fine-tuning time.

May 19 '24 06:05 cooelf