I3D-Tensorflow icon indicating copy to clipboard operation
I3D-Tensorflow copied to clipboard

How can i modify the train file to multiple gpu version

Open Epiphqny opened this issue 6 years ago • 10 comments

hello, i used the train_ucf file and modify it to multi-gpu according to your resnet-3d version, but do not success, can you provide a multi-gpu version for this model? thanks.

Epiphqny avatar Nov 30 '18 07:11 Epiphqny

you are right, this code just for one gpu, the multi_gpu version will public in two days if you need

LossNAN avatar Dec 03 '18 02:12 LossNAN

you are right, this code just for one gpu, the multi_gpu version will public in two days if you need

Yes, i implemented the multi-gpu version myself but there are still some problems, will be very grateful if you can release your version!

Epiphqny avatar Dec 03 '18 03:12 Epiphqny

@Epiphqny The version of multi_gpu has been pushed ,but i haven't do validation because all my gpus are working , so if there are some bugs you can not solve it , please contact me. Also, the updated codes are modified from my 3D-resnet-tensorflow code, and this code just for rgb, data_loading was modified by using tensorflow pipeline for speeding.(if not use, it will take 4 sec one step, very slow). Best wishes!

LossNAN avatar Dec 03 '18 06:12 LossNAN

@LossNAN OK, thank you very much for your help, i will try it.

Epiphqny avatar Dec 03 '18 06:12 Epiphqny

@LossNAN Can i ask how do you deal with the batch normalization items when saving the model?

Epiphqny avatar Dec 03 '18 13:12 Epiphqny

@Epiphqny if you use my I3D(inception 3d) code , the net work was built by 'sonnet'(snt.BatchNorm()) which has already packaged so you do not need create 'beta' 'gama' ,and the 'mean, variance' will be added to tf.GraphKeys.UPDATE_OPS, so you can see 'with tf.control_dependencies(update_ops):'to update the 'mean, variance' and wil be saved , when you test and set is_training to false , sonnet will use 'mean, variance' saved to compute; another version you can get in my bn_function of 3d-resnet-tensorflow

LossNAN avatar Dec 04 '18 02:12 LossNAN

@Epiphqny if you want to know more ,i will be very glad to help you out with any queries, my QQ_number:346925546

LossNAN avatar Dec 04 '18 02:12 LossNAN

@LossNAN 已加

Epiphqny avatar Dec 04 '18 12:12 Epiphqny

Hello, I also tried to train multiple cpu and refer your last Dec codes.

I have seen the multi_gpu_train_kinetics_rgb.py file, and there seems to be a non_local related code that is not in git. I wonder if there was no change in the i3d internals. ex. share variable in i3d network........

bhkim1020 avatar Jun 26 '19 09:06 bhkim1020

Hi @LossNAN, thank you for sharing your work. Do you have the code for training a singular gpu as I am using 1 gpu? Also, while using trying the multi_gpu code, there is a NameError because the learning_rate is not defined, amI doing something wrong? Can you also please please explain the non_local reference in the code?

sanolans avatar Oct 17 '19 08:10 sanolans