CosFace
CosFace copied to clipboard
Training code should be modified if multiple GPUs are used
When I use four GPUs to training cosface model, exception occurs:
ValueError: Variable conv1_/conv2d/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
File "networks/sphere_network.py", line 49, in first_conv
network = tf.layers.conv2d(input, num_output, kernel_size = [3, 3], strides = (2, 2), padding = 'same', kernel_initializer = xavier, bias_initializer = zero_init, kernel_regularizer = l2_regularizer, bias_regularizer = l2_regularizer)
File "networks/sphere_network.py", line 14, in infer
network = first_conv(input, 64, name = 'conv1')
File "train/train_multi_gpu.py", line 197, in main
prelogits = network.infer(batch_image_split[i], args.embedding_size)
prelogits = network.infer(batch_image_split[i],args.embedding_size) construct graph for every GPU and there is no resue setting. It should be modified to that : with tf.variable_scope(name_or_scope='', reuse=tf.AUTO_REUSE): prelogits = network.infer(batch_image_split[i],args.embedding_size)
If you want to use multiple gpus to train the model, you can switch NETWORK=sphere_network
to NETWORK=resface
in train.sh
. The resface is the implementation for multiple gpus. I just find the accuracy of sphere_network
is more better than that of resface
.
@AlexWang90 First of all, thank you and the author yule-li ! Is it possible to perform multi-GPU training only by modifying this part? Looking forward to your reply.