DCLGAN
DCLGAN copied to clipboard
Error when training simdcl with batch_size > 1
Hi, I ran the following config,
python train.py \
--dataroot datasets/a_to_b \
--name a_to_b \
--model simdcl \
--no_flip \
--lr_policy linear \
--gpu_ids 0,1,2,3 \
--direction AtoB \
--num_threads 60 \
--batch_size 4
which resulted in
File "train.py", line 46, in <module>
model.optimize_parameters() # calculate loss functions, get gradients, update network weights
File ".../DCLGAN/models/simdcl_model.py", line 163, in optimize_parameters
self.loss_G = self.compute_G_loss()
File ".../DCLGAN/models/simdcl_model.py", line 239, in compute_G_loss
(self.real_A, self.fake_B, self.real_B, self.fake_A)
File ".../DCLGAN/models/simdcl_model.py", line 276, in calculate_Sim_loss_all
feature_realA[i] = feat_k_pool1[i]
RuntimeError: The expanded size of the tensor (256) must match the existing size (1024) at non-singleton dimension 0. Target sizes: [256, 256]. Tensor sizes: [1024, 256]
The error does not occur with --batch_size 1
, and the shape mismatch in the first nicely matches the batch_size
.
Cheers, Paul
Hello Paul, Oh I didn't check the cases for batch_size >1, due to batch_size >1 will degrade the performance (not only DCLGAN/SimDCL, but fairly a large percentage of unsupervised I2I models).
To run the models: Use batch_size ==1 for SimDCL, or switch to DCLGAN for batch_size >1?
To fix this: I think we need to check the part of computing similarity loss ( especially the light networks H, netF3-F6 in code) . Could you try changing x = x.view(1, -1, self.patch_num, self.nc) to x = x.view(num_of_batch_size, -1, self.patch_num, self.nc) in line 514 of /models/networks.py?
Cheers, Junlin
Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)
Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)
Cheers, Paul
Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)
Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)
Cheers, Paul
Thanks Paul! Ye I guess so. As larger batch size should provide more stable gradients, jumping out of local minima (where mode collapse usually happens).
Thanking you again for shareing valuable information!
And thank you for sharing the code! :)