counterfactual-open-set Issue while training GAN

HI everyone,

I face the following issue when running this command for the tiny imagenet dataset:

python src/train_gan.py --epochs 10

Error: Traceback (most recent call last): File "C:/Users/RAR7ABT/pj-val-ml/pjval_ml/OSR/counterfactual/src/train_gan.py", line 32, in train_gan(networks, optimizers, dataloader, epoch=epoch, **options) File "C:\Users\RAR7ABT\pj-val-ml\pjval_ml\OSR\counterfactual\src\training.py", line 67, in train_gan logits = netD(images)[:,0] File "C:\Users\RAR7ABT\AppData\Local\conda\conda\envs\pjval\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, **kwargs) File "C:\Users\RAR7ABT\pj-val-ml\pjval_ml\OSR\counterfactual\src\network_definitions.py", line 275, in forward x = self.fc1(x) File "C:\Users\RAR7ABT\AppData\Local\conda\conda\envs\pjval\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, **kwargs) File "C:\Users\RAR7ABT\AppData\Local\conda\conda\envs\pjval\lib\site-packages\torch\nn\modules\linear.py", line 92, in forward return F.linear(input, self.weight, self.bias) File "C:\Users\RAR7ABT\AppData\Local\conda\conda\envs\pjval\lib\site-packages\torch\nn\functional.py", line 1406, in linear ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [64 x 16384], m2: [4096 x 20] at C:/w/1/s/tmp_conda_3.6_041836/conda/conda-bld/pytorch_1556684464974/work/aten/src\THC/generic/THCTensorMathBlas.cu:268

Process finished with exit code 1

Any help would be really appreciated. Thanks in advance!

Aug 02 '19 08:08 RamyaRaghuraman

@lwneal the error seems to come from x = self.fc1(x) from class multiclassDiscriminator32(nn.Module). The size of m1 must be m1: [64 x 4096 ] but I somehow end up with m1: [64 x 16384]

Please do take a look at the discriminator updates in training.py @lwneal @mattolson93

Aug 09 '19 07:08 RamyaRaghuraman

@RamyaRaghuraman, I ran into a similar issue when training only the baseline classifier; for me, the error was because I had not resized the input images from 64 x 64 to 32x 32. If this resizing doesn't happen, the size of the network output ends up 4x bigger...which would explain why your output size is 16384 instead of the desired 4096 since 16384 = 4 * 4096.

Sep 12 '20 00:09 KevLuo

@KevLuo Hello sir, I encountered the same problem. I see the ImageConverter could resize img to 32x32,

# Crops, resizes, normalizes, performs any desired augmentations
# Outputs images as eg. 32x32x3 np.array or eg. 3x32x32 torch.FloatTensor

but it looks like it didn't. So, do we need to re-write the converter to make a resize transform?

Nov 22 '23 13:11 77flyy

counterfactual-open-set counterfactual-open-set copied to clipboard

Issue while training GAN

counterfactual-open-set
counterfactual-open-set copied to clipboard