idinvert copied to clipboard
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed
Following is the training log.
dnnlib: Running training.training_loop.training_loop() on localhost...
GPU available: True
GPU devices: /device:GPU:0
>>>>> Create Session
Dataset directory: .
Streaming data using training.dataset.TFRecordDataset...
tfrecord_dir: .\custom-images
Dataset shape = [1, 512, 512]
Dynamic range = [0, 255]
Label size = 0
Constructing networks...
G Params OutputShape WeightShape
--- --- --- ---
latents_in - (?, 512) -
labels_in - (?, 0) -
lod - () -
dlatent_avg - (512,) -
G_mapping/latents_in - (?, 512) -
G_mapping/labels_in - (?, 0) -
G_mapping/PixelNorm - (?, 512) -
G_mapping/Dense0 262656 (?, 512) (512, 512)
G_mapping/Dense1 262656 (?, 512) (512, 512)
G_mapping/Dense2 262656 (?, 512) (512, 512)
G_mapping/Dense3 262656 (?, 512) (512, 512)
G_mapping/Dense4 262656 (?, 512) (512, 512)
G_mapping/Dense5 262656 (?, 512) (512, 512)
G_mapping/Dense6 262656 (?, 512) (512, 512)
G_mapping/Dense7 4202496 (?, 8192) (512, 8192)
G_mapping/Reshape - (?, 16, 512) -
G_mapping/dlatents_out - (?, 16, 512) -
Truncation - (?, 16, 512) -
G_synthesis/dlatents_in - (?, 16, 512) -
G_synthesis/4x4/Const 534528 (?, 512, 4, 4) (512,)
G_synthesis/4x4/Conv 2885632 (?, 512, 4, 4) (3, 3, 512, 512)
G_synthesis/ToRGB_lod7 513 (?, 1, 4, 4) (1, 1, 512, 1)
G_synthesis/8x8/Conv0_up 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/8x8/Conv1 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/ToRGB_lod6 513 (?, 1, 8, 8) (1, 1, 512, 1)
G_synthesis/Upscale2D - (?, 1, 8, 8) -
G_synthesis/Grow_lod6 - (?, 1, 8, 8) -
G_synthesis/16x16/Conv0_up 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/16x16/Conv1 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/ToRGB_lod5 513 (?, 1, 16, 16) (1, 1, 512, 1)
G_synthesis/Upscale2D_1 - (?, 1, 16, 16) -
G_synthesis/Grow_lod5 - (?, 1, 16, 16) -
G_synthesis/32x32/Conv0_up 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/32x32/Conv1 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/ToRGB_lod4 513 (?, 1, 32, 32) (1, 1, 512, 1)
G_synthesis/Upscale2D_2 - (?, 1, 32, 32) -
G_synthesis/Grow_lod4 - (?, 1, 32, 32) -
G_synthesis/64x64/Conv0_up 1442816 (?, 256, 64, 64) (3, 3, 512, 256)
G_synthesis/64x64/Conv1 852992 (?, 256, 64, 64) (3, 3, 256, 256)
G_synthesis/ToRGB_lod3 257 (?, 1, 64, 64) (1, 1, 256, 1)
G_synthesis/Upscale2D_3 - (?, 1, 64, 64) -
G_synthesis/Grow_lod3 - (?, 1, 64, 64) -
G_synthesis/128x128/Conv0_up 426496 (?, 128, 128, 128) (3, 3, 256, 128)
G_synthesis/128x128/Conv1 279040 (?, 128, 128, 128) (3, 3, 128, 128)
G_synthesis/ToRGB_lod2 129 (?, 1, 128, 128) (1, 1, 128, 1)
G_synthesis/Upscale2D_4 - (?, 1, 128, 128) -
G_synthesis/Grow_lod2 - (?, 1, 128, 128) -
G_synthesis/256x256/Conv0_up 139520 (?, 64, 256, 256) (3, 3, 128, 64)
G_synthesis/256x256/Conv1 102656 (?, 64, 256, 256) (3, 3, 64, 64)
G_synthesis/ToRGB_lod1 65 (?, 1, 256, 256) (1, 1, 64, 1)
G_synthesis/Upscale2D_5 - (?, 1, 256, 256) -
G_synthesis/Grow_lod1 - (?, 1, 256, 256) -
G_synthesis/512x512/Conv0_up 51328 (?, 32, 512, 512) (3, 3, 64, 32)
G_synthesis/512x512/Conv1 42112 (?, 32, 512, 512) (3, 3, 32, 32)
G_synthesis/ToRGB_lod0 33 (?, 1, 512, 512) (1, 1, 32, 1)
G_synthesis/Upscale2D_6 - (?, 1, 512, 512) -
G_synthesis/Grow_lod0 - (?, 1, 512, 512) -
G_synthesis/images_out - (?, 1, 512, 512) -
G_synthesis/lod - () -
G_synthesis/noise0 - (1, 1, 4, 4) -
G_synthesis/noise1 - (1, 1, 4, 4) -
G_synthesis/noise2 - (1, 1, 8, 8) -
G_synthesis/noise3 - (1, 1, 8, 8) -
G_synthesis/noise4 - (1, 1, 16, 16) -
G_synthesis/noise5 - (1, 1, 16, 16) -
G_synthesis/noise6 - (1, 1, 32, 32) -
G_synthesis/noise7 - (1, 1, 32, 32) -
G_synthesis/noise8 - (1, 1, 64, 64) -
G_synthesis/noise9 - (1, 1, 64, 64) -
G_synthesis/noise10 - (1, 1, 128, 128) -
G_synthesis/noise11 - (1, 1, 128, 128) -
G_synthesis/noise12 - (1, 1, 256, 256) -
G_synthesis/noise13 - (1, 1, 256, 256) -
G_synthesis/noise14 - (1, 1, 512, 512) -
G_synthesis/noise15 - (1, 1, 512, 512) -
images_out - (?, 1, 512, 512) -
--- --- --- ---
Total 30114536
D Params OutputShape WeightShape
--- --- --- ---
images_in - (?, 1, 512, 512) -
labels_in - (?, 0) -
lod - () -
FromRGB_lod0 64 (?, 32, 512, 512) (1, 1, 1, 32)
512x512/Conv0 9248 (?, 32, 512, 512) (3, 3, 32, 32)
512x512/Conv1_down 18496 (?, 64, 256, 256) (3, 3, 32, 64)
Downscale2D - (?, 1, 256, 256) -
FromRGB_lod1 128 (?, 64, 256, 256) (1, 1, 1, 64)
Grow_lod0 - (?, 64, 256, 256) -
256x256/Conv0 36928 (?, 64, 256, 256) (3, 3, 64, 64)
256x256/Conv1_down 73856 (?, 128, 128, 128) (3, 3, 64, 128)
Downscale2D_1 - (?, 1, 128, 128) -
FromRGB_lod2 256 (?, 128, 128, 128) (1, 1, 1, 128)
Grow_lod1 - (?, 128, 128, 128) -
128x128/Conv0 147584 (?, 128, 128, 128) (3, 3, 128, 128)
128x128/Conv1_down 295168 (?, 256, 64, 64) (3, 3, 128, 256)
Downscale2D_2 - (?, 1, 64, 64) -
FromRGB_lod3 512 (?, 256, 64, 64) (1, 1, 1, 256)
Grow_lod2 - (?, 256, 64, 64) -
64x64/Conv0 590080 (?, 256, 64, 64) (3, 3, 256, 256)
64x64/Conv1_down 1180160 (?, 512, 32, 32) (3, 3, 256, 512)
Downscale2D_3 - (?, 1, 32, 32) -
FromRGB_lod4 1024 (?, 512, 32, 32) (1, 1, 1, 512)
Grow_lod3 - (?, 512, 32, 32) -
32x32/Conv0 2359808 (?, 512, 32, 32) (3, 3, 512, 512)
32x32/Conv1_down 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
Downscale2D_4 - (?, 1, 16, 16) -
FromRGB_lod5 1024 (?, 512, 16, 16) (1, 1, 1, 512)
Grow_lod4 - (?, 512, 16, 16) -
16x16/Conv0 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
16x16/Conv1_down 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
Downscale2D_5 - (?, 1, 8, 8) -
FromRGB_lod6 1024 (?, 512, 8, 8) (1, 1, 1, 512)
Grow_lod5 - (?, 512, 8, 8) -
8x8/Conv0 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
8x8/Conv1_down 2359808 (?, 512, 4, 4) (3, 3, 512, 512)
Downscale2D_6 - (?, 1, 4, 4) -
FromRGB_lod7 1024 (?, 512, 4, 4) (1, 1, 1, 512)
Grow_lod6 - (?, 512, 4, 4) -
4x4/MinibatchStddev - (?, 513, 4, 4) -
4x4/Conv 2364416 (?, 512, 4, 4) (3, 3, 513, 512)
4x4/Dense0 4194816 (?, 512) (8192, 512)
4x4/Dense1 513 (?, 1) (512, 1)
scores_out - (?, 1) -
--- --- --- ---
Total 23075169
Building TensorFlow graph...
Setting up snapshot image grid...
Setting up run dir...
Traceback (most recent call last):
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1334, in _do_call
return fn(*args)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1407, in _call_tf_sessionrun
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16
[[{{node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1}} = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]]
[[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 193, in <module>
File "", line 188, in main
File "D:\dy\idinvert\dnnlib\submission\", line 290, in submit_run
File "D:\dy\idinvert\dnnlib\submission\", line 242, in run_wrapper
util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs)
File "D:\dy\idinvert\dnnlib\", line 257, in call_func_by_name
return func_obj(*args, **kwargs)
File "D:\dy\idinvert\training\", line 231, in training_loop[D_train_op, Gs_update_op], {lod_in: sched.lod, lrate_in: sched.D_lrate, minibatch_in: sched.minibatch})
File "D:\dy\idinvert\dnnlib\tflib\", line 26, in run
return tf.get_default_session().run(*args, **kwargs)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 929, in run
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1328, in _do_run
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16
[[node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1 (defined at D:\dy\idinvert\dnnlib\tflib\ = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]]
[[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1', defined at:
File "", line 193, in <module>
File "", line 188, in main
File "D:\dy\idinvert\dnnlib\submission\", line 290, in submit_run
File "D:\dy\idinvert\dnnlib\submission\", line 242, in run_wrapper
util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs)
File "D:\dy\idinvert\dnnlib\", line 257, in call_func_by_name
return func_obj(*args, **kwargs)
File "D:\dy\idinvert\training\", line 184, in training_loop
D_opt.register_gradients(tf.reduce_mean(D_loss), D_gpu.trainables)
File "D:\dy\idinvert\dnnlib\tflib\", line 98, in register_gradients
grads = self._dev_opt[dev].compute_gradients(loss, trainable_vars, gate_gradients=tf.train.Optimizer.GATE_NONE) # disable gating to reduce memory usage
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\training\", line 519, in compute_gradients
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 630, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 814, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 408, in _MaybeCompile
return grad_fn() # Exit early
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 814, in <lambda>
lambda: grad_fn(op, *out_grads))
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 1131, in _MatMulGrad
grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 4560, in mat_mul
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\", line 787, in _apply_op_helper
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\util\", line 488, in new_func
return func(*args, **kwargs)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\", line 3274, in create_op
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
...which was originally created as op 'GPU0/D_loss/D_1/4x4/Dense0/MatMul', defined at:
File "", line 193, in <module>
[elided 3 identical lines from previous traceback]
File "D:\dy\idinvert\dnnlib\", line 257, in call_func_by_name
return func_obj(*args, **kwargs)
File "D:\dy\idinvert\training\", line 182, in training_loop
D_loss = dnnlib.util.call_func_by_name(G=G_gpu, D=D_gpu, opt=D_opt, training_set=training_set, minibatch_size=minibatch_split, reals=reals, labels=labels, **D_loss_args)
File "D:\dy\idinvert\dnnlib\", line 257, in call_func_by_name
return func_obj(*args, **kwargs)
File "D:\dy\idinvert\training\", line 154, in D_logistic_simplegp
fake_scores_out = fp32(D.get_output_for(fake_images_out, labels, is_training=True))
File "D:\dy\idinvert\dnnlib\tflib\", line 222, in get_output_for
out_expr = self._build_func(*final_inputs, **build_kwargs)
File "D:\dy\idinvert\training\", line 654, in D_basic
scores_out = grow(2, resolution_log2 - 2)
File "D:\dy\idinvert\training\", line 651, in grow
x = block(x(), res); y = lambda: x
File "D:\dy\idinvert\training\", line 619, in block
x = act(apply_bias(dense(x, fmaps=nf(res-2), gain=gain, use_wscale=use_wscale)))
File "D:\dy\idinvert\training\", line 159, in dense
return tf.matmul(x, w)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 2057, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\", line 4560, in mat_mul
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\", line 787, in _apply_op_helper
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\util\", line 488, in new_func
return func(*args, **kwargs)
File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\", line 3274, in create_op
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16
[[node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1 (defined at D:\dy\idinvert\dnnlib\tflib\ = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]]
[[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
is this problem caused by large batch_size? but when i turn down the batch_size ,the problem is still occured.
You can try on the images with the resolution of 256x256 and see if the problem still happens.
the problem still occured.