dc_tts-transfer-learning icon indicating copy to clipboard operation
dc_tts-transfer-learning copied to clipboard

ERROR : Only when Training Text2Mel

Open sallyjoy opened this issue 5 years ago • 2 comments

I am getting errors with the command ( train_transfer.py 1 ) Strangely no problem with ( train_transfer.py 2 )

name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 totalMemory: 15.90GiB freeMemory: 15.61GiB 2019-04-14 22:02:31.187878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-04-14 22:02:31.604761: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-04-14 22:02:31.604834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2019-04-14 22:02:31.604847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2019-04-14 22:02:31.605171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15121 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) WARNING:tensorflow:From /opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [35,128] rhs shape= [32,128] [[{{node save/Assign_624}}]] [[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1276, in restore {self.saver_def.filename_tensor_name: save_path}) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [35,128] rhs shape= [32,128] [[node save/Assign_624 (defined at train_transfer.py:171) ]] [[node save/RestoreV2 (defined at train_transfer.py:171) ]]

Caused by op 'save/Assign_624', defined at: File "train_transfer.py", line 171, in sv = tf.train.Supervisor(logdir=logdir, save_model_secs=0, global_step=g.global_step) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func return func(*args, **kwargs) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 320, in init self._init_saver(saver=saver) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 470, in _init_saver saver = saver_mod.Saver() File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 354, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 73, in restore self.op.get_shape().is_fully_defined()) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 223, in assign validate_shape=validate_shape) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign use_locking=use_locking, name=name) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [35,128] rhs shape= [32,128] [[node save/Assign_624 (defined at train_transfer.py:171) ]] [[node save/RestoreV2 (defined at train_transfer.py:171) ]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train_transfer.py", line 174, in sv.saver.restore(sess, tf.train.latest_checkpoint(hp.restoredir + "-" + str(num))) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1312, in restore err, "a mismatch between the current graph and the graph") tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [35,128] rhs shape= [32,128] [[node save/Assign_624 (defined at train_transfer.py:171) ]] [[node save/RestoreV2 (defined at train_transfer.py:171) ]]

Caused by op 'save/Assign_624', defined at: File "train_transfer.py", line 171, in sv = tf.train.Supervisor(logdir=logdir, save_model_secs=0, global_step=g.global_step) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func return func(*args, **kwargs) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 320, in init self._init_saver(saver=saver) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 470, in _init_saver saver = saver_mod.Saver() File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 354, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 73, in restore self.op.get_shape().is_fully_defined()) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 223, in assign validate_shape=validate_shape) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign use_locking=use_locking, name=name) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [35,128] rhs shape= [32,128] [[node save/Assign_624 (defined at train_transfer.py:171) ]] [[node save/RestoreV2 (defined at train_transfer.py:171) ]]

sallyjoy avatar Apr 14 '19 22:04 sallyjoy

Did you manage to resolve it?

energyanalyst avatar Jul 24 '19 16:07 energyanalyst

Did you manage to resolve it?

I got this issue because I had added one punctuation symbol to the parameter called vocab in hyperparams.py ; So be carefull. Modifying Some parameters in this file may lead to mismatch errors. Hope it helps.

sallyjoy avatar Jul 25 '19 14:07 sallyjoy