gpt-2-tensorflow2.0 icon indicating copy to clipboard operation
gpt-2-tensorflow2.0 copied to clipboard

Error while training

Open MadRajib opened this issue 4 years ago • 1 comments

Latest checkpoint restored...............
Running in graph mode.............
Traceback (most recent call last):
  File "train_gpt2.py", line 77, in <module>
    train()
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "train_gpt2.py", line 72, in train
    model.fit([train_dataset, test_dataset], graph_mode)
  File "/home/madrajib/workspace/projects/master_project/gpt-2-tensorflow2.0/gpt2_model.py", line 282, in fit
    step, loss, perplexity = train_func(inputs, targets)
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
    return self._call_flat(
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
    outputs = execute.execute(
  File "/home/madrajib/.local/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError:  indices[2,1] = 31868 is not in [0, 24512)
	 [[node gpt2/embeddings/embedding_layer/embedding/embedding_lookup (defined at /home/madrajib/workspace/projects/master_project/gpt-2-tensorflow2.0/layers/embedding_layer.py:41) ]] [Op:__inference_train_step_6798]

Errors may have originated from an input operation.
Input Source operations connected to node gpt2/embeddings/embedding_layer/embedding/embedding_lookup:
 Inputs (defined at /home/madrajib/workspace/projects/master_project/gpt-2-tensorflow2.0/gpt2_model.py:282)

Function call stack:
train_step

MadRajib avatar Jan 06 '21 13:01 MadRajib

Have you tried the solutions on this one? https://github.com/tensorflow/tensorflow/issues/2734

brunodpoliveira avatar Mar 10 '21 15:03 brunodpoliveira