gpt-2-simple
gpt-2-simple copied to clipboard
error when setup batch size
As I check the source code, I found the training model function can setup batch size, So I set batch size to be 12 and run, then I got following error...
Errors may have originated from an input operation. Input Source operations connected to node model/MatMul: model/wte/read (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:185) model/Reshape (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:202)
Input Source operations connected to node model/MatMul: model/wte/read (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:185) model/Reshape (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:202)
Original stack trace for 'model/MatMul':
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 5, in
gpt2.finetune(sess, dataset="train.txt", model_name='345M', batch_size=24, steps=4000, restore_from='latest', print_every=500, sample_every=400, save_every=2000)
File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py", line 170, in finetune
output = model.model(hparams=hparams, X=context)
File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py", line 203, in model
logits = tf.matmul(h_flat, wte, transpose_b=True)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py", line 2647, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 5925, in mat_mul
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2005, in init
self._traceback = tf_stack.extract_stack()
my environment is in colab, and input text run in batch_size=1 can be without any error... So how to setup batch_size?
I found the error will be existed when batch_size set to larger than 1, only 1 can pass error... Why?
I found the error will be existed when batch_size set to larger than 1, only 1 can pass error... Why?
same problem