gpt-2-simple icon indicating copy to clipboard operation
gpt-2-simple copied to clipboard

error when setup batch size

Open chiangandy opened this issue 5 years ago • 2 comments

As I check the source code, I found the training model function can setup batch size, So I set batch size to be 12 and run, then I got following error...

Errors may have originated from an input operation. Input Source operations connected to node model/MatMul: model/wte/read (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:185) model/Reshape (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:202)

Input Source operations connected to node model/MatMul: model/wte/read (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:185) model/Reshape (defined at /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:202)

Original stack trace for 'model/MatMul': File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in app.launch_new_instance() File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start ioloop.IOLoop.instance().start() File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start handler_func(fd_obj, events) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events self._handle_recv() File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv self._run_callback(callback, msg) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback callback(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell handler(stream, idents, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes if self.run_code(code, result): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 5, in gpt2.finetune(sess, dataset="train.txt", model_name='345M', batch_size=24, steps=4000, restore_from='latest', print_every=500, sample_every=400, save_every=2000) File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py", line 170, in finetune output = model.model(hparams=hparams, X=context) File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py", line 203, in model logits = tf.matmul(h_flat, wte, transpose_b=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py", line 2647, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 5925, in mat_mul name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3616, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()

my environment is in colab, and input text run in batch_size=1 can be without any error... So how to setup batch_size?

chiangandy avatar Aug 12 '19 03:08 chiangandy

I found the error will be existed when batch_size set to larger than 1, only 1 can pass error... Why?

chiangandy avatar Aug 12 '19 03:08 chiangandy

I found the error will be existed when batch_size set to larger than 1, only 1 can pass error... Why?

same problem

jemmryx avatar Jan 18 '21 07:01 jemmryx