MRFN icon indicating copy to clipboard operation
MRFN copied to clipboard

您好,我在尝试运行代码时出现了下面的报错,一直无法解决,不知是什么原因。

Open ZhengDanYang1 opened this issue 5 years ago • 11 comments

报错如下:

InvalidArgumentError Traceback (most recent call last) ~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1333 try: -> 1334 return fn(*args) 1335 except errors.OpError as e:

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata) 1318 return self._call_tf_sessionrun( -> 1319 options, feed_dict, fetch_list, target_list, run_metadata) 1320

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata) 1406 self._session, options, feed_dict, fetch_list, target_list, -> 1407 run_metadata) 1408

InvalidArgumentError: Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/replica:0/task:0/device:CPU:0" [[{{node IteratorFromStringHandleV2}}]] [[{{node add_10}}]]

During handling of the above exception, another exception occurred:

InvalidArgumentError Traceback (most recent call last) in 130 with tf.variable_scope('positional', reuse=tf.AUTO_REUSE): 131 for i in range(10): --> 132 train_step() 133 current_step = tf.train.global_step(sess, global_step) 134

in train_step() 16 } 17 ---> 18 _, step, summaries, loss, accuracy= sess.run([train_op, global_step, train_summary_op, model.loss, model.accuracy], feed_dict) 19 20

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata) 927 try: 928 result = self._run(None, fetches, feed_dict, options_ptr, --> 929 run_metadata_ptr) 930 if run_metadata: 931 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1150 if final_fetches or final_targets or (handle and feed_dict_tensor): 1151 results = self._do_run(handle, final_targets, final_fetches, -> 1152 feed_dict_tensor, options, run_metadata) 1153 else: 1154 results = []

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1326 if handle is None: 1327 return self._do_call(_run_fn, feeds, fetches, targets, options, -> 1328 run_metadata) 1329 else: 1330 return self._do_call(_prun_fn, handle, feeds, fetches)

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1346 pass 1347 message = error_interpolation.interpolate(message, self._graph) -> 1348 raise type(e)(node_def, op, message) 1349 1350 def _extend_graph(self):

InvalidArgumentError: Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/replica:0/task:0/device:CPU:0" [[node IteratorFromStringHandleV2 (defined at :82) ]] [[node add_10 (defined at :167) ]]

Caused by op 'IteratorFromStringHandleV2', defined at: File "/home/gong/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/gong/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in app.launch_new_instance() File "/home/gong/anaconda3/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 563, in start self.io_loop.start() File "/home/gong/anaconda3/lib/python3.6/site-packages/zmq/eventloop/ioloop.py", line 177, in start super(ZMQIOLoop, self).start() File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/ioloop.py", line 832, in start self._run_callback(self._callbacks.popleft()) File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/ioloop.py", line 605, in _run_callback ret = callback() File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/gen.py", line 1152, in inner self.run() File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/gen.py", line 1069, in run yielded = self.gen.send(value) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 365, in process_one yield gen.maybe_future(dispatch(*args)) File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper yielded = next(result) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 272, in dispatch_shell yield gen.maybe_future(handler(stream, idents, msg)) File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper yielded = next(result) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 542, in execute_request user_expressions, allow_stdin, File "/home/gong/anaconda3/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper yielded = next(result) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 294, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/home/gong/anaconda3/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 536, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2855, in run_cell raw_cell, store_history, silent, shell_futures) File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell return runner(coro) File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 68, in pseudo_sync_runner coro.send(None) File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3058, in run_cell_async interactivity=interactivity, compiler=compiler, result=result) File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes if (await self.run_code(code, result, async=asy)): File "/home/gong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 82, in iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 293, in from_string_handle output_shapes=output_structure._flat_shapes) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1596, in iterator_from_string_handle_v2 output_shapes=output_shapes, name=name) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/home/gong/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/replica:0/task:0/device:CPU:0" [[node IteratorFromStringHandleV2 (defined at :82) ]] [[node add_10 (defined at :167) ]]

ZhengDanYang1 avatar Aug 27 '19 11:08 ZhengDanYang1

您是在CPU上运行的吗?这个代码我们只在GPU上测试过哈

chongyangtao avatar Sep 19 '19 08:09 chongyangtao

@chongyangtao 你好 我想知道大概需要多大的显存。

bringtree avatar Dec 01 '19 12:12 bringtree

在你设置的batch_size=100下

bringtree avatar Dec 01 '19 12:12 bringtree

恩 我感觉到了 100 差不多要12g的显存。 不过 内存大概最大很占用到多少? 好奇 当时的机器配置

bringtree avatar Dec 03 '19 13:12 bringtree

我用机器是 1080Ti 哈,显存基本占满

chongyangtao avatar Dec 03 '19 13:12 chongyangtao

我用机器是 1080Ti 哈,显存基本占满

大神 内存当时的占用大概是多少呢? 或者当时的机器配置 内存大概多大。 记得吗。 我这边跑程序需要先申请机器来分配。 我发现128g,好像你的程序会 内存OOM 掉?

bringtree avatar Dec 03 '19 14:12 bringtree

内存我没太注意呢,不过应该不会超过128g呢,程序的数据读取使用的tfrecords,不会全部加载进去的

chongyangtao avatar Dec 04 '19 06:12 chongyangtao

恩 跑出来结果 和大神差不多。 给大神点个👍。 Processing 1000 samples Processing 2000 samples Processing 3000 samples Processing 4000 samples Processing 5000 samples


('pred_scores: ', 500000) recall_2_1: 0.944 recall_at_1: 0.786 recall_at_2: 0.888 recall_at_5: 0.973


bringtree avatar Dec 05 '19 11:12 bringtree

@chongyangtao 大神,我无论用网上找的词和字向量和训练出来的结果 还是 系统自己初始化的向量跑出来的中文结果结果 都差好多。 求大神告知下用的字向量和词向量的下载地址。

recall_2_1: 0.653 recall_at_1: 0.151 recall_at_2: 0.274 recall_at_5: 0.598

bringtree avatar Dec 16 '19 05:12 bringtree

恩 跑出来结果 和大神差不多。 给大神点个👍。 Processing 1000 samples Processing 2000 samples Processing 3000 samples Processing 4000 samples Processing 5000 samples

('pred_scores: ', 500000) recall_2_1: 0.944 recall_at_1: 0.786 recall_at_2: 0.888 recall_at_5: 0.973

你好 ,我也出现了这个问题, Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/replica:0/task:0/device:CPU:0"。想问一下你是怎么解决的

cccccs avatar Feb 26 '20 22:02 cccccs

@chongyangtao hi,直接跑训练脚本也出现了Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/replica:0/task:0/device:CPU:0",tf版本1.14,请问是如何解决的呢,谢谢

JayMarx avatar Apr 02 '20 15:04 JayMarx