Attention-ocr-Chinese-Version icon indicating copy to clipboard operation
Attention-ocr-Chinese-Version copied to clipboard

请问怎样可以修改为用不同尺寸数据走batch=1的训练

Open z2140684 opened this issue 6 years ago • 5 comments

想请问大神一下,我想用它处理我的文档图片,经过切分后图片尺寸为(32,长度不定,3),我想通过batch=1来训练不同尺寸的图片。由于想提高模型速度,若将训练集padding成同样大小会在测试的时候降低速度。 目前我转化好了tfrecord 在运行时候报错,是尺寸问题 2018-08-08 10:54:00.548359: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-08-08 10:54:00.833290: E tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUDA_ERROR_NO_DEVICE 2018-08-08 10:54:00.833429: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: ksai-GPUSERVER_V100_1 2018-08-08 10:54:00.833443: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: ksai-GPUSERVER_V100_1 2018-08-08 10:54:00.833504: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.111.0 2018-08-08 10:54:00.833584: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.111.0 2018-08-08 10:54:00.833597: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.111.0 INFO 2018-08-08 10:54:16.000140: model.py: 581 Restoring checkpoint(s) INFO:tensorflow:Running local_init_op. INFO 2018-08-08 10:54:16.000140: tf_logging.py: 115 Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO 2018-08-08 10:54:16.000395: tf_logging.py: 115 Done running local_init_op. INFO:tensorflow:Starting Session. INFO 2018-08-08 10:54:38.000205: tf_logging.py: 115 Starting Session. INFO:tensorflow:Saving checkpoint to path ./model.ckpt INFO 2018-08-08 10:54:38.000788: tf_logging.py: 115 Saving checkpoint to path ./model.ckpt INFO:tensorflow:Starting Queues. INFO 2018-08-08 10:54:38.000832: tf_logging.py: 115 Starting Queues. INFO:tensorflow:global_step/sec: 0 INFO 2018-08-08 10:54:51.000791: tf_logging.py: 159 global_step/sec: 0 INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Input to reshape is a tensor with 26880 values, but the requested shape has 57600 [[Node: Reshape_6 = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](case/cond/Merge, PreprocessImage/AugmentImage/Shape)]] INFO 2018-08-08 10:55:01.000739: tf_logging.py: 115 Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Input to reshape is a tensor with 26880 values, but the requested shape has 57600 [[Node: Reshape_6 = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](case/cond/Merge, PreprocessImage/AugmentImage/Shape)]] INFO:tensorflow:Caught OutOfRangeError. Stopping Training. RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

Caused by op 'shuffle_batch', defined at: File "train.py", line 211, in app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train.py", line 198, in main central_crop_size=common_flags.get_crop_size()) File "/DATA/disk1/zyt/try_att/Attention-ocr-Chinese-Version/python/data_provider.py", line 193, in get_data min_after_dequeue=shuffle_config.min_after_dequeue)) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 1300, in shuffle_batch name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 846, in _shuffle_batch dequeued = queue.dequeue_many(batch_size, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1740, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

INFO 2018-08-08 10:55:02.000396: tf_logging.py: 115 Caught OutOfRangeError. Stopping Training. RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

Caused by op 'shuffle_batch', defined at: File "train.py", line 211, in app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train.py", line 198, in main central_crop_size=common_flags.get_crop_size()) File "/DATA/disk1/zyt/try_att/Attention-ocr-Chinese-Version/python/data_provider.py", line 193, in get_data min_after_dequeue=shuffle_config.min_after_dequeue)) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 1300, in shuffle_batch name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 846, in _shuffle_batch dequeued = queue.dequeue_many(batch_size, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1740, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

INFO:tensorflow:Finished training! Saving model to disk. INFO 2018-08-08 10:55:02.000401: tf_logging.py: 115 Finished training! Saving model to disk. Traceback (most recent call last): File "train.py", line 211, in app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train.py", line 207, in main train(total_loss, init_fn, hparams) File "train.py", line 155, in train init_fn=init_fn) File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 785, in train ignore_live_threads=ignore_live_threads) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/supervisor.py", line 833, in stop ignore_live_threads=ignore_live_threads) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join six.reraise(*self._exc_info_to_raise) File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise raise value File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 252, in _run enqueue_callable() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1244, in _single_operation_run self._call_tf_sessionrun(None, {}, [], target_list, None) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 26880 values, but the requested shape has 57600

z2140684 avatar Aug 08 '18 04:08 z2140684

在CNN中好像大多都是相同尺寸的输入,你想使用不同尺寸,建议还是按比例reshape,或者在小尺寸图片的四周填充成全黑背景,使输入网络的是相同尺寸的图片。

A-bone1 avatar Aug 10 '18 11:08 A-bone1

@A-bone1 你好,填充黑背景的原因是什么呢?空白背景可以吗?

Cancerce1l avatar Aug 14 '18 09:08 Cancerce1l

@Cancerce1l 可以的,反正就是纯净的背景

A-bone1 avatar Aug 15 '18 10:08 A-bone1

tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 8192 values, but the requested shape has 24576 @A-bone1 你好,我也出现了类似的问题,我的样本图片是三通道的灰度图片,是不是必须用彩色图片生成tfrecord文件?谢谢!

zzhaohao avatar Nov 21 '18 07:11 zzhaohao

@zzhaohao 只要尺寸是符合的三通道图片应该都是可以的

A-bone1 avatar Nov 24 '18 10:11 A-bone1