EAST icon indicating copy to clipboard operation
EAST copied to clipboard

when run multigpu-train.py, i meet some problems that i unable to solve

Open ll-image opened this issue 7 years ago • 29 comments

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help. resnet_v1_50/block1 (?, ?, ?, 256) resnet_v1_50/block2 (?, ?, ?, 512) resnet_v1_50/block3 (?, ?, ?, 1024) resnet_v1_50/block4 (?, ?, ?, 2048) Shape of f_0 (?, ?, ?, 2048) Shape of f_1 (?, ?, ?, 512) Shape of f_2 (?, ?, ?, 256) Shape of f_3 (?, ?, ?, 64) Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048) Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128) Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64) Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32) WARNING:tensorflow:Variable feature_fusion/Conv/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt INFO:tensorflow:Restoring parameters from C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt Generator use 10 batches for buffering, this may take a while, you can tune this yourself. Traceback (most recent call last): File "C:\pycharm\PyCharm 2018.1.4\helpers\pydev\pydev_run_in_console.py", line 52, in run_file pydev_imports.execfile(file, globals, locals) # execute the script File "C:\pycharm\PyCharm 2018.1.4\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master/multigpu_train.py", line 182, in tf.app.run() File "C:\Anaconda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master/multigpu_train.py", line 155, in main data = next(data_generator) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master\icdar.py", line 726, in get_batch enqueuer.start(max_queue_size=10, workers=num_workers) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master\data_util.py", line 81, in start thread.start() File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'GeneratorEnqueuer.start..data_generator_task' PyDev console: using IPython 6.4.0 Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] on win32 Traceback (most recent call last): File "", line 1, in File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input

But i have download the .ckpt file and everything is prepared.

ll-image avatar Jul 01 '18 10:07 ll-image

I also got the same error, looking forward to solution!

NaxAlpha avatar Jul 04 '18 10:07 NaxAlpha

It is problem in windows, works pretty well in linux!

NaxAlpha avatar Jul 09 '18 10:07 NaxAlpha

Yes,you are right. I just try to run it in linux and it run aright.

ll-image avatar Jul 10 '18 05:07 ll-image

It seem a problem of multiprocessing in windows, i got the same problem in windows and run well in linux. Changing the parameter 'use_mutiprocessing=False' in line 726 in file 'icdar.py' to close multiprocessing might work, but i didn't try it.

YangZeyu95 avatar Aug 13 '18 13:08 YangZeyu95

It seem a problem of multiprocessing in windows, i got the same problem in windows and run well in linux. Changing the parameter 'use_mutiprocessing=False' in line 726 in file 'icdar.py' to close multiprocessing might work, but i didn't try it.

It doesn't work... : (

daming98 avatar Oct 23 '18 13:10 daming98

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help. resnet_v1_50/block1 (?, ?, ?, 256) resnet_v1_50/block2 (?, ?, ?, 512) resnet_v1_50/block3 (?, ?, ?, 1024) resnet_v1_50/block4 (?, ?, ?, 2048) Shape of f_0 (?, ?, ?, 2048) Shape of f_1 (?, ?, ?, 512) Shape of f_2 (?, ?, ?, 256) Shape of f_3 (?, ?, ?, 64) Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048) Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128) Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64) Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32) WARNING:tensorflow:Variable feature_fusion/Conv/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/gamma missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/beta missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/weights missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/biases missing in checkpoint C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt INFO:tensorflow:Restoring parameters from C:/Users/64929/Desktop/EAST/resnet_v1_50.ckpt Generator use 10 batches for buffering, this may take a while, you can tune this yourself. Traceback (most recent call last): File "C:\pycharm\PyCharm 2018.1.4\helpers\pydev\pydev_run_in_console.py", line 52, in run_file pydev_imports.execfile(file, globals, locals) # execute the script File "C:\pycharm\PyCharm 2018.1.4\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master/multigpu_train.py", line 182, in tf.app.run() File "C:\Anaconda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master/multigpu_train.py", line 155, in main data = next(data_generator) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master\icdar.py", line 726, in get_batch enqueuer.start(max_queue_size=10, workers=num_workers) File "C:/Text Detection/Top/EAST An Efficient and Accurate Scene Text Detector/EAST-master\data_util.py", line 81, in start thread.start() File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'GeneratorEnqueuer.start..data_generator_task' PyDev console: using IPython 6.4.0 Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] on win32 Traceback (most recent call last): File "", line 1, in File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Anaconda\envs\tensorflow-gpu\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input

But i have download the .ckpt file and everything is prepared.

Have you solved this problem?

daming98 avatar Oct 23 '18 13:10 daming98

What is that WARNING mean?

Antonio-hi avatar Nov 02 '18 02:11 Antonio-hi

I got following error while running multi_gpu.py configuration: Ubuntu: 18.04 CPU 8GB python 3.6

resnet_v1_50/block1 (?, ?, ?, 256) resnet_v1_50/block2 (?, ?, ?, 512) resnet_v1_50/block3 (?, ?, ?, 1024) resnet_v1_50/block4 (?, ?, ?, 2048) Shape of f_0 (?, ?, ?, 2048) Shape of f_1 (?, ?, ?, 512) Shape of f_2 (?, ?, ?, 256) Shape of f_3 (?, ?, ?, 64) Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048) Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128) Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64) Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32) WARNING:tensorflow:Variable feature_fusion/Conv/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/beta missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/gamma missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_7/biases missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_8/biases missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/weights missing in checkpoint resnet_v1_50.ckpt WARNING:tensorflow:Variable feature_fusion/Conv_9/biases missing in checkpoint resnet_v1_50.ckpt 2018-11-03 10:58:48.238692: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2018-11-03 10:58:48.238825: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2018-11-03 10:58:48.238932: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-11-03 10:58:48.239009: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2018-11-03 10:58:48.239099: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. Killed

harshalcse avatar Nov 03 '18 11:11 harshalcse

It seem a problem of multiprocessing in windows, i got the same problem in windows and run well in linux. Changing the parameter 'use_mutiprocessing=False' in line 726 in file 'icdar.py' to close multiprocessing might work, but i didn't try it.

does not working on ubuntu 18.04 as well

harshalcse avatar Nov 05 '18 08:11 harshalcse

What is that WARNING mean?

In fact, this is not an error, it means that the parameters of the corresponding layer are not imported, but this is exactly what we want.

Antonio-hi avatar Nov 06 '18 09:11 Antonio-hi

@Antonio-hi so what is workaround for it?

harshalcse avatar Nov 13 '18 07:11 harshalcse

@Antonio-hi so what is workaround for it?

@harshalcse I just dont konw what is the mean about your error, I was confused at the beginning with the warning in the middle, I later found that this problem is simple ,we load the parameters of some layer from a more layered model ,This warning just note you some layers parameters are not loaded.when i finished my training I save the new model.After training i load parameters from my mew model warning disappeared.

BT , The ERROR you encountered is puzzling. why did a sudden suddenly appear Killed It looks well before that.

Antonio-hi avatar Nov 18 '18 05:11 Antonio-hi

I want to know how to solve this problem in Windows? Do I have to run it in Linux?

cdsss avatar Nov 22 '18 11:11 cdsss

Dear community, this stuff works for me in Windows 10: #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1) icdar.py

nihao88 avatar Jan 07 '19 13:01 nihao88

Dear community, this stuff works for me in Windows 10: #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1) icdar.py

Thanks a lot. It works!!!!!!!!!!!!

ShunYaoITMO avatar Feb 27 '19 13:02 ShunYaoITMO

Dear community, this stuff works for me in Windows 10: #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1) icdar.py

Thanks a lot. It works!!!!!!!!!!!!

thanks

zhanghongruiupup avatar May 09 '19 15:05 zhanghongruiupup

为什么我在改变icdar.py 之后程序停在Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)不动了

20180821 avatar May 22 '19 01:05 20180821

改了哪里?

------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午9:40 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Author"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165)

为什么我在改变icdar.py 之后程序停在Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)不动了

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ll-image avatar May 22 '19 01:05 ll-image

#enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1)这里,按照上面的改的@ll-image

20180821 avatar May 22 '19 01:05 20180821

只改了这里的话是没问题的,你看看其他地方的设置

------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午9:43 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Mention"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165)

#enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1)这里,按照上面的改的@ll-image

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ll-image avatar May 22 '19 01:05 ll-image

只改了这里的话是没问题的,你看看其他地方的设置 ------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午9:43 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Mention"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165) #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1)这里,按照上面的改的@ll-image — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

image

20180821 avatar May 22 '19 02:05 20180821

只改了这里的话是没问题的,你看看其他地方的设置 ------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午9:43 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Mention"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165) #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1)这里,按照上面的改的@ll-image — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

image

现在运行到这就挺住了,我该怎么办

20180821 avatar May 22 '19 02:05 20180821

显示意思是没有读取到数据集 ,你看看你的/data/ocr/icdar2015/,里面有没有图片,格式对不对

------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午10:12 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Mention"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165)

只改了这里的话是没问题的,你看看其他地方的设置 … ------------------ 原始邮件 ------------------ 发件人: "20180821"[email protected]; 发送时间: 2019年5月22日(星期三) 上午9:43 收件人: "argman/EAST"[email protected]; 抄送: "罗时婷"[email protected];"Mention"[email protected]; 主题: Re: [argman/EAST] when run multigpu-train.py, i meet some problemsthat i unable to solve (#165) #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1)这里,按照上面的改的@ll-image — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

现在运行到这就挺住了,我该怎么办

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ll-image avatar May 22 '19 02:05 ll-image

thanks

BreadCanFly avatar May 29 '19 02:05 BreadCanFly

Dear community, this stuff works for me in Windows 10: #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1) icdar.py

When I do this the error disappears but the model is not training using GPU anymore and it runs really slow. Do you know any fix that still uses the GPU?

Faribakh avatar Aug 09 '19 14:08 Faribakh

When I ran the training command I get this kind of error. How I solve that?

WARNING:tensorflow:From C:\Users\Malika Chathushka\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. resnet_v1_50/block1 (?, ?, ?, 256) resnet_v1_50/block2 (?, ?, ?, 512) resnet_v1_50/block3 (?, ?, ?, 1024) resnet_v1_50/block4 (?, ?, ?, 2048) Shape of f_0 (?, ?, ?, 2048) Shape of f_1 (?, ?, ?, 512) Shape of f_2 (?, ?, ?, 256) Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048) Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128) Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64) Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32) 2019-10-10 13:55:06.408205: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2019-10-10 13:55:06.418700: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance. Generator use 10 batches for buffering, this may take a while, you can tune this yourself. Traceback (most recent call last): File "d:/Embla/Sprint 12/Git/EAST/multigpu_train.py", line 180, in tf.app.run() File "C:\Users\Malika Chathushka\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "d:/Embla/Sprint 12/Git/EAST/multigpu_train.py", line 153, in main data = next(data_generator) File "d:\Embla\Sprint 12\Git\EAST\icdar.py", line 727, in get_batch enqueuer.start(max_queue_size=10, workers=num_workers) File "d:\Embla\Sprint 12\Git\EAST\data_util.py", line 81, in start thread.start() File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\process.py", line 112, in start self._popen = self._Popen(self) File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in init reduction.dump(process_obj, to_child) File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'GeneratorEnqueuer.start..data_generator_task' PS D:\Embla\Sprint 12\Git\EAST> Traceback (most recent call last): File "", line 1, in File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Users\Malika Chathushka\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input

@argman

MalikaChathushka avatar Oct 10 '19 08:10 MalikaChathushka

Dear community, this stuff works for me in Windows 10: #enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=True) #enqueuer.start(max_queue_size=10, workers=num_workers) enqueuer = GeneratorEnqueuer(generator(**kwargs), use_multiprocessing=False) enqueuer.start(max_queue_size=1, workers=1) icdar.py

When I do this the error disappears but the model is not training using GPU anymore and it runs really slow. Do you know any fix that still uses the GPU?

Yeah. I tried that also same issue has occurred.

MalikaChathushka avatar Oct 10 '19 08:10 MalikaChathushka

@harshalcse Can u run mul_train.py on ubuntu18.14 now? I'm using 16.04,and i met the same problem.

18810908122 avatar Feb 16 '20 16:02 18810908122

#264

hoozh avatar Mar 05 '21 06:03 hoozh