Semantic-Segmentation-PSPNet icon indicating copy to clipboard operation
Semantic-Segmentation-PSPNet copied to clipboard

请问应该如何训练?

Open Amos6666 opened this issue 6 years ago • 4 comments

使用Cityscape数据集 应该准备什么样的trainlist ? 谢谢

Amos6666 avatar Apr 01 '18 07:04 Amos6666

图片和注解对应放在一行就行,比如在train_list.txt中: image\0_0_0.png label\0_0_0.png image\0_0_1000.png label\0_0_1000.png image\0_0_2000.png label\0_0_2000.png 其中image和label是和train_list.txt在相同目录下的文件夹。 具体可以看ImageReader.py中的read_labeled_image_list函数。

ALISURE avatar Apr 01 '18 14:04 ALISURE

您好, 我最近在看您的这个代码,我对这个代码的训练部分无法训练成功(即runnerTraining.py)。 错误代码: Traceback (most recent call last): File "RunnerTrain.py", line 165, in train_list="train_list.txt").train(save_pred_freq=1000) File "RunnerTrain.py", line 137, in train Tools.restore_if_y(self.sess, self.log_dir) File "/Users/rhc/Downloads/pspnet/Tools.py", line 102, in restore_if_y tf.train.Saver(var_list=tf.global_variables()).restore(sess, ckpt.model_checkpoint_path) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore {self.saver_def.filename_tensor_name: save_path}) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run options, run_metadata) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key conv6/weights/Momentum not found in checkpoint [[Node: save_1/RestoreV2_891 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_891/tensor_names, save_1/RestoreV2_891/shape_and_slices)]]

Caused by op u'save_1/RestoreV2_891', defined at: File "RunnerTrain.py", line 165, in train_list="train_list.txt").train(save_pred_freq=1000) File "RunnerTrain.py", line 137, in train Tools.restore_if_y(self.sess, self.log_dir) File "/Users/rhc/Downloads/pspnet/Tools.py", line 102, in restore_if_y tf.train.Saver(var_list=tf.global_variables()).restore(sess, ckpt.model_checkpoint_path) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1140, in init self.build() File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1172, in build filename=self._filename) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 688, in build restore_sequentially, reshape) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps tensors = self.restore_op(filename_tensor, saveable, preferred_shard) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op [spec.tensor.dtype])[0]) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2 dtypes=dtypes, name=name) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Key conv6/weights/Momentum not found in checkpoint [[Node: save_1/RestoreV2_891 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_891/tensor_names, save_1/RestoreV2_891/shape_and_slices)]]

我们测试runnerone.py都能成功运行,这说明模型应该是能够被正确读取的。 但我们语音runnerTrain.py 不能正常运行。原因如上。 并且我们发现,runnerone中有构建net2,但runnertrain中没有。这样构成的网络图不是一样的,不知道是否是因为这个导致的。

请题主帮忙分析一下?

衷心祝您工作学习顺利!

饶浩承 2018-04-27

ZhiboRao avatar Apr 27 '18 02:04 ZhiboRao

这个错误和net2没有关系。看输出是你的模型参数加载出了错误,你可以试着将Tools.restore_if_y(self.sess, self.log_dir)(File "RunnerTrain.py", line 137, in train)这句话去掉看看有没有错误。

ALISURE avatar Apr 27 '18 04:04 ALISURE

多谢您的回答,我们尝试把model文件下所有模型文件删除,然后该程序又自动生成了模型文件(共4个),正常训练。并且RunnerOne文件也正常工作。 具体原因我还不清楚。

ZhiboRao avatar Apr 27 '18 05:04 ZhiboRao