使用Cityscape数据集 应该准备什么样的trainlist ? 谢谢
图片和注解对应放在一行就行,比如在train_list.txt中:
image\0_0_0.png label\0_0_0.png
image\0_0_1000.png label\0_0_1000.png
image\0_0_2000.png label\0_0_2000.png
其中image和label是和train_list.txt在相同目录下的文件夹。
具体可以看ImageReader.py中的read_labeled_image_list函数。
您好,
我最近在看您的这个代码,我对这个代码的训练部分无法训练成功(即runnerTraining.py)。
错误代码:
Traceback (most recent call last):
File "RunnerTrain.py", line 165, in
train_list="train_list.txt").train(save_pred_freq=1000)
File "RunnerTrain.py", line 137, in train
Tools.restore_if_y(self.sess, self.log_dir)
File "/Users/rhc/Downloads/pspnet/Tools.py", line 102, in restore_if_y
tf.train.Saver(var_list=tf.global_variables()).restore(sess, ckpt.model_checkpoint_path)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key conv6/weights/Momentum not found in checkpoint
[[Node: save_1/RestoreV2_891 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_891/tensor_names, save_1/RestoreV2_891/shape_and_slices)]]
Caused by op u'save_1/RestoreV2_891', defined at:
File "RunnerTrain.py", line 165, in
train_list="train_list.txt").train(save_pred_freq=1000)
File "RunnerTrain.py", line 137, in train
Tools.restore_if_y(self.sess, self.log_dir)
File "/Users/rhc/Downloads/pspnet/Tools.py", line 102, in restore_if_y
tf.train.Saver(var_list=tf.global_variables()).restore(sess, ckpt.model_checkpoint_path)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1140, in init
self.build()
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
dtypes=dtypes, name=name)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/rhc/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
NotFoundError (see above for traceback): Key conv6/weights/Momentum not found in checkpoint
[[Node: save_1/RestoreV2_891 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_891/tensor_names, save_1/RestoreV2_891/shape_and_slices)]]
我们测试runnerone.py都能成功运行,这说明模型应该是能够被正确读取的。
但我们语音runnerTrain.py 不能正常运行。原因如上。
并且我们发现,runnerone中有构建net2,但runnertrain中没有。这样构成的网络图不是一样的,不知道是否是因为这个导致的。
请题主帮忙分析一下?
衷心祝您工作学习顺利!
饶浩承
2018-04-27
这个错误和net2没有关系。看输出是你的模型参数加载出了错误,你可以试着将Tools.restore_if_y(self.sess, self.log_dir)(File "RunnerTrain.py", line 137, in train)这句话去掉看看有没有错误。
多谢您的回答,我们尝试把model文件下所有模型文件删除,然后该程序又自动生成了模型文件(共4个),正常训练。并且RunnerOne文件也正常工作。
具体原因我还不清楚。