SqueezeSeg icon indicating copy to clipboard operation
SqueezeSeg copied to clipboard

Problem with training

Open omartriki1 opened this issue 6 years ago • 4 comments

Hi :),

I wanted to train the model. I followed all the steps described in the readme.md. The demo script ran perfectly but for the training I always confront the same issue( it ran just for the first step and it stops after) . ./scripts/train.sh -gpu 0 -image_set train -log_dir ./log/ Shape of the pretrained parameter of conv1 does not match, use randomly initialized parameter Cannot find conv1_skip in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv10/squeeze1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv10/expand1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv10/expand1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv10/expand3x3 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv11/squeeze1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv11/expand1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv11/expand3x3 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv12/squeeze1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv12/expand1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv12/expand3x3 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv13/squeeze1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv13/expand1x1 in the pretrained model. Use randomly initialized parameters Cannot find fire_deconv13/expand3x3 in the pretrained model. Use randomly initialized parameters Cannot find conv14_prob in the pretrained model. Use randomly initialized parameters WARNING:tensorflow:From /home/rahul/$SQSG_ROOT/src/nn_skeleton.py:736: calling softmax (from tensorflow.python.ops.nn_ops) with dim is deprecated and will be removed in a future version. Instructions for updating: dim is deprecated, use axis instead Model statistics saved to ./log///train/model_metrics.txt. WARNING:tensorflow:From ./src/train.py:109: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Please use tf.global_variables instead. WARNING:tensorflow:From /home/rahul/anaconda2/lib/python2.7/site-packages/tensorflow/python/util/tf_should_use.py:189: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Use tf.global_variables_initializer instead. 2019-01-07 11:09:21.153978: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-01-07 11:09:34.158950: step 0, loss = 2.98 (2.6 images/sec; 12.406 sec/batch) terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc ./scripts/train.sh: line 73: 9311 Aborted (core dumped) python ./src/train.py --dataset=KITTI --pretrained_model_path=./data/SqueezeNet/squeezenet_v1.1.pkl --data_path=./data/ --image_set=$IMAGE_SET --train_dir="$logdir/train" --net=$NET --max_steps=$STEPS --summary_step=100 --checkpoint_step=1000 --gpu=$GPUID

Does anyone have any idea how to solve this issue ?

Thank you :)

omartriki1 avatar Jan 07 '19 10:01 omartriki1

im also encountering same error does anyone have any idea? @BichenWuUCB

praveenkurabarkumar avatar Jan 25 '19 12:01 praveenkurabarkumar

i got the solution give rwx permission to SqueezeSeg-master folder

chmod -R 777 SqueezeSeg-master

praveenkurabarkumar avatar Jan 30 '19 07:01 praveenkurabarkumar

i got the solution give rwx permission to SqueezeSeg-master folder

chmod -R 777 SqueezeSeg-master

i try your solution, but not work, :( can you help me?Thanks

chaoxingxi avatar Jul 29 '19 01:07 chaoxingxi

Hi if you were able to run demo.py can you please help me out with this error: #50

jashshah999 avatar Jul 02 '20 14:07 jashshah999