C3D-tensorflow
C3D-tensorflow copied to clipboard
After some steps while training, the weights and loss value will be NaN .
Hello everyone,
I have a question about training this model with different dataset.
When i finetune the c3d model with UCF101 data, there is no problem. But when i change the dataset i have got this error that loss is Nan Value.
I tried some ways to handle this problem which did not solve it.
- changed the learning rate
- changed batch size
- tried with small test and train split ( with #num : 3)
For instance, these are steps of training same model with different dataset. Learning rates and batch size are same with original as this model. Just dataset is changed.
('Step : ', 31)
------------------------------------------------------------------
TRAIN DATA READING ...
Training Data Eval:
accuracy: 0.00000
(' Loss : ', array([ 2.60935736, 2.62928033, 2.65052104, 2.6719377 , 2.69320059,
2.7358048 , 2.73551226, 2.73502755, 3.95449066, 3.61877584,
2.61726952, 2.60790229, 2.60790229, 2.60790229, 2.60790229,
2.60790229, 2.60790229, 2.60790229, 2.60790229, 2.60790229,
2.60790229, 2.60790229], dtype=float32))
TEST DATA READING ...
Validation Data Eval:
accuracy: 0.00000
('Step : ', 32)
------------------------------------------------------------------
TRAIN DATA READING ...
Training Data Eval:
accuracy: 0.10000
(' Loss : ', array([ 3.21029162, 3.2302146 , 3.25145531, 3.27287173, 3.29413438,
3.33673644, 3.33644032, 3.33594394, 4.55498886, 4.21940422,
3.21820426, 3.20883656, 3.20883656, 3.20883656, 3.20883656,
3.20883656, 3.20883656, 3.20883656, 3.20883656, 3.20883656,
3.20883656, 3.20883656], dtype=float32))
TEST DATA READING ...
Validation Data Eval:
accuracy: 0.20000
('Step : ', 33)
------------------------------------------------------------------
TRAIN DATA READING ...
Training Data Eval:
accuracy: 0.30000
(' Loss : ', array([ 2.50483251, 2.52475548, 2.54599619, 2.56741214, 2.58867455,
2.63127494, 2.63097525, 2.63046718, 3.84910154, 3.51364231,
2.51274562, 2.50337744, 2.50337744, 2.50337744, 2.50337744,
2.50337744, 2.50337744, 2.50337744, 2.50337744, 2.50337744,
2.50337744, 2.50337744], dtype=float32))
TEST DATA READING ...
Validation Data Eval:
accuracy: 0.00000
('Step : ', 34)
------------------------------------------------------------------
TRAIN DATA READING ...
Training Data Eval:
accuracy: 0.00000
(' Loss : ', array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32))
TEST DATA READING ...
Validation Data Eval:
accuracy: 0.00000
Where should be the problem ? Why does not this model work with different dataset ? Any suggestion? Thank you.
try use labels = tf.one_hot(labels_placeholder, c3d_model.NUM_CLASSES) loss = -tf.reduce_sum(labels*tf.log(tf.clip_by_value(tf.nn.softmax(logit),1e-10,1.0)),1.0) instead
@zeynepgokce hi, could you tell me how to print the loss, where should i add code? thank you.
@491506870 Hi, i just added "loss" to session simply, as following
summary, acc,l = sess.run( [merged, accuracy,loss], feed_dict={images_placeholder: train_images, labels_placeholder: train_labels }) print ("accuracy: " + "{:.5f}".format(acc)) print(" Loss : ",l)
@zeynepgokce thank you so much!!! which helps me a lot~~and i think i met the same problem with you, my loss became NAN after several steps, what did you do to solve it?
@491506870 , Problem was related to my own Dataset Labelling. I changed my labelling like starting from 0 to 2 since i have 3 classes.