STN_CNN_LSTM_CTC_TensorFlow icon indicating copy to clipboard operation
STN_CNN_LSTM_CTC_TensorFlow copied to clipboard

about the STN loss

Open Fangction opened this issue 6 years ago • 6 comments

Is there any STN loss to constraint the transformation angle?I didn't find it in your code. If there is no one, how do the net know the how to transfer a picture?

Fangction avatar Oct 15 '18 02:10 Fangction

@Fangction There is no stn loss,stn was supervised by ctc loss.However,if you have more supersied information,you can add constraint.For example,monitoring the coordinates regressed by the STN.

wushilian avatar Oct 15 '18 04:10 wushilian

@wushilian thank you. And I have another question. In your code W = tf.Variable(tf.zeros([128, 20])) b = tf.Variable(initial_value=[-1, -0.2, -0.5, -0.35, 0, -0.5, 0.5, -0.67, 1, -0.8, -1, 0.8, -0.5, 0.65, 0, 0.5, 0.5, 0.33, 1, 0.2], dtype=tf.float32) # fc3_loc=tf.layers.dense(fc2_loc,20,activation=tf.nn.tanh,kernel_initializer=tf.zeros_initializer) # fc3_loc = slim.fully_connected(fc2_loc, 8, activation_fn=tf.nn.tanh, scope='fc3_loc') # spatial transformer fc3_loc = tf.nn.tanh(tf.matmul(fc2_loc, W) + b)#激活函数结果 loc = tf.reshape(fc3_loc, [-1, 10, 2])#将fc3_loc的结果按照-1 10 2的结构reshape # spatial transformer s = np.array([[-0.95, -0.95], [-0.5, -0.95], [0, -0.95], [0.5, -0.95], [0.95, -0.95], [-0.95, 0.95], [-0.5, 0.95], [0, 0.95], [0.5, 0.95], [0.95,0.95]] * 256) s = tf.constant(s.reshape([256, 10, 2]), dtype=tf.float32) how did you decide the value of variable b and s?

Fangction avatar Oct 15 '18 11:10 Fangction

@Fangction b means Specific initialization,for details,you need to read the paper:Robust Scene Text Recognition with Automatic Rectification

wushilian avatar Oct 15 '18 12:10 wushilian

@wushilian thank you again. And I also want to ask one more question. Do you think the STN's output image size must be fixed size?I see you define the output as image_width=120 image_height=32. Can i keep the size of output image same as the input image?

Fangction avatar Oct 18 '18 14:10 Fangction

@Fangction Yes,you can change the size.

wushilian avatar Oct 20 '18 02:10 wushilian

@Fangction Yes,you can change the size.

I change the size and it cause a error like "InvalidArgumentError (see above for traceback): len(seq_lens) != input.dims(0), (256 vs. 1536)"

What can I do to solve this error?

daming98 avatar Mar 27 '19 12:03 daming98