fudan_mtl_reviews icon indicating copy to clipboard operation
fudan_mtl_reviews copied to clipboard

questions of code

Open marvelousgirl opened this issue 5 years ago • 21 comments

Hi, i am confused bypad_id = -1 word_embed[pad_id] = np.zeros([word_dim]) Please tell me why let pad_id = -1? I think these two lines should be removed.

note: this piece of code is in util.py ---> trim_embeddings function

marvelousgirl avatar Jul 19 '18 12:07 marvelousgirl

I pad sentences to fixed length manually. Of course you can use tf.data API to automaticlly pad zeros.

FrankWork avatar Jul 20 '18 02:07 FrankWork

I know that, but in your code you have noted that " make sure the pad id is 0." And you place the PAD_WORD in the first line of the vocab_file.

Therefore, i think there is a contradiction between "pad_id=-1 " and "make sure the pad id is 0".

marvelousgirl avatar Jul 20 '18 02:07 marvelousgirl

Besides, i notice that in your code the datasets do not include "dvd" and "MR". Could you please tell me about that? I add the two, and there will an encoding problem. I solve it by “encoding = "ISO-8859-1” for these two datasets.

marvelousgirl avatar Jul 20 '18 02:07 marvelousgirl

i ignored the two dataset because of encoding problem. i thought the data was broken.

FrankWork avatar Jul 20 '18 05:07 FrankWork

Please please answer this comment: I know that, but in your code you have noted that " make sure the pad id is 0." And you place the PAD_WORD in the first line of the vocab_file.

Therefore, i think there is a contradiction between "pad_id=-1 " and "make sure the pad id is 0".

Could you please tell me about that?

marvelousgirl avatar Jul 20 '18 05:07 marvelousgirl

pad_id = -1 and word_embed[pad_id] = np.zeros([word_dim]) is unnecessary. PAD_WORD is in vocab and not in pretrain_words2id, its vector will be random initialized

for w in vocab:
    if w in pretrain_words2id:
      id = pretrain_words2id[w]
      word_embed.append(pretrain_embed[id])
    else:
      vec = np.random.normal(0,0.1,[word_dim])
      word_embed.append(vec)

FrankWork avatar Jul 20 '18 05:07 FrankWork

Thank you very much.

marvelousgirl avatar Jul 20 '18 05:07 marvelousgirl

Hi, i notice you implement the class ConvLayer in your code. Is it a choice or must? Because i want to use LSTM and plan to modify the code. Could you please tell me whether i can use the predefined convolution operations or rnn in tensorflow?

marvelousgirl avatar Aug 07 '18 14:08 marvelousgirl

@marvelousgirl cnn just encode a sentence to a vector. you can try to use lstm.

FrankWork avatar Aug 08 '18 01:08 FrankWork

您好,fudan.py 里 的 read_tfrecord函数: test_data = util.read_tfrecord(test_record_file, epoch, 400, _parse_tfexample, shuffle=false) 为什么测试数据要传入epoch,复制epoch次,我认为不用复制。只有训练数据复制epoch次。

marvelousgirl avatar Sep 07 '18 07:09 marvelousgirl

@marvelousgirl 我在read_tfrecord里用的是dataset.make_one_shot_iterator,只能使用一次,如果想要测试多次的话,就要使用dataset.repeatdataset.repeat不是复制,只是个循环次数。

FrankWork avatar Sep 07 '18 09:09 FrankWork

您好,我有两个问题想向您请教。 问题一: 您的代码里好像没有用到batch_norm,是不是这两行代码可以去掉 update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops):# for batch_norm

问题二,关于参数的正则化,我注意到您只对线性层的参数进行了正则化,那么 cnn里卷积核这些参数需要考虑在内吗?

marvelousgirl avatar Sep 09 '18 11:09 marvelousgirl

@marvelousgirl 1. 可以去掉 2. 你可以对卷积核正则化看看效果,是否加正则化是调参调出来的

FrankWork avatar Sep 09 '18 15:09 FrankWork

@FrankWork 谢谢你,我从中学到了很多,真心感谢!

marvelousgirl avatar Sep 10 '18 00:09 marvelousgirl

您好,我想把cnn的部分替换成rnn, 我看到您的代码里ConvLayer里的build函数里有add_variable操作, 问题一: 请问是必须把模型变量在build里加进去吗 问题二:如果是必须加的,请问对于rnn里的参数又如何写代码呢?这个我不清楚,而且也没找到资料。

marvelousgirl avatar Sep 11 '18 00:09 marvelousgirl

您好,这是我自己写的RnnLayer,不知道build函数里能否这样写。

class RnnLayer(tf.layers.Layer): '''inherit tf.layers.Layer to cache trainable variables ''' def init(self, layer_name, hidden_size, **kwargs): self.layer_name = layer_name self.hidden_size = hidden_size super(RnnLayer, self).init(**kwargs)

def build(self, input_shape): self.batch_size = input_shape[0] with tf.variable_scope(self.layer_name): self.rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(self.hidden_size,forget_bias=1.0, state_is_tuple = True) super(RnnLayer, self).build(input_shape)

def call(self, x, sequence_length): init_state = self.rnn_cell.zero_state(self.batch_size, dtype = tf.float32) output, final_state = tf.nn.dynamic_rnn(self.rnn_cell, x, sequence_length = sequence_length, initial_state = init_state) return output,final_state

marvelousgirl avatar Sep 11 '18 01:09 marvelousgirl

@marvelousgirl 你测一下结果对不对不就行了

FrankWork avatar Sep 11 '18 01:09 FrankWork

@marvelousgirl 您好,请问之前说想换RNN的,现在有完成吗?

Haiming94 avatar Oct 28 '18 02:10 Haiming94

@Prime-Number 有完成,跑起来了,但是代码比较low

marvelousgirl avatar Nov 04 '18 09:11 marvelousgirl

@marvelousgirl 哦哦,是你上面给出的代码吗?能不能把你写的发一下

Haiming94 avatar Nov 05 '18 06:11 Haiming94

@Prime-Number class BiRnnLayer(tf.layers.Layer): def init(self, layer_name, hidden_size, **kwargs): self.layer_name = layer_name self.hidden_size = hidden_size with tf.variable_scope(self.layer_name): with tf.variable_scope("fw"): self.fw_rnn_cell = tf.nn.rnn_cell.LSTMCell(self.hidden_size,use_peepholes=True, state_is_tuple=True,reuse = False) with tf.variable_scope("bw"): self.bw_rnn_cell = tf.nn.rnn_cell.LSTMCell(self.hidden_size, use_peepholes=True, state_is_tuple=True,reuse=False) super(BiRnnLayer,self).init(**kwargs)

def call(self,x,sequence_length): (outputs_fw, outputs_bw),_ = tf.nn.bidirectional_dynamic_rnn(self.fw_rnn_cell, self.bw_rnn_cell,x,sequence_length = sequence_length,dtype = tf.float32, time_major= False) return outputs_fw, outputs_bw

marvelousgirl avatar Nov 05 '18 13:11 marvelousgirl