r-drop icon indicating copy to clipboard operation
r-drop copied to clipboard

tnews.py好像并没用r-drop只有在sentiment用r-drop啦吧?想确认一下!

Open tianke0711 opened this issue 4 years ago • 6 comments

tianke0711 avatar Jul 31 '21 11:07 tianke0711

你好 @bojone 想问一下 unlabeled_data = [(t, 0) for t, l in train_data[num_labeled:]] 这个啥作用 能否举一个例子 我数据看不到。 我看表注的训练数据才展0.01 这样不恨少的训练数据?为啥这样啊?不是很理解,麻烦解答。

模拟标注和非标注数据

num_labeled = int(len(train_data) * train_frac) unlabeled_data = [(t, 0) for t, l in train_data[num_labeled:]] train_data = train_data[:num_labeled]

还有就是如果是三分类怎么弄非标注数据 这是二分类的。还是这样一样统一设置为0?

tianke0711 avatar Jul 31 '21 11:07 tianke0711

1、每个都有r-drop。我不至于乱发几个文件糊弄大家吧?

2、unlabeled_data = [(t, 0) for t, l in train_data[num_labeled:]]这一步纯粹是心理安慰,即把所有的无标签数据的标签都设置为0,你直接改为unlabeled_data = train_data[num_labeled:]也完全等价,因为事实上就没用到标签。

bojone avatar Aug 02 '21 02:08 bojone

@bojone 谢谢回复。 我可能只看到了sentiment 里unlabeled_data = [(t, 0) for t, l in train_data[num_labeled:]] 这一块。其他没有看到,所以造成了误会。 我只是一个建议啊, 如果可以的话, 可以在readme文件里写上各个python file中各种r-drop的方法。我想问一下,为啥无标签要占这么大比例啊,小一点不行吗,比如30%。麻烦赐教!

tianke0711 avatar Aug 02 '21 05:08 tianke0711

1、感谢你的意见,但是我认为在熟悉r-drop和keras本身的情况下,阅读我所给的参考代码是轻而易举的;

2、半监督学习本来就是“少量标签数据+大量无标签数据”的场景,你要是有30%的标注数据,我估计都用不着半监督了。

bojone avatar Aug 02 '21 07:08 bojone

@bojone 谢谢回复。我想问一下下面代码啥意思。为啥有一个 for i in range(2)? ``` for i in range(2): batch_token_ids.append(token_ids) batch_segment_ids.append(segment_ids) batch_labels.append(label) if len(batch_token_ids) == self.batch_size * 2 or is_end: batch_token_ids = sequence_padding(batch_token_ids) batch_segment_ids = sequence_padding(batch_segment_ids) batch_labels = to_categorical(batch_labels, num_classes) yield [batch_token_ids, batch_segment_ids], batch_labels batch_token_ids, batch_segment_ids, batch_labels = [], [], []

tianke0711 avatar Aug 05 '21 06:08 tianke0711

你好我用你这个sentiment drop code去做一个三分类的相似度预测。我发现用你的drop咋差了不少啊,这个准确度和f1指标在val数据很不好(69%),相比没有用drop的(80%)。我想知道啥原因,麻烦你解答一下。代码如下:


class data_generator(DataGenerator):
    """数据生成器
    """

    def __iter__(self, random=False):
        batch_token_ids, batch_segment_ids, batch_labels = [], [], []
        for is_end, (text1, text2, label) in self.sample(random):
            token_ids, segment_ids = tokenizer.encode(
                text1, text2, maxlen=maxlen
            )
            batch_token_ids.append(token_ids)
            batch_segment_ids.append(segment_ids)
            batch_labels.append([label])
            if len(batch_token_ids) == self.batch_size or is_end:
                batch_token_ids = sequence_padding(batch_token_ids)
                batch_segment_ids = sequence_padding(batch_segment_ids)
                batch_labels = sequence_padding(batch_labels)
                yield [batch_token_ids, batch_segment_ids], batch_labels
                batch_token_ids, batch_segment_ids, batch_labels = [], [], []

                
class data_generator_rdrop(DataGenerator):
    """数据生成器
    """
    def __iter__(self, random=False):
        batch_token_ids, batch_segment_ids, batch_labels = [], [], []
        for is_end, (text1, text2, label) in self.sample(random):
            token_ids, segment_ids = tokenizer.encode(text1, text2, maxlen=maxlen)
            for i in range(2):
                batch_token_ids.append(token_ids)
                batch_segment_ids.append(segment_ids)
                batch_labels.append([label])
            if len(batch_token_ids) == self.batch_size * 2 or is_end:
                batch_token_ids = sequence_padding(batch_token_ids)
                batch_segment_ids = sequence_padding(batch_segment_ids)
                batch_labels = sequence_padding(batch_labels)
                yield [batch_token_ids, batch_segment_ids], batch_labels
                batch_token_ids, batch_segment_ids, batch_labels = [], [], []
                
                
def kld_rdrop(y_true, y_pred):
    """无监督部分只需训练KL散度项
    """
    loss = kld(y_pred[::2], y_pred[1::2]) + kld(y_pred[1::2], y_pred[::2])
    return K.mean(loss)



bert = build_transformer_model(
    config_path=config_path,
    checkpoint_path=checkpoint_path,
    with_pool=True,
    model="bert",
    return_keras_model=False,
)

output = Dropout(rate=0.1)(bert.model.output)
output = Dense(
    units=len(labels), activation='softmax', kernel_initializer=bert.initializer
)(output)

model = keras.models.Model(bert.model.input, output)
model.summary()

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=Adam(2e-5), 
    # optimizer=PiecewiseLinearLearningRate(Adam(5e-5), {10000: 1, 30000: 0.1}),
    metrics=['accuracy'],
)

# 用于R-Drop训练的模型
model_rdrop = keras.models.Model(bert.model.input, output)
model_rdrop.compile(
    loss=kld_rdrop,
    optimizer=Adam(1e-5),
)

tianke0711 avatar Aug 05 '21 07:08 tianke0711