albert_pytorch issues

这个断言可以修改一下，前面内循环判断。 if len(masked_token_labels) >= num_to_mask:

2

moonblue333

sentence-order prediction

4

BERT的NextSentencePr任务过于简单。ALBERT中，为了只保留一致性任务去除主题识别的影响，提出了一个新的任务 sentence-order prediction（SOP）请问：这个任务在您程序的哪个部分？

qiunlp

AttributeError: 'AlbertForSequenceClassification' object has no attribute 'keys'

Hello, I am trying to predict sentence similarity, I have trained using STS-B data set. ### modeling_utils.py I changed the code from `torch.save(model_to_save.state_dict(), output_model_file)` to `torch.save(model_to_save, output_model_file)` because I'm looking...

chiragsanghvi10

使用albert.base（英文）finetuning的时候，--gradient_accumulation_steps设置为大于1时直接进入evaluating而不training

1

YuxiangLu

请问如何使用微调好的模型进行预测？

1

zhu1090093659

tf_path in the english model

1

Hello, I wanted to ask for the purpose of adding `tf_path = tf_path + "/variables/variables"` in the english model. There's no analogous `tf_path` modification in the chinese version of the...

Porcupine96

fix: remove appending variables/variables to the tfpath

This PR removes appending `variables/variables` to the `tf_path` that was described in #20

Porcupine96

关于xlarge模型的batch_size和学习率

2

您好，我最近在使用xlarge-albert在自己任务上微调，起初我设置的batch_size是16，学习率是2e-5，然后训练过程中发现loss震荡的厉害，验证集效果极差。然后，我把学习率调低到2e-6，发现效果好一些，但是验证集精度仍然和原始bert有差距。最后，我又继续把学习率调低到2e-7，发现效果又会好一些，但是和原始bert还是有差距。另外和使用albert-base相比也有差距，所以我觉得是训练出了问题。所有我想请教下您，使用xlarge-albert微调时，学习率和batch_size需要设置成多少合适呢？我听说batch_size不能太小，否则可能影响精度，我16的batch_size是否过小了？

DunZhang

模型在运行过程中崩了

3

你好，感谢这么漂亮的工作，基于您的工作，我打算预训练一下我们自己的模型， 1 因为语料（英文的）的不同，我把字典替换我们自己的（其实上另一个bert的字典） 2 因为机器限制（8卡 p100）所以把batchsize设为32，其他基本没有变化， 3 第一次尝试只用了140w左右的数据，发现1w的时间大约是1h54m左右 3. 开始时模型进展顺利，但到14w左右，模型突然崩了，即loss突然上涨，acc急剧下降以上是基本情况 a. 因为发现warmup rate是0.1所以14w差不多正好是140w的0.1, 然后，我去check learning rate的代码，大佬写的完全没有问题 b. 有人说是learning rate太大了，但我想，如果太大的话，在warmup过程后半段就会有异常， c. 就算learning rate 太大，也只是震荡，像这种突然崩掉，似乎也不太可能。以上是初步分析 check了好久，也没有发现问题，故来请教下，谢谢 ![微信图片_20191016093600](https://user-images.githubusercontent.com/22723154/66881150-724baa00-eff8-11e9-9265-d641e4d8df43.jpg)

vpegasus

Add scripts, command options for large, xlarge model and enhance README.md,

1. Add file convert.sh, train.sh, test.sh for end user reference. 2. Add command options for large, xlarge model config. 3. Clear guide for README.

delldu

albert_pytorch
albert_pytorch copied to clipboard

Metadata

这个断言可以修改一下，前面内循环判断。 if len(masked_token_labels) >= num_to_mask:

sentence-order prediction

AttributeError: 'AlbertForSequenceClassification' object has no attribute 'keys'

使用albert.base（英文）finetuning的时候，--gradient_accumulation_steps设置为大于1时直接进入evaluating而不training

请问如何使用微调好的模型进行预测？

tf_path in the english model

fix: remove appending variables/variables to the tfpath

关于xlarge模型的batch_size和学习率

模型在运行过程中崩了

Add scripts, command options for large, xlarge model and enhance README.md,

← Metadata

Owner

Metadata

albert_pytorch albert_pytorch copied to clipboard

Metadata

← Metadata

Owner

Metadata

albert_pytorch
albert_pytorch copied to clipboard