Weitang Liu comments

Results 66 comments of


                                            Weitang Liu

Modify this code for better accuracy

@1033020837 Thanks a lot, I will try.

No GPU Detected

if torch.cuda.is_available() is true, modify `parser.add_argument("--n_gpu", type=str, default='0', help='"0,1,.." or "0" or "" ')`

Invalid file: training_args.bin

@chmille3 my python version is 3.6 and torch version is 1.0.0,, it support `pathlib.Path` ,maybe you should use `torch.save(args, str(config['checkpoint_dir'] / 'training_args.bin'))`

hi @raymondklutse ,I test 200 samples,print result: (only use 2000 sample to training) maybe you can try again [0.01232055 0.00306275 0.00431524 0.00309635 0.00360053 0.00390799] [0.00900013 0.00339839 0.00440324 0.00366084 0.00335302 0.00430566]...

Can provide links of Albert's model and sentence word model

@w279805299 https://github.com/lonePatient/albert_pytorch

GPU MEMORY

bert-base+256length+16batch ~> 8g GPU memory

FileNotFoundError: [Errno 2] No such file or directory: 'pybert/output/checkpoints/bert'

@Sajjadahmed668 hi，Maybe you can refer to https://stackoverflow.com/questions/51870056/colab-no-such-file-or-directory

CUDA out of memory

2g太小了啊，bert本身有点大，另外跟你的seq_length有关。换卡吧 8G我都觉得玩耍不友好了。

ner_seq.py", line 146, in convert_examples_to_features assert len(label_ids) == max_seq_length AssertionError

> > > 有一些字符像‘’ 比如 x  x x 0 0 x 无法tokenize 这怎么处理 > > > > > > 这个倒还好，比如表情之类的评论可以直接删除，但我这边出现的情况是英文字符全部为[unk]该怎么办啊？ > > 您好，您现在解决这个问题了吗？可能是这个模型是针对中文的，但是我现在不太清楚在哪解决英文字符训练的问题。您好！这个主要是针对中文的，英文可以自己在tokenizer上进行修改下，或者你参考下另外个仓库代码torchblocks吧。我记得应该是支持的

Weitang Liu

预训练模型