Chandler-Bing comments

Results 12 comments of


                                            Chandler-Bing

Is BERT powerful enough to learn sentence embedding and word embedding?

hi, thank u for this job,i think its so great however i have trouble getting embedding of sentences.my texts are always very long(20k chars), i notice that max_seq_len 512 is...

结果f1为0

> > 已解决，谢谢 > > 你好,我遇到了相同的问题. 请教一下如何解决的我的情况是用错了label，原作者用的是B-GOODS，我没看清楚，用的是B_GOODS，下划线和横杠的区别

结果f1为0

> 你好请问是怎么解决的呢？我遇到了这个问题：`processed 0 tokens with 0 phrases; found: 0 phrases; correct: 0. `在label_test.txt里什么数据也没有我的情况是用错了label，原作者用的是B-GOODS，我没看清楚，用的是B_GOODS，下划线和横杠的区别

师姐您好，项目中有个小问题，假设我的用的20ng的数据集，一共20个类，每个类500篇文档的话，不到40M的原数据，如果每个类有5个种子词，每篇文档中有300个不同的词的话。那训练集的格式就是20 * 500 * (19*500) ,种子词有20中选择，每个种子词类别对应500篇pos文档，对应19*500篇neg文档，那这样的话再乘以每篇文档300个词的编号，训练集会非常大，感觉40M文本的数据集处理成10几个g的训练集，冗余的信息是不是太多了？感觉像是无意义的扩充。。。（PS，我这样处理的过程是正确的吗？感谢师姐解答(*^_^*)）

[BUG] grad_norm and loss is nan when deepspeed==0.13.5 but ok with deepspeed==0.10.2

@loadams sure, I test different version of deepspeed. - deepspeed==0.11.2 Y - deepspeed==0.12.1 Y - deepspeed==0.12.2 Y - deepspeed==0.12.3 Y - deepspeed==0.12.4 **N** (grad_norm always be 1.0, and loss 0)...

[BUG] grad_norm and loss is nan when deepspeed==0.13.5 but ok with deepspeed==0.10.2

@loadams sorry for the late... I think @renhouxing is right. I tried every commit `pip install git+https://github.com/microsoft/deepspeed.git@ [commit from 0.12.3 to 0.12.4](https://github.com/microsoft/DeepSpeed/compare/v0.12.3...v0.12.4)` ``` {'loss': 2.6719, 'grad_norm': 1.0, 'learning_rate': 1.25e-06, 'timestamp':...

The Data

thank you the code,however i cant understand the data format in 'data preparation'，could you put some train samples ? thank u very much!!!

Chandler-Bing

Is BERT powerful enough to learn sentence embedding and word embedding?

结果f1为0

结果f1为0

结果f1为0

转onnx格式出现错误

关于怎么处理文档

关于怎么处理文档

[BUG] grad_norm and loss is nan when deepspeed==0.13.5 but ok with deepspeed==0.10.2

[BUG] grad_norm and loss is nan when deepspeed==0.13.5 but ok with deepspeed==0.10.2

The Data