Yiming Cui

http://ymcui.github.io [email protected]

Joint Laboratory of HIT and iFLYTEK Research (HFL) Beijing, China NLP Researcher. Mainly interested in Pre-trained Language Model, Machine Reading Comprehension, Question Answering, etc.

Results 165 comments of


                                            Yiming Cui

create_pretraining_data.py kept killed..

Seems to be an OOM problem. Have you tried to feed a small text (~1M) into script?

数据标注用的什么工具

使用的是公司内部的标注平台，暂没有其他推荐的工具，抱歉。

new() missing 2 required positional arguments: 'start_index' and 'end_index'

已修复，是因为nbest没有满足条件的答案。对应代码已加入start_index和end_index字段（默认为0）。 https://github.com/ymcui/cmrc2018/blob/master/baseline/run_cmrc2018_drcd_baseline.py#L900

new() missing 2 required positional arguments: 'start_index' and 'end_index'

1. 只跑SQuAD 2.0的话，建议使用bert原版代码：https://github.com/google-research/bert/blob/master/run_squad.py 2. 中文版BERT词表中包含一些常见英文单词，这里的代码是可以支持中英混合数据的。

new() missing 2 required positional arguments: 'start_index' and 'end_index'

少量的中英混合是没有问题的，因为本身中文预训练语料中也会存在一定的英文表述。如果你要处理的文本中英文占比不大就没有关系。

new() missing 2 required positional arguments: 'start_index' and 'end_index'

哪行报错误？

new() missing 2 required positional arguments: 'start_index' and 'end_index'

比较奇怪，_NbestPrediction 定义里有这两个argument吗？谷歌原版run_squad.py里是没有的。你就围绕这两个参数调一调吧，或者这一块直接不处理，最后判断没有答案就写一个空字符串，试试会不会报错。

new() missing 2 required positional arguments: 'start_index' and 'end_index'

可不用这两个index信息，当时是为了把index信息写入文件所以留下的这两个字段。如果你不用的话可以把后续涉及到这两个index的代码都删掉。

new() missing 2 required positional arguments: 'start_index' and 'end_index'

仔细看榜单上的机构，并不是我们的submission。

您好，calc_f1_score函数为什么返回的是max(f1_scores)呢？这部分可以解释一下吗，返回最大的，不是就是1了吗？这里没看明白

为什么是1呢？

1
2
3
4
5
6
7
8
9
10
...
16
17
›