ChineseBert icon indicating copy to clipboard operation
ChineseBert copied to clipboard

about train

Open helloword12345678 opened this issue 6 years ago • 1 comments

thanks for sharing: in the readme you say "Data: 200m chinese internet question answering pairs. Vocab: 52777, jieba CWS enhanced with forward maximum matching." so you train process is just pair of <question,answer>,so this train process is different with squad which <passage,question,answer>, because you don't have passage and this is not a span extraction problem! if i said is right,you train is two input sentence similarity probelm? can you give some explanation

helloword12345678 avatar Dec 15 '18 07:12 helloword12345678

exactly! As the main focus of BERT is to pre-train the model, the model is essentially sentence-level. However, you can build your own model upon BERT to perform your task-specific applications, such as machine comprehension, question answering, sentence classification etc.

benywon avatar Dec 16 '18 03:12 benywon