bert Processing book corpus

Processing book corpus

Open ngoanpv opened this issue 6 years ago • 1 comments

Hi team et al,

I'd like to know how to process bookcorpus to pre-training. I am confusing to process this data. Should I treat 1 book as a document including all sentences or 1 chapter as a document?

Thanks.

Nov 10 '19 05:11 ngoanpv

Same question!

Dec 23 '21 03:12 sunyilgdx

bert bert copied to clipboard

Processing book corpus

bert
bert copied to clipboard