bert icon indicating copy to clipboard operation
bert copied to clipboard

Processing book corpus

Open ngoanpv opened this issue 4 years ago • 1 comments

Hi team et al,

I'd like to know how to process bookcorpus to pre-training. I am confusing to process this data. Should I treat 1 book as a document including all sentences or 1 chapter as a document?

Thanks.

ngoanpv avatar Nov 10 '19 05:11 ngoanpv

Same question!

sunyilgdx avatar Dec 23 '21 03:12 sunyilgdx