style-transfer-paraphrase icon indicating copy to clipboard operation
style-transfer-paraphrase copied to clipboard

COHA datasets - request example of end-to-end training

Open GenTxt opened this issue 4 years ago • 0 comments

I have read the information under 'Custom Datasets' but I'm unclear how it applies to COHA files which are single line texts in the following format:

@@10133

" But , oh , I do n't like those people . They do n't like us . They 're dead , they do n't care , they do n't even feel foolish , " Albany said . I felt mad enough @ @ @ @ @ @ @ @ @ @ hotly as she met his eyes . " etc.

I would like to train another COHA model. Is it possible to provide an end-to-end example that explains how to do this?

Should I collect the COHA text files into groups and merge for train.txt, dev.txt, test.txt ?

I assume train.label, dev.label, test.label in the example COHA folders can be used and modified e.g. 1890s-1900s --> 1940s-1950s

Thanks

GenTxt avatar Dec 14 '20 01:12 GenTxt