twic icon indicating copy to clipboard operation
twic copied to clipboard

Text Chunking Option

Open jarmoza opened this issue 8 years ago • 0 comments

By default TWiC will now chunk texts over 5000 words, and will do so by attempting to find logical, syntactical endpoints in texts if possible. This chunk size should be alterable via configuration file and command line argument. Part of the reasoning here is to follow suggested methodology (see Jockers' Macroanalysis) and also that large texts noticeably slow down TWiC when the Text View panel is opened. The balance here is also that in the current layouts for CorpusCluster and TextCluster panels, more texts crowd the space.

jarmoza avatar Oct 31 '15 14:10 jarmoza