langid.py icon indicating copy to clipboard operation
langid.py copied to clipboard

if wordn is set in tokenize.py, the max_order in DFfeatureselect.py is according to words or bytes?

Open RyanPeking opened this issue 4 years ago • 0 comments

if wordn 3-gram is set in tokenize.py, the unit of max_order in DFfeatureselect.py is word or byte?Because in some langs, one string takes up several bytes.

RyanPeking avatar May 14 '20 06:05 RyanPeking