enry icon indicating copy to clipboard operation
enry copied to clipboard

Bayesian classifier cann't distinguish "SQL" vs "PLpgSQL"

Open bzz opened this issue 7 years ago • 0 comments

Part of the #155.

After update to latest samples in #189, Bayesian classifier test fail to distinguish "SQL" vs "PLpgSQL" based only on content. Classifier weights are different in enry/linguist for the same document https://github.com/src-d/enry/pull/189#issuecomment-457559748

This most probably this has to do with with difference between tokenizations between two projects that going to be addressed in #193

bzz avatar Jan 28 '19 10:01 bzz