enry
enry copied to clipboard
Bayesian classifier cann't distinguish "SQL" vs "PLpgSQL"
Part of the #155.
After update to latest samples in #189, Bayesian classifier test fail to distinguish "SQL" vs "PLpgSQL" based only on content. Classifier weights are different in enry/linguist for the same document https://github.com/src-d/enry/pull/189#issuecomment-457559748
This most probably this has to do with with difference between tokenizations between two projects that going to be addressed in #193