JFastText icon indicating copy to clipboard operation
JFastText copied to clipboard

the prediction is not same as predicted using official c++

Open ericxsun opened this issue 5 years ago • 2 comments

I just tested this repo and the official one to predict a number of samples(with the same model trained by official code, in format of ftz).

c++

fasttext predict-prob test-example.txt

java(this) - api call

(equal represents the label is same, discard the probability) all samples: 21513 equal: 19236 not-equal: 2219 null(in this repo): 58

java-cmd(this)

java -jar jfasttext-0.4-jar-with-dependencies.jar predict-prob test-example.txt all samples: 21513 equal: 18825 not-equal: 2688

so, what's wrong?

Another thing: the prediction of java-cmd is unstable , changing every time.

ericxsun avatar Mar 17 '19 09:03 ericxsun

Found this one: https://github.com/linkfluence/fastText4j, the prediction is quite same.

ericxsun avatar Mar 18 '19 04:03 ericxsun

Based on what @carschno mentioned in https://github.com/vinhkhuc/JFastText/issues/49, I used this to get the right results:

public Map<String, Double> predictTopLabel(String text, int k) {
    Map<String, Double> scoreMap = new LinkedHashMap<>();
    text = StringUtils.trimToEmpty(text) + "\n";
    final List<JFastText.ProbLabel> pl = model.predictProba(text, k);
    for (JFastText.ProbLabel i : CollectionUtils.emptyIfNull(pl)) {
        final double prob = Math.exp(i.logProb);
        final double score = Math.round(prob * 100000000) / 100000000;
        scoreMap.put(i.label, score);
    }
    return scoreMap;
}

kun368 avatar Feb 25 '22 09:02 kun368