tofu The implementation of Truth Ratio and Probability is different from the definition in the paper

The implementation of Truth Ratio and Probability is different from the definition in the paper

Open wzunknown opened this issue 10 months ago • 13 comments

Truth Ratio

In the paper, the truth ratio is defined as, The normalization is defined as, The code implementation is, https://github.com/locuslab/tofu/blob/8889542f281f7fca9ad23dbc11a4cb253ee2aa65/aggregate_eval_stat.py#L60-L72

In the code, there are two questions:

L69, the normalization for the "forget" branch take the minimum of a normalized probability and its reciprocal which doesn't make sense and is different from the paper.
L64, the mean operation is over the log probs, but the average in the paper is over the probs.

Probability

The probability score for Real Authors and World Facts is defined as the ratio of original probabilities, but in the code (L50-L53) is computed as the ratio of normalized probabilities. https://github.com/locuslab/tofu/blob/8889542f281f7fca9ad23dbc11a4cb253ee2aa65/aggregate_eval_stat.py#L45-L54

Any help is appreciated!

Apr 03 '24 05:04 wzunknown

tofu tofu copied to clipboard

The implementation of Truth Ratio and Probability is different from the definition in the paper

Truth Ratio

Probability

tofu
tofu copied to clipboard