Flux.jl icon indicating copy to clipboard operation
Flux.jl copied to clipboard

[Bug]: Swapped `alpha` and `beta` in `tversky` loss?

Open Saransh-cpp opened this issue 3 years ago • 0 comments

The tversky loss has 2 parameters, $\alpha$ and $\beta$, and Flux internally calculates the value of $\alpha$ as 1 - $\beta$. The loss is defined as 1 - tversky index -

$$1 - \frac{True Positives}{True Positives + \alpha * False Positives + \beta * False Negatives}$$

the Tversky index is defined as: S(P, G; α, β) = |P G| / (|P G| + α|P \ G| + β|G \ P|) (2) where α and β control the magnitude of penalties for FPs and FNs, respectively.

Flux implements it as -

1 - sum(|y .* ŷ| + 1) / (sum(y .* ŷ + β*(1 .- y) .* ŷ + (1 - β)*y .* (1 .- ŷ)) + 1)

Code -

num = sum(y .* ŷ) + 1
den = sum(y .* ŷ + β * (1 .- y) .* ŷ + (1 - β) * y .* (1 .- ŷ)) + 1
1 - num / den

Notice how the term (1 .- y) .* ŷ (False Positives, I hope I am not wrong) is multiplied by $\beta$, whereas it should be multiplied with $\alpha$ (which is 1 - $\beta$). Similarly, the term y .* (1 .- ŷ) is multiplied with $\alpha$ (that is 1 - $\beta$), whereas it should be multiplied with $\beta$.

This makes the loss function behave in a manner opposite to its documentation. For example -


julia> y = [0, 1, 0, 1, 1, 1];

julia> ŷ_fp = [1, 1, 1, 1, 1, 1];  # 2 false positive -> 2 wrong predictions

julia> ŷ_fnp = [1, 1, 0, 1, 1, 0];  # 1 false negative, 1 false positive -> 2 wrong predictions

julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996

julia> Flux.tversky_loss(ŷ_fp, y)  # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.21875

Here the loss for ŷ_fnp, y should have been larger than the loss for ŷ_fp, y as the loss should give more weight or penalize the False Negatives (default $\beta$ is 0.7; hence it should give more weight to FN), but the exact opposite happens.

Changing the implementation of the loss -

julia> y = [0, 1, 0, 1, 1, 1];

julia> ŷ_fp = [1, 1, 1, 1, 1, 1];  # 2 false positive -> 2 wrong predictions

julia> ŷ_fnp = [1, 1, 0, 1, 1, 0];  # 1 false negative, 1 false positive -> 2 wrong predictions

julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996

julia> Flux.tversky_loss(ŷ_fp, y)  # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.1071428571428571

which looks right.

Is this a bug, or am I missing something? Would be happy to create a PR if it is a bug!

Saransh-cpp avatar Jun 08 '22 16:06 Saransh-cpp