[Bug]: Swapped `alpha` and `beta` in `tversky` loss?
The tversky loss has 2 parameters, $\alpha$ and $\beta$, and Flux internally calculates the value of $\alpha$ as 1 - $\beta$. The loss is defined as 1 - tversky index -
$$1 - \frac{True Positives}{True Positives + \alpha * False Positives + \beta * False Negatives}$$
the Tversky index is defined as: S(P, G; α, β) = |P G| / (|P G| + α|P \ G| + β|G \ P|) (2) where α and β control the magnitude of penalties for FPs and FNs, respectively.
Flux implements it as -
1 - sum(|y .* ŷ| + 1) / (sum(y .* ŷ + β*(1 .- y) .* ŷ + (1 - β)*y .* (1 .- ŷ)) + 1)
Code -
num = sum(y .* ŷ) + 1
den = sum(y .* ŷ + β * (1 .- y) .* ŷ + (1 - β) * y .* (1 .- ŷ)) + 1
1 - num / den
Notice how the term (1 .- y) .* ŷ (False Positives, I hope I am not wrong) is multiplied by $\beta$, whereas it should be multiplied with $\alpha$ (which is 1 - $\beta$). Similarly, the term y .* (1 .- ŷ) is multiplied with $\alpha$ (that is 1 - $\beta$), whereas it should be multiplied with $\beta$.
This makes the loss function behave in a manner opposite to its documentation. For example -
julia> y = [0, 1, 0, 1, 1, 1];
julia> ŷ_fp = [1, 1, 1, 1, 1, 1]; # 2 false positive -> 2 wrong predictions
julia> ŷ_fnp = [1, 1, 0, 1, 1, 0]; # 1 false negative, 1 false positive -> 2 wrong predictions
julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996
julia> Flux.tversky_loss(ŷ_fp, y) # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.21875
Here the loss for ŷ_fnp, y should have been larger than the loss for ŷ_fp, y as the loss should give more weight or penalize the False Negatives (default $\beta$ is 0.7; hence it should give more weight to FN), but the exact opposite happens.
Changing the implementation of the loss -
julia> y = [0, 1, 0, 1, 1, 1];
julia> ŷ_fp = [1, 1, 1, 1, 1, 1]; # 2 false positive -> 2 wrong predictions
julia> ŷ_fnp = [1, 1, 0, 1, 1, 0]; # 1 false negative, 1 false positive -> 2 wrong predictions
julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996
julia> Flux.tversky_loss(ŷ_fp, y) # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.1071428571428571
which looks right.
Is this a bug, or am I missing something? Would be happy to create a PR if it is a bug!