model-diagnostics icon indicating copy to clipboard operation
model-diagnostics copied to clipboard

Add sample uncertainty to score decompose

Open lorentzenchr opened this issue 2 years ago • 6 comments

lorentzenchr avatar Apr 14 '23 12:04 lorentzenchr

@lorentzenchr do you have any reference for implementing this? This feature sounds very useful and I would be happy to contribute

m-maggi avatar Dec 04 '23 17:12 m-maggi

I'm thinking of a new function compute_score as the compute_bias. It's just simple t-tests, see the code of compute_bias.

lorentzenchr avatar Dec 04 '23 18:12 lorentzenchr

@lorentzenchr thanks for clarifying. To put it in pseudo code, compute_score should take as argument at least score_per_obs and at some point call

score_per_obs_de_meaned = score_per_obs - np.mean(score_per_obs)
scipy.special.stdtr(len(score_per_obs) - 1, 
                     -np.abs(score_per_obs_de_meaned / stderr(score_per_obs))

I ignored the weights for the time being. What do you think?

m-maggi avatar Dec 08 '23 20:12 m-maggi

I would use the model predictions instead of the score per obs, pretty much a blend of decompose and compute_bias:

def compute_score(
    y_obs,
    y_pred,
    feature,
    weights,
    scoring_function,
    functional,
    level,
    n_bins,
):

lorentzenchr avatar Dec 09 '23 10:12 lorentzenchr

I'm thinking of a new function compute_score as the compute_bias. It's just simple t-tests, see the code of compute_bias.

the t-test in compute_bias is testing whether the bias per observation has 0 mean, right? What would be the null hypothesis in the compute_scorecase? Otherwise to give the user a sense of the uncertainty one could return a confidence interval on the statistical risk, which would use (among other things like the empirical risk) the t-student percentile at the desired confidence level.

m-maggi avatar Dec 25 '23 16:12 m-maggi

I guess uncertainty / confidence intervals would be enough. As you say, for bias there is a universal reference, i.e. zero, for scores all pairwise comparison are options, that's way too many.

lorentzenchr avatar Feb 11 '24 09:02 lorentzenchr