themis-ml Implement "Reweighting" fairness-aware preprocessing

Implement "Reweighting" fairness-aware preprocessing

Open cosmicBboy opened this issue 8 years ago • 0 comments

Reweighting takes a dataset D and assigns a weight to each observation using conditional probabilities based on target labels and protected class membership.

s1 - disadvantaged group s2 - advantaged group + - positive label - - negative label

large weights are assigned to X_s1_y+ and X_s0_y–:
- weights for s1 | +: (p(s1) * p(+)) / p(s1 and +)
- weights for s1 | -: (p(s1) * p(-)) / p(s1 and -)
small weights are assigned to Xs1_y– and X_s0_y+
- weights for s0 | +: (p(s0) * p(+)) / p(s0 and +)
- weights for s0 | -: (p(s0) * p(-)) / p(s0 and -)
the weights are then used as input to model types that support weighted observations

NOTE: The above weighting scheme works because e.g. the numerator p(s1) * p(+) denotes the expected probability of an observation being disadvantaged and positively labelled if the two variables are independent, and the denominator p(s1 and +) denotes the actual probability. Therefore, in a discriminatory dataset the term (p(s1) * p(+)) / p(s1 and +) will evaluate to > 1 since the actual probability of being s1 and + is less than the expected probability under the independence assumption.

Conversly, (p(s1) * p(-)) / p(s1 and -) will evaluate to < 1 since the actual probability of being s1 and - is greater than the expected probability under the independence assumption.

Aug 30 '17 04:08 cosmicBboy

themis-ml themis-ml copied to clipboard

Implement "Reweighting" fairness-aware preprocessing

themis-ml
themis-ml copied to clipboard