Definitions of Fairness
Models are trained on real-world examples to mimic past outcomes on unseen data. The training data could be biased, which means the model will perpetuate the biases in the decisions it makes. While there is not a universally agreed upon definition of fairness, we define a ‘fair’ ML model as a model that does not favor any group of people based on their characteristics. Ensuring fairness is key before deploying a model in production. For example, in the US, the government prohibited discrimination in credit and real-estate transactions with fair lending laws like the Equal Credit Opportunity Act (ECOA) and the Fair Housing Act (FHAct). The Equal Employment Opportunity Commission (EEOC) acknowledges 12 factors of discrimination:[1] age, disability, equal pay/compensation, genetic information, harassment, national origin, pregnancy, race/color, religion, retaliation, sex, sexual harassment. These are what we call protected attributes.Fairness Metrics
Among others, some popular metrics are:- Disparate Impact
- Group Benefit
- Equal Opportunity
- Demographic Parity
Disparate Impact
Disparate impact is a form of indirect and unintentional discrimination, in which certain decisions disproportionately affect members of a protected group. Mathematically, disparate impact compares the pass rate of one group to that of another. The pass rate is the rate of positive outcomes for a given group. It’s defined as follows: pass rate = passed / (num of ppl in the group) Disparate impact is calculated by:DI = (pass rate of group 1) / (pass rate of group 2)
Groups 1 and 2 are interchangeable. Therefore, the following formula can be used to calculate disparate impact:
DI = min{pr_1, pr_2} / max{pr_1, pr_2}.
The disparate impact value is between 0 and 1. The Four-Fifths rule states that the disparate impact has to be greater than 80%.
For example:
pass-rate_1 = 0.3, pass-rate_2 = 0.4, DI = 0.3/0.4 = 0.75
pass-rate_1 = 0.4, pass-rate_2 = 0.3, DI = 0.3/0.4 = 0.75
📘 Info Disparate impact is the only legal metric available. The other metrics are not yet codified in US law.
Demographic Parity
Demographic parity states that the proportion of each segment of a protected class should receive the positive outcome at equal rates. Mathematically, demographic parity compares the pass rate of two groups. The pass rate is the rate of positive outcome for a given group. It is defined as follow:pass rate = passed / (num of ppl in the group)
If the decisions are fair, the pass rates should be the same.
Group Benefit
Group benefit aims to measure the rate at which a particular event is predicted to occur within a subgroup compared to the rate at which it actually occurs. Mathematically, group benefit for a given group is defined as follows:Group Benefit = (TP+FP) / (TP + FN).
Group benefit equality compares the group benefit between two groups. If the two groups are treated equally, the group benefit should be the same.
Equal Opportunity
Equal opportunity means that all people will be treated equally or similarly and not disadvantaged by prejudices or bias. Mathematically, equal opportunity compares the true positive rate (TPR) between two groups. TPR is the probability that an actual positive will test positive. The true positive rate is defined as follows:TPR = TP/(TP+FN)
If the two groups are treated equally, the TPR should be the same.