Abstract
A number of findings in the field of machine learning have given rise
to questions about what it means for automated scoring- or decisionmaking
systems to be fair. One center of gravity in this discussion
is whether such systems ought to satisfy classification parity (which
requires parity in accuracy across groups, defined by protected attributes)
or calibration (which requires similar predictions to have similar
meanings across groups, defined by protected attributes). Central
to this discussion are impossibility results, owed to Kleinberg et al.
(2016), Chouldechova (2017), and Corbett-Davies et al. (2017), which
show that classification parity and calibration are often incompatible.
This paper aims to argue that classification parity, calibration, and a
newer, interesting measure called counterfactual fairness are unsatisfactory
measures of fairness, offer a general diagnosis of the failure of
these measures, and sketch an alternative approach to understanding
fairness in machine learning.