Has the system been tested for disparate impact across demographic groups? - Which fairness metrics have been applied (demographic parity, equalized odds, calibration — see Chapter 15)? - What trade-offs between fairness definitions have been accepted, and why?