Chapter 39: Key Takeaways

Bias and Fairness

  1. Bias enters ML systems at every stage. Data collection (selection, historical, measurement bias), algorithm design (representation, optimization bias), and deployment (feedback loops, automation bias) all introduce or amplify unfair discrimination.

  2. Multiple fairness definitions exist, and they conflict. Demographic parity (equal positive rates), equalized odds (equal TPR and FPR), equal opportunity (equal TPR), and calibration (equal meaning of scores) cannot all be satisfied simultaneously when base rates differ. Choosing a fairness criterion requires context-specific value judgments.

  3. Bias mitigation operates at three levels. Pre-processing (resampling, reweighting), in-processing (adversarial debiasing, constrained optimization), and post-processing (threshold adjustment, calibration). Each has different trade-offs between effectiveness, complexity, and impact on accuracy.

  4. Proxy discrimination bypasses explicit protections. Even when protected attributes are excluded from the model, correlated features (zip code, name, employment history) can serve as indirect proxies for discrimination.

Regulation and Governance

  1. The EU AI Act establishes a risk-based regulatory framework. AI systems are classified as unacceptable (banned), high-risk (strict requirements), limited risk (transparency obligations), or minimal risk. Employment and credit scoring are high-risk categories requiring documentation, risk management, and human oversight.

  2. Documentation is a professional obligation. Model Cards document model capabilities, limitations, and fairness assessments. Datasheets for Datasets document data provenance, composition, and intended use. Both are increasingly expected by regulators and industry.

Privacy

  1. ML models can leak private training data. Membership inference attacks, model inversion, and training data extraction demonstrate that trained models can reveal sensitive information about individuals in their training data.

  2. Differential privacy provides mathematical privacy guarantees. DP-SGD (per-sample gradient clipping + Gaussian noise) ensures that the trained model's behavior does not depend significantly on any single training example. The privacy-accuracy trade-off is real but can be mitigated with larger datasets and pre-training.

Generative AI and Deepfakes

  1. Deepfakes create new categories of harm. Political disinformation, non-consensual imagery, fraud, and evidence fabrication are enabled by realistic AI-generated media. Detection, provenance tracking (C2PA), and watermarking are active defense strategies.

  2. Responsible generative AI requires proactive measures. Content moderation, usage policies, red teaming, staged deployment, and watermarking are necessary practices for deploying generative AI systems.

Environmental and Social Impact

  1. Training large models has significant environmental costs. Energy consumption scales with model size and training duration. Mitigation strategies include efficient architectures (MoE, distillation), carbon-aware computing, and model reuse through fine-tuning.

  2. The open vs. closed model debate involves genuine trade-offs. Open models democratize access and enable safety research through transparency, but powerful open models cannot be recalled after release. Intermediate approaches (staged release, responsible use licenses) are emerging.

AI Safety

  1. AI safety is a set of concrete engineering practices. Define intended use, conduct pre-deployment evaluations, implement guardrails, design for human oversight, plan for incidents, and monitor in production. Safety is not separate from engineering---it is part of it.

  2. Scalable oversight is an open challenge. As AI systems become more capable, developing methods to evaluate and correct them (debate, recursive reward modeling, iterated amplification) becomes increasingly important.

  3. Every AI engineer has an ethical responsibility. Building systems that are fair, transparent, private, safe, and accountable is not optional. It is as fundamental to professional competence as technical skill. The tools in this chapter---fairness metrics, bias auditing, differential privacy, model cards---are essential components of responsible AI engineering.