Chapter 37 Further Reading: Emerging AI Technologies
Agentic AI and Multi-Agent Systems
1. Yao, S., Zhao, J., Yu, D., et al. (2023). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. The foundational paper on the ReAct framework, which enables language models to interleave reasoning (thinking about what to do) with acting (using tools and observing results). ReAct is the conceptual basis for most agentic AI systems. The paper demonstrates that combining chain-of-thought reasoning with tool use significantly outperforms either approach alone. Essential reading for understanding the technical architecture behind the agents described in this chapter.
2. Wang, L., Ma, C., Feng, X., et al. (2024). "A Survey on Large Language Model Based Autonomous Agents." Frontiers of Computer Science, 18(6). The most comprehensive survey of the LLM-based agent landscape, covering agent architectures, planning mechanisms, memory systems, tool use, and multi-agent collaboration. The paper taxonomizes over 100 agent systems and identifies common design patterns. Particularly valuable for its systematic comparison of agent frameworks and its discussion of open challenges including reliability, safety, and evaluation.
3. Wu, Q., Bansal, G., Zhang, J., et al. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv preprint arXiv:2308.08155. The technical paper behind Microsoft's AutoGen framework for multi-agent systems. Introduces the concept of "conversable agents" that interact through structured conversations to solve complex tasks. Includes case studies in code generation, mathematics, and collaborative decision-making that illustrate both the potential and the limitations of multi-agent collaboration.
4. Chan, C. M., Chen, W., Su, Y., et al. (2024). "ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate." ICLR 2024. Demonstrates that multi-agent debate --- where multiple AI agents evaluate each other's outputs --- can produce more reliable evaluations than single-agent assessment. The implications for quality assurance in agentic workflows are significant: rather than relying on a single agent's judgment, organizations can use adversarial multi-agent processes to improve output quality.
Edge AI and Small Language Models
5. Banbury, C., Reddi, V. J., Lam, M., et al. (2021). "Benchmarking TinyML Systems: Challenges and Direction." arXiv preprint arXiv:2003.04821. The definitive overview of the TinyML field --- machine learning on microcontrollers with kilobytes of memory and milliwatts of power. Covers the hardware landscape, model optimization techniques, and benchmarking challenges. Useful for understanding the extreme end of edge AI and the techniques (quantization, pruning, architecture search) that enable AI on the smallest devices.
6. Gunter, T., Wang, Z., Wang, C., et al. (2024). "Apple Intelligence Foundation Language Models." Apple Machine Learning Research. Apple's technical report on building small language models optimized for on-device deployment on iPhones and Macs. Demonstrates how a combination of careful architecture design, training data curation, and quantization can produce models that run efficiently on consumer hardware while maintaining useful capability. A practical example of the small model philosophy described in this chapter.
7. Abdin, M., Jacobs, S. A., Awan, A. A., et al. (2024). "Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone." arXiv preprint arXiv:2404.14219. The technical report for Microsoft's Phi-3 family of small language models, which achieved benchmark scores rivaling much larger models through careful data curation and training methodology. The paper demonstrates that training data quality can partially compensate for model scale --- a finding with significant implications for organizations building specialized AI applications.
Quantum Computing and AI
8. Preskill, J. (2018). "Quantum Computing in the NISQ Era and Beyond." Quantum, 2, 79. John Preskill's landmark paper that coined the term "NISQ" (Noisy Intermediate-Scale Quantum) to describe the current era of quantum computing. Preskill, one of the field's most respected researchers, provides a balanced assessment of what near-term quantum computers can and cannot do. Essential reading for anyone seeking to separate quantum hype from quantum reality. His realistic timelines align with this chapter's cautious assessment.
9. Cerezo, M., Verdon, G., Huang, H.-Y., et al. (2022). "Challenges and Opportunities in Quantum Machine Learning." Nature Computational Science, 2, 567-576. A comprehensive review of quantum machine learning by leading researchers, including an honest assessment of the field's progress and its open problems. The paper distinguishes between near-term possibilities (quantum kernel methods, variational algorithms) and longer-term aspirations (quantum speedups for practical ML tasks), providing the nuanced perspective that business leaders need.
10. National Institute of Standards and Technology. (2024). Post-Quantum Cryptography Standards (FIPS 203, 204, 205). US Department of Commerce. NIST's finalized post-quantum cryptography standards, providing concrete guidance for organizations beginning the transition to quantum-resistant encryption. While technical, the executive summary and implementation guidance sections are accessible to business leaders. The most actionable quantum computing resource for most organizations, as the post-quantum cryptography transition is relevant today regardless of when practical quantum computers arrive.
Hardware Economics and AI Chips
11. Knight, W. (2024). "The GPU Shortage Is Changing the AI Industry." Wired. A thorough investigation of the GPU shortage's impact on the AI industry, including its effects on startups, research labs, and national AI strategies. Knight documents how hardware access has become a competitive determinant and explores the strategic responses of cloud providers, AI startups, and governments. Connects directly to the chapter's discussion of hardware as a binding constraint on AI ambitions.
12. Hooker, S. (2021). "The Hardware Lottery." Communications of the ACM, 64(12), 58-65. Sara Hooker's influential essay arguing that the AI research community's directions are heavily shaped by the available hardware --- and that potentially superior AI approaches are never explored because they are not well-suited to GPUs. The paper challenges the assumption that current AI architectures are optimal and suggests that alternative hardware (neuromorphic chips, analog computing) could enable fundamentally different AI approaches. Thought-provoking reading for anyone considering the future of AI hardware.
13. Khan, S., & Mann, A. (2024). AI Chips: Why They Matter for US National Security. Center for a New American Security. An analysis of the geopolitical dimensions of AI chip manufacturing, including the US-China competition for AI hardware dominance, the strategic importance of TSMC's semiconductor manufacturing, and the implications of US export controls on advanced chips. Provides essential context for understanding why hardware economics is not just a business issue but a national security concern.
Open-Source vs. Closed Models
14. Touvron, H., Martin, L., Stone, K., et al. (2023). "Llama 2: Open Foundation and Fine-Tuned Chat Models." arXiv preprint arXiv:2307.09288. Meta's technical paper accompanying the Llama 2 release --- one of the most consequential decisions in AI industry history. The paper details the model architecture, training methodology, and safety evaluation, but the real significance is strategic: Meta's decision to release powerful models as open-weight fundamentally changed the competitive dynamics of the AI industry. Read alongside the chapter's analysis of open-source vs. closed-model tradeoffs.
15. Bommasani, R., Hudson, D. A., Adeli, E., et al. (2022). "On the Opportunities and Risks of Foundation Models." arXiv preprint arXiv:2108.07258. Stanford CRFM. The Stanford CRFM's comprehensive analysis of foundation models --- their capabilities, risks, and societal implications. At over 200 pages, this is the most thorough academic treatment of the foundation model phenomenon. The sections on centralization risks and ecosystem dynamics are particularly relevant to the open-source vs. closed-model debate discussed in this chapter.
16. Kapoor, S., & Narayanan, A. (2024). "Leakage and the Reproducibility Crisis in Machine-Learning-Based Science." Patterns, 5(1). A rigorous analysis of reproducibility problems in machine learning, including issues specific to both open and closed models. The paper demonstrates that many claimed AI results cannot be reproduced, even with open-source code, due to data leakage, unreported preprocessing steps, and environmental dependencies. Essential context for evaluating AI benchmark claims from any source.
AI and Robotics
17. Brohan, A., Brown, N., Carbajal, J., et al. (2023). "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control." arXiv preprint arXiv:2307.15818. Google DeepMind. Google DeepMind's paper on RT-2, a robot control model that transfers knowledge from large vision-language models to physical manipulation tasks. The paper demonstrates that the same scaling and transfer learning approaches that powered the language model revolution can be applied to robotics --- a significant step toward general-purpose embodied AI. Temper the excitement with the chapter's realistic assessment of robotics timelines.
18. Fitzgerald, M. (2024). "Inside Amazon's Robot Revolution." MIT Technology Review. A detailed examination of Amazon's robotics deployment --- the most extensive commercial deployment of AI-powered robots in the world. Covers the evolution from simple conveyor systems to autonomous mobile robots, the integration challenges, the workforce implications, and the economic model. Provides practical evidence for the chapter's distinction between robotics in structured environments (proven) and unstructured environments (aspirational).
Synthetic Data
19. Shumailov, I., Shumaylov, Z., Zhao, Y., et al. (2024). "AI Models Collapse When Trained on Recursively Generated Data." Nature, 631, 755-759. The landmark study on "model collapse" --- the finding that training language models on outputs from other language models causes progressive quality degradation over successive generations. As AI-generated content increasingly populates the internet, this finding has profound implications for future model training and the value of curated, human-generated data. One of the most important AI research findings of 2024.
20. Jordon, J., Szpruch, L., Houssiau, F., et al. (2022). "Synthetic Data --- What, Why, and How?" The Alan Turing Institute & Royal Statistical Society. A comprehensive guide to synthetic data from the Alan Turing Institute, covering techniques, quality metrics, privacy guarantees, and practical applications. Written for a mixed audience of researchers and practitioners, it provides both the technical foundations and the practical guidance that business leaders need to evaluate synthetic data solutions.
Technology Strategy and Emerging Technology Adoption
21. Neal Ford, N., Sadalage, P., & Bak, N. (2022). Software Architecture: The Hard Parts. O'Reilly Media. While focused on software architecture, this book's treatment of technology decision-making under uncertainty --- including the "fitness function" approach to evaluating architectural choices --- is directly applicable to the AI Technology Radar concept described in this chapter. The framework for making reversible vs. irreversible technology decisions is particularly relevant for emerging AI adoption.
22. ThoughtWorks Technology Radar. Published twice annually at thoughtworks.com/radar. The original Technology Radar, published by ThoughtWorks since 2010, categorizes technologies, tools, platforms, and techniques into the same four rings (Hold, Assess, Trial, Adopt) described in this chapter. The AI Technology Radar concept presented in the chapter is directly inspired by this publication. Reviewing past editions reveals how quickly the technology landscape shifts and how frequently confident predictions prove wrong --- a healthy corrective for anyone tempted to treat current trends as permanent.
23. Christensen, C. M., McDonald, R., Altman, E. J., & Palmer, J. E. (2018). "Disruptive Innovation: An Intellectual History and Directions for Future Research." Journal of Management Studies, 55(7), 1043-1078. Clayton Christensen's updated treatment of disruption theory, including how it applies to AI. The paper distinguishes between sustaining innovations (which improve existing products along established dimensions) and disruptive innovations (which initially underperform on traditional metrics but create new value propositions). This framework is useful for evaluating which emerging AI technologies are sustaining (better AI for existing tasks) and which are disruptive (AI that enables entirely new business models).
24. Satell, G. (2024). "How to Create a Robust Technology Strategy." Harvard Business Review. A practical guide to building organizational capability for technology adoption, emphasizing the importance of strategic patience, portfolio approaches, and alignment between technology investments and business objectives. Complements the chapter's discussion of the Technology Radar with a broader strategic perspective on managing technology uncertainty.
25. Mollick, E. (2024). "What Just Happened? Looking Back at the AI Year." Substack: One Useful Thing. Ethan Mollick's year-in-review analysis of AI developments provides a model for the kind of ongoing assessment that business leaders should engage in. Mollick's approach --- grounded in personal experimentation, research evidence, and honest uncertainty about timelines --- exemplifies the balanced perspective the chapter advocates. His Substack is one of the best ongoing sources for business-relevant AI analysis.
For comprehensive coverage of AI governance frameworks relevant to emerging technology deployment, see the further reading for Chapters 27-30. For the strategic frameworks that contextualize technology adoption decisions, see the further reading for Chapter 31. For the AI maturity assessment that determines an organization's readiness for emerging technologies, see the further reading for Chapter 1.