Chapter 40: Further Reading
Test-Time Compute and Inference Scaling
-
Snell, C., Lee, J., Xu, K., & Kumar, A. (2024). "Scaling LLM Test-Time Compute Optimally Can Be More Effective Than Scaling Model Parameters." arXiv preprint arXiv:2408.03314. A rigorous analysis demonstrating that allocating compute at inference can be more cost-effective than training larger models.
-
Wei, J., Wang, X., Schuurmans, D., et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS 2022. The foundational paper on chain-of-thought prompting, showing that intermediate reasoning steps improve performance on multi-step problems.
-
Brown, B., Juravsky, J., Ehrlich, R., et al. (2024). "Large Language Monkeys: Scaling Inference Compute with Repeated Sampling." arXiv preprint arXiv:2407.21787. Demonstrates that repeated sampling with verification can dramatically improve performance on coding and math benchmarks.
World Models
-
Ha, D. & Schmidhuber, J. (2018). "World Models." arXiv preprint arXiv:1803.10122. A seminal paper introducing the modern concept of learned world models for reinforcement learning agents.
-
Bruce, J., Dennis, M., Edwards, A., et al. (2024). "Genie: Generative Interactive Environments." arXiv preprint arXiv:2402.15391. Demonstrates learning interactive world models from unlabeled video data.
-
NVIDIA. (2024). "Cosmos: World Foundation Models." Technical Report. NVIDIA's approach to building world foundation models for physical AI and robotics.
Neurosymbolic AI
-
Garcez, A. d'A. & Lamb, L. C. (2023). "Neurosymbolic AI: The 3rd Wave." Artificial Intelligence Review, 56, 12387--12406. A comprehensive survey of neurosymbolic integration approaches and their historical context.
-
Li, Z., Huang, J., & Naik, M. (2023). "Scallop: A Language for Neurosymbolic Programming." PLDI 2023. Introduces Scallop, a probabilistic programming language for neurosymbolic applications.
-
Gao, L., Madaan, A., Zhou, S., et al. (2023). "PAL: Program-Aided Language Models." ICML 2023. Demonstrates LLMs generating code to offload computation to symbolic executors.
Continual Learning
-
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., et al. (2017). "Overcoming Catastrophic Forgetting in Neural Networks." PNAS, 114(13), 3521--3526. The original EWC paper, introducing Fisher information regularization for continual learning.
-
De Lange, M., Aljundi, R., Masana, M., et al. (2022). "A Continual Learning Survey: Defying Forgetting in Classification Tasks." IEEE TPAMI, 44(7), 3366--3385. A comprehensive survey covering regularization, replay, and architecture-based continual learning methods.
-
Wang, L., Zhang, X., Su, H., & Zhu, J. (2024). "A Comprehensive Survey of Continual Learning: Theory, Method and Application." IEEE TPAMI. An updated survey covering continual learning in the era of foundation models.
AI for Science
-
Jumper, J., Evans, R., Pritzel, A., et al. (2021). "Highly Accurate Protein Structure Prediction with AlphaFold." Nature, 596, 583--589. The paper that solved the protein folding problem using deep learning.
-
Merchant, A., Batzner, S., Schoenholz, S. S., et al. (2023). "Scaling Deep Learning for Materials Discovery." Nature, 624, 80--85. GNoME's discovery of millions of new stable materials using graph neural networks.
-
Lu, C., Lu, C., Lange, R. T., et al. (2024). "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery." arXiv preprint arXiv:2408.06292. An early prototype of an autonomous AI system for scientific research.
Autonomous AI Systems
-
Yao, S., Zhao, J., Yu, D., et al. (2023). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. A foundational framework for language model agents that interleave reasoning and tool use.
-
Jimenez, C. E., Yang, J., Wettig, A., et al. (2024). "SWE-bench: Can Language Models Resolve Real-World GitHub Issues?" ICLR 2024. A benchmark for evaluating coding agents on real software engineering tasks.
-
Zhou, S., Xu, F. F., Zhu, H., et al. (2024). "WebArena: A Realistic Web Environment for Building Autonomous Agents." ICLR 2024. A benchmark for evaluating web-browsing AI agents.
The Path to AGI
-
Morris, M. R., Sohl-Dickstein, J., Fiedel, N., et al. (2024). "Levels of AGI: Operationalizing Progress on the Path to AGI." arXiv preprint arXiv:2311.02462. A framework for measuring AGI progress using a graduated capability scale.
-
Sutskever, I. (2024). "An Observation on Generalization." Various talks and interviews on the scaling hypothesis and its limits.
-
Chollet, F. (2019). "On the Measure of Intelligence." arXiv preprint arXiv:1911.01547. A thoughtful analysis of what intelligence measurement should capture, beyond narrow benchmarks.
Quantum Machine Learning
-
Cerezo, M., Arrasmith, A., Babbush, R., et al. (2021). "Variational Quantum Algorithms." Nature Reviews Physics, 3, 625--644. A comprehensive review of variational quantum algorithms including quantum ML approaches.
-
Schuld, M. & Petruccione, F. (2021). Machine Learning with Quantum Computers, 2nd ed. Springer. A textbook covering quantum ML from foundations to applications.
-
McClean, J. R., Boixo, S., Smelyanskiy, V. N., et al. (2018). "Barren Plateaus in Quantum Neural Network Training Landscapes." Nature Communications, 9, 4812. The paper identifying the barren plateau problem in parameterized quantum circuits.
Career Development
-
Ng, A. (2024). "How to Build Your Career in AI." DeepLearning.AI Reading List. Practical advice on building an AI career from one of the field's most influential educators.
-
Huyen, C. (2022). Designing Machine Learning Systems. O'Reilly Media. An excellent guide to the practical challenges of building production ML systems.
AI Governance and Ethics
-
European Union. (2024). "EU Artificial Intelligence Act." Official regulation text. The first comprehensive AI regulation, establishing risk-based requirements for AI systems.
-
Weidinger, L., Mellor, J., Rauh, M., et al. (2021). "Ethical and Social Risks of Harm from Language Models." arXiv preprint arXiv:2112.04359. A taxonomy of risks from large language models with mitigation strategies.
-
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" FAccT 2021. An influential paper on the environmental and social costs of large language models.