Description: 5.85 billion image-text pairs scraped from the web. Used to train open-source vision-language models like Stable Diffusion. - License: CC BY 4.0 (metadata); images are linked, not redistributed. - Chapters: 22, 23.