systems that can process and generate across multiple modalities simultaneously. A multimodal system can read a document, look at an image, listen to audio, watch a video, and reason about all of them together.