Glossary

MMLU (Massive Multitask Language Understanding)

URL: https://github.com/hendrycks/test - Description: 57 multiple-choice tasks spanning STEM, humanities, social sciences, and more. A standard benchmark for evaluating LLM knowledge breadth. - Size: Approximately 15,000 test questions. - License: MIT. - Chapters: 15, 16, 31.

Learn More

Related Terms