MMLU Paraphrases

rbelanec 's Collections

updated Mar 18, 2025

Paraphrases of the max 500 tokens subset of the MMLU dataset. We train models on both paraphrases and not paraphrases to increase robustness.