SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?
Paper
•
2506.04557
•
Published
AfroXLMR-large-114L was created by an MLM adaptation of the expanded XLM-R-large model on 114 languages widely spoken in Africa including 4 high-resource languages.
A mix of mC4, Wikipedia and OPUS data
There are 76 languages available :
We would like to thank Google Cloud for providing us access to TPU v4-8 through the free cloud credits. Model trained using flax, before converted to pytorch.
@article{li2025ssa,
title={SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?},
author={Li, Senyu and Wang, Jiayi and Ali, Felermino DMA and Cherry, Colin and Deutsch, Daniel and Briakou, Eleftheria and Sousa-Silva, Rui and Cardoso, Henrique Lopes and Stenetorp, Pontus and Adelani, David Ifeoluwa},
journal={arXiv preprint arXiv:2506.04557},
year={2025}
}