DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper
•
2402.03300
•
Published
•
138
Description:
A GRPO-fine-tuned version of Qwen2.5-1.5B trained on the MATH dataset.
@article{sha2024deepseekmath,
title = {DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models},
author = {Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Bi, Xiao and … Guo, Daya},
journal = {arXiv preprint arXiv:2402.03300},
year = {2024},
}
Base model
Qwen/Qwen2.5-1.5B