lewtun HF Staff commited on
Commit
14b1a85
·
verified ·
1 Parent(s): 89bc25b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -23,11 +23,11 @@ base_model:
23
 
24
  ## Model Summary
25
 
26
- QED-Nano is a 4B parameter model explicitly post-trained to strengthen capabilities on Olympiad-level math proof problems. Despite its small size, QED-Nano achieves an impressive 40% score on the challenging IMO-ProofBench benchmark (+20% over the Qwen3 base model), matching the performance of [GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) from OpenAI. With an agent scaffold that scales inference-time compute to over 1M tokens per problem, QED-Nano approaches the performance of Gemini-3-Pro:
27
 
28
  ![imoproofbench.png](https://huggingface.co/lm-provers/QED-Nano/resolve/main/imoproofbench.png)
29
 
30
- QED-Nano is built by post-training [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507), via a combination of supervised fine-tuning on curated high-quality proofs, and reinforcement learning with a [reasoning cache](https://huggingface.co/papers/2602.03773) meant to improve the trained model's ability to scale test-time compute effectively with agentic scaffolds. We train on a mixture of Olympiads proof problems from various public sources that we also open source, along with our SFT dataset.
31
 
32
  ## How to use
33
 
@@ -90,13 +90,6 @@ In this section, we report the evaluation results of QED-Nano. All evaluations a
90
 
91
  [ADD TABLE]
92
 
93
- ## Training
94
-
95
- ### Model
96
-
97
- - **Architecture:** Transformer decoder
98
- - **Precision:** bfloat16
99
-
100
 
101
  ## Limitations
102
 
@@ -105,15 +98,22 @@ QED-Nano is a domain-specific model that is designed for one thing and one thing
105
  ## License
106
  [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
107
 
108
- ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
- If you find this model useful in your research, please consider citing it as follows:
111
 
112
- ```bibtex
113
- @misc{qwed_nano2026,
114
- title={{QED-Nano: Solving Olympiad Math at Gemini Level with a 4B Model}},
115
- author={Setlur, Amrith and Dekoninck, Jasper and Qu, Yuxiao and Wu, Ian and Li, Jia and Beeching, Edward and Tunstall, Lewis and Kumar, Aviral},
116
- year={2026},
117
- howpublished={\url{https://huggingface.co/blog/smollm3}}
118
- }
119
- ```
 
23
 
24
  ## Model Summary
25
 
26
+ QED-Nano is a 4B parameter model explicitly post-trained to strengthen its proof-writing capabilities. Despite its small size, QED-Nano achieves an impressive 40% score on the challenging IMO-ProofBench benchmark (+20% over the Qwen3 base model), matching the performance of [GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) from OpenAI. With an agent scaffold that scales inference-time compute to over 1M tokens per problem, QED-Nano approaches the performance of Gemini-3-Pro:
27
 
28
  ![imoproofbench.png](https://huggingface.co/lm-provers/QED-Nano/resolve/main/imoproofbench.png)
29
 
30
+ QED-Nano is based on [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507), and was post-trained via a combination of supervised fine-tuning and [reinforcement learning with a reasoning cache](https://huggingface.co/papers/2602.03773) on a mixture of Olympiads proof problems from various public sources.
31
 
32
  ## How to use
33
 
 
90
 
91
  [ADD TABLE]
92
 
 
 
 
 
 
 
 
93
 
94
  ## Limitations
95
 
 
98
  ## License
99
  [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
100
 
101
+ ## Acknowledgements
102
+
103
+ QED-Nano is a joint collaboration between the research teams at CMU, ETH Zürich, Numina, and Hugging Face. Below is a list of the individual contributors and their affiliations:
104
+
105
+ ### CMU
106
+
107
+ Amrith Setlur, Yuxiao Qu, Ian Wu, and Aviral Kumar
108
+
109
+ ### ETH Zürich
110
+
111
+ Jasper Dekoninck
112
+
113
+ ### Numina
114
+
115
+ Jia Li
116
 
117
+ ### Hugging Face
118
 
119
+ Edward Beeching and Lewis Tunstall