IAAR-Shanghai
/

xVerify-7B-I

@@ -1,61 +1,101 @@
----
-inference: false
-language:
-- en
-- zh
-tags:
-- instruction-finetuning
-task_categories:
-- text-generation
-base_model:
-- Qwen/Qwen2.5-7B-Instruct
-license: cc-by-nc-nd-4.0
----
-<h1 align="center">
-🔍 xVerify-7B-I
-</h1>
-<p align="center">
-  <div style="display: flex; justify-content: center; gap: 10px;">
-    <a href="https://github.com/IAAR-Shanghai/xVerify">
-      <img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>
-    </a>
-    <a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">
-      <img src="https://img.shields.io/badge/🤗%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>
-    </a>
-  </div>
-</p>
-xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.
----
-## ✨ Key Features
-### 📊 Broad Applicability
-Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.
-### ⛓️ Handles Long Reasoning Chains
-Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.
-### 🌐 Multilingual Support
-Primarily handles Chinese and English responses while remaining compatible with other languages.
-### 🔄 Powerful Equivalence Judgment
-- ✓ Recognizes basic transformations like letter case changes and Greek letter conversions
-- ✓ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)
-- ✓ Determines semantic equivalence in natural language answers
-- ✓ Matches multiple-choice responses by content rather than just option identifiers
----
-## 📚 Citation
-```bibtex
-@article{xVerify,
-      title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations},
-      author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},
-      journal={arXiv preprint arXiv:2504.10481},
-      year={2025},
-}
 ```

+---
+base_model:
+- Qwen/Qwen2.5-7B-Instruct
+language:
+- en
+- zh
+license: cc-by-nc-nd-4.0
+tags:
+- instruction-finetuning
+library_name: transformers
+pipeline_tag: text-generation
+inference: false
+---
+<h1 align="center">
+🔍 xVerify-7B-I
+</h1>
+<p align="center">
+  <div style="display: flex; justify-content: center; gap: 10px;">
+    <a href="https://github.com/IAAR-Shanghai/xVerify">
+      <img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>
+    </a>
+    <a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">
+      <img src="https://img.shields.io/badge/🤗%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>
+    </a>
+  </div>
+</p>
+xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It is presented in the paper [xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://huggingface.co/papers/2504.10481).
+It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.
+---
+## ✨ Key Features
+### 📊 Broad Applicability
+Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.
+### ⛓️ Handles Long Reasoning Chains
+Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.
+### 🌐 Multilingual Support
+Primarily handles Chinese and English responses while remaining compatible with other languages.
+### 🔄 Powerful Equivalence Judgment
+- ✓ Recognizes basic transformations like letter case changes and Greek letter conversions
+- ✓ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)
+- ✓ Determines semantic equivalence in natural language answers
+- ✓ Matches multiple-choice responses by content rather than just option identifiers
+---
+## 🚀 Sample Usage
+This snippet demonstrates single-sample evaluation using the `Evaluator` logic provided in the [official repository](https://github.com/IAAR-Shanghai/xVerify).
+```python
+from src.xVerify.model import Model
+from src.xVerify.eval import Evaluator
+# initialization
+model_name = 'xVerify-7B-I'
+model_path = 'IAAR-Shanghai/xVerify-7B-I'
+inference_mode = 'local'
+model = Model(
+    model_name=model_name,
+    model_path_or_url=model_path,
+    inference_mode=inference_mode,
+)
+evaluator = Evaluator(model=model)
+# input evaluation information
+question = "New steel giant includes Lackawanna site A major change is coming to the global steel industry and a galvanized mill in Lackawanna that formerly belonged to Bethlehem Steel Corp.
+Classify the topic of the above sentence as World, Sports, Business, or Sci/Tech."
+llm_output = "The answer is Business."
+correct_answer = "Business"
+# evaluation
+result = evaluator.single_evaluate(
+    question=question,
+    llm_output=llm_output,
+    correct_answer=correct_answer
+)
+print(result)
+```
+---
+## 📚 Citation
+```bibtex
+@article{xVerify,
+      title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations},
+      author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},
+      journal={arXiv preprint arXiv:2504.10481},
+      year={2025},
+}
 ```