Add library_name, pipeline_tag and link to paper

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +100 -60
README.md CHANGED
@@ -1,61 +1,101 @@
1
- ---
2
- inference: false
3
- language:
4
- - en
5
- - zh
6
- tags:
7
- - instruction-finetuning
8
- task_categories:
9
- - text-generation
10
- base_model:
11
- - Qwen/Qwen2.5-7B-Instruct
12
- license: cc-by-nc-nd-4.0
13
- ---
14
- <h1 align="center">
15
- πŸ” xVerify-7B-I
16
- </h1>
17
-
18
- <p align="center">
19
- <div style="display: flex; justify-content: center; gap: 10px;">
20
- <a href="https://github.com/IAAR-Shanghai/xVerify">
21
- <img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>
22
- </a>
23
- <a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">
24
- <img src="https://img.shields.io/badge/πŸ€—%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>
25
- </a>
26
- </div>
27
- </p>
28
- xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.
29
-
30
- ---
31
-
32
- ## ✨ Key Features
33
-
34
- ### πŸ“Š Broad Applicability
35
- Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.
36
-
37
- ### ⛓️ Handles Long Reasoning Chains
38
- Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.
39
-
40
- ### 🌐 Multilingual Support
41
- Primarily handles Chinese and English responses while remaining compatible with other languages.
42
-
43
- ### πŸ”„ Powerful Equivalence Judgment
44
- - βœ“ Recognizes basic transformations like letter case changes and Greek letter conversions
45
- - βœ“ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)
46
- - βœ“ Determines semantic equivalence in natural language answers
47
- - βœ“ Matches multiple-choice responses by content rather than just option identifiers
48
-
49
- ---
50
-
51
-
52
- ## πŸ“š Citation
53
-
54
- ```bibtex
55
- @article{xVerify,
56
- title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations},
57
- author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},
58
- journal={arXiv preprint arXiv:2504.10481},
59
- year={2025},
60
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ```
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-7B-Instruct
4
+ language:
5
+ - en
6
+ - zh
7
+ license: cc-by-nc-nd-4.0
8
+ tags:
9
+ - instruction-finetuning
10
+ library_name: transformers
11
+ pipeline_tag: text-generation
12
+ inference: false
13
+ ---
14
+
15
+ <h1 align="center">
16
+ πŸ” xVerify-7B-I
17
+ </h1>
18
+
19
+ <p align="center">
20
+ <div style="display: flex; justify-content: center; gap: 10px;">
21
+ <a href="https://github.com/IAAR-Shanghai/xVerify">
22
+ <img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>
23
+ </a>
24
+ <a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">
25
+ <img src="https://img.shields.io/badge/πŸ€—%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>
26
+ </a>
27
+ </div>
28
+ </p>
29
+
30
+ xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It is presented in the paper [xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://huggingface.co/papers/2504.10481).
31
+
32
+ It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.
33
+
34
+ ---
35
+
36
+ ## ✨ Key Features
37
+
38
+ ### πŸ“Š Broad Applicability
39
+ Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.
40
+
41
+ ### ⛓️ Handles Long Reasoning Chains
42
+ Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.
43
+
44
+ ### 🌐 Multilingual Support
45
+ Primarily handles Chinese and English responses while remaining compatible with other languages.
46
+
47
+ ### πŸ”„ Powerful Equivalence Judgment
48
+ - βœ“ Recognizes basic transformations like letter case changes and Greek letter conversions
49
+ - βœ“ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)
50
+ - βœ“ Determines semantic equivalence in natural language answers
51
+ - βœ“ Matches multiple-choice responses by content rather than just option identifiers
52
+
53
+ ---
54
+
55
+ ## πŸš€ Sample Usage
56
+
57
+ This snippet demonstrates single-sample evaluation using the `Evaluator` logic provided in the [official repository](https://github.com/IAAR-Shanghai/xVerify).
58
+
59
+ ```python
60
+ from src.xVerify.model import Model
61
+ from src.xVerify.eval import Evaluator
62
+
63
+ # initialization
64
+ model_name = 'xVerify-7B-I'
65
+ model_path = 'IAAR-Shanghai/xVerify-7B-I'
66
+ inference_mode = 'local'
67
+
68
+ model = Model(
69
+ model_name=model_name,
70
+ model_path_or_url=model_path,
71
+ inference_mode=inference_mode,
72
+ )
73
+ evaluator = Evaluator(model=model)
74
+
75
+ # input evaluation information
76
+ question = "New steel giant includes Lackawanna site A major change is coming to the global steel industry and a galvanized mill in Lackawanna that formerly belonged to Bethlehem Steel Corp.
77
+ Classify the topic of the above sentence as World, Sports, Business, or Sci/Tech."
78
+ llm_output = "The answer is Business."
79
+ correct_answer = "Business"
80
+
81
+ # evaluation
82
+ result = evaluator.single_evaluate(
83
+ question=question,
84
+ llm_output=llm_output,
85
+ correct_answer=correct_answer
86
+ )
87
+ print(result)
88
+ ```
89
+
90
+ ---
91
+
92
+ ## πŸ“š Citation
93
+
94
+ ```bibtex
95
+ @article{xVerify,
96
+ title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations},
97
+ author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},
98
+ journal={arXiv preprint arXiv:2504.10481},
99
+ year={2025},
100
+ }
101
  ```