Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
zhangzhifang
/
verl-agent
like
0
Safetensors
arxiv:
12 papers
Model card
Files
Files and versions
xet
Community
main
verl-agent
/
examples
/
data_preprocess
42.5 kB
Ctrl+K
Ctrl+K
100 contributors
History:
18 commits
zhangzf01
hf run successfully
0eca069
about 1 month ago
aime2024_multiturn_w_tool.py
2.17 kB
Major Update: merge latest verl (#54)
11 months ago
dapo_multiturn_w_tool.py
2.17 kB
Major Update: merge latest verl (#54)
11 months ago
full_hh_rlhf.py
4.85 kB
Major Update: merge latest verl (#54)
11 months ago
geo3k.py
2.87 kB
Major Update: merge latest verl (#54)
11 months ago
gsm8k.py
2.96 kB
Major Update: merge latest verl (#54)
11 months ago
gsm8k_multiturn_w_tool.py
4.12 kB
Major Update: merge latest verl (#54)
11 months ago
hellaswag.py
3.24 kB
Major Update: merge latest verl (#54)
11 months ago
math_dataset.py
2.83 kB
Major Update: merge latest verl (#54)
11 months ago
multiturn.py
4.45 kB
Major Update: merge latest verl (#54)
11 months ago
prepare.py
3.67 kB
Update README and Add FAQ (#173)
7 months ago
prepare_vwa.py
2.73 kB
hf run successfully
about 1 month ago
preprocess_search_r1_dataset.py
6.41 kB
Add search-r1 experiments (tool-calling) & the resutls of GiGPO on search-r1 experiments & similarity-based GiGPO (#159)
8 months ago