Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
zhangzhifang
/
verl-agent
like
0
Safetensors
arxiv:
12 papers
Model card
Files
Files and versions
xet
Community
main
verl-agent
/
examples
214 kB
Ctrl+K
Ctrl+K
100 contributors
History:
119 commits
zhangzf01
hf run successfully
0eca069
about 1 month ago
dapo_trainer
add 'resources_per_worker' config for easily managing cpus/gpus of each env worker (#148)
8 months ago
data_preprocess
hf run successfully
about 1 month ago
env_server
remove appworld folder and adjust the appworld worker (#80)
10 months ago
generation
Merge VeRL
about 1 year ago
gigpo_dynamic_trainer
add 'resources_per_worker' config for easily managing cpus/gpus of each env worker (#148)
8 months ago
gigpo_trainer
add Qwen3-VL (#196)
4 months ago
grpo_trainer
add Qwen3-VL (#196)
4 months ago
gspo_trainer
Add GSPO to verl-agent (#179)
6 months ago
ppo_trainer
Update README and Add FAQ (#173)
7 months ago
prompt_agent
code adjustment & fix prompt agent bug (#183)
6 months ago
ray
Major Update: merge latest verl (#54)
11 months ago
rloo_trainer
support RLOO
11 months ago
search
Add search-r1 experiments (tool-calling) & the resutls of GiGPO on search-r1 experiments & similarity-based GiGPO (#159)
8 months ago
sft
Major Update: merge latest verl (#54)
11 months ago
slurm
Major Update: merge latest verl (#54)
11 months ago
split_placement
Major Update: merge latest verl (#54)
11 months ago