Xuandong Zhao's picture

20 13

Xuandong Zhao

Xuandong

·

https://xuandongzhao.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

authored a paper 20 days ago

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

upvoted a paper 21 days ago

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

View all activity

Organizations

upvoted a paper 2 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published 9 days ago • 27

authored a paper 20 days ago

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

Paper • 2601.00575 • Published 24 days ago • 3

upvoted a paper 21 days ago

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

Paper • 2601.00575 • Published 24 days ago • 3

submitted a paper to Daily Papers 21 days ago

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

Paper • 2601.00575 • Published 24 days ago • 3

New activity in harborframework/parity-experiments 24 days ago

Delete gpqa-diamond files

#14 opened 24 days ago by

gpqa-diamond

#13 opened 24 days ago by

published a model 28 days ago

Xuandong/Llama-2-7b-chat_bad100_2e-5

Text Generation • Updated Nov 30, 2023 • 12

updated a dataset about 2 months ago

Xuandong/CUA-Synth-Sample

Updated Nov 30, 2025

published a dataset about 2 months ago

Xuandong/CUA-Synth-Sample

Updated Nov 30, 2025

updated a model 2 months ago

Xuandong/Qwen2.5-3B-Quiet-STaR

Text Generation • 3B • Updated Nov 20, 2025

published a model 2 months ago

Xuandong/Qwen2.5-3B-Quiet-STaR

Text Generation • 3B • Updated Nov 20, 2025

updated a Space 2 months ago

Unigram-Watermark

updated a model 3 months ago

Xuandong/Qwen2.5-VL-3B-CUA-SFT

4B • Updated Nov 11, 2025

published a model 3 months ago

Xuandong/Qwen2.5-VL-3B-CUA-SFT

4B • Updated Nov 11, 2025

replied to Kseniase's post 4 months ago

Please also check Reinforcement Learning from Internal Feedback (RLIF) https://arxiv.org/abs/2505.19590

New activity in sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH 6 months ago

Improve model card: Add transformers library, expand description, links, and usage

#1 opened 6 months ago by

New activity in sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH 6 months ago

Improve model card: Add library, links, and usage example

#1 opened 6 months ago by

New activity in sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH 6 months ago

Improve model card: Add library, update pipeline tag, link to code

#1 opened 6 months ago by

New activity in sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH 6 months ago

Improve model card: Add library_name, paper/code links, and usage example

#1 opened 6 months ago by

New activity in sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH 6 months ago

Improve model card: Add library, GitHub link, paper details, and usage example

#1 opened 6 months ago by