PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published 27 days ago • 75
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published Jun 10 • 14
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published Jun 10 • 14 • 2
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models Paper • 2505.15406 • Published May 21 • 5
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs Paper • 2505.15524 • Published May 21 • 7
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models Paper • 2505.15406 • Published May 21 • 5 • 2
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs Paper • 2505.15524 • Published May 21 • 7 • 2