Qihan Ren's picture

4 21 2

Qihan Ren

jasonrqh

·

https://nebularaid2000.github.io/

AI & ML interests

explainable AI, LLM

Recent Activity

upvoted a paper 1 day ago

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

upvoted a paper 1 day ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

upvoted a paper 3 days ago

DeepSeek-OCR 2: Visual Causal Flow

View all activity

Organizations

upvoted 2 papers 1 day ago

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published 2 days ago • 25

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published 2 days ago • 44

upvoted 2 papers 3 days ago

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published 3 days ago • 42

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Paper • 2601.10156 • Published 16 days ago • 26

upvoted a paper 4 days ago

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Paper • 2601.18491 • Published 5 days ago • 120

upvoted a collection 5 days ago

AgentDoG

A Diagnostic Guardrail Framework for AI Agent Safety and Security • 11 items • Updated 3 days ago • 82

upvoted 2 papers 2 months ago

Geometrically-Constrained Agent for Spatial Reasoning

Paper • 2511.22659 • Published Nov 27, 2025 • 41

iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation

Paper • 2511.20635 • Published Nov 25, 2025 • 32

upvoted 10 papers 4 months ago

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published Oct 15, 2025 • 58

AutoPR: Let's Automate Your Academic Promotion!

Paper • 2510.09558 • Published Oct 10, 2025 • 53

Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models

Paper • 2509.23962 • Published Sep 28, 2025 • 5

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Paper • 2510.08211 • Published Oct 9, 2025 • 22

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

Paper • 2510.08529 • Published Oct 9, 2025 • 19

Who's Your Judge? On the Detectability of LLM-Generated Judgments

Paper • 2509.25154 • Published Sep 29, 2025 • 30

Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution

Paper • 2509.24726 • Published Sep 29, 2025 • 20

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

Paper • 2509.26354 • Published Sep 30, 2025 • 18

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2, 2025 • 80

Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step

Paper • 2509.23924 • Published Sep 28, 2025 • 9

upvoted a paper 6 months ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28, 2025 • 83

upvoted a paper 8 months ago

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published May 26, 2025 • 8