Weihao Yu's picture

Weihao Yu

whyu

·

https://scholar.google.com/citations?user=LYxjt1QAAAAJ

AI & ML interests

Computer Vision, NLP and AI

Recent Activity

upvoted a paper 12 days ago

In-Video Instructions: Visual Signals as Generative Control

upvoted a paper 18 days ago

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

upvoted a paper 18 days ago

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

View all activity

Organizations

upvoted a paper 12 days ago

In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published 12 days ago • 29

upvoted 2 papers 18 days ago

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Paper • 2511.11793 • Published 22 days ago • 158

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

Paper • 2511.11434 • Published 22 days ago • 44

upvoted a paper 26 days ago

Visual Spatial Tuning

Paper • 2511.05491 • Published 29 days ago • 49

upvoted 4 papers about 1 month ago

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published Oct 28 • 15

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219

LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

Paper • 2510.22946 • Published Oct 27 • 16

Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets

Paper • 2510.19944 • Published Oct 22 • 19

upvoted 3 papers about 2 months ago

Trace Anything: Representing Any Video in 4D via Trajectory Fields

Paper • 2510.13802 • Published Oct 15 • 30

Generative Universal Verifier as Multimodal Meta-Reasoner

Paper • 2510.13804 • Published Oct 15 • 25

Artificial Hippocampus Networks for Efficient Long-Context Modeling

Paper • 2510.07318 • Published Oct 8 • 30

upvoted 3 papers 6 months ago

Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4 • 24

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published Jun 16 • 43

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Paper • 2505.17941 • Published May 23 • 25

upvoted 3 papers 7 months ago

Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22 • 22

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 134

Thinkless: LLM Learns When to Think

Paper • 2505.13379 • Published May 19 • 50

upvoted 2 papers 9 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Paper • 2503.07906 • Published Mar 10 • 4

upvoted a paper about 1 year ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published Nov 27, 2024 • 87