Papers
arxiv:2512.16924

The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

Published on Dec 18
ยท Submitted by
Yihao Meng
on Dec 19
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

WorldCanvas generates coherent, controllable world events using a multimodal framework that integrates text, trajectories, and reference images.

AI-generated summary

We present WorldCanvas, a framework for promptable world events that enables rich, user-directed simulation by combining text, trajectories, and reference images. Unlike text-only approaches and existing trajectory-controlled image-to-video methods, our multimodal approach combines trajectories -- encoding motion, timing, and visibility -- with natural language for semantic intent and reference images for visual grounding of object identity, enabling the generation of coherent, controllable events that include multi-agent interactions, object entry/exit, reference-guided appearance and counterintuitive events. The resulting videos demonstrate not only temporal coherence but also emergent consistency, preserving object identity and scene despite temporary disappearance. By supporting expressive world events generation, WorldCanvas advances world models from passive predictors to interactive, user-shaped simulators. Our project page is available at: https://worldcanvas.github.io/.

Community

Paper submitter

arXiv lens breakdown of this paper ๐Ÿ‘‰ https://arxivlens.com/PaperView/Details/the-world-is-your-canvas-painting-promptable-events-with-reference-images-trajectories-and-text-9084-11219c98

  • Key Findings
  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.16924 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.16924 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.16924 in a Space README.md to link it from this page.

Collections including this paper 1