Ouroboros3D
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
π Project Page | Paper
TL;DR. Ouroboros3D, a unified 3D generation framework, which integrates diffusion-based multi-view image generation and 3D reconstruction into a recursive diffusion process. During the multi-view denoising process, the multi-view diffusion model uses the 3D-aware maps rendered by the reconstruction module at the previous timestep as additional conditions.
π¨ Method Overview
Updata
- 2025-02-27: We are excited to announce that our paper has been accepted to CVPR 2025! π
- 2024-09-02: We released the baseline model for
svd & lgmin checkpoint - 2024-08-30: We released the training code, inference code, and model checkpoint of our baseline, Ouroboros3D.
π§ Installation
Just one command to prepare training and test environments:
pip install -r requirements.txt
Preparation for training
We render 2 16-frame RGBA orbits at 512 Γ 512. For each orbit, the cameras are positioned at a randomly sampled elevation between [-5, 30] degrees. You can rendering any
|
|-- 0a0c7e40a66d4fd090f549599f2f2c9d # object id
| |-- render_0000.webp
| |-- ...
| |-- meta.json # meta info
|-- train.txt
|-- ...
After preparing the data, you can to modify the config file configs/mv/svd_lgm_rgb_ccm_lvis.yaml to meet your needs.
To train on one node, please use:
torchrun --nnodes=1 --nproc_per_node=8 train.py --config configs/mv/svd_lgm_rgb_ccm_lvis.yaml
or use
./run.sh train
You can modify the run.sh to meets your requirements.
During training, you will see the validation results in both outputs and logger. You can choose wandb or tensorboard.
Inference
You need first download the checkpoint by python download_weights.py to download the model in checkpoint dir.
python inference.py \
--input testset/3d_arena_a_black_t-shirt_with_the_peace_sign_on_it.png \
--checkpoint checkpoint/Ouroboros3D-SVD-LGM \
--output workspace \
--seed 42 \
--config configs/mv/infer.yaml
or use
./run.sh inference
The inference results will be stored in the workspace folder, which will contain three files: xxx.ply, xxx.png, and xxx.mp4.
xxx.plyis the Gaussian splatting result obtained byLGM.xxx.mp4is the 360-degree rendering generated fromxxx.ply.xxx.pngis the multiview image generated by `SVD``.
π€ Acknowledgement
We appreciate the open source of the following projects:
π Citation
If you find this repository useful, please consider citing:
@article{wen2024ouroboros3d,
title={Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion},
author={Wen, Hao and Huang, Zehuan and Wang, Yaohui and Chen, Xinyuan and Qiao, Yu and Sheng, Lu},
journal={arXiv preprint arXiv:2406.03184},
year={2024}
}
- Downloads last month
- -