---
title: Awesome Depth Anything 3
emoji: 🌊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Metric 3D reconstruction from images/video
---
# Awesome Depth Anything 3
**Optimized fork of Depth Anything 3 with production-ready features**
[](https://pypi.org/project/awesome-depth-anything-3/)
[](https://www.python.org/)
[](LICENSE)
[](https://github.com/Aedelon/awesome-depth-anything-3/actions)
[](https://colab.research.google.com/github/Aedelon/awesome-depth-anything-3/blob/main/notebooks/da3_tutorial.ipynb)
[](https://huggingface.co/spaces/Aedelon/awesome-depth-anything-3)
[Demo](https://huggingface.co/spaces/Aedelon/awesome-depth-anything-3) · [Tutorial](notebooks/da3_tutorial.ipynb) · [Benchmarks](BENCHMARKS.md) · [Original Paper](https://arxiv.org/abs/2511.10647)
---
> **This is an optimized fork** of [Depth Anything 3](https://github.com/ByteDance-Seed/Depth-Anything-3) by ByteDance.
> All credit for the model architecture, training, and research goes to the original authors (see [Credits](#-credits) below).
> This fork focuses on **production optimization, developer experience, and ease of deployment**.
## 🚀 What's New in This Fork
| Feature | Description |
|---------|-------------|
| **Model Caching** | ~200x faster model loading after first use |
| **Adaptive Batching** | Automatic batch size optimization based on GPU memory |
| **PyPI Package** | `pip install awesome-depth-anything-3` |
| **CLI Improvements** | Batch processing options, better error handling |
| **Apple Silicon Optimized** | Smart CPU/GPU preprocessing for best MPS performance |
| **Comprehensive Benchmarks** | Detailed performance analysis across devices |
### Performance Improvements
| Metric | Upstream | This Fork | Improvement |
|--------|----------|-----------|-------------|
| Cached model load | ~1s | ~5ms | **200x faster** |
| Batch 4 inference (MPS) | 3.32 img/s | 3.78 img/s | **1.14x faster** |
| Cold model load | 1.28s | 0.77s | **1.7x faster** |
---
## Original Depth Anything 3
Recovering the Visual Space from Any Views
[**Haotong Lin**](https://haotongl.github.io/)
* · [**Sili Chen**](https://github.com/SiliChen321)
* · [**Jun Hao Liew**](https://liewjunhao.github.io/)
* · [**Donny Y. Chen**](https://donydchen.github.io)
* · [**Zhenyu Li**](https://zhyever.github.io/) · [**Guang Shi**](https://scholar.google.com/citations?user=MjXxWbUAAAAJ&hl=en) · [**Jiashi Feng**](https://scholar.google.com.sg/citations?user=Q8iay0gAAAAJ&hl=en)
[**Bingyi Kang**](https://bingykang.github.io/)
*†
†project lead *Equal Contribution
This work presents **Depth Anything 3 (DA3)**, a model that predicts spatially consistent geometry from
arbitrary visual inputs, with or without known camera poses.
In pursuit of minimal modeling, DA3 yields two key insights:
- 💎 A **single plain transformer** (e.g., vanilla DINO encoder) is sufficient as a backbone without architectural specialization,
- ✨ A singular **depth-ray representation** obviates the need for complex multi-task learning.
🏆 DA3 significantly outperforms
[DA2](https://github.com/DepthAnything/Depth-Anything-V2) for monocular depth estimation,
and [VGGT](https://github.com/facebookresearch/vggt) for multi-view depth estimation and pose estimation.
All models are trained exclusively on **public academic datasets**.