Depth Estimation

Depth is the missing dimension. Cameras project 3D scenes onto 2D images, losing depth information. This chapter explores how to recover it—from stereo cameras, multiple views, or even single images using deep learning.

What is this chapter about? We study the mathematics and algorithms for recovering depth from images. From classical stereo matching to modern neural networks, we cover the full spectrum of depth estimation techniques.

Why does this matter? Depth enables:

3D photography: Portrait mode, 3D photos on social media
Robotics: Obstacle avoidance, manipulation planning
AR/VR: Occlusion handling, interaction with real objects
Autonomous vehicles: Understanding scene geometry

How the topics connect: We start with epipolar geometry—the mathematical constraints between two views. Stereo matching uses these constraints to find correspondences. Multi-view stereo extends to many images. Finally, deep depth estimation learns to predict depth from single images.

Chapter 12: Depth Estimation

Chapter Overview

Chapter Roadmap

Epipolar Geometry

Stereo Matching

Multi-View Stereo

Deep Depth Estimation

Sign up to unlock this chapter