Tracking 2D Points

To add objects to a video that seem 3D, we need to establish a correspondence between the 2D points in the video with their 3D points in world coordinates. To begin, we initalize a tracker for every reference point that we set in the first frame of the video:

For every point marked, we also instantiate a list of their corresponding points in 3D.

Updating Tracked Points

For every frame after the first, we update each tracker and check if the previous position has changed significantly. If it has, we set that tracker as invalid.

This means for every frame, we have a list of 2D points and their corresponding 3D points. We index into each of them by a boolean array based on if the points are valid or not.

Camera Calibration

The mapping of 3D coordinates to points in the image is done by a series of two matrices, one of which account for the rotation and translation of points in 3D space, and the other based on camera intrinsics.

Solving for this matrix involves solving for some matrix $M$ which minimizes the error of: \begin{equation*} \begin{bmatrix} mu \\ mv \\ m \end{bmatrix} = M \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix} \end{equation*} for every point \begin{bmatrix} x & y & z \end{bmatrix} and its corresponding point \begin{bmatrix} u & v \end{bmatrix} in the image. The calculation is done by least squares since the system is over-determined after $6$ points and is very similar to the homography matrix calculation.

Adding To the Scene

Now that we have the matrix sending 3D points to 2D points in the image, we can build structures like a cube:

Project 6: Poor Man's Augemented Reality

Tracking 2D Points

Updating Tracked Points

Camera Calibration

Adding To the Scene