This paper explores articulated-pose estimation, assuming that video-rate depth information is available, from either stereo cameras or other sensors. We use these depth measurements in the traditional linear brightness constraint equation, as well as in a similar constraint equation on depth, called the depth constraint equation. We introduce "shifted" constraint equations that allow for larger motions without requiring iterative estimation. The resulting constraint equations are linear on a modified parameter set. After solving these linear constraints, there is a single closed-form non-linear transformation to return the updates to the original pose parameters.

Our tracking results, both on synthetic data and on real data (see below), argue strongly for video-rate depth information, from either stereo cameras or other sensors. Without the true depth information, when we estimated depths, the tracking behavior became unstable and failed catastrophically within duration of our test sequences.

B wo/Z = BCCE without true depth; B w/Z = BCCE with true depth; Z = ZCCE only; B+Z = BCCE and ZCCE

