Computational studies of human motion: part 1, tracking and motion synthesis
We review methods for kinematic tracking of the human body in video.
The review is part of a pro jected book that is intended to cross-fertilize
ideas about motion representation between the animation and com-
puter vision communities. The review confines itself to the earlier stages
of motion, focusing on tracking and motion synthesis; future material
will cover activity representation and motion generation.
In general, we take the position that tracking does not necessarily
involve (as is usually thought) complex multimodal inference problems.
Instead, there are two key problems, both easy to state.
The first is lifting, where one must infer the configuration of the
body in three dimensions from image data. Ambiguities in lifting can
result in multimodal inference problem, and we review what little is
known about the extent to which a lift is ambiguous. The second is
data association, where one must determine which pixels in an image
come from the body. We see a tracking by detection approach as the
most productive, and review various human detection methods.
Lifting, and a variety of other problems, can be simplified by observ-
ing temporal structure in motion, and we review the literature on data-
driven human animation to expose what is known about this structure.
Accurate generative models of human motion would be extremely useful
in both animation and tracking, and we discuss the profound difficulties
encountered in building such models. Discriminative methods which
should be able to tell whether an observed motion is human or not
do not work well yet, and we discuss why.
There is an extensive discussion of open issues. In particular, we
discuss the nature and extent of lifting ambiguities, which appear to
be significant at short timescales and insignificant at longer timescales.
This discussion suggests that the best tracking strategy is to track a 2D
representation, and then lift it. We point out some puzzling phenom-
ena associated with the choice of human motion representation joint
angles vs. joint positions. Finally, we give a quick guide to resources.
Download: pdf
Text Reference
David A. Forsyth, Okan Arikan, Leslie Ikemoto, James O'Brien, and Deva Ramanan. Computational studies of human motion: part 1, tracking and motion synthesis. Found. Trends. Comput. Graph. Vis., 1(2-3):77–254, 2005. doi:http://dx.doi.org/10.1561/0600000005.BibTeX Reference
@article{ForsythEtAl_FTCG_2007,author = "Forsyth, David A. and Arikan, Okan and Ikemoto, Leslie and O'Brien, James and Ramanan, Deva",
tag = "people",
title = "Computational studies of human motion: part 1, tracking and motion synthesis",
journal = "Found. Trends. Comput. Graph. Vis.",
volume = "1",
number = "2-3",
year = "2005",
issn = "1572-2740",
pages = "77--254",
doi = "http://dx.doi.org/10.1561/0600000005",
publisher = "Now Publishers Inc.",
address = "Hanover, MA, USA"
}