Unsupervised Bayesian Detection of Independent Motion in Crowds
by Gabriel J. Brostow and Roberto Cipolla
University of Cambridge
to appear in IEEE CVPR, June 2006, New York, NY

While crowds of various subjects may offer application-specific cues to detect individuals, we demonstrate that for the general case, motion itself contains more information than previously exploited. This paper describes an unsupervised data driven Bayesian clustering algorithm which has detection of individual entities as its primary goal.

We track simple image features and probabilistically group them into clusters representing independently moving entities. The numbers of clusters and the grouping of constituent features are determined without supervised learning or any subject-specific model. The new approach is instead, that space-time proximity and trajectory coherence through image space are used as the only probabilistic criteria for clustering. An important contribution of this work is how these criteria are used to perform a one-shot data association without iterating through combinatorial hypotheses of cluster assignments. Our proposed general detection algorithm can be augmented with subject-specific filtering, but is shown to already be effective at detecting individual entities in crowds of people, insects, and animals. This paper and the associated video examine the implementation and experiments of our motion clustering framework:

A number of people were kind enough to provide us with video footage from past/present/future research, including Tao Zhao who is now at Sarnoff Corporation, Peter Tu at GE Research, Niccolò Caderni of Legion International Ltd, and Alan Lerner at Tel-Aviv University. Ant and bee sequences are from Frank Dellaert's group (thanks to Zia Khan) at Georgia Tech. We have also benefited greatly from early discussions with Tom Drummond and David MacKay, and other members of the vision group at Cambridge University, as well as feedback on the text from Neil Dodgson. See Edward Rosten's great feature detection algorithm (called FAST), which we used for some of the sequences since it finds a superset of corners found by other techniques.

CVPR 2006 Acrobat PDF

Video (all together)

Video in mpg format (360x288 - 21 Mb)
or the same sequence full-size, but still compressed:
Video in divx format (720x576 - 88 Mb).
The free player for Divx-encoded video is available here for Mac and Windows.

Video (split into separate clips)

(Recommended for Better Quality)
DivX compressed files (see my playback tips
or get the free
Mac / Windows player)
(Ok Quality, should play anywhere)
MPEG 1 format. May require saving locally and activating "Play all frames" to prevent lurching during playback.
tunnel-A125 2.6 Mb
3.1 Mb
19.7 Mb
11.4 Mb
6.1 Mb
2.8 Mb
1.1 Mb
2.9 Mb
ZiaAnts (from ECCV04)
1.5 Mb
3.3 Mb
ZiaBees (from CVPR04)
8.6 Mb
2.5 Mb
Side-by-side with Zhao&Nevatia's CVPR'04 "Commons01"
7 Mb
9.1 Mb
Side-by-side with Rittscher et al.'s CVPR'05
1.9 Mb
6.6 Mb
escalator-A128 and sample application for architectural
visualization of gaze-directions in crowds

35.4 Mb
13 Mb

"Why can't my computer play some .avi (or .mpg, .mov, or .qt, etc.) files?"
i.e. my summary of CoDecs with respect to digital video.