|
G.R. Bradski, Computer video face tracking for use in a perceptual user interface, Intel Technology Journal, Q2 1998. pdf Camshift stands for "Continuously Adaptive Mean Shift." This is the basis for the face-tracking algorithm in OpenCV. It combines the basic Mean Shift algorithm with an adaptive region-sizing step. The kernel is a simple step function applied to a skin-probability map. The skin probability of each image pixel is based on color using a method called histogram backprojection. Color is represented as Hue from the HSV color model. Since the kernel is a step function, the mean shift at each iteration is simply the average x and y of skin-probability contributions within the current region. This is determined by dividing the first moments of the region by its zeroth moment at each iteration and shifting the region to the probability centroid. After Mean Shift converges to an (x,y) location, scale is updated based on the current value for the zeroth moment. This update is heuristic. Linear scale is assumed to be proportional to the square root of the summed probability contributions within the current region (i.e., the zeroth moment). Height and width are assumed to have a predetermined, consistent ratio. Since Hue is unstable at low saturation, the color histograms don't include pixels with saturation below some threshold. Similarly, minimum and maximum intensity values can be applied to skip over pixels that are very bright or very dark.
Pro:
This method is fast and appears on initial testing to be moderately accurate.
It may be possible to improve accuracy by using a different color representation.
|