Multimedia Signal Processing

ZGPCA algorithm for data clustering

We propose a new algorithm called the ZGPCA algorithm for subspace estimation based on the GPCA (Generalized Principal Component Analysis) algorithm. It is formulated within an FIR filter framework so that the norm vectors of the subspaces correspond to filter coefficients. It is shown that such an approach leads to a more accurate and computationally efficient method compared to the GPCA algorithm. We extend the ZGPCA algorithm to make it recursive so that subspaces with possibly different dimensions can be obtained. We also propose a new distance measure that can be used for k-means clustering of sample points within a subspace. Experimental results on synthetic data and applications on face clustering and sports video clustering show good performance of the proposed algorithm.

FIR filter formulation for subspace estimation.

H. Yi, D. Rajan and L. T. Chia, A ZGPCA algorithm for subspace estimation, ICME, Beijing, 2007.

Motion-based scene tree

A fully automatic content-based approach for browsing and retrieval of MPEG-2 compressed video is developed. The first step of the approach is the detection of shot boundaries based on motion vectors available from the compressed video stream. The next step involves the construction of a scene tree from the shots obtained earlier. The scene tree is shown to capture some semantic information as well as to provide a construct for hierarchical browsing of compressed videos. Finally, we build a new model for video similarity based on global as well as local motion associated with each node in the scene tree. To this end, we propose new approaches to camera motion and object motion estimation. The experimental results demonstrate that the integration of the above techniques results in an efficient framework for browsing and searching large video databases.

Video annotation tool interface. User selects a video to be annotated. The algorithm performs shot boundary detection, key frame extraction and builds the scene tree shown in the bottom left part of the interface. The key frames from each shot are shown as thumbnail images on the right panel. The user can navigate the ‘‘scene tree’’ and select interested scene node for annotation.

H. Yi, D. Rajan and L. T. Chia, A motion-based scene tree for browsing and retrieval of compressed videos, Information Systems, vol. 37, no. 7, pp.638-658, 2006.

Motion histogram

A new motion feature for video indexing is proposed. The motion content of the video at pixel level, is represented as a Pixel Change Ratio Map (PCRM). The PCRM enables us to capture the intensity of motion in a video sequence. It also indicates the spatial location and size of the moving object. The proposed motion feature is the motion histogram which is a non-uniformly quantized histogram of the PCRM. We demonstrate the usefulness of the motion histogram with three applications, viz., video retrieval, video clustering and video classification.

Cluster 1. Irregular camera motion, large object motion with small moving object size(e.g., soccer (long shot), basketball (long shot), etc.)
Cluster 2. Still camera and little object motion (e.g., talking head, dialogue, interview, etc.)
Cluster 3. Smooth camera motion with little or no object motion (e.g., scenery, etc.)
Cluster 4. Little camera motion and little object motion (e.g., outdoor interview, out door news shots, etc.)
Cluster 5. Little camera motion and large object motion (e.g., marathon, racing cars, animal shows, etc.)
Cluster 6. Irregular camera motion and large moving object(e.g., soccer(close-up), basketball(close-up), cycling, etc.)
Cluster 7. Little or no motion (scenery, etc.).

Examples of video sequences (key frames) from two classes - high motion (top row) and low motion (bottom row).

H. Yi, D. Rajan, L. T. Chia, A new motion histogram to index motion content in video segments, Pattern Recognition Letters, vol. 26, no. 0, pp. 1221-1231, 2005.

Near-duplicate Image Retrieval