Manifold learning in computer vision

by 1969- Park, JinHyeong

Abstract (Summary)
iii Appearance based learning has become very popular in the field of computer vision. In a particular system, a visual datum such as an image is usually treated as a vector by concatenating each row or column. The dimension of the image vector is very high, equal to the number of pixels of the image. When we consider a sequence of images, such video sequences or images capturing an object from different view points, it typically lies on a non-linear dimensional manifold, whose dimension is much lower than that of the original data. When we know the structure of the non-linear manifold, it can be very helpful in the field of computer vision for various applications such as dimensionality reduction, noise handling, etc. In the first part of this thesis, we propose a method for outlier handling and noise reduction using weighted local linear smoothing for a set of noisy points sampled from a nonlinear manifold. This method can be used in conjunction with various manifold learning methods such as Isomap (Isometric Feature Map), LLE (Local Linear Embedding) and LTSA (Local Tangent Space Alignment) as a preprocessing step to obtain a more accurate reconstruction of the underlying nonlinear manifolds. Using Weighted PCA (Principal Component Analysis) as a foundation, we suggest an iterative weight selection scheme for robust local linear fitting together with an outlier detection method based on minimal spanning trees to further improve robustness. We also develop an efficient and effective bias-reduction method to deal with the “trim the peak and fill the valley” phenomenon in local linear smoothing. Synthetic examples along with several real iv image data sets are presented to show that we can combine manifold learning methods with weighted local linear smoothing to produce more accurate results. The proposed local smoothing method has been applied to the image occlusion handling problem and to the noise reduction problem for point-based rendering. The second part of this thesis focuses on image occlusion handling utilizing manifold learning. We propose an algorithm to handle the problem of image occlusion using the Least Angle Regression (LARS) algorithm. LARS, which was proposed recently in the area of statistics, is known as a less greedy version of the traditional forward model selection algorithm. In other words, the LARS algorithm provides a family of image denoising results from one updated pixel to all of the updated pixels. Using image thresholding and the statistical model selection criterion of Akaike Information Criterion (AIC), we propose a method for selecting an optimal solution among the family of solutions that the LARS algorithm provides. Three sets of experiments were performed. The first measured the stability of the optimal solution estimation method. The second set showed the effects of subblock computation on performance. The last set applied the occlusion handling algorithm to the noisy data cleaning problem, and compared it to two other methods: Orthogonal projection with Weighted PCA and Robust PCA. Experimental results showed that the proposed method yields better performance than the other methods. In the third part of this thesis, we propose a robust motion segmentation method using the techniques of matrix factorization, subspace separation and spectral graph partitioning. We first show that the shape interaction matrix can be derived using QR decomposition rather than Singular Value Decomposition(SVD) which also leads to a v simple proof of the shape subspace separation theorem. Using the shape interaction matrix, we solve the motion segmentation problems using spectral clustering techniques. We exploit the multi-way Min-Max cut clustering method and provide a novel approach for cluster membership assignment. We further show that we can combine a cluster refinement method based on subspace separation with the graph clustering method which improves its robustness in the presence of noise. The proposed method yields very good performance for both synthetic and real image sequences.
Bibliographical Information:


School:Pennsylvania State University

School Location:USA - Pennsylvania

Source Type:Master's Thesis



Date of Publication:

© 2009 All Rights Reserved.