Manifold learning in computer vision
Abstract (Summary)
iii
Appearance based learning has become very popular in the field of computer
vision. In a particular system, a visual datum such as an image is usually treated as
a vector by concatenating each row or column. The dimension of the image vector is
very high, equal to the number of pixels of the image. When we consider a sequence of
images, such video sequences or images capturing an object from different view points,
it typically lies on a non-linear dimensional manifold, whose dimension is much lower
than that of the original data. When we know the structure of the non-linear manifold,
it can be very helpful in the field of computer vision for various applications such as
dimensionality reduction, noise handling, etc.
In the first part of this thesis, we propose a method for outlier handling and noise
reduction using weighted local linear smoothing for a set of noisy points sampled from
a nonlinear manifold. This method can be used in conjunction with various manifold
learning methods such as Isomap (Isometric Feature Map), LLE (Local Linear Embedding)
and LTSA (Local Tangent Space Alignment) as a preprocessing step to obtain a
more accurate reconstruction of the underlying nonlinear manifolds. Using Weighted
PCA (Principal Component Analysis) as a foundation, we suggest an iterative weight
selection scheme for robust local linear fitting together with an outlier detection method
based on minimal spanning trees to further improve robustness. We also develop an
efficient and effective bias-reduction method to deal with the “trim the peak and fill the
valley” phenomenon in local linear smoothing. Synthetic examples along with several real
iv
image data sets are presented to show that we can combine manifold learning methods
with weighted local linear smoothing to produce more accurate results. The proposed
local smoothing method has been applied to the image occlusion handling problem and
to the noise reduction problem for point-based rendering.
The second part of this thesis focuses on image occlusion handling utilizing manifold
learning. We propose an algorithm to handle the problem of image occlusion using
the Least Angle Regression (LARS) algorithm. LARS, which was proposed recently
in the area of statistics, is known as a less greedy version of the traditional forward
model selection algorithm. In other words, the LARS algorithm provides a family of
image denoising results from one updated pixel to all of the updated pixels. Using image
thresholding and the statistical model selection criterion of Akaike Information Criterion
(AIC), we propose a method for selecting an optimal solution among the family of
solutions that the LARS algorithm provides. Three sets of experiments were performed.
The first measured the stability of the optimal solution estimation method. The second
set showed the effects of subblock computation on performance. The last set applied
the occlusion handling algorithm to the noisy data cleaning problem, and compared it
to two other methods: Orthogonal projection with Weighted PCA and Robust PCA.
Experimental results showed that the proposed method yields better performance than
the other methods.
In the third part of this thesis, we propose a robust motion segmentation method
using the techniques of matrix factorization, subspace separation and spectral graph
partitioning. We first show that the shape interaction matrix can be derived using QR
decomposition rather than Singular Value Decomposition(SVD) which also leads to a
v
simple proof of the shape subspace separation theorem. Using the shape interaction
matrix, we solve the motion segmentation problems using spectral clustering techniques.
We exploit the multi-way Min-Max cut clustering method and provide a novel approach
for cluster membership assignment. We further show that we can combine a cluster
refinement method based on subspace separation with the graph clustering method which
improves its robustness in the presence of noise. The proposed method yields very good
performance for both synthetic and real image sequences.
Bibliographical Information:
Advisor:
School:Pennsylvania State University
School Location:USA - Pennsylvania
Source Type:Master's Thesis
Keywords:
ISBN:
Date of Publication: