The critiques & summaries of "Nonlinear dimensionality reduction by locally linear embedding"

"Nonlinear dimensionality reduction by locally linear embedding"

By Sam T. Roweis and Lawrence K. Saul

--

The high-dimensional data cause problems in data analysis and visualization in many areas of science. We need an efficient and effective approach to reduce dimensionality for a compact representation of high-dimensional data.

In this paper, locally linear embedding (LLE) has been proposed as an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs.

Compared to local dimensionality reduction based on clustering methods, LLE maps inputs into a single global coordinate system of lower dimensionality. By the feature to exploit the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds, such as those by images or documents.

Previous approaches to this problem try to compute embeddings that preserve pairwise distances between data points. LLE eliminates the need to estimate these distances and recovers global nonlinear structure from locally linear fits.

The LLE algorithm is based on a simple intuition. Provided there is sufficient data, we expect each data point and its neighbors to lie on or close to a locally linear patch of the nonlinear manifold.

Then we characterize the local geometry of these patches by linear coefficients that reconstruct each data point from its neighbors. To measure the error introduced by reconstruction, the cost function, which adds up the squared distances between all the data points and their reconstructions, is proposed.

After we obtain the weight matrix W, LLE constructs a neighborhood-preserving mapping based on the idea that the same weight reconstructing the data points in high-dimensionality should also reconstruct its embedded manifold coordinates in low-dimensionality.

In the final step of the algorithm, each high-dimensional data point is mapped to low-dimensional vector. This is done by choosing a coordinate to minimize the embedding cost function. The problem is minimized to solve the eigenvector problem of a sparse matrix.

The LLE algorithm is based on strong intuition and simple to implement. In rough look, Because LLE preserves the neighborhood of data points in high-dimensionality, it is more useful for the applications of data analysis and visualization.