Sunday, September 19, 2010

PCA Principal component analysis

http://ordination.okstate.edu/PCA.htm
owever, with more than three dimensions, we usually need a little help. What PCA does is that it takes your cloud of data points, and rotates it such that the maximum variability is visible. Another way of saying this is that it identifies your most important gradients.

http://en.wikipedia.org/wiki/Mahalanobis_distance
In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936.[1] It is based on correlations between variables by which different patterns can be identified and analyzed. It is a useful way of determining similarity of an unknown sample set to a known one

http://onlinelibrary.wiley.com/doi/10.1002/wics.101/abstract
Principal component analysis
    HervĂ© Abdi1,*, Lynne J. Williams2

http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

http://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
I would never try to explain this to my grandmother, but if I had to talk generally about dimension reduction techniques, I'd point to this trivial projection example (not PCA). Suppose you have a Calder mobile that is very complex. Some points in 3-d space close to each other, others aren't. If we hung this mobile from the ceiling and shined light on it from one angle, we get a projection onto a lower dimension plane (a 2-d wall). Now, if this mobile is mainly wide in one direction, but skinny in the other direction, we can rotate it to get projections that differ in usefulness. Intuitively, a skinny shape in one dimension projected on a wall is less useful - all the shadows overlap and don't give us much information. However, if we rotate it so the light shines on the wide side, we get a better picture of the reduced dimension data - points are more spread out. This is often what we want. I think my grandmother could understand that :-)

No comments: