principal component analysis
problem statement
- map x∈Rd to z∈Rq with q<d
- A q×d matrix can represent a linear mapping:
z=Ax
- Assume that AAT=I (orthonormal matrix)
minimising reconstruction error
- Given X∈Rd×n, find A that minimises the reconstruction error:
A,Bmini∑∥xi−BAxi∥22
if q=d, then error is zero.
Solution:
- B=AT
- Amin∑i∥xi−ATAxi∥2 is subjected to AAT=Iq×q
- assuming data is centered, or n1∑_ixi=[0⋯0]T
eigenvalue decomposition
XTXuXTX∵Λ=λu=UTΛU=diag(λ1,λ2,⋯,λd)=λ10⋮00λ2⋮0⋯⋯⋱⋯00⋮λq
pca
Idea: given input x1,⋯,xn∈Rd, μ=n1∑ixi
Thus
C=∑(xi−μ)(xi−μ)T
Find the eigenvectors/values of C:
C=UTΛU
Optimal A is:
A=u1Tu2T⋮uqT