PCA And SVD


What can PCA do?

How to use it?

Code:

# PCA
pca = PCA(n_components=310)
X_scaled=pca.fit_transform(X)

# SVD
U,S,V = np.linalg.svd(X.T @ X)
Xrot = np.dot(X_scaled, U[:,:100]) # decorrelate the data```


### SVD
#### How to use SVD to reconstruct a matrix?
row matrix A = U * Sigma * V^T
```python
import numpy as np
A = np.array([[1,2],[3,4],[5,6]])
U, s, V = np.linalg.svd(A)
Sigma = np.zeros((A.shape[0],A.shape[1]))
shape = A.shape[0] if A.shape[0] < A.shape[1] else A.shape[1]
Sigma[:shape,:shape] = np.diag(s)

B = U.dot(Sigma.dot(V)) # or U @ Sigma @ V
print(B)```

#### pseudoinverse(Generalized Inverse)
A^+ means pseudoinverse
A^+ = V * D^+ * U^T
A^+ is pseudoinverse of A, D^+ is pseudoinverse of Sigma, U^T is transponse of U.
We can use U * Sigma * V^T to compute U, Sigma and V to get pseudoinverse.

```python
import numpy as np
A = np.array([[1,2],[3,4],[5,6]])
U, s, V = np.linalg.svd(A)  # ( V is  V.T)
Sigma = np.zeros((A.shape[0],A.shape[1]))
shape = A.shape[0] if A.shape[0] < A.shape[1] else A.shape[1]
Sigma[:shape,:shape] = np.diag(s)
D = Sigma
D[:shape,:shape] = np.diag(1/s)
Dpse = D.T
Apse = V @ Dpse @ U.T
print(Apse)

#or

Apse = np.linalg.svd(A)
print(Apse)

What it can do?

SVD - decomposition

eg: feature numbers(columns) > data numbers(rows)
Then we can choose k biggest Sigma values, and choose the row from V.T, and then reconstruct the matrix: B = U * SigmaNew * VNew

A new dataset(the projection of A): T = U * SigmaNew or T = A * VNew.T

import numpy as np
A = np.array([list(range(1,11)),list(range(11,21)),list(range(21,31))])
U, s, V = np.linalg.svd(A)
Sigma = np.zeros((A.shape[0],A.shape[1]))
shape = A.shape[0] if A.shape[0] < A.shape[1] else A.shape[1]
Sigma[:shape,:shape] = np.diag(s)

# select
n_elements = 2
SigmaNew = Sigma[:, :n_elements]
VNew = V[:n_elements, :]

# reconstruct
B = U @ SigmaNew @ VNew

T = A @ VNew.T # or
T = U @ SigmaNew```

```python
import numpy as np
from sklearn.decomposition import TruncatedSVD
A = np.array([list(range(1,11)),list(range(11,21)),list(range(21,31))])

SVD = TruncatedSVD(n_componets = 2)
SVD.fit(A)
result = SVD.transform(A)
print(result)

Reference:


Welcome to share or comment on this post:

Table of Contents