Introduction :

→ PCA stands for Principal Component Analysis. The idea of the PCA to reduce the dimensionality by introducing the Principal Components which aren’t correlated with each other.
→ The technique is used to get High variance possible.

→ Here we have a gaussian distribution. These points are carrying more context as individual dimension. After reducing the dimension the data is get much simpler. Let’s look after the data containing lower dimension.

→ Reducing the data into lower dimension makes us to iterate easily.
→ The number of axis in the data plane defines the number of Principal components.
→ Each Principal Component having a priority and the principal components are arranged in descending order for ex: -Consider three principal components PC1, PC2 and PC3.
→ The will be PC1(50%), PC2(30%) and PC3(20%). In this way those principal components are represented. For the 1D plane given previously , 1 Principal component is there.

Algorithm :

The approach for PCA is two major types :
1. Geometric Perspective ( Minimizing the distance )
2.Linear Algebra Perspective ( Minimizing the variance )
→ The Maximum variance achieve minimum distance.
→ The efficient way to find the PCA using covariance matrix. ( need to find covariance and correlation )
→ Covariance defines the directional relationship between two assets.
→ The Covariance will only measure the direction but the correlation measure both the direction and the intensity. Here is the formula -

→ Correlation is also known as normalized covariance.
→ Let’s talk about eigen value and vectors.
( i ) Eigen value containing the intensity of the transformed value.
( ii ) Eigen vector defines direction of the transformed matrix, The direction should not be change in case of eigen vector.

Spectral theorem :

→ Set of eigen vectors is known as matrix spectrum .This context introduced spectral theorem in to the picture.
→ Every covariance matrix having both eigen value and eigen vector. The Spectral theorem proves the presence of these two factors.

Steps Involving PCA operation are:
1. Center the data :
→ The first step is to center the data. The data is mapped towards center. Considering zero as a index value. ( returns → Transformed data )

2. Covariance matrix :
→ In this step we are going to find the covariance matrix, the formula is already mentioned.

3. Decomposition :
→ This process is going to the game changer, in this process decomposition is being performed.
→ But there is a catch, we are having two methods, by using those we are going to perform decomposition
i. EVD (Eigen value Decomposition)
ii. SVD (Singular Value Decomposition)

Let’s understand the difference between EVD and SVD.
→ Both EVD and SVD are matrix factorization method in linear algebra.
→ Decomposition means to segregate number of matrix from a given matrix.
→ For example : There are multiple ways to factorize (decompose / break down) a matrix like we can factorize the number 16, for example, into 2 x 8 = 16, 4 x 4 = 16, 2 x 2 x 4 = 16, 2 x 2 x 2 x 2 = 16. Not all factorization methods are equally important. It depends on the use case.

Singular Value Decomposition is the process of decomposing matrix A into following 3 matrices as it the following equation :

Here,
A → matrix on which we perform SVD
U → It refers to the orthogonal matrix, It is used to rotate the matrix. The dimension will be (m*m).
Σ → It refers to the diagonal matrix, It is used to stretch the matrix. The dimension will be (m*n).
U^T → It refers to the orthogonal matrix, It is used to rotate the matrix. The dimension will be (n*n).

Eigen Value Decomposition is the process of decomposing matrix A into following 3 matrices as it the following equation :

Here,
A → Squared matrix
Q → It refers to the Eigen vector.
Λ → It refers to a diagonal matrix of Eigen value.
Q^−1→ It refers the Eigen vector.

Difference between them :
1. Applicably , Eigen Decomposition is only applied to squared matrix. SVD can be applied to any matrix.
2. In ED Eigen vector is not necessarily orthogonal. SVD are always orthogonal.
3. Eigen values can be negative, but Singular vectors are always non-negative.
4. ED having complexity O(n³). The Complexity of the SVD is O(mn²)

Now, what to use →
Use SVD, that having better time complexity. So For implementation perspective we use SVD. SVD is generally preferred due to its numerical stability and broader applicability to non-square matrices. The matrix must be a squared matrix in case of ED.

4. Projection :

In the final step we are going to return the transformed data by finding the product of Data Matrix and the Eigen value.
[Data Matrix] * eigen_value → Transformed Data.

Applications of PCA :

→ PCA’s having a huge range of use cases , When we deal with a higher dimensional data, It helps us to reduce it. Apart from this the PCA having lot many use cases -
1. Denoising
2. Data Correlation Analysis
3. Data Compression

(This blog is referrer from several articles, research papers and books)

PCA : A Tale of Brainrot Dimensions

Introduction :

Algorithm :

Spectral theorem :

Applications of PCA :

THE END…

Subscribe to my newsletter

Kiran Kumar

Kiran Kumar