Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the observed data. This widely used Technique is one foundational machine learning algorithm in machine learning.

Linear Regression is a Supervised machine learning algorithm. Linear regression works by finding the best-fit line (Regression line) through the data points, such that the line minimizes error or residual in data.

Equation of Regression line :

y = dependent variable,
x = independent variable,
c = intercept,
m = Slope of the line

Error or Residual :

Error(or Residual) is nothing but the difference between the actual value and predicted value for a data point.

Gradient-Decent :

When implementing the algorithm first a random value c and m is assigned to create a fitting line and then through the use of a optimizers like Gradient-Decent the correct value of c and m are found.

How gradient decent calculates the value of new m and new c :

Here, n = learning Rate

putting the value of E

Mean squared Error (MSE):

MSE is the average of the squared differences between actual and predicted values. Squaring the errors penalizes larger errors more than smaller ones, making MSE sensitive to outliers.

Mean Absolute Error (MAE) :

MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. Unlike MSE, it does not square the error terms, so it’s less sensitive to outliers.

💡

MSE and MAE are also called Cost function.

Learning Rate :

the learning rate determines the step size taken during the gradient descent optimization process. It controls how much the model's parameters (weights and biases) are adjusted at each iteration to minimize the loss function.

Experimentally it has been proven that the right value of learning rate is 0.001 .

Assumptions in Linear Regression :

Linearity : Values of X and Y are linear in nature.
Independence : Different values of X are independent of each other.
Normality : If you plot the values of residuals of regression line on a graph than it follow a Normal/Gaussian distribution.
Homoscedasticity : The variance of the residuals always remains constant in linear regression.

Types of Linear Regression :

1.Simple Linear Regression :

If there is only one value of X than it is Simple Linear Regression.

2. Multiple Linear Regression :

If there is more than one value of X than it is multiple Linear Regression.

3.Polynomial linear Regression :

If there is non-linear relationship between X and Y than it is a Polynomial Linear Regression.

Model Evaluation Metric :

R Square :

The R² Score indicates the proportion of variance in the dependent variable that is predictable from the independent variable(s). Essentially, it tells us how well our model explains the variability in the target data.

Adjusted R square :

Here, N = No. of data points, P = No. of independent predictors/Features

Week 5 : Linear Regression