š Day 3: Activation Functions - Bringing Life to Neural Networks

So far, weāve explored how neural networks are built from neurons, layers, weights, and bias. But thereās one more essential piece: activation functions.
Without them, no matter how complex your network is, it will act like a simple equation. Activation functions inject non-linearity, making it possible for neural networks to model real-world data.
What Is an Activation Function?
An activation function determines if a neuron should āactivateā (i.e., fire) or not based on the input it receives. This input isnāt raw, itās a weighted sum of inputs plus a bias.
They allow the network to learn patterns that arenāt just straight lines, allows curves, layers, and complexity.
What Role Does Bias Play?
Think of bias as a way to fine-tune when a neuron activates.
Without bias, the activation function is "stuck" to pass through the origin (0,0).
With bias, you shift the curve left or right, giving the model more flexibility.
Bias helps decide when a neuron should activate, and the activation function decides how.
Popular Activation Functions
ReLU (Rectified Linear Unit):
f(x) = max(0, x)
Most commonly used in hidden layers.
Simple & efficient ā turns all negative values into 0.
Helps solve the "vanishing gradient" problem.
Sigmoid:
f(x) = 1 / (1 + e^(-x))
Squeezes values into a range between 0 and 1.
Often used in binary classification problems (e.g., spam or not spam).
Tanh (Hyperbolic Tangent):
f(x) = (e^x - e^(-x)) / (e^x + e^(-x))
Outputs values between -1 and 1.
Centered around 0, so it often performs better than sigmoid in hidden layers.
Why Are Activation Functions So Important?
Without activation functions:
A neural network becomes just a linear transformation, no matter how deep.
It canāt solve problems like image recognition or language understanding.
With activation functions:
The model learns non-linear patterns.
It can make sense of complex, real-world data.
Summary Table
Activation | Formula | Output Range | Use Case |
ReLU | max(0, x) | [0, ā) | Hidden layers (default) |
Sigmoid | 1 / (1 + e^-x) | (0, 1) | Binary classification |
Tanh | tanh(x) | (-1, 1) | Deep hidden layers |
Subscribe to my newsletter
Read articles from Jahnavi Sri Kavya Bollimuntha directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
