Table of Contents’’

Section 1.2 : Perceptron loss function

Section 1.3: Drawbacks of Perceptron

Section 1.4: Understanding the basics of Multi-Layer Perceptron

From Mistakes to Mastery: Perceptron Loss

Welcome back! In our last post, we met the perceptron, the basic building block of neural networks. But how does it actually learn? And what happens when the problem gets too tough for one perceptron to handle? Let's find out!

How a Perceptron Says "Oops!": The Loss Function

Imagine teaching a child to identify apples. You show them an apple and ask, "Is this an apple?" If they say yes, great! If they say no, that's an error. The bigger the error, the more you need to correct them.

A loss function (or cost function) is a way of measuring this error. It tells the perceptron how wrong its prediction was compared to the actual, correct answer. The goal of training is to make the total loss as small as possible.

Loss Calculator

Let's see the simplest loss in action. For binary classification, if the prediction is right, the loss is 0. If it's wrong, the loss is 1.

True Label: 1 (Apple) 0 (Not Apple)

Predicted Label: 1 (Apple) 0 (Not Apple)

Loss: 0 (Correct!)

Perceptron Loss Function

The Perceptron loss function is designed to update the weights of the model based on misclassifications. Mathematically, the Perceptron loss function can be defined as:

Here (w.x) is simply indicating z. And we calculate the total loss by the summation of the losses individually.

Simple Numerical Example

Assume:

Feature vector: x=[2,−1]
True label: y=+1
Model weights: w=[0.5,1]
Bias: b=−0.5

Step 1: Compute raw output:

z=w.x+b = (0.5)(2)+ (1)(−1)+ (−0.5) = -0.5

Step 2: Compute Perceptron Loss:

L=max⁡(0,−yz) = max⁡(0, −(1) (−0.5)) = max(0,0.5) = 0.5

Thus, the point is misclassified (since z=−0.5<0) and the loss is 0.5.

When One Perceptron Isn't Enough: The XOR Problem✨

A single perceptron is great, but it has a major weakness: it can only solve problems that are linearly separable. That means it can only draw a single straight line to separate the different groups of data.

What if it can't? Consider the classic "XOR" problem. It has four data points. Try to draw one straight line to separate the red dots from the blue dots. You can't!

The Unsolvable Puzzle (for one Perceptron)

This is the limit of a single perceptron.

To solve this, we need to level up.

Level Up! Intro to Multi-Layer Perceptron (MLPs)

If one perceptron is like a single employee making a yes/no decision, a Multi-Layer Perceptron (MLP) is like a whole team of them working together. It's our first real look at a proper neural network!

An MLP has at least three layers:

An Input Layer that receives the initial data.
One or more Hidden Layers where the real "thinking" happens. These layers allow the network to learn complex, non-linear patterns.
An Output Layer that gives the final answer.

By having that middle "hidden" layer, the network can draw more complex shapes than a single straight line, easily solving the XOR problem.

Conclusion 🎉

So there you have it! Loss functions are the compass that guides a perceptron's learning, always pointing it towards the "right" answer. And when a problem is too complex for one perceptron, we build a team called a Multi-Layer Perceptron, adding hidden layers to unlock incredible problem solving power.

In the next blog we will dive deep into MLP and learn about forward propagation and Back Propagation. Stay Tuned.

Perceptron loss Function and MLP simplified