INTRODUCTION

Unlocking the intricacies of neural networks, we delve into the manual calculations of forward and backward propagation. By dissecting the derived equations, we gain insight into how information flows and parameters are optimized. Join us on this journey to master the mechanics and empower your understanding of neural network training.

PRE-REQUISITE

Derivation of forward and backward propagation in neural networks

NEURAL NETWORK EXAMPLE

Consider the neural networks with sigmoid activation with a learning rate of 0.5

with input x1=0.05,x2=0.1 and

target values t1=0.01, t2=0.99.

The biases are given by b1=0.35, and b2=0.60.

Initially, randomly initialized weights are given by w1=0.15, w2=0.20, w3=0.25, w4=0.30, w5=0.40, w6=0.45, w7=0.50, and w8=0.55.

SOLUTION

Step 1: Forward propagation

Step 1.1: Forward propagation for Hidden layers (H1, H2)

EQUATION 1:_________________________(H1):

$$H1=x1w1+x2w2+b1$$

Substituting x1,w1,x2,w2,b1

$$H1=(0.05)(0.15)+(0.1)(0.20)+0.35$$

H1:

$$H1=0.3775$$

Since it is given by sigmoid activation

$$H1out=σ(H1)=1/(1+e^(-H1))$$

Substituting the H1 value we get

$$H1out=0.59326$$

EQUATION 2:_________________________(H2):

$$H2=x2w4+x1w3+b1$$

similarly, we will get

H2out:

$$H2out=0.59683$$

Step 1.2:Forward propagation for Output layer(y1,y2)

EQUATION 3:_________________________(y1):

$$y1=H1outw5+H2outw6+b2$$

substituting for the above equation

$$y1=(0.59326)(0.4)+(0.59683)(0.45)+0.6$$

y1:

$$y1=1.105905967$$

Since it is given by sigmoid activation

$$y1out=σ(y1out)=1/(1+e^(-y1))$$

y1out:

$$y1out=0.75136507$$

EQUATION 4:_________________________(y2):

$$y2=H1outw7+H2outw8+b2$$

Similarly, for y2

$$y2out=σ(y2out)=1/(1+e^(-y2))$$

y2out:

$$y2out=0.772928465$$

Step 2: Calculating Error

w.k.t

$$Etot=1/2*∑_i((target-output)^2)$$

Let's take E1 and E2 are error for y1,y2

$$Etot=(1/2*((target1-y_1out)^2))+(1/2*((target2-y_2out)^2))$$

Substituting the values

$$Etot=(1/2*((0.01-0.75136507)^2))+(1/2*((0.99-0.772928465)^2))$$

Total error:

$$Etot=0.298371108$$

Step 3: Backpropagation

Updating weights to reduce the error

W.K.T

Equation_________(UPDATE)

$$w_inew=W_iold-(lr)∂E/∂w_iold$$

**Step 3.1:**Backpropagate from the output layer to the hidden layer

we have w5,w6,w7,w8

For w5:

By applying the chain rule**___________(A)**

$$∂E/ ∂w5= (∂E/ ∂y1out)( ∂y1_out/ ∂y1)( ∂y1/ ∂w5)$$

For the above, we have to find

1] ∂E/∂y1out___________(1)

$$∂((1/2*((target1-y_1out)^2))+(1/2*((target2-y_2out)^2)))/∂y_1out$$

we get

$$∂E/∂y1out=(target1-y1out)*-1$$

substituting the values for above

$$∂E/∂y1out=0.74136507$$

2] ∂y1out/∂y1___________(2)

From step 1.2 y1out formula is

$$y1out=σ(y1out)$$

applying

$$∂y_1out/∂y_1=y_1out(1-y_1out)$$

substituting y1out in the above equation we get

$$∂y_1out/∂y_1=0.186815601$$

3] ∂y1/∂w5___________(3)

from step 1.2 equation 3

$$∂(H1outw5+H2outw6+b2)/∂w5$$

we get

$$∂y1/∂w5=H1out=0.59326$$

Substituting values of ∂E/∂y1out___________(1),

∂y1out/∂y1___________(2),

∂y1/∂w5___________(3) in equation_____(A)

finally, the gradient for w5

$$∂E/ ∂w5=0.082165$$

Step 3.1.1 Substitute ∂E/∂w5 and lr=0.5 and old weight of w5 in

Equation (UPDATE)

$$w_5new=w_5old-lr*∂E/∂w_5$$

The updated W5

$$w5=0.358916$$

Similarly for w6,w7,w8 we get

Updated weights w6,w7,w8

$$w6=0.408666186$$

$$w7=0.511301270$$

$$w8=0.561370121$$

Step 3.2 Backpropagate from the hidden layer to the input layer

we have w1,w2,w3,w4

Let's take E1 and E2 are error for y1,y2

$$Etot=(1/2*((target1-H_1out)^2))+(1/2*((target2-H_2out)^2))$$

For w1:

By applying the chain rule ____________(B)

$$∂E/ ∂w1=(∂E/ ∂H_1out)(∂H_1out/ ∂H1)(∂H1/ ∂w1)$$

For the above, we have to find

1] ∂E/∂H1out_____________(1)

$$∂((1/2*((target1-H_1out)^2))+(1/2*((target2-H_2out)^2)))/∂H_1out$$

After partial differentiation we get,

$$∂E/∂H_1out=(target1-H_1out)*-1$$

substituting the values for the above equation

$$∂E/∂H_1out=0.58326$$

2] ∂H1out/∂H1_____________(2)

From step 1.1 w.k.t H1out

$$H_1out=∂(H1)$$

Applying

we get

$$∂H_1out/∂H_1=H_1out(1-H_1out)$$

By substituting the values for the above equation

$$∂H1out/∂H1=0.59326(1-0.59326)$$

we get

$$∂H1out/∂H1=0.24130257$$

3] ∂H1/∂w1_____________(3)

from step 1.1 EQUATION 1:_________________________(H1)

$$∂(x1w1+x2w2+b1)/∂w1=x1$$

we get

$$∂H1/∂w1=x1$$

substituting the value we get

$$∂H1/∂w1=0.05$$

substituting values of ∂E/∂H1out_____________(1),

∂H1out/∂H1_____________(2),

∂H1/∂w1_____________(3) in equation (B) we get

finally gradient of w1,

$$∂E/∂w1=0.007037$$

Step 3.2.1 Substitute ∂E/∂w1 and lr=0.5 and old weight of w1 in

equation (UPDATE)

Updated w1

$$w1=0.14978$$

similarly, we have calculated for w2,w3,w4

finally, we get

$$w2=0.199561$$

$$w3=0.249751$$

$$w4=0.299502$$

Likewise, we will update the weights iteratively in order to reduce the error rate.

Note: The above example is shown for one forward and backward pass.

CONCLUSION

In this blog, we have explored the numerical problem-solving capabilities of neural networks, delving into their mechanics of forward and backward propagation. By manually calculating the steps involved in training a neural network, we gain a deeper understanding of the inner workings and parameter optimization. I encourage you to personally work out the provided examples to enhance your understanding and solidify your grasp on this fascinating topic, as hands-on experience will undoubtedly lead to a better comprehension of neural network numerical problem-solving.

Hope you enjoyed solving the problem !!!

Practice on your own !!!

Stay tuned !!!

Thank you !!!

Mastering the Mechanics: Manual Calculations of Forward and Backward Propagation in Neural Networks

Table of contents

INTRODUCTION

PRE-REQUISITE

NEURAL NETWORK EXAMPLE

CONCLUSION

Subscribe to my newsletter

Naveen Kumar

Naveen Kumar