PyTorch Autograd — From Tensors to Optimization

Autograph:
PyTorch's autograd
engine is the magic behind automatic differentiation. As machine learning engineers or deep learning practitioners, we often define a loss function and expect our framework to figure out how to update weights. PyTorch handles this with a dynamic computational graph and intuitive syntax.
Basic Autograd Example 1 – Scalar Operations
To compute the gradients of a scalar output y = w * x + b
with respect to the inputs x
, w
, and b
.
import torch
# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)
# Build a computational graph.
y = w * x + b # y = 2 * 1 + 3 = 5
# Compute gradients.
y.backward()
# Print out the gradients.
print(x.grad) # x.grad = 2
print(w.grad) # w.grad = 1
print(b.grad) # b.grad = 1
What’s Happening?
We create scalar tensors
x
,w
, andb
and enable gradient tracking withrequires_grad=True
.The equation
y = w * x + b
constructs a computational graph dynamically..backward()
triggers reverse-mode differentiation and computes partial derivatives:
$$\frac{\partial x}{\partial y} = w = 2, \frac{\partial w}{\partial y} = x = 1, \frac{\partial b}{\partial y} = 1.$$
Output
tensor(2.)
tensor(1.)
tensor(1.)
Takeaways
.backward()
computes gradients for scalar outputs.Gradients are stored in
.grad
attributes of tensors.This is the foundation of how neural networks learn!
Basic Autograd Example 2 – Training a Linear Model
Now that we understand scalar autograd, let’s move on to a real-world mini training scenario using PyTorch modules.
Train a single-layer (fully connected) neural network using:
A forward pass
Loss calculation
Backward pass
A single optimizer step
import torch
import torch.nn as nn
# Input and target tensors of shape (10, 3) and (10, 2)
x = torch.randn(10, 3)
y = torch.randn(10, 2)
# Build a fully connected layer (3 input -> 2 output)
linear = nn.Linear(3, 2)
print('w:', linear.weight)
print('b:', linear.bias)
# Define loss and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)
# Forward pass
pred = linear(x)
# Compute loss
loss = criterion(pred, y)
print('loss:', loss.item())
# Backward pass
loss.backward()
# Gradients
print('dL/dw:', linear.weight.grad)
print('dL/db:', linear.bias.grad)
# Optimization step
optimizer.step()
# Optional manual update (not recommended)
# linear.weight.data.sub_(0.01 * linear.weight.grad.data)
# Forward pass again to observe loss decrease
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization:', loss.item())
Breakdown
Step | Description |
x , y | Random input and target tensors |
nn.Linear(3, 2) | A simple layer: 3 inputs → 2 outputs |
criterion | Mean Squared Error loss |
optimizer | Stochastic Gradient Descent (SGD) |
forward pass | Get predictions from the model |
loss.backward() | Compute gradients of loss w.r.t. model weights |
optimizer.step() | Update model weights to reduce loss |
Output
w: Parameter containing:
tensor([[ 0.0429, 0.2674, 0.0835],
[ 0.3527, -0.5326, -0.4342]], requires_grad=True)
b: Parameter containing:
tensor([-0.5735, -0.1420], requires_grad=True)
loss: 1.3224560022354126
dL/dw: tensor([[ 0.2497, 0.3206, 0.4442],
[ 0.0946, -0.6343, -0.4770]])
dL/db: tensor([-0.5793, -0.0815])
loss after 1 step optimization: 1.3091014623641968
You’ll notice that after one optimization step, the loss value decreases, which confirms that gradient descent is working!
🧠 Interview Tips
Question | What to Remember |
What is autograd ? | PyTorch's engine for automatic differentiation |
What is .backward() ? | It computes gradients for all tensors with requires_grad=True |
What is optimizer.step() ? | It updates parameters using calculated gradients |
When does .grad get populated? | Only after calling .backward() |
What's the difference between optimizer.step() and manual update? | optimizer.step() is cleaner and supports multiple strategies (SGD, Adam, etc.) |
Subscribe to my newsletter
Read articles from Vasu Soni directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
