Introduction to PyTorch and CNN Project

Table of contents

PyTorch Overview
PyTorch is a Python library designed for building deep learning projects. It simplifies array-based calculations and allows for the creation of dynamic neural networks with auto differentiation, which is crucial for deep learning.
Key Features:
Simplicity and Flexibility: PyTorch is known for its simple and flexible Python interface, making it easier to use compared to other deep learning libraries.
GPU Acceleration: PyTorch supports strong GPU acceleration, which is essential for handling large-scale computations efficiently.
Dynamic Neural Networks: PyTorch allows for the creation of dynamic neural networks, which can change during runtime, offering more flexibility in model building.
Development and Community:
Developed by FAIR: PyTorch was developed by the Facebook AI Research (FAIR) division to handle large-scale image analysis tasks like object detection, segmentation, and classification.
Open-Source: It is free and open-source under the modified BSD license, with contributions from a large community of developers.
Support and Stability:
Cloud Platform Support: PyTorch is supported by major cloud platforms such as Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
Regular Maintenance: It is mature and stable, with regular updates and maintenance.
Hardware Support: PyTorch supports CPU, GPU, TPU, and parallel processing, allowing for distributed training across multiple GPUs and machines.
Beginner-Friendly: PyTorch can be run in a browser using Google Colaboratory (Colab), which requires no installation or configuration, making it accessible for beginners.
We prefer to use google collab, but if you have GPU in your system you can install and use PyTorch locally.
installation: https://pytorch.org/get-started/locally/
CUDA (Compute Unified Device Architecture) allows PyTorch to use NVIDIA GPUs for faster computations.
Without CUDA, deep learning tasks would take hours or even days instead of minutes or hours.
import torch
torch.cuda.is_available()
if it return false, you are not using GPU, change runtime type to T4GPU from ‘runtime‘ tab in google collab.
Tensors
Tensors are essential data structures in PyTorch used to store data at various stages of deep learning.
They are multi-dimensional arrays that can represent scalars (0 Dimension), vectors (1D), matrices (2D), and beyond.
Tensors in PyTorch are similar to NumPy's ndarrays but have additional advantages like faster operations on GPUs and the ability to be distributed across multiple CPUs and GPUs.
Tensors keep track of the computation graph that created them, making them powerful for deep learning tasks.
In deep learning, tensors are used to store data at different stages, like images or text, and process them through neural networks.
first_tensor = torch.tensor([[1,2,3,4],[5,6,7,8]])
second_tensor = torch.tensor([[2,3,5,4],[6,4,6,3]])
# addition
print(first_tensor+second_tensor)
# substraction
print(first_tensor-second_tensor)
By default this tensor is allocated to CPU. To move tensors to the GPU for faster computation, we explicitly allocate tensor to GPU as demonstrated below
if torch.cuda.is_available(): device = 'cuda'
else: device='cpu'
print(device)
ten_a = torch.tensor([[1,2,3,4],[5,6,7,8]], device = device)
ten_b = torch.tensor([[2,3,5,4],[6,4,6,3]], device = device)
multi_ten = ten_a * ten_b
print(multi_ten)
Refer: https://colab.research.google.com/drive/1Iw9jtZbngRrFQZv06-SwurlPRhTU7n2L?usp=sharing
Moving Tensor between CPUs and GPUs
Tensors are moved between CPUs and GPUs to leverage the GPU's faster processing capabilities, especially for large-scale neural network training.
CPU to GPU: When training large neural networks or handling high-dimensional data (like images) that require faster computation, we move tensors to the GPU. This is because GPUs are optimized for parallel processing and can significantly speed up training.
GPU to CPU: After training, if the output tensors need pre-processing and the libraries used (like NumPy) only support CPU data, we move the tensors back to the CPU. This ensures compatibility with these libraries for further data manipulation.
Methods to Move Tensors: 3 ways are there
CPU to GPU: Use
tensor.cuda(),
tensor.to(cuda')
, ortensor.to('cuda:0').
GPU to CPU: Use
tensor.cpu() if requires_grad=False,
ortensor.detach().cpu() if requires_grad=True
.
Ways to Create tensor
Creating Tensors from Python Objects: You can create tensors from lists, tuples, or NumPy arrays using the
torch.tensor
method.Predefined Functions: PyTorch provides functions like
torch.empty()
,torch.ones()
, andtorch.zeros()
to create tensors with specific initial values.Random Tensors: Functions like
torch.rand()
,torch.randn()
, andtorch.randint()
generate tensors with random values from different distributions.Similar Properties: Use functions like
torch.ones_like()
to create tensors with properties (dtype, device, layout) similar to another tensor.
Tensor attributes
Device Location: Use the
device
function to find out where the tensor is stored (CPU or GPU).Data Type: The
dtype
function reveals the data type of the tensor elements (e.g.,int64
).Dimensions and Rank: The
shape
function provides the dimensions of the tensor, andndim
gives the number of dimensions (rank).
Tensor data Types
Data Type Specification: You can specify the data type of a tensor using the
dtype
argument, such astorch.int8
for 8-bit integers ortorch.float32
for 32-bit floating points.Casting Tensors: Tensors can be cast to different data types using functions like
float()
or theto()
method.Automatic Casting: PyTorch automatically casts tensors to a larger data type during operations to ensure precision and manage memory consumption effectively.
Tensor Operation
Indexing and Slicing: Similar to NumPy, you can access single elements (indexing) and ranges of elements (slicing) in tensors.
Combining and Splitting: Use
torch.stack
to combine tensors along a new dimension andtorch.unbind
to split tensors along a specified dimension.Conditional Indexing: You can extract data that meets specific criteria, such as values less than a certain number.
Mathematical functions:
Pointwise Operations: Perform operations on each point in the tensor individually (e.g.,
add()
,mul()
,div()
).Reduction Operations: Reduce numbers down to a single number or a smaller set of numbers, reducing the tensor's dimensionality (e.g.,
mean()
,median()
,mode()
).Comparison Functions: Compare values within a tensor or between tensors (e.g., finding min/max values, sorting).
Linear Algebra Functions: Enable matrix operations essential for deep learning computations.
Spectral and Other Math Operations: Useful for data transformations or analysis.
Linear algebra operations
Matrix Products: PyTorch provides functions like
torch.matmul
andtorch.mm
for matrix multiplication.torch.matmul
supports broadcasting, whiletorch.mm
does not.Multi-Dot Function: The
torch.linalg.multi_dot
function is used for calculating the matrix product of multiple two-dimensional tensors.Eigen Decomposition: The
torch.linalg.eig
function computes the eigenvalues and eigenvectors of a square matrix, which is essential for various deep learning computations.
Automatic Differentiation
Automatic differentiation (Autograd) is a technique used to efficiently compute gradients for complex functions, which is essential for training neural networks. We know that training Neural Networks consist of two steps, Forward Propagation and Backward Propagation.
Forward Propagation: Input data is passed through the network to make predictions.
Backward Propagation: The network adjusts its parameters based on the error between predicted and actual outputs.
Gradient Calculation: Automatic differentiation computes the derivatives (gradients) of the loss function with respect to the network's parameters. This helps in optimizing the network by minimizing the loss.
PyTorch's autograd package automates this process, making it easier to perform backpropagation and access individual gradients. This is crucial for training and optimizing deep learning models.
Developing Deep Learning Model
Please Refer:
https://colab.research.google.com/drive/1XGJ_ReAPnexH0G8IDHI_OwY4DsqzQcLq?usp=sharing
Stages of Deep Learning Model Training
The process includes data preparation, model development, and model deployment.
Data Preparation: Involves loading data in various formats (text, images, videos, audio) and converting them into numeric values (tensors).
Model Development: Consists of designing the model, training it with training data, and validating its performance using validation data to prevent overfitting.
Model Deployment: The final step where the model is saved and deployed to a production environment, such as a cloud server or edge device.
different types of data used in the deep learning process
Training Data: This is the data used to train the model. The model learns from this data by adjusting its parameters to minimize errors. It's the largest portion of your dataset.
Validation Data: This data is used to tune the model's hyperparameters and to check for overfitting. Overfitting occurs when the model performs well on training data but poorly on unseen data. Validation data helps ensure the model generalizes well to new data.
Testing Data: This is the data used to evaluate the final model's performance. It provides an unbiased evaluation of the model after it has been trained and validated. Testing data is never used during the training or validation phases.
Usage:
Training Data: Used during the model training phase to adjust model parameters.
Validation Data: Used during the model validation phase to tune hyperparameters and prevent overfitting.
Testing Data: Used after training and validation to assess the model's performance on unseen data.
Subscribe to my newsletter
Read articles from Omkar Kasture directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Omkar Kasture
Omkar Kasture
MERN Stack Developer, Machine learning & Deep Learning