In this article, we will explore the Iris flower dataset, a well-known and historically significant dataset in the field of machine learning. Originally introduced by the statistician Ronald Fisher in 1936, this dataset has been widely used for classification tasks. It consists of 150 samples from three different species of Iris flowers—Setosa, Versicolor, and Virginica—each characterized by four features: sepal length, sepal width, petal length, and petal width.

We will utilize the PyTorch framework to develop a classification model that can accurately identify the species of Iris flowers based on these features. Throughout the article, we will walk through the process step by step, from data loading and preprocessing to building, training, and evaluating our model. By the end, you will have a solid understanding of how to apply machine learning techniques to this dataset using PyTorch.

We will utilize the Lightning framework of PyTorch, which simplifies the process of building and training deep learning models. This framework provides a high-level interface that promotes best practices, enhances code organization, and facilitates efficient model training and testing. By leveraging Lightning, we can focus on developing our model's architecture and experiment with different training strategies, while the framework handles the boilerplate code and optimization tasks for us.

We will construct the neural network architecture as illustrated below. The model will take two features as input and consist of one hidden layer that contains two neurons. Each of these neurons will utilize the ReLU (Rectified Linear Unit) activation function to introduce non-linearity into the model. The final output layer will be designed to classify the input data into one of three distinct categories. This configuration aims to effectively capture the underlying patterns in the data for accurate classification.

Start with installing lightening framework

%%capture
!pip install lightning

Next we import all libraries

import torch # torch will allow us to create tensors.
import torch.nn as nn # torch.nn allows us to create a neural network.
# nn.functional give us access to the activation and loss functions.
import torch.nn.functional as F 
from torch.optim import Adam # optim contains many optimizers. This time we're using Adam

import lightning as L # lightning has tons of cool tools that make neural networks easier
# these are needed for the training data
from torch.utils.data import TensorDataset, DataLoader 

import pandas as pd # We'll use pandas to read in the data and normalize it
# We'll use this to create training and testing datasets
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import load_iris

We load iris dataset using Scikit

iris= load_iris(as_frame=True)
df=iris.data

The dataset consists of 150 samples total, 50 for each of 3 species of Iris, Setosa, Versicolor, and Virginica.

df.shape
(150, 4)

To start our analysis, we need to divide the dataset into training and testing subsets. The first step in this process is to identify and separate the relevant columns into two distinct DataFrames: one for the input values and another for the labels.

The first DataFrame, which we will name "input_values," will contain the features that we will use to make our predictions. Specifically, this DataFrame will include the measurements of the petal and sepal widths, which are critical for our predictive model.

The second DataFrame, labeled "label_values," will hold the target variable we aim to predict. This DataFrame will consist of the species classifications, which will allow us to assess the accuracy and effectiveness of our predictions once the model is trained.

By clearly defining these two DataFrames, we set the foundation for an organized approach to model training and evaluation.

In this example, we will keep the neural network simple by using only the values for petal width and sepal width as inputs. First, we'll ensure we can correctly isolate the columns we want from those we don't need. To do this, we will pass the DataFrame (df) a list of the column names we want to retrieve values for: ['petal width (cm), 'sepal width (cm)'].

input_values = df[['petal width (cm)', 'sepal width (cm)']]
label_values = iris.target

Using the DataFrame factorize() function, you will get two outputs: a list of numeric codes, shaped the same as your input, and an array of unique values that represent what each number corresponds to.

classes_as_numbers, classes = label_values.factorize()

We will separate the variables, specifically input_values and classes_as_numbers, into distinct training and testing datasets. This process is essential for building a robust machine learning model and helps us evaluate its performance effectively. To accomplish this, we will utilize the train_test_split() function from the sklearn library. This function allows us to randomly partition our data, ensuring that we have a subset for training the model and a separate subset for testing its accuracy and reliability.

input_train, input_test, label_train, label_test = train_test_split(input_values,
                                                                    classes_as_numbers,
                                                                    test_size=0.25,
                                                                    stratify=classes_as_numbers)input_train.shape

input_train.shape
(112, 2)
input_test.shape
(38, 2)

Since our neural network has three outputs, one for each species (as illustrated in the drawing of the neural network above), we need to convert the numbers in label_train into arrays with three elements. Each element in the array corresponds to a specific output of the neural network. We will use the following encoding: [1.0, 0.0, 0.0] for Setosa, [0.0, 1.0, 0.0] for Versicolor, and [0.0, 0.0, 1.0] for Virginica. The good news is that we can easily perform this one-hot encoding. Additionally, we'll use type(torch.float32) to ensure that the numbers are stored in the correct format for efficient processing by the neural network.

one_hot_label_train = F.one_hot(torch.tensor(label_train)).type(torch.float32)

To enhance the effectiveness of our machine learning models, it is important to normalize the input variables so that their values fall within a range of 0 to 1. Normalization standardizes the data, ensuring that all features contribute equally during the training process. This scaling helps to improve the model's convergence and overall performance. To achieve this, we will utilize the MinMaxScaler, a tool provided by the scikit-learn library, which efficiently transforms the data by adjusting the minimum and maximum values accordingly.

# Initialize the scaler
scaler = MinMaxScaler()
input_train_normalized = scaler.fit_transform(input_train)
input_test_normalized = scaler.fit_transform(input_test)

To effectively train our neural network, we need to organize our training data into a DataLoader. DataLoaders are particularly useful for handling large datasets, as they facilitate the processing of data in manageable batches. This approach not only allows us to shuffle the dataset at the beginning of each epoch, enhancing the training process by reducing potential overfitting, but it also lets us work with a smaller subset of the data if we're aiming for a quick, preliminary run—perfect for debugging our code.

To start, we will convert our training inputs, input_train, into PyTorch tensors using the function torch.tensor(). This step is crucial because neural networks in PyTorch operate with tensors.

Once we have our input data in tensor format, we'll combine input_train with our labels, one_hot_label_train, to form a TensorDataset. This dataset acts as a wrapper that pairs our inputs with their corresponding labels, ensuring that during training, the model learns from the correct label for each input.

Finally, we'll use the TensorDataset to create the DataLoader. By doing so, we can specify parameters such as batch size and whether we would like to shuffle the data. With everything set up in this manner, the DataLoader will streamline the process of feeding data to our neural network during training, enhancing both efficiency and ease of use.

## Convert the DataFrame input_train into tensors
input_train_tensors = torch.tensor(input_train.values).type(torch.float32)
# Convert the DataFrame input_test into tensors
input_test_tensors = torch.tensor(input_test.values).type(torch.float32)
train_dataset = TensorDataset(input_train_tensors, one_hot_label_train)
train_dataloader = DataLoader(train_dataset)

To build a neural network using PyTorch, you need to create a new class that inherits from LightningModule. This approach makes it easier to train the neural network.

Our new class will include the following methods:

__init__(): This method initializes the weights and biases, as well as manages other housekeeping tasks.
forward(): This method performs a forward pass through the neural network.
configure_optimizers(): This method sets up the optimizer. Although there are many optimizers available, for this tutorial, we will use the Adam optimizer.
training_step(): This method takes the training data, passes it to the forward() method, calculates the loss, and logs the loss values.

By implementing these methods, we will create a functional and efficient neural network ready for training.

class MultipleInsOuts(L.LightningModule):

  def __init__(self):
    super().__init__()

    L.seed_everything(seed=42)
    self.input_to_hidden=nn.Linear(in_features=2,out_features=2,bias=True)
    self.hidden_to_output = nn.Linear(in_features=2, out_features=3, bias=True)
    self.loss = nn.MSELoss(reduction='sum')

  def forward(self,input):

    ## First, we run the input values to the activation functions
        ## in the hidden layer
        hidden = self.input_to_hidden(input)
        ## Then we run the values through a ReLU activation function
        ## and then run those values to the output.
        output_values = self.hidden_to_output(torch.relu(hidden))
        return(output_values)

  def configure_optimizers(self):
        ## configuring the optimizer
        ## consists of passing it the weights and biases we want
        ## to optimize, which are all in self.parameters(),
        ## and setting the learning rate with lr=0.001.
        return Adam(self.parameters(), lr=0.001)

  def training_step(self, batch, batch_idx):
        ## The first thing we do is split 'batch'
        ## into the input and label values.
        inputs, labels = batch

        ## Then we run the input through the neural network
        outputs = self.forward(inputs)

        ## Then we calculate the loss.
        loss = self.loss(outputs, labels)

        return loss

Training our new neural network involves creating a model from the new class, MultipleInsOuts.

model = MultipleInsOuts()

INFO: Seed set to 42
INFO:lightning.fabric.utilities.seed:Seed set to 42

We will develop a Lightning Trainer, referred to as L.Trainer, aimed at optimizing our model parameters. The training process will commence with an initial setting of 100 epochs. This approach allows us to thoroughly evaluate and adjust the model's performance over multiple iterations, ensuring that we can refine our techniques and achieve better accuracy in our results.

trainer = L.Trainer(max_epochs=100)
trainer.fit(model, train_dataloaders=train_dataloader)

Lets test using test data

# Run the input_test_tensors through the neural network
predictions = model(input_test_tensors)

## Select the output with highest value...
predicted_labels = torch.argmax(predictions, dim=1) ## dim=0 applies softmax to rows, dim=1 applies softmax to columns

torch.sum(torch.eq(torch.tensor(label_test), predicted_labels)) / len(predicted_labels)

tensor(0.8947)

We get 89% Accuracy

With our model now trained, we can use it to make predictions from new data. This is achieved by passing the model a tensor that includes normalized petal and sepal widths.

Jupyter notebook is available at github at iris_PyTorch

Single Layer Neural Network Using PyTorch

Start with installing lightening framework

Next we import all libraries

We load iris dataset using Scikit

Lets test using test data

Subscribe to my newsletter

Nitin Sharma

Nitin Sharma