Understanding TensorFlow Sequential Models

Emeron MarcelleEmeron Marcelle
4 min read

In machine learning, especially deep learning, the Sequential API in TensorFlow is one of the most straightforward and powerful tools for building neural networks. By stacking layers one after the other, we can easily define complex models for tasks like classification, regression, and more. In this blog post, we will explore how to use the Sequential model in TensorFlow, focusing on dense (fully connected) and convolutional layers.

Building a Sequential Model

To begin with, we use the Sequential class to initialize our model. A Sequential model is simply a linear stack of layers where we can add one layer at a time. Below is an example of how to start:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()  # Initialize an empty sequential model

Now that we have our model, we can start adding layers.

Adding Dense Layers

Dense layers are the basic building blocks of most neural networks. They are fully connected layers where each neuron in one layer is connected to every neuron in the next. When adding a dense layer, the number of neurons and activation functions must be specified.

model.add(Dense(Num))  # Adds a fully connected layer with 'Num' neurons
  • input_shape=(Num,): The input shape is specified in the first layer. This defines the shape of the input data. For instance, if your input is a vector with 784 values (e.g., flattened 28x28 image), you set input_shape=(784,).

  • Activation Functions:

    • 'relu': The Rectified Linear Unit (ReLU) is the most commonly used activation function, defined as ( f(x) = \max(0, x) ). It helps in faster training.

    • 'sigmoid': Outputs values between 0 and 1, making it suitable for binary classification problems.

    • 'tanh': Outputs values between -1 and 1.

    • 'leaky_relu': A variation of ReLU that allows a small, positive gradient for negative inputs.

    • 'softmax': Generally used in the output layer for classification tasks. It converts the output into a probability distribution across classes.

For example:

model.add(Dense(128, input_shape=(784,), activation='relu'))  # 128 neurons, input shape (784,)
model.add(Dense(1, activation='sigmoid'))  # Binary classification layer

Viewing the Model

Once the layers are added, you can view a summary of your model using:

model.summary()

This provides an overview of the layers, output shapes, and the number of trainable parameters in your network.

Adding Convolutional Layers (Conv2D)

For image data, you often use convolutional layers to extract spatial features from the input. A convolutional layer applies filters (kernels) to the input data to detect patterns like edges or textures.

from tensorflow.keras.layers import Conv2D, Flatten

model.add(Conv2D(Num, kernel_size=Num, input_shape=(Num, Num, Num)))
  • Filters: Specifies the number of filters (i.e., the depth of the output).

  • Kernel size: Defines the size of the convolutional window, for example, (3, 3).

  • Input shape: This is the shape of the input data. For instance, an RGB image of size 32x32 would have an input shape of (32, 32, 3). The 3 represents that the image is an RGB image.

Once the convolutional layers are added, we need to flatten the output so that it can be fed into a dense layer.

model.add(Flatten())  # Flattens the output to a one-dimensional vector

Output Layer with Softmax

Finally, for classification tasks, especially multi-class classification, the softmax activation function is commonly used in the output layer. This activation function converts the outputs into a probability distribution across the classes.

model.add(Dense(10, activation='softmax'))  # 10 classes, softmax for probability distribution

Summary

The TensorFlow Sequential API makes it easy to build complex neural networks layer by layer. With the combination of dense, convolutional, and flatten layers, you can build models for a wide variety of tasks, from image classification to text processing. Here's a recap of the basic steps:

  1. Initialize a Sequential model.

  2. Add Dense layers with proper activation functions (relu, sigmoid, etc.).

  3. Use Conv2D layers for image data, followed by Flatten to prepare data for dense layers.

  4. Always end with a softmax layer for classification tasks to generate a probability distribution.

With these tools, you're equipped to tackle a wide range of machine learning problems.

0
Subscribe to my newsletter

Read articles from Emeron Marcelle directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Emeron Marcelle
Emeron Marcelle

As a doctoral scholar in Information Technology, I am deeply immersed in the world of artificial intelligence, with a specific focus on advancing the field. Fueled by a strong passion for Machine Learning and Artificial Intelligence, I am dedicated to acquiring the skills necessary to drive growth and innovation in this dynamic field. With a commitment to continuous learning and a desire to contribute innovative ideas, I am on a path to make meaningful contributions to the ever-evolving landscape of Machine Learning.