How is Graph stored in memory?

Piyush BajajPiyush Bajaj
5 min read

Introduction

Graphs are fundamental data structures used to model relationships between entities in various fields such as computer science, social networks, transportation systems, and more. When working with graphs in computer programs, one of the key challenges is storing them efficiently in memory.

Here in this article we will talk about storing graph in different data structures and their efficiency.

Understanding Graph Representation

At its core, a graph comprises vertices (nodes) and edges. These edges can be directed or undirected and may have weights associated with them. The choice of representation depends on the characteristics of the graph and the operations to be performed.

Data Structures for Graph Storage

  1. Adjacency Matrix

This is a 2D array where each cell represents the presence or absence of an edge between two vertices. While suitable for dense graphs, adjacency matrices consume more memory, especially for sparse graphs.

Representation: An adjacency matrix is a 2D array where each cell matrix[i][j]matrix[i][j] represents the presence or absence of an edge between vertex i and vertex j. For weighted graphs, the cell may contain the weight of the edge.

Adjacency Matrix:
0 1 1 0 0 
1 0 1 1 0 
1 1 0 0 1 
0 1 0 0 0 
0 0 1 0 0

In this matrix, Rows and columns represent vertices. A value of 1 indicates the presence of an edge between two vertices.

    0  1  2  3  4
  +--------------
0 | 0  1  1  0  0
1 | 1  0  1  1  0
2 | 1  1  0  0  1
3 | 0  1  0  0  0
4 | 0  0  1  0  0

Each row and column corresponds to a vertex, and the presence of an edge is indicated by a 1 in the corresponding cell.

Performance:

  • Edge Existence Check: O(1)

  • Edge Addition/Removal: O(1)

  • Space Complexity: O(V2), where V is the number of vertices

  • Time Complexity for Traversal: O(V2)

Pros:

  • Fast edge existence check and edge addition/removal.

  • Suitable for dense graphs with a large number of edges.

  • Easy implementation and intuitive for small graphs.

Cons:

  • High memory consumption, especially for sparse graphs.

  • Inefficient for graphs with many isolated vertices.

  1. Adjacency List

Each vertex is associated with a list of its neighboring vertices. This approach is memory-efficient for sparse graphs as it only stores existing edges.

Adjacency List:
0: 1 2
1: 0 2 3
2: 0 1 4
3: 1
4: 2

Representation: An adjacency list is a collection of lists, where each vertex maintains a list of its neighboring vertices. Each line represents a vertex. The numbers after the colon denote the neighboring vertices of the vertex. For example, vertex 0 has neighbors 1 and 2, vertex 1 has neighbors 0, 2, and 3, and so on.

Performance:

  • Edge Existence Check: O(degree(v))

  • Edge Addition/Removal: O(1) if self-loops are allowed, O(degree(v)) otherwise

  • Space Complexity: O(V+E), where Vis the number of vertices and E is the number of edges

  • Time Complexity for Traversal: O(V+E)

Pros:

  • Memory-efficient for sparse graphs.

  • Suitable for graphs with varying degrees of connectivity.

  • Efficient for traversals and graph algorithms.

Cons:

  • Slower edge existence check compared to adjacency matrix for dense graphs.

  • Requires more memory overhead for storing pointers/references.

  1. Edge List

A simple list of all the edges in the graph, with each edge represented by a pair of vertices. Edge lists are memory-efficient for sparse graphs but may be less efficient for certain operations.

Representation: An edge list is a simple list containing all edges in the graph, each represented by a pair of vertices. Each pair of numbers represents an edge between two vertices. For example, (0, 1) indicates an edge between vertex 0 and vertex 1, and so on.

Edge List:
(0, 1) (0, 2) (1, 2) (1, 3) (2, 4)

Performance:

  • Edge Existence Check: O(E)

  • Edge Addition/Removal: O(1)

  • Space Complexity: O(E), where E is the number of edges

  • Time Complexity for Traversal: O(E)

Pros:

  • Memory-efficient for sparse graphs.

  • Easy to implement and understand.

  • Suitable for certain graph algorithms like Kruskal's algorithm for minimum spanning trees.

Cons:

  • Slower edge existence check compared to adjacency matrix and adjacency list.

  • Inefficient for certain operations like finding all neighbors of a vertex.

  1. Adjacency Set

Similar to the adjacency list, but it uses sets to store neighboring vertices. This can offer faster edge presence checks.

Representation: An adjacency set is similar to the adjacency list, but it uses sets instead of lists to store neighboring vertices. In an adjacency list, each vertex maintains a list (or other suitable data structure) of its neighboring vertices directly. However, in an adjacency set representation, each vertex maintains a set (often implemented using a hash set or a balanced binary search tree) of its neighboring vertices. The use of a set ensures that duplicate edges are not stored and allows for efficient edge presence checks.

Adjacency Set:
0: {1, 2}
1: {0, 2, 3}
2: {0, 1, 4}
3: {1}
4: {2}

Performance:

  • Edge Existence Check: O(log⁡(degree(v)))

  • Edge Addition/Removal: O(log⁡(degree(v)))

  • Space Complexity: O(V+E), where V is the number of vertices and E is the number of edges

  • Time Complexity for Traversal: O(V+E)

Pros:

  • Fast edge existence check and edge addition/removal.

  • Memory-efficient for sparse graphs.

  • Suitable for graphs with varying degrees of connectivity.

Cons:

  • Slightly slower performance compared to adjacency list due to overhead of set operations.

  • Requires additional memory overhead for storing sets.

Choosing the Right Data Structure

The choice of data structure depends on factors such as graph size, density, memory constraints, and the operations to be performed. For example, adjacency matrices are suitable for dense graphs with frequent edge operations, while adjacency lists are preferable for sparse graphs with memory constraints.

Performance Comparison

Comparing the performance of different data structures for common graph operations reveals trade-offs. While adjacency matrices offer fast edge operations, they consume more memory. Adjacency lists, on the other hand, are memory-efficient but may have slower edge operations for dense graphs.

By understanding the characteristics of different data structures and their trade-offs, we developers can make informed decisions when choosing the right representation for their graphs.

For more info: follow me on instagram @smrtdvlpr

Hope you enjoyed reading this :)

0
Subscribe to my newsletter

Read articles from Piyush Bajaj directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Piyush Bajaj
Piyush Bajaj