Data Structures - Best Way To Build Your Logic
This Blog contains all concept of data structures which can be easily remember . Behind every click, swipe, and search, there's a complex dance of data and instructions. Data structures and algorithms are the choreography that makes it all work.
What are Data Structures ? -> They organize data in a specific format, allowing for efficient storage, retrieval, and manipulation. For Example - Imagine a library. Books are data, and the shelves are data structures. A well-organized library with sections and Dewey Decimal classifications makes finding a book efficient.
Data Structures Are of Two Types - a) Linear Data Structures b) Non Linear Data Structures
Linear Data Structures - In linear data structures, data elements are organized sequentially, i.e., each element is connected to its previous and next element.Examples include:
a. Arrays b. Linked Lists c. Stacks d. Queues
Non Linear Data Structures - n non-linear data structures, elements are not organized sequentially. Instead, each element may be connected to multiple other elements. Examples include:
a. Trees (Binary Trees, Binary Search Trees, AVL Trees, B-Trees, etc.) b. Graphs (Directed and Undirected Graphs, Weighted Graphs, etc.) c. Heaps (Binary Heap, Fibonacci Heap, etc.) d. Hash Tables
Arrays - An array is a fundamental data structure used to store elements of the same data type in contiguous memory locations. It provides a way to represent and manipulate collections of items of the same type efficiently. Arrays can be of fixed size (static arrays) or resizable (dynamic arrays), depending on the programming language and data structure implementation.
Stacks - A stack is a fundamental data structure that follows the Last In, First Out (LIFO) principle. In a stack, elements are added and removed from the same end, known as the top of the stack. It can be visualized as a collection of items stacked on top of each other, where only the top item is accessible. Stacks support two primary operations: push and pop.
Here are the key characteristics and operations of a stack:
Push: This operation adds an element to the top of the stack. The new element becomes the top of the stack.
Pop: This operation removes the element at the top of the stack. After the pop operation, the next element below the removed element becomes the new top of the stack.
Peek (or Top): This operation allows you to view the element at the top of the stack without removing it.
Empty Stack: A stack is considered empty when it contains no elements. Attempting to pop from an empty stack typically results in an error (underflow), depending on the implementation.
Full Stack: In some implementations, a stack may have a maximum capacity. Attempting to push onto a full stack typically results in an error (overflow).
Dynamic Size: In most implementations, stacks can dynamically grow and shrink as elements are added and removed. Dynamic resizing allows stacks to accommodate varying numbers of elements efficiently.
Applications: Stacks are used in various applications such as expression evaluation, function call management (call stack), backtracking algorithms, parsing, undo mechanisms in text editors, and more.
Linked List - A linked list is a linear data structure where elements, called nodes, are stored in memory not necessarily in contiguous locations. Instead, each node contains a reference or pointer to the next node in the sequence, forming a chain-like structure. The last node typically points to null, indicating the end of the list. Unlike arrays, linked lists do not have a fixed size and can dynamically grow or shrink as needed.
Here are the key characteristics and operations of a linked list:
Nodes: Each node in a linked list consists of two parts: data and a reference to the next node. The data part stores the actual value, while the reference part (pointer or link) points to the next node in the sequence.
Head: The first node of the linked list is called the head. It serves as the starting point for traversing the list.
Tail: The last node of the linked list is called the tail. Its reference points to null, indicating the end of the list.
Singly Linked List: In a singly linked list, each node has a reference to the next node in the sequence. Traversal is only possible in one direction, from the head to the tail.
Doubly Linked List: In a doubly linked list, each node has references to both the next and previous nodes in the sequence. This allows for traversal in both directions.
Circular Linked List: In a circular linked list, the last node's reference points back to the head of the list, forming a circle. This structure enables continuous traversal from any node to any other node in the list.
Insertion and Deletion: Elements can be efficiently inserted or deleted from a linked list by adjusting the references of neighboring nodes. Insertion and deletion at the beginning and end of the list typically have constant time complexity O(1), while insertion and deletion at arbitrary positions may require traversing the list, resulting in linear time complexity O(n).
Random Access: Unlike arrays, linked lists do not support direct random access to elements based on index. Accessing an element at a specific position may require traversing the list from the beginning, resulting in linear time complexity O(n).
Queues - A queue is a linear data structure that follows the First In, First Out (FIFO) principle. In a queue, elements are inserted at the rear (enqueue operation) and removed from the front (dequeue operation). It can be visualized as a line of people waiting for a service, where the first person to join the line is the first to be served, and new people join the line at the rear.
Here are the key characteristics and operations of a queue:
Enqueue: This operation adds an element to the rear of the queue. The newly added element becomes the last in line to be processed.
Dequeue: This operation removes the element at the front of the queue. The next element in line becomes the new front of the queue.
Front: This operation allows you to view the element at the front of the queue without removing it.
Rear (or Back): This operation allows you to view the element at the rear of the queue without removing it.
Empty Queue: A queue is considered empty when it contains no elements. Attempting to dequeue from an empty queue typically results in an error (underflow), depending on the implementation.
Full Queue: In some implementations, a queue may have a maximum capacity. Attempting to enqueue onto a full queue typically results in an error (overflow).
Dynamic Size: Like stacks, queues can dynamically grow and shrink as elements are added and removed. Dynamic resizing allows queues to accommodate varying numbers of elements efficiently.
Applications: Queues are used in various applications such as job scheduling, task management, breadth-first search algorithms, printer queues, message queuing systems, and more.
Trees - Trees are hierarchical data structures composed of nodes, where each node has a value and can have zero or more child nodes. The node at the top is called the root node, and nodes with no children are called leaf nodes. Trees are used for organizing data in a hierarchical manner, making them efficient for tasks like searching, sorting, and organizing data relationships.
Here's an explanation of some common types of trees:
Binary Tree:
In a binary tree, each node can have at most two children, typically called the left child and the right child.
Binary trees can be used to efficiently represent hierarchical structures like file systems, expression trees, and decision trees.
Binary trees can be traversed in different ways, such as in-order, pre-order, and post-order traversal.
Binary Search Tree (BST):
A binary search tree is a type of binary tree where the left child of a node contains values less than the node's value, and the right child contains values greater than the node's value.
BSTs allow for efficient searching, insertion, and deletion operations. Searching for a value in a BST typically has a time complexity of O(log n) in balanced trees.
However, if the tree is unbalanced (skewed), the time complexity can degrade to O(n), making balancing operations essential for maintaining efficiency.
AVL Tree:
An AVL tree is a self-balancing binary search tree where the heights of the two child subtrees of any node differ by at most one.
AVL trees ensure that the tree remains balanced after insertion and deletion operations, which guarantees efficient search, insertion, and deletion operations with a worst-case time complexity of O(log n).
Balancing operations, such as rotations, are performed whenever necessary to maintain the balance factor of nodes.
B-Tree:
A B-tree is a balanced tree data structure that generalizes binary search trees to allow for more than two children per node.
B-trees are commonly used in databases and file systems because they are optimized for systems that read and write large blocks of data.
B-trees maintain balance by ensuring that all leaf nodes are at the same level, and nodes have a minimum and maximum number of children.
B-trees have efficient search, insertion, and deletion operations with a time complexity of O(log n) and are tolerant of disk I/O operations.
Graph:
A graph is a non-linear data structure consisting of a set of vertices (also called nodes) and a set of edges (also called arcs or links) that connect pairs of vertices.
Graphs are widely used to represent relationships between entities, such as connections in a social network, roads in a transportation network, or dependencies between tasks in a project.
Graphs can be categorized into two main types based on the nature of their edges: directed graphs and undirected graphs.
Directed Graph (Digraph):
In a directed graph, each edge has a direction associated with it, indicating a one-way relationship between vertices.
For example, if there is an edge from vertex A to vertex B, it means that there is a directed connection from A to B, but not necessarily from B to A.
Directed graphs are used to model asymmetric relationships, such as traffic flow in road networks, dependencies in software modules, or follower-followee relationships in social networks.
Undirected Graph:
In an undirected graph, edges have no inherent direction, meaning that the relationship between vertices is bidirectional.
If there is an edge between vertices A and B in an undirected graph, it implies that there is a connection from A to B and vice versa.
Undirected graphs are used to represent symmetric relationships, such as friendships in social networks, connections in a computer network, or physical links between locations in a map.
Weighted Graph:
In a weighted graph, each edge is associated with a numerical value or weight that represents the cost, distance, or any other quantitative measure of the relationship between vertices.
The weights on the edges can represent various factors, such as the distance between two locations, the cost of traversing a road, or the time required to transmit data between nodes in a network.
Weighted graphs are used in various applications, including route planning, network optimization, and resource allocation, where the quantitative measure of relationships plays a crucial role in decision-making.
Heaps - Heaps are specialized tree-based data structures that satisfy the heap property. The heap property states that for a heap data structure:
In a max heap, the value of each node is greater than or equal to the values of its children.
In a min heap, the value of each node is less than or equal to the values of its children.
Heaps are often used in priority queues and various sorting algorithms. There are different types of heaps, with the two most common ones being Binary Heap and Fibonacci Heap:
Binary Heap:
Binary heap is a complete binary tree (a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible).
In a binary heap, the values of parent nodes are greater (for max heap) or smaller (for min heap) than the values of their children.
Binary heaps are typically implemented using arrays because the complete binary tree property allows for efficient use of contiguous memory.
Operations such as insertion, deletion, and finding the maximum or minimum value have a time complexity of O(log n) due to the height of the binary tree.
Fibonacci Heap:
Fibonacci heap is a more complex data structure that supports a wider range of operations and has better amortized time complexities for certain operations compared to binary heaps.
Fibonacci heap is composed of a collection of trees called Fibonacci trees, each satisfying the heap property. These trees are organized in a circular, doubly linked list.
Fibonacci heaps have several unique features, including the ability to merge heaps in constant time, decrease-key operation in constant amortized time, and find-min or find-max operations in constant amortized time.
Due to its more complex structure, Fibonacci heaps are more memory-intensive and have higher constant factors in their time complexities compared to binary heaps.
Fibonacci heaps are commonly used in algorithms where multiple decrease-key operations are performed, such as Dijkstra's algorithm for shortest paths and Prim's algorithm for minimum spanning trees.
Hash Tables - A hash table, also known as a hash map, is a data structure that implements an associative array abstract data type. It uses a technique called hashing to map keys to values. Hash tables offer efficient insertion, deletion, and lookup operations, with an average time complexity of O(1) for these operations, making them widely used in computer science and programming.
Here's how hash tables work:
Hash Function:
A hash function is a mathematical function that takes a key as input and produces a hash value, which is typically an integer.
The hash function maps keys of arbitrary size to fixed-size hash values.
The goal of a good hash function is to minimize collisions, where different keys produce the same hash value.
Array Storage:
Hash tables typically use an array as underlying storage.
Each element in the array is called a bucket or slot, and it can store a key-value pair.
Hashing:
To store a key-value pair in a hash table, the key is first hashed to produce a hash value.
The hash value is then used as an index to determine the bucket where the key-value pair will be stored.
If there is a collision (i.e., multiple keys hash to the same index), various collision resolution techniques are used to handle it.
Collision Resolution:
Collision resolution strategies include separate chaining and open addressing.
Separate chaining involves storing multiple key-value pairs in the same bucket, typically using a linked list or another data structure.
Open addressing involves searching for an alternative empty slot when a collision occurs, using techniques like linear probing, quadratic probing, or double hashing.
Lookup, Insertion, and Deletion:
To retrieve the value associated with a key, the key is hashed to find the corresponding bucket, and then the value is looked up in that bucket.
To insert a new key-value pair, the key is hashed to determine the bucket, and the pair is inserted into that bucket.
To delete a key-value pair, the key is hashed to find the corresponding bucket, and then the pair is removed from that bucket.
Subscribe to my newsletter
Read articles from Ishan Vaghela directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by