What is a B+tree?


These are self-balancing data structures used to search, sequential access, insert, and delete data quickly. You might be wondering binary search of the ordered list is also quick, but here is the thing suppose if the array size is 1,000,000, then the number of steps it will take is 20, or we can say it will do 20 disk reads. Reading from a disk is significantly slower as it is a secondary disk. We can assume each disk read take 10ms
then 20 reads (actually, the last 6-7 reads can be neglected, discussed later ) will take 130ms - 140ms
this is significantly slower. B+tree can do this in 3 disk reads or 30ms
.
Some Pre-requisites
What is a Disk Block?
It is the smallest unit of data that we can read in a single I/O operation. Even if we want only a specific part of the disk block, we will have to load the whole disk block into the main memory and, find out the part we want and discard everything else. It is usually 4KB
. That's why the last 6-7 steps in binary search can be neglected because the search space there in less than one disk block ( assuming 100 records per block);
Sparse Index and Dense Index
Dense Index - Maintaining a search key-value index for every record.
Sparse Index - Maintaining a search key-value index for a range of records rather than each record
The need for B+tree
If we store data in sequential order, then the cost of operations can be substantial
Read - O(N) , talking about naive implementation
Update - O(N), can be thought of as inserting a new record and deleting the old record
Insert - We have to find the position first where to insert and then shift all records to its right by one index and then insert, this can also be O(N)
Delete - Same as in insertion, first search which records to delete and then shift of records, this can also take O(N)
All these operations can be done in logarithmic time complexity in B+tree. It is a type of multilevel index but differs from the structure.
Here, we will discuss B+tree, as it is mostly used, however, the concept of B-tree is quite similar, but it differs in its structure.
Terminology
Branching Factor - It is the maximum number of children a node can have
Leaf Nodes - These nodes hold pointer to actual pointer to the actual record, along with its search key
Internal Nodes - These nodes have child nodes that can have a max of
b
orbranching factor
child nodes a min ofceil(b/2)
. The min rule does not apply to the root node.
Structure of nodes
Fig- here b = 5
This is a structure of internal node, p_i
refers to pointers to child nodes and k_i
refers to search keys. The p_i
where i = 1,2,3..., b - 2
are pointers to the child whose search key value ranges from k_(i-1) <= key < k_i
p_0
points to the child whose search key values are less than k_0
and p_(b-1)
points to the child whose values are greater than and equal to k_(b-2)
.
As we can notice, the number of search keys is 1 less than the number of children, this is a property of B+tree and is quite evident.
The structure of the leaf node is similar to that of internal nodes, but rather than pointing to child nodes, they point to the actual record, such that,
p_i
where i = 0, 1, 2,..., b - 2
for search key k_i
points to data and p_(b-1)
points to the next ordered leaf node. This property of B+tree greatly helps for range queries, and this is what differs between B+tree and B-tree.
Operations
Read - Now, let's again take the example of when we have 1,000,000 records. As we have discussed earlier in one disk block, we can store 100 records, therefore the branching factor for our B-tree would be 100. Thus the height of the tree will be ( log b ( 1000000 ) = 3 ) . This means we only have to do 3 disk block reads and we will get our desired record.
Insert - For insertions, we leave some space overhead for new elements, the efficiency B+tree provides outweighs the space overhead given. When a new element is needed to be inserted. We first find its would be position, and then insert is there, we can do this quickly because we only have to do this for one disk block. In case the number of keys after insertion exceeds the max amount, then the node gets divided into two nodes. This way the B+tree maintains its balance, and is therefore called self-balancing. The actual method is beyond the scope of this blog.
Deletion - The process is the same as insertion, we find and then delete the record. Here if the search - keys after deletion, goes beyond the minimum limit then we join two nodes together and then update the tree.
Update - Same as read, or can be thought of as insertion and deletion
Range Query - Let's take an example, we want to find a record from
L
toU
, we will search forL
and then read disk blocks, using the linked list type linking property of leaf nodes to access records if the range exceeds disk block records
The cost of operation for each method is ( log b N ) where N is number of records
Conclusion
Hence, the B+ tree is a widely used data structure for its efficiency in handling large files and data. It grows slower (meaning in size, not to be confused with time, but time also increases as levels increase) as the data grows. It is widely used in relational databases, even MongoDB uses this in its wiredtiger storage engine. It is also used in file systems.
Resources
Subscribe to my newsletter
Read articles from Siddhartha Khuntia directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
