Indexing in DBMS
B-Tree and B+ Tree Indexing
In the previous article, we studied basic indexing concepts like primary index, secondary index, dense index, sparse index, and multilevel indexing.
When multilevel indexing grows further, it forms tree-based structures.
The most widely used tree-based indexing structures in DBMS are:
- B-Tree
- B+ Tree
These structures ensure efficient search, insertion, and deletion while maintaining balance.
1. Why Do We Need Tree-Based Indexing?
Simple indexing has limitations:
- If the index becomes very large, searching becomes slow.
- Inserting new keys may require shifting entries.
- Maintaining sorted order becomes costly.
To solve these problems, balanced tree structures are used.
Tree-based indexing ensures:
- Logarithmic search time
- Balanced height
- Efficient insertion and deletion
- Reduced disk access
2. B-Tree in DBMS
A B-Tree is a self-balancing search tree used for indexing.
It maintains sorted data and allows searches, insertions, and deletions in logarithmic time.
2.1 Properties of B-Tree
Let order = m
- Each node can have at most m children.
- Each internal node can have at most m ā 1 keys.
- Except for the root, every node must have at least ām/2ā children.
- All leaf nodes are at the same level.
- The tree remains balanced.
2.2 Structure of B-Tree
- Keys are stored in both internal and leaf nodes.
- Internal nodes guide the search.
- Leaf nodes also contain actual keys and data pointers.
2.3 Search in B-Tree
Search starts from the root:
- Compare key values.
- Move to the appropriate child.
- Continue until found, or the leaf is reached.
Time Complexity:
O(log n)
2.4 Insertion in B-Tree
- Insert the key in the appropriate leaf node.
- If node overflows:
- Split the node.
- Promote the middle key to the parent.
- Splitting may propagate upward.
The tree remains balanced.
3. B+ Tree in DBMS
B+ Tree is an improved version of B-Tree.
It is the most widely used indexing structure in real databases.
3.1 Key Difference from B-Tree
In B+ Tree:
- Internal nodes store only keys.
- Leaf nodes store:
- Keys
- Pointers to actual records
- All records are stored at the leaf level.
- Leaf nodes are linked together.
3.2 Properties of B+ Tree
- All records are stored in leaf nodes.
- Internal nodes act only as index nodes.
- Leaf nodes are linked using pointers.
- Tree remains balanced.
- All leaf nodes are at the same level.
3.3 Why the B+ Tree is Preferred
- Faster range queries
- Sequential access is efficient
- Better disk block utilization
- Smaller internal nodes
- Higher fan-out (more children per node)
- Shorter tree height
Because internal nodes store only keys, more keys fit in a block, reducing tree height.
4. B-Tree vs B+ Tree
| Feature | B-Tree | B+ Tree |
| Data stored in | Internal + Leaf | Leaf only |
| Leaf nodes linked | No | Yes |
| Range queries | Less efficient | Very efficient |
| Disk block usage | Less optimal | More optimal |
| Used in DBMS | Rarely | Widely used |
In practical DBMS systems, the B+ Tree is preferred.
5. Advantages of Tree-Based Indexing
- Balanced structure
- Logarithmic search time
- Efficient insertion and deletion
- Good for range queries
- Reduced disk I/O
Summary
In this article, we studied:
- Need for tree-based indexing
- B-Tree and its properties
- B+ Tree and its properties
- Differences between B-Tree and B+ Tree
- Why the B+ Tree is preferred in DBMS
B+ Tree is the backbone of indexing in most relational database systems.