Indexing in DBMS
Introduction and Types of Index
After studying file organization, we now understand how records are stored on disk. However, even with good file organization, searching for specific records can still be slow.
To improve search efficiency, DBMS uses Indexing.
Indexing is a data structure technique used to speed up data retrieval operations.
1. What is Indexing?
Indexing is a mechanism used by DBMS to quickly locate records without scanning the entire file.
An index is similar to the index of a book:
- Instead of reading every page,
- You look at the index,
- Find the page number,
- And directly go to that page.
Similarly, in DBMS:
- The index stores key values
- Along with pointers to the actual records on disk.
2. Why Do We Need Indexing?
Without indexing:
- Searching requires a full table scan.
- Time complexity becomes O(n).
- Disk I/O operations increase.
With indexing:
- Direct access to records.
- Fewer disk accesses.
- Faster query execution.
Indexing is especially important when:
- The table is large.
- Search operations are frequent.
- Range queries are common.
3. Basic Structure of an Index
An index contains:
- Search key (attribute used for searching)
- Pointer (address of record or block)
Example:
| Roll_No | Pointer |
| 101 | Block 1 |
| 102 | Block 3 |
| 103 | Block 5 |
The index is usually stored separately from the data file.
4. Types of Indexing
There are different types of indexing based on structure and storage.
We will first understand basic classification.
5. Primary Index
A Primary Index is created on the primary key of the table.
Characteristics:
- The data file is sorted based on the primary key.
- The index is built on that sorted key.
- Usually, a sparse index.
Example:
If records are sorted by Student_ID, the index is built on Student_ID.
5.1 Features of Primary Index
- One primary index per table.
- Efficient for range queries.
- Requiresa sorted file.
6. Secondary Index
A Secondary Index is created on a non-primary key attribute.
Characteristics:
- The data file may not be sorted.
- Multiple secondary indexes can exist.
- Usually, a dense index.
Example:
Creatingan index on the Department in a Student table.
6.1 Features of Secondary Index
- Can exist on multiple attributes.
- Improves search on non-key attributes.
- Requires extra storage.
7. Dense Index
In a Dense Index:
- There is an index entry for every search key value.
- Every record has a corresponding index entry.
Example:
If a file has 1000 records, the index will have 1000 entries.
Advantages:
- Faster search.
- Direct record access.
Disadvantages:
- Larger index size.
- More storage is required.
8. Sparse Index
In a Sparse Index:
- Index entries are created only for some search key values.
- Usually one entry per block.
Example:
If each block stores 100 records, the index may store one entry per block.
Advantages:
- Smaller index size.
- Less storage is required.
Disadvantages:
- Slightly slower than dense index.
- Requires a sorted file.
9. Multilevel Index
When a single-level index becomes too large:
- Searching the index itself becomes slow.
Solution:
Create an index on the index.
This is called Multilevel Indexing.
Structure:
Data File
↓
Level 1 Index
↓
Level 2 Index
This reduces search time significantly.
Multilevel indexing forms the foundation of tree-based indexing (like B+ Trees).
10. Comparison of Index Types
| Type | Sorted File Required | Entries per Record | Use Case |
| Primary Index | Yes | Sparse | Primary key search |
| Secondary Index | No | Dense | Non-key search |
| Dense Index | No | One per record | Fast retrieval |
| Sparse Index | Yes | One per block | Storage efficient |
Summary
In this article, we studied:
- What indexing is
- Why is indexing needed
- Structure of an index
- Primary and Secondary index
- Dense and Sparse index
- Multilevel indexing
Indexing improves performance by reducing disk I/O and speeding up search operations.