Indexing in DBMS

Introduction and Types of Index

After studying file organization, we now understand how records are stored on disk. However, even with good file organization, searching for specific records can still be slow.

To improve search efficiency, DBMS uses Indexing.

Indexing is a data structure technique used to speed up data retrieval operations.


1. What is Indexing?

Indexing is a mechanism used by DBMS to quickly locate records without scanning the entire file.

An index is similar to the index of a book:

  • Instead of reading every page,
  • You look at the index,
  • Find the page number,
  • And directly go to that page.

Similarly, in DBMS:

  • The index stores key values
  • Along with pointers to the actual records on disk.

2. Why Do We Need Indexing?

Without indexing:

  • Searching requires a full table scan.
  • Time complexity becomes O(n).
  • Disk I/O operations increase.

With indexing:

  • Direct access to records.
  • Fewer disk accesses.
  • Faster query execution.

Indexing is especially important when:

  • The table is large.
  • Search operations are frequent.
  • Range queries are common.

3. Basic Structure of an Index

An index contains:

  • Search key (attribute used for searching)
  • Pointer (address of record or block)

Example:

Roll_No

Pointer

101

Block 1

102

Block 3

103

Block 5

The index is usually stored separately from the data file.


4. Types of Indexing

There are different types of indexing based on structure and storage.

We will first understand basic classification.


5. Primary Index

A Primary Index is created on the primary key of the table.

Characteristics:

  • The data file is sorted based on the primary key.
  • The index is built on that sorted key.
  • Usually, a sparse index.

Example:

If records are sorted by Student_ID, the index is built on Student_ID.


5.1 Features of Primary Index

  • One primary index per table.
  • Efficient for range queries.
  • Requiresa sorted file.

6. Secondary Index

A Secondary Index is created on a non-primary key attribute.

Characteristics:

  • The data file may not be sorted.
  • Multiple secondary indexes can exist.
  • Usually, a dense index.

Example:

Creatingan index on the Department in a Student table.


6.1 Features of Secondary Index

  • Can exist on multiple attributes.
  • Improves search on non-key attributes.
  • Requires extra storage.

7. Dense Index

In a Dense Index:

  • There is an index entry for every search key value.
  • Every record has a corresponding index entry.

Example:

If a file has 1000 records, the index will have 1000 entries.

Advantages:

  • Faster search.
  • Direct record access.

Disadvantages:

  • Larger index size.
  • More storage is required.

8. Sparse Index

In a Sparse Index:

  • Index entries are created only for some search key values.
  • Usually one entry per block.

Example:

If each block stores 100 records, the index may store one entry per block.

Advantages:

  • Smaller index size.
  • Less storage is required.

Disadvantages:

  • Slightly slower than dense index.
  • Requires a sorted file.

9. Multilevel Index

When a single-level index becomes too large:

  • Searching the index itself becomes slow.

Solution:

Create an index on the index.

This is called Multilevel Indexing.

Structure:

Data File

Level 1 Index

Level 2 Index

This reduces search time significantly.

Multilevel indexing forms the foundation of tree-based indexing (like B+ Trees).


10. Comparison of Index Types

Type

Sorted File Required

Entries per Record

Use Case

Primary Index

Yes

Sparse

Primary key search

Secondary Index

No

Dense

Non-key search

Dense Index

No

One per record

Fast retrieval

Sparse Index

Yes

One per block

Storage efficient


Summary

In this article, we studied:

  • What indexing is
  • Why is indexing needed
  • Structure of an index
  • Primary and Secondary index
  • Dense and Sparse index
  • Multilevel indexing

Indexing improves performance by reducing disk I/O and speeding up search operations.