Tree-based indexing is very efficient for range queries and ordered data retrieval.

Now we study another important indexing technique: Hash Indexing.

Hash indexing is mainly used for fast equality-based searches.

What is Hash Indexing?

Hash indexing uses a hash function to compute the location (bucket) where a record should be stored.

Instead of searching through a tree, we directly compute the address using:

Address = h(Key)

Where:

  • h is the hash function

  • The key is the search key

This allows direct access to data.

How Hash Indexing Works

  • Step 1: Apply the hash function to the key

  • Step 2: Compute the bucket number

  • Step 3: Access the bucket directly

Example:

If:

h(Student_ID) = Student_ID mod 10

For:

Student_ID = 123
h(123) = 3

The record is stored in bucket 3.

Static Hashing

In static hashing:

  • The number of buckets is fixed.

  • The hash function remains constant.

Advantages of Static Hashing

  • Very fast equality search

  • Simple to implement

Disadvantages of Static Hashing

  • Hash collisions may occur

  • If the file grows, the buckets may overflow

  • Performance degrades due to overflow chains

Hash Collision

A collision occurs when two keys map to the same bucket.

Example:

h(101) = 1
h(111) = 1

Both are stored in bucket 1.

Collision resolution methods include:

  • Chaining

  • Open addressing

Dynamic Hashing

Static hashing is not suitable when the database size changes frequently.

Dynamic hashing adjusts the number of buckets automatically.

There are two main types:

  • Extendible Hashing

  • Linear Hashing

We will focus on Extendible Hashing.

Extendible Hashing

Extendible hashing uses:

  • A directory

  • Buckets

  • Global depth

  • Local depth

Key Concepts in Extendible Hashing

Global Depth:

  • Number of bits used to index the directory.

Local Depth:

  • Number of bits used by a bucket.

How Extendible Hashing Works

  1. Apply the hash function.

  2. Use the first few bits to access the directory.

  3. The directory points to the bucket.

If the bucket overflows:

  • Split the bucket

  • Increase local depth

  • If needed, double the directory size

This allows flexible growth without full reorganization.

Advantages of Extendible Hashing

  • Efficient space utilization

  • No long overflow chains

  • Better performance than static hashing

Hash Indexing vs B+ Tree

FeatureHash IndexingB+ Tree
Equality SearchVery FastFast
Range QueryNot SupportedVery Efficient
Ordered DataNoYes
Dynamic GrowthWith Dynamic HashingYes
Disk AccessVery LowLow

When to Use Hash Indexing

Use hash indexing when:

  • Queries are mostly equality-based

  • Example: WHERE ID = 101

  • Range queries are not required

Do not use hash indexing when:

  • Range queries are frequent

  • Ordered traversal is required

Summary

In this article, we studied:

  • Hash Indexing

  • Static Hashing

  • Collision handling

  • Dynamic Hashing

  • Extendible Hashing

  • Comparison with B+ Tree