Tree-based indexing is very efficient for range queries and ordered data retrieval.
Now we study another important indexing technique: Hash Indexing.
Hash indexing is mainly used for fast equality-based searches.
What is Hash Indexing?
Hash indexing uses a hash function to compute the location (bucket) where a record should be stored.
Instead of searching through a tree, we directly compute the address using:
Address = h(Key)
Where:
his the hash functionThe key is the search key
This allows direct access to data.
How Hash Indexing Works
Step 1: Apply the hash function to the key
Step 2: Compute the bucket number
Step 3: Access the bucket directly
Example:
If:
h(Student_ID) = Student_ID mod 10
For:
Student_ID = 123
h(123) = 3
The record is stored in bucket 3.
Static Hashing
In static hashing:
The number of buckets is fixed.
The hash function remains constant.
Advantages of Static Hashing
Very fast equality search
Simple to implement
Disadvantages of Static Hashing
Hash collisions may occur
If the file grows, the buckets may overflow
Performance degrades due to overflow chains
Hash Collision
A collision occurs when two keys map to the same bucket.
Example:
h(101) = 1
h(111) = 1
Both are stored in bucket 1.
Collision resolution methods include:
Chaining
Open addressing
Dynamic Hashing
Static hashing is not suitable when the database size changes frequently.
Dynamic hashing adjusts the number of buckets automatically.
There are two main types:
Extendible Hashing
Linear Hashing
We will focus on Extendible Hashing.
Extendible Hashing
Extendible hashing uses:
A directory
Buckets
Global depth
Local depth
Key Concepts in Extendible Hashing
Global Depth:
Number of bits used to index the directory.
Local Depth:
Number of bits used by a bucket.
How Extendible Hashing Works
Apply the hash function.
Use the first few bits to access the directory.
The directory points to the bucket.
If the bucket overflows:
Split the bucket
Increase local depth
If needed, double the directory size
This allows flexible growth without full reorganization.
Advantages of Extendible Hashing
Efficient space utilization
No long overflow chains
Better performance than static hashing
Hash Indexing vs B+ Tree
| Feature | Hash Indexing | B+ Tree |
|---|---|---|
| Equality Search | Very Fast | Fast |
| Range Query | Not Supported | Very Efficient |
| Ordered Data | No | Yes |
| Dynamic Growth | With Dynamic Hashing | Yes |
| Disk Access | Very Low | Low |
When to Use Hash Indexing
Use hash indexing when:
Queries are mostly equality-based
Example:
WHERE ID = 101Range queries are not required
Do not use hash indexing when:
Range queries are frequent
Ordered traversal is required
Summary
In this article, we studied:
Hash Indexing
Static Hashing
Collision handling
Dynamic Hashing
Extendible Hashing
Comparison with B+ Tree