Hashing in DBMS
⏱ Estimated reading time: 3 min
Hashing in DBMS is a technique used to store and retrieve data quickly based on a key value. It uses a hash function to convert a key (such as RollNo, EmpID, AccountNo) into a hash address (bucket number) where the record is stored.
Hashing is primarily used in file organization, indexing, and memory tables to achieve O(1) average search time.
Why Hashing is Used?
Hashing provides:
-
Very fast search (constant-time lookup)
-
Efficient insertion and deletion
-
Best performance for exact-match queries
Example → Find student with RollNo = 105
Hashing is NOT suitable for range queries (e.g., marks between 70 and 90).
Components of Hashing
1️⃣ Hash Function
A mathematical function that converts a key value into an integer address.
Example:
h(x) = x mod 10
If key = 105 → h(105) = 5
Record goes to bucket 5.
A good hash function must ensure:
-
Uniform distribution
-
Minimum collisions
-
Fast computation
2️⃣ Bucket
A bucket is a location where records mapped by hash function are stored.
3️⃣ Collision
A collision occurs when two different keys hash to the same bucket.
Example:
h(15) = 5
h(25) = 5
Both try to store at address 5 → collision.
Collisions are unavoidable, so DBMS uses collision resolution techniques.
Collision Resolution Techniques
1️⃣ Open Hashing (Separate Chaining)
Each bucket stores a linked list of records.
Advantages:
-
Simple and easy to implement
-
No limit on number of collisions
Disadvantages:
-
More memory needed
-
Longer search time when list grows
2️⃣ Closed Hashing (Open Addressing)
If bucket is full, DBMS finds another empty bucket.
Types:
a) Linear Probing
Check next bucket sequentially:
h(k), h(k)+1, h(k)+2 ...
b) Quadratic Probing
Use square increments:
h(k)+1², h(k)+2², h(k)+3² ...
c) Double Hashing
Use a second hash function when collision occurs.
Types of Hashing in DBMS
1️⃣ Static Hashing
-
Hash function does NOT change
-
Number of buckets is fixed
-
Suitable for small or stable databases
Limitations:
-
Overflow occurs as data grows
-
Performance degrades
2️⃣ Dynamic Hashing
-
Hash table grows or shrinks dynamically
-
Solves overflow and distribution problems
Types:
a) Extendible Hashing
-
Uses directory of pointers
-
Directory doubles when buckets overflow
-
Very efficient for large databases
b) Linear Hashing
-
Gradually splits buckets without doubling
-
Smooth expansion
-
No directory required
Dynamic hashing is widely used in modern DBMS indexing.
Advantages of Hashing
-
Very fast access for exact match queries
-
Efficient insertion, deletion, search
-
Suitable for large databases
-
Simple implementation compared to B-trees
Disadvantages of Hashing
-
Not suitable for range queries
-
Collisions degrade performance
-
Storage overhead in chaining
-
Rehashing is costly in dynamic hashing
Conclusion
Hashing is a powerful technique in DBMS used to achieve fast data retrieval. It maps keys to specific locations using a hash function and handles collisions using various strategies. Hashing is ideal for equality searches, such as account lookup, employee ID search, etc., but it is not suitable for range-based queries. Modern DBMS systems extensively use dynamic hashing for scalability and efficient performance.
Register Now
Share this Post
← Back to Tutorials