Hashing and Hash Tables
β± Estimated reading time: 3 min
Hashing and Hash Tables
Hashing is a powerful technique used to store and retrieve data efficiently. It converts input data (called a key) into a fixed-size value known as a hash code or hash value, which determines the index where the data is stored in a table called a hash table.
What is Hashing?
Hashing is a process of mapping data of arbitrary size to fixed-size values using a hash function. It provides fast data access in constant time on average, making it ideal for searching, insertion, and deletion operations.
// Example (in C++)
int hashFunction(int key) {
return key % 10; // returns index between 0β9
}
Here, each key is converted into an index by taking its remainder when divided by 10. The resulting index is used to store the data in the hash table.
What is a Hash Table?
A hash table is a data structure that stores keyβvalue pairs. It uses a hash function to compute the index for storing and retrieving values quickly.
Basic Operations on Hash Table:
- Insert(key, value): Compute hash index using hash function and store the key-value pair.
- Search(key): Compute the hash index and retrieve the value if present.
- Delete(key): Locate the key using hash index and remove it.
Hash Function
A hash function is a function that takes a key as input and returns an index in the hash table. It should distribute keys uniformly to reduce collisions.
Good Hash Function Properties:
- Fast to compute
- Uniformly distributes keys
- Minimizes collisions
- Deterministic (same input β same output)
What is a Collision?
A collision occurs when two keys produce the same hash index. Since multiple keys cannot occupy the same slot, a collision resolution technique is required.
Collision Resolution Techniques:
- Chaining: Each table index stores a linked list. When multiple keys map to the same index, they are added to the list.
- Open Addressing: If a collision occurs, the algorithm searches for the next available index using:
- Linear Probing: Check the next slot (index + 1, index + 2, ...).
- Quadratic Probing: Use a quadratic sequence (index + 1Β², index + 2Β², ...).
- Double Hashing: Use a second hash function to calculate the next slot.
Example of Hash Table (using Modulus Function)
| Key | Hash Function (key % 10) | Index |
|---|---|---|
| 25 | 25 % 10 = 5 | 5 |
| 37 | 37 % 10 = 7 | 7 |
| 47 | 47 % 10 = 7 (collision) | 7 β handle using chaining |
Applications of Hashing
- Databases (indexing and key-value lookups)
- Compiler symbol tables
- Data caching and memory management
- Password verification (storing hash of passwords)
- Blockchain and cryptography
Advantages of Hashing
- Provides constant-time average search, insert, and delete operations.
- Efficient for large datasets with key-based access.
Disadvantages of Hashing
- Performance depends on hash function quality.
- Collisions can reduce efficiency.
- Not suitable for ordered data access.
Conclusion
Hashing and hash tables offer one of the most efficient methods for data storage and retrieval. By using an effective hash function and proper collision handling, they ensure fast and scalable performance in real-world applications.
Register Now
Share this Post
β Back to Tutorials