Normalization in DBMS
⏱ Estimated reading time: 3 min
Normalization is a systematic process used in DBMS to organize data in tables to reduce redundancy and avoid anomalies such as insertion, update, and deletion anomalies. It divides a database into multiple related tables and ensures that data is stored efficiently, consistently, and logically.
The main goal of normalization is to:
✔ Eliminate data redundancy
✔ Ensure data integrity
✔ Remove anomalies
✔ Make database structure flexible and scalable
Normalization was introduced by E.F. Codd, the father of the relational model.
Why Normalization is Needed?
Without normalization, problems occur:
1. Update Anomaly
Updating a value in multiple places leads to inconsistency.
2. Insert Anomaly
We cannot insert data because some other values are missing.
3. Delete Anomaly
Deleting a record removes useful information unintentionally.
Normalization solves all of these by decomposing data properly.
Functional Dependency (FD)
Normalization is based on functional dependencies.
Definition:
A functional dependency X → Y means attribute Y depends on attribute X.
Example:
StudentID → StudentName
(If we know StudentID, we can uniquely determine StudentName.)
Normal Forms in DBMS
1️⃣ First Normal Form (1NF)
A table is in 1NF if:
-
All values are atomic (indivisible)
-
No repeating groups or multivalued attributes
Example
Wrong (Not in 1NF):
| Student | Subjects |
|---|---|
| A | Math, Science |
Correct (1NF):
| Student | Subject |
|---|---|
| A | Math |
| A | Science |
2️⃣ Second Normal Form (2NF)
A table is in 2NF if:
-
It is already in 1NF
-
No partial dependency (non-key attribute depends on part of a composite key)
Example
Composite Key: (StudentID, CourseID)
FD: StudentID → StudentName (partial dependency)
Solution → Split the table.
3️⃣ Third Normal Form (3NF)
A table is in 3NF if:
-
It is in 2NF
-
No transitive dependency
Transitive dependency:
A → B and B → C, then A → C
Example
StudentID → Department
Department → HOD
StudentID indirectly determines HOD → Transitive dependency
Solve → Create separate Department table.
4️⃣ Boyce–Codd Normal Form (BCNF)
A stronger version of 3NF.
A table is in BCNF if:
For every functional dependency X → Y, X should be a super key.
BCNF removes anomalies that 3NF sometimes fails to eliminate.
5️⃣ Fourth Normal Form (4NF)
A table is in 4NF if:
-
It is in BCNF
-
It has no multivalued dependencies
6️⃣ Fifth Normal Form (5NF)
A table is in 5NF if:
-
It has no join dependency
This is used rarely in practical databases.
Example: Normalizing a Table
Consider a table:
| Student | Course | Instructor |
|---|---|---|
| A | Math | Dr. Rao |
| A | Physics | Dr. Roy |
| B | Math | Dr. Rao |
Problems
-
Redundant instructor names (Update anomaly)
-
Cannot insert a new course without a student (Insert anomaly)
Solution — Apply Normalization
1NF: Already atomic.
2NF: All attributes depend on composite key (Student, Course). Fine.
3NF: Instructor depends on Course → Transitive dependency.
So we split it:
Student_Course Table
(Student, Course)
Course_Instructor Table
(Course, Instructor)
Now:
✔ No redundancy
✔ No anomalies
✔ Consistent data
Conclusion
Normalization is a vital technique in DBMS used to organize data efficiently. It improves database reliability by reducing redundancy and preventing anomalies. The most widely used normal forms in real systems are 1NF, 2NF, and 3NF, while BCNF is applied for better accuracy. Higher normal forms (4NF, 5NF) are used in complex database scenarios.
Normalization ensures a clean, consistent, and optimized database design.
Register Now
Share this Post
← Back to Tutorials