Normalization in DBMS
Introduction and First Normal Form (1NF)
Normalization is a systematic process used in database design to organize data efficiently. It helps reduce redundancy and eliminate undesirable characteristics like update, insertion, and deletion anomalies.
In this article, we will understand:
- Why is normalization needed
- Problems in unnormalized tables
- Concept of redundancy
- First Normal Form (1NF)
- How to convert a table into 1NF
1. What is Normalization?
Normalization is the process of organizing data in a database to:
- Reduce data redundancy
- Improve data integrity
- Avoid anomalies
- Make the database structure efficient
It divides large tables into smaller, well-structured tables while maintaining relationships between them.
Normalization is based on Functional Dependencies.
2. Why Do We Need Normalization?
When databases are not properly designed, they may contain repeated data and inconsistencies. This leads to problems during insertion, updating, or deletion of data.
Let us understand these problems.
3. Problems in Unnormalized Tables
Consider the following table:
| Student_ID | Student_Name | Course_ID | Course_Name | Instructor |
| 101 | Aditi | C1 | DBMS | Sharma |
| 101 | Aditi | C2 | OS | Verma |
| 102 | Rahul | C1 | DBMS | Sharma |
Here, student and course details are stored together.
This causes the following anomalies.
3.1 Update Anomaly
If the instructor for DBMS changes from Sharma to Mehta, we must update it in multiple rows.
If we miss one row, the data becomes inconsistent.
3.2 Insertion Anomaly
If we want to add a new course that currently has no students, we cannot insert it without adding a student.
So we cannot store course information independently.
3.3 Deletion Anomaly
If student 102 is deleted and he was the only student enrolled in DBMS, we might lose course information unintentionally.
This leads to loss of important data.
4. Concept of Redundancy
Redundancy means unnecessary repetition of data.
In the above table:
- Student_Name is repeated for the same Student_ID.
- Course_Name and Instructor are repeated for the same Course_ID.
Redundancy:
- Wastes storage space
- Causes inconsistency
- Makes maintenance difficult
Normalization removes this redundancy step by step.
5. First Normal Form (1NF)
A relation is said to be in First Normal Form (1NF) if:
- All attribute values are atomic (indivisible).
- There are no repeating groups or multivalued attributes.
- Each cell contains only a single value.
5.1 Understanding Atomic Values
Atomic means the value cannot be further divided.
Incorrect (Not Atomic):
| Student_ID | Phone_Numbers |
| 101 | 98765, 91234 |
Here, Phone_Numbers contains multiple values in one cell.
Correct (Atomic):
| Student_ID | Phone_Number |
| 101 | 98765 |
| 101 | 91234 |
Now each cell contains a single value.
5.2 Removing Repeating Groups
Consider this table:
| Student_ID | Student_Name | Course1 | Course2 |
| 101 | Aditi | DBMS | OS |
Here, Course1 and Course2 represent repeating groups.
To convert into 1NF:
| Student_ID | Student_Name | Course |
| 101 | Aditi | DBMS |
| 101 | Aditi | OS |
Now:
- Each column has a single value.
- No repeating attributes exist.
6. Converting UNF to 1NF (Step-by-Step)
Step 1: Identify multivalued attributes or repeating groups.
Step 2: Separate repeating values into individual rows.
Step 3: Ensure every column contains atomic values.
Step 4: Define a primary key for uniqueness.
7. Important Points About 1NF
- 1NF does not remove redundancy completely.
- It only ensures atomicity and structured data.
- Partial and transitive dependencies may still exist.
- Further normalization (2NF and 3NF) is required.
Summary
- Normalization organizes data to reduce redundancy and anomalies.
- Unnormalized tables suffer from update, insertion, and deletion anomalies.
- First Normal Form (1NF) ensures:
- Atomic values
- No repeating groups
- Single value per cell