- Database Normalization is a technique of organizing the data in the database.
- Normalization is the systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like insertion, Update and Deletion Anomalies.
Why do we need normalization
- To avoid redundency
- To avoid insert, update, delete Anomalies
Types of Normal Form
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce Cotts Normal Form (BCNF)
Basically Normal form can be upto 6NF, but we mostly normalize our database table upto 3NF and sometime to 4NF i.e Boyce Codds Normal form. Here we will be discussing upto 3rd normal form only. So lets begin from 1NF :
First Normal Form (1NF)
A Database table is said to in 1NF if it satisfies the following conditions
- Each table cell should contain a single value
- Each record needs to be unique.
Lets take an example. Suppose we have a student table in which we need to save the student record. An student can have the following columns Name, city state, zip code, Phone. here we have the possibility taht the student can have two phone numbers and we have to store this information to the table, which is stored as shown in the student table.
Here the above table voilates the condition of 1NF and hence we can conclude the above table is not in 1NF form.
So, to make the table in 1NF we store the record of each student so that the each cell in the student table have exactly one value i.e atomic value as show in the table below:
Now this table satisfies the condition for 1NF and hence we conclude that the table is in First Normal Form.
Second Normal Form (2NF)
To be in second normal form the table should satisfies the following conditions
- Should be in 1NF.
- Should not have partial dependency in the table
Let me explain what exactly partial dependency is !!
Partial Dependency in a database table occurs when a non-prime attribute is functionally dependent on part of a candidate key. i.e Partial Dependency exists, when for a composite primary key, any attribute in the table depends only on a part of the primary key and not on the complete primary key.
To remove Partial dependency, we can divide the table, remove the attribute which is causing partial dependency, and move it to some other table where it fits in well.
Third Normal Form (3NF)
Third Normal Form is an upgrade to Second Normal Form. When a table is in the Second Normal Form and has no transitive dependency, then it is in the Third Normal Form.
For a table to be in the third normal form
- It should be in the Second Normal form.
- And it should not have Transitive Dependency.
In this table Student_id is Primary Key, but city and state depends upon zip. The dependency between zip anbd other field is called transitive dependency. Hence to apply 3NF, we need to move the city and state to new table, with zip as primary key as show below:
New Student Table
1 2 3 First Normal Form : All attribute are atomic Second Normal form : No Partial Dependecies Third Normal form : No transitive Dependecies