Database normalization is the process of designing relational databases to eliminate data redundancy, maintain integrity, and enable efficient data operations (updates, inserts, deletes). Normalization follows a series of rules called normal forms.
Purpose of Normalization
- Maintaining data integrity: Keeps data consistent and prevents contradictions.
- Eliminating data redundancy: Prevents the same data from existing in multiple places, saving storage and reducing the effort required for updates.
- Ensuring proper data operations: Eliminates anomalies that can occur during data operations (update anomalies, insertion anomalies, deletion anomalies) for efficient data management.
Functional Dependencies
A key concept for understanding normalization is functional dependency. This refers to a relationship where determining the value of one attribute uniquely determines the value of another. Written as A -> B, read as “A functionally determines B” or “B is functionally dependent on A.”
Full Functional Dependency
A relationship where an attribute is functionally determined by all columns of the primary key. It cannot be determined by only a part of the primary key.
Partial Functional Dependency
A relationship where an attribute is functionally determined by part of the primary key. This becomes an issue with composite primary keys.
Transitive Functional Dependency
A relationship where an attribute is functionally determined through a non-key attribute. If A -> B and B -> C, resulting in A -> C, then C is transitively functionally dependent on A.
First Normal Form (1NF)
From an unnormalized form, eliminate repeating groups (multi-valued attributes) so that all attributes hold a single value.
Conversion Steps
- Set a primary key for the unnormalized table.
- Separate the repeating groups into a new table.
- Set a primary key for the separated table and establish a foreign key for linking back to the original table.
Second Normal Form (2NF)
From first normal form, eliminate partial functional dependencies. This applies to tables with composite primary keys.
Conversion Steps
- Find tables with composite primary keys.
- Separate attributes that are functionally dependent on only part of the primary key into a new table with that partial key as the primary key.
- Keep the separated table’s primary key as a foreign key in the original table.
Third Normal Form (3NF)
From second normal form, eliminate transitive functional dependencies.
Conversion Steps
- Find non-key attributes that are functionally dependent on other non-key attributes (transitively dependent attributes).
- Separate the transitively dependent attributes into a new table with the determining non-key attribute as the primary key.
- Keep the separated table’s primary key as a foreign key in the original table.
Normalization involves trade-offs with database performance and complexity. Excessive normalization can lead to performance degradation due to increased table joins, so it is important to choose the appropriate normalization level based on system requirements.