📚 Day 1 of My Learning Journey: Understanding SQL Normalization

Today, I studied SQL normalization through a helpful DataCamp tutorial. Here’s a quick summary of what I learned today.

What is SQL Normalization?

Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It helps in breaking down large, complex tables into smaller, related ones, making the database more efficient and easier to maintain.

In simple terms, it is a process of designing a database effectively to avoid data redundancy.

Normalization is often used in data integrity, efficiency in querying and storage optimization.

Why Do We Need Normalization?

  1. To eliminate data redundancy: Avoid storing the same piece of data in multiple places.

  2. To ensure logical data dependencies: Organize data so that relationships between attributes make sense.

  3. To enhance data integrity: Maintain consistent, accurate, and reliable data across the database.

  4. To optimize storage: Reduce unnecessary data duplication and save space.

  5. To improve query performance: Streamline data retrieval and updates for better speed and efficiency.

  6. To simplify data updates: Make modifications easier without risking inconsistencies.

Before jumping to learn different levels of normalization, understand some basic terminology.

Candidate Key:

  • Set of columns which uniquely identify a record.

  • A table can have multiple candidate keys because there can be multiple set of columns which uniquely identify a record/row in a table.

Non-key Attribute:

  • Columns/attributes which are not part of primary key or candidate key.

Partial Dependency:

  • If a candidate key is a combination of two columns (or multiple columns) then every non-key attribute should be fully dependent on all the columns of the candidate key. If there is any non-key attribute that depends only on one of the candidate key columns, then it is partial dependency.

Transitive Dependency:

  • Let’s say you have a table T which has 3 columns A, B, and C

  • If A is functionally dependent on B and B is functionally dependent on C then we can say that A is functionally dependent on C.

Different levels of normalization:

  1. First Normal Form (1NF):

    • Every column/attribute need to have a single attribute.

    • Each row should be unique. Either through a single or multiple columns. Not mandatory to have a primary key.

  2. Second Normal Form (2NF):

    • Must meet all requirements of 1NF.

    • Every non-key attribute must be fully dependent on the candidate key (not just part of it).

      i.e. if a non-key attribute is partially dependent on candidate key then split them into separate tables.

    • Every table should have primary key and relationship between the tables should be formed using foreign key.

  3. Third Normal Form (3NF):

    • Must meet all requirements of 2NF.

    • Non-primary attributes should not depend on other non-primary attributes (no transitive dependency).

  4. Boyce-Codd Normal Form (BCNF):

    • A stronger version of 3NF.

    • Even if there is more than one candidate key, every determinant must be a candidate key.

  5. Fourth Normal Form (4NF):

    • Must meet all requirements of BCNF.

    • No multi-valued dependencies (one attribute should not imply multiple independent values for another attribute).

  6. Fifth Normal Form (5NF):

    • Must meet all requirements of 4NF.

    • Deals with complex join dependencies; tables should be broken down to eliminate redundancy in multi-join scenarios.

Generally, normalization takes place until 3NF in most of the companies.

For better understanding of Normalization, I watched this youtube video by techTFQ.

Conclusion

Mastering SQL normalization is an essential skill for anyone working with relational databases. It ensures that data is organized logically, which not only reduces redundancy but also makes updates and queries more efficient. While normalization can sometimes make querying more complex, the benefits of improved data integrity and easier maintenance are clear. By practicing and applying these concepts, you can design robust, scalable databases that will serve your applications well for years to come.

What's Next?

  1. Review more examples of NFs

  2. Practice normalizing sample tables manually.

Thanks for reading! 🌟
I’ll be posting my learning journey here daily — feel free to follow along or share tips if you have any! 🚀

0
Subscribe to my newsletter

Read articles from Jahnavi Sri Kavya Bollimuntha directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jahnavi Sri Kavya Bollimuntha
Jahnavi Sri Kavya Bollimuntha