Stand with Ukraine

Stand with Ukraine logo
Quick donation

Data Normalization and Denormalization: When Each Should Be Used?

ITOMYCH STUDIO
December 23, 2021
7 min to read
Data Normalization and Denormalization: When Each Should Be Used?

What is data normalization?

Database normalization is the process of organizing data into a linked table. This significantly increases the integrity and productivity of the query. To normalize a table, you must divide it into sub-tables and establish relationships between them.

Thus, database normalization is a practical optimization of table structures. Much attention here is paid to setting up relationships between tables.

There are several types of database normalization process:

  • First normal form (1 NF)
  • Second normal form (2 NF)
  • Third normal form (3 NF)
  • Boyce-Codd's normal form or fourth normal form (BCNF or 4 NF)
  • Fifth normal form (5 NF)
  • Sixth normal form (6 NF)

Data normalization avoids data redundancy by distributing information to different addresses of one database.

In addition to cumbersome maintenance and updating of the database, a large amount of information can also be inconsistent. Anomalies of inserting, updating and deleting data are becoming frequent. In addition, redundant data significantly increases the size of the database and takes up too much disk space.

What is data denormalization?

This optimization method is applied after normalization. Denormalization is the deliberate bringing of a database structure to a state that doesn't meet the requirements of normalization. Denormalization is used to add redundancy to fulfill a query quickly. In this case, the number of tables decreases, in contrast to normalization.

Fewer connections reduce errors. But there are drawbacks to this method: denormalization can make it difficult to update and refine the code.

The difference between data normalization and denormalization

We've already found that normalization and denormalization are the two pillars of one process — changing the structure of databases. Now you might be interested in seeing the border line between the two more clearly. Satisfy your wish!

Normalization:

  • Removes redundant data from the database and stores non-null and sequential data in it;
  • Cleans the database of unused data by cutting down data inconsistencies;
  • Ensures higher performance thanks to the usage of optimized memory;
  • Augments the number of tables used (which isn't a problem);
  • Gives full support for data integrity, i.e. adding or removing data from the table, which, in turn, precludes inconsistencies in the relationship of tables;
  • Works perfectly well if you need to insert, update, or delete data.

Denormalization:

  • Сombines multiple tables to simplify queries;
  • Focuses on achieving faster queries;
  • Reduces the number of tables used.
  • Is characterised by the possible memory loss (consider this carefully!);
  • Doesn't support data integrity;
  • Fits your needs when you need questions to be translated into the tables format.

Examples of use

Let's also look at examples of normalization and denormalization in action. Here's the table we can use as the data template:

If we normalize this data according to the company's parameters, we'll get such a result:

In the same way, we can create a table for the product, which will include the ID, name, and quantity of the product.

Let's move on! Now we've got 2 tables:

And what denormalization brings here? Thanks to it, we'll get the following table:

As you may see, the use of normalization and denormalization is an integral part of working with databases. The benefit of using this or that method should be considered in each case. With the right choice, you will never face data loss problems or scaling difficulties.