Decomposition in DBMS

Decomposition in DBMS involves dividing a table into multiple tables, aiming to eradicate redundancy, inconsistencies, and anomalies. This process, represented as {X1, X2,……Xn}, ensures dependency preservation and losslessness. When a relational model's relation lacks appropriate normal form, decomposition becomes necessary to address issues like information loss, anomalies, and redundancy, ultimately enhancing the overall design quality and efficiency of the database.

There are two types of decomposition as shown below:

Type of Decomposition

Rules for Decomposition

Whenever we decompose a relation, there are certain properties that must be satisfied to ensure no information is lost while decomposing the relations. These properties are:

Lossless Join Decomposition.
Dependency Preserving.

Lossless Join Decomposition

A lossless Join decomposition ensures two things:

No information is lost while decomposing from the original relation.
If we join back the sub decomposed relations, the same relation that was decomposed is obtained.

We can follow certain rules to ensure that the decomposition is a lossless join decomposition Let’s say we have a relation R and we decomposed it into R1 and R2, then the rules are:

The union of attributes of both the sub relations R1 and R2 must contain all the attributes of original relation R.

R1 ∪ R2 = R
The intersection of attributes of both the sub relations R1 and R2 must not be null, i.e., there should be some attributes that are present in both R1 and R2.

R1 ∩ R2 ≠ ∅
The intersection of attributes of both the sub relations R1 and R2 must be the superkey of R1 or R2, or both R1 and R2.

R1 ∩ R2 = Super key of R1 or R2

Let’s see an example of a lossless join decomposition. Suppose we have the following relation EmployeeProjectDetail as:

<EmployeeProjectDetail>

Employee_Code	Employee_Name	Employee_Email	Project_Name	Project_ID
101	John	john@demo.com	Project103	P03
101	John	john@demo.com	Project101	P01
102	Ryan	ryan@example.com	Project102	P02
103	Stephanie	stephanie@abc.com	Project102	P02

Now, we decompose this relation into EmployeeProject and ProjectDetail relations as:

<EmployeeProject>

Employee_Code	Project_ID	Employee_Name	Employee_Email
101	P03	John	john@demo.com
101	P01	John	john@demo.com
102	P04	Ryan	ryan@example.com
103	P02	Stephanie	stephanie@abc.com

The primary key of the above relation is {Employee_Code, Project_ID}.

<ProjectDetail>

Project_ID	Project_Name
P03	Project103
P01	Project101
P04	Project104
P02	Project102

The primary key of the above relation is {Project_ID}.

Now, let’s see if this is a lossless join decomposition by evaluating the rules discussed above:

Let’s first check the EmployeeProject ∪ ProjectDetail:

<EmployeeProject ∪ ProjectDetail>

Employee_Code	Project_ID	Employee_Name	Employee_Email	Project_Name
101	P03	John	john@demo.com	Project103
101	P01	John	john@demo.com	Project101
102	P04	Ryan	ryan@example.com	Project104
103	P02	Stephanie	stephanie@abc.com	Project102

As we can see all the attributes of EmployeeProject and ProjectDetail are in EmployeeProject ∪ ProjectDetail relation and it is the same as the original relation. So the first condition holds.

Now let’s check the EmployeeProject ∩ ProjectDetail:

<EmployeeProject ∩ ProjectDetail>

Project_ID
P03
P01
P04
P02

As we can see this is not null, so the the second condition holds as well. Also the EmployeeProject ∩ ProjectDetail = Project_Id. This is the super key of the ProjectDetail relation, so the third condition holds as well.

Now, since all three conditions hold for our decomposition, this is a lossless join decomposition.

Lossless vs Lossy Decomposition

In a lossy decomposition, one or more of these conditions would fail and we will not be able to recover Complete information as present in the original relation. For example, let's say we decompose our original relation EmployeeProjectDetail into EmployeeProject and ProjectDetail relations as:

<EmployeeProject>

Employee_Code	Employee_Name	Employee_Email
101	John	john@demo.com
102	Ryan	ryan@example.com
103	Stephanie	stephanie@abc.com

The primary key of the above relation is {Employee_Code}.

<ProjectDetail>

Project_ID	Project_Name
P03	Project103
P01	Project101
P04	Project104
P02	Project102

The primary key of the above relation is {Project_ID}.

Now, the intersection EmployeeProject ∩ ProjectDetail is null. Therefore there is no way for us to map a project to its employees. Thus this is a lossy decomposition.

Dependency Preserving

The second property of lossless decomposition is dependency preservation which says that after decomposing a relation R into R1 and R2, all dependencies of the original relation R must be present either in R1 or R2 or they must be derivable using the combination of functional dependencies present in R1 and R2.

Let’s understand this from the same example above:

<EmployeeProjectDetail>

Employee_Code	Employee_Name	Employee_Email	Project_Name	Project_ID
101	John	john@demo.com	Project103	P03
101	John	john@demo.com	Project101	P01
102	Ryan	ryan@example.com	Project104	P04
103	Stephanie	stephanie@abc.com	Project102	P02

In this relation we have the following FDs:

Employee_Code -> {Employee_Name, Employee_Email}
Project_ID - > Project_Name

Now, after decomposing the relation into EmployeeProject and ProjectDetail as:

<EmployeeProject>

Employee_Code	Project_ID	Employee_Name	Employee_Email
101	P03	John	john@demo.com
101	P01	John	john@demo.com
102	P04	Ryan	ryan@example.com
103	P02	Stephanie	stephanie@abc.com

In this relation we have the following FDs:

Employee_Code -> {Employee_Name, Employee_Email}

<ProjectDetail>

Project_ID	Project_Name
P03	Project103
P01	Project101
P04	Project104
P02	Project102

In this relation we have the following FDs:

Project_ID - > Project_Name

As we can see that all FDs in EmployeeProjectDetail are either part of the EmployeeProject or the ProjectDetail, So this decomposition is dependency preserving.

Conclusion

Decomposition is the process of breaking an original relation into multiple sub relations.
Decomposition helps to remove anomalies, redundancy, and other problems in a DBMS.
Decomposition can be lossy or lossless.
An ideal decomposition should be lossless join decomposition and dependency preserving.

Rules for Decomposition

Lossless Join Decomposition

Lossless vs Lossy Decomposition

Dependency Preserving

Conclusion

Read More: