Super Key in DBMS
Overview
Super Key is a crucial concept in DBMS that will act as a foundation for concepts like Primary Keys, Candidate Keys, Normal Forms, and much more!
Super keys are just attributes (or a single attribute) that can uniquely identify all rows in a relational database.
What are Keys?
The point of a database is to store data to be retrieved easily in the future. During retrieval, it would be convenient to get back relevant data based on as few parameters as possible.
It’s all to do with convenience. For example, in a database of Students , it would be hard to find a student based on his grades or age. A slightly better method would be to query the student’s name. However, in many cases, more than one student can have the same name. To distinguish between these students, we use the concept of roll numbers.
This way, no two students can have the same roll number, which means, we can uniquely identify all students from their roll numbers.
Keys in DBMS are just that.
A key is an attribute or a set of attributes that can uniquely identify an entire row in a database.
In simple databases, one attribute is enough to uniquely identify all rows. However, that’s not always the case. Consider the following example,
There are 3 sections (A, B, C) in the 5th grade of a certain school. Based on the alphabetic order, the students are given a roll number, which uniquely identifies them in a classroom. However, since there are 3 such sections, there will be many students who have the same roll number but are in different sections.
In this case, we can uniquely identify a student using their section and roll numbers together. For example, (5B, 12) represents exactly one student, who is in 5B and has the roll number 12.
Various types of Keys in Database Management Systems
There are various classes of keys in DBMS. Some of them are:
1) Super Keys
A Super Key is essentially just a key, i.e. it can uniquely identify all the attributes in a database.
2) Composite Keys
A Composite Key is a key that contains more than one attribute. In the student table mentioned above, the key – (Section, Roll Number) is a Composite Key. This key can contain any number of attributes (greater than 1). Trivially, the key involving all the columns in the table is the largest Composite Key possible.
3) Candidate Keys
A Candidate Key is a key that contains the least possible attributes, and that maintains the criteria that it can uniquely identify any table row. Again, in the student table mentioned above, Roll Number cannot be a candidate key, since it cannot identify a student across 5th grade. Similarly, the key (Section, Roll Number, Name) cannot be a candidate key since we can make do with (Section, Roll Number) as a key, which has 1 less attribute. Therefore, the key (Section, Roll Number) is a candidate key.
4) Primary Key
One candidate key from the set of all possible candidate keys is chosen to be the primary key. This primary key is used to identify rows once decided, which reduces the complexity of data retrieval since we would rely on only 1 key for most queries. A primary key cannot have null values for obvious reasons.
5) Alternate Keys
After a primary key is chosen from the set of candidate keys, the leftover keys are called Alternate Keys.
6) Foreign Key
A Foreign Key in table X is a primary key in another table Y, which is used to identify the rows in table Y from the point of view of table X. For example, if each college student had a proctor, we could put the details of the proctor on the student table itself. But, since many students can have the same proctor, doing so will result in redundant data. To eliminate this redundancy, we can create a separate proctor table and mention the proctor’s id for each student in the student table.
In this scenario, proctor_Id is the foreign key in the Student table and is used to cross-reference the proctor’s details.
Introduction to Super Keys in DBMS
Since a super key is just a key that can uniquely identify a set of attributes, all candidate keys come under the bracket of super keys. The set of all Super Keys is a superset of all Candidate Keys.
Essentially, the super keys with the least number of attributes form the candidate keys. With this, we can create all super keys by just pairing the candidate keys with other table columns.
Super keys are important since they’re the starting point of keys, normal forms in DBMS, and more!
Importance of Super Key
The main purpose of a super key is just to identify rows in the table. In many cases, you can't identify a table with any random column, since a column with duplicates will not be able to identify a unique row.
Super Keys remove this ambiguity and make data retrieval easy.
Examples of Super Key in DBMS
Consider a database that stores customer orders and the products they have purchased. In this scenario, the super key could be a composite key made up of the customer ID and the product ID, as this would ensure that each customer order is uniquely identified based on the products they have purchased.Another example could be a database that stores orders placed by customers, including the order number, customer name, and order date. In this case, the super key could be the combination of order number and order date, as this would ensure that each order is uniquely identified.
Super Keys vs. Candidate Keys
Feature | Super Key | Candidate Key |
---|---|---|
Definition | A set of one or more attributes that can uniquely identify a tuple in a relation. | A minimal set of attributes that can uniquely identify a tuple in a relation. |
Unique | Yes | Yes |
Minimal | No | Yes |
Number of Keys | One or more | One |
Purpose | Used for functional dependency determination and normalization. | Used for primary key selection and uniqueness constraint. |
Example | {ID, Name} | {ID} |
Consider the following example,
In this table, we can't identify rows uniquely by section or roll number alone. However, we can club them together to form the key - (section, roll number), which can identify all rows uniquely. Therefore, (section, roll number) is a super key.
Interestingly, we can't identify any row with less than 1 column, which makes (section, roll number) a Candidate Key.
We can do the same with the key (first name, last name) for this table specifically, but it is possible for 2 people to have the same first and last names, which is why it is avoided.
We can create other super keys using the candidate keys. For example, the key (section, roll number, first name) is also a Super Key. It is not a candidate key, since it doesn't use the least number of columns required to uniquely identify a row.
Conclusion
- Super key is the key that can uniquely identify any row in a database.
- Candidate Keys are super keys with the least number of columns.
- We can generate the set of all super keys using the candidate keys as a base.
Thank you for taking the time to read this article!