DBMS Keys: Primary, Candidate, Super, and Foreign Key
Overview
- Keys are used to uniquely identify any record or row of data from the table. It is also used to establish and identify relationships between tables.
- Keys are of different types eg: Super key, Candidate key, Primary Key, Foreign key, etc.
What are keys in DBMS?
Keys in DBMS are introduced to avoid redundancy in data. A table represents a big box with unique records, and keys help to identify those records efficiently. Keys in DBMS can be single or a group of attributes that uniquely identify the records. Using all attributes as a key is less efficient than choosing the minimum combination of attributes that provide the same result.
Why do we Need DBMS Keys?
Keys in DBMS are essential for identifying unique records in a table. In a database application, data must be organized into separate tables to avoid redundancy, which can create complexity when identifying common relationships between entities. For instance, in a user and pizza shop ledger example, each user has a unique identifier along with other attributes, and each transaction in the ledger contains information about the user and the pizza they bought. This creates redundancy because the user data is copied into the ledger. However, it is necessary to have this redundancy to link each transaction to the user who made it. Keys in DBMS help ensure data integrity and consistency by uniquely identifying each record in a table, and there are different types of keys, such as primary keys, foreign keys, candidate keys, alternate keys, and composite keys.
User Table
sno | user_name | address | contact-number |
---|---|---|---|
1 | user1 | delhi | 964XXXXXXX |
2 | user2 | haryana | 888XXXXXXX |
3 | user3 | punjab | 784XXXXXXX |
Pizza Ledger Table
sno | user_name | address | contact-number | p_id | p_name | p_type | p_size |
---|---|---|---|---|---|---|---|
1 | user1 | delhi | 964XXXXXXX | 3 | onion | mania | normal |
3 | user3 | punjab | 784XXXXXXX | 2 | onion | mania | medium |
2 | user2 | haryana | 888XXXXXXX | 3 | corn | burst | large |
If you can see from the above table how we’re unnecessarily writing the same data of the user. Even information related to pizza is redundant (this is where you need normalization concepts in DBMS).
In this kind of database, one can query about user-specific details, pizza-specific details, or even query related to both, like fetch all the users who have bought any kind of pizza. And, to fetch the results, we want KEYS.
So, keys in DBMS come to the rescue. DBMS keys are also used to create relationships between tables. We can use a type of key to solve this redundancy (see types of keys in the DBMS section to know more).
Relating a table with another table is a part of the basic to complex level of applications. And, DBMS keys are the way towards it. Let’s learn the type of keys in DBMS, each with an example.
Types of DBMS Keys
Mainly we’ve 8 different keys in DBMS. See the list below.
- Super Key
- Candidate Key
- Primary Key
- Alternate Key
- Foreign Key
- Composite Key
- Compound Key
- Surrogate Key
Before discussing these keys, let’s first understand an important nomenclature. Have a look at the table given below.
SNo | Name | Age |
---|---|---|
1 | A | 19 |
2 | A | 10 |
3 | BC | 18 |
From the knowledge you gain from the article until now, can you find the key to uniquely identify the record? (Hey, don’t scroll down. Please try!)
Yes, it’s SNo. And, if you saw values inside, the Age attribute also has unique values. But do you think using it is valuable in the long run? Age is a kind of value that has a very chance of repetition. So, that’s why we avoid using it as a key. And, we use SNo to identify each record of the given table uniquely.
Note: In real life, we don’t make a table first and then decide which attribute(s) are eligible to be taken as keys. We decide all these things beforehand (while designing the database), so here we’re doing it just for learning purposes.
Back to nomenclature. To describe the above example on paper, we can use SNo -> Name, Age.
(by the way, this is called functional dependency, but you can ignore this term for now)
This means that from SNo, we can get the values of three things, i.e., Name, Age, and SNo itself.
In general, if we say, A, B, C, D, E are attributes of a table, and we’ve given AB -> CDE.
This means that given values of A and B, we can find values of C, D, and E.
Got it? Now it’s time to learn about Different DBMS key.
1. Super Key
Everything we learned above is nothing but a super key. A super key is a set of all the keys (with single or multiple attributes) which can uniquely identify the records of the table.
Let’s break this down with an example.
Table Alpha
A | B | C |
---|---|---|
Text | Text | Text |
- | - | - |
- | - | - |
- | - | - |
We’ve 3 attributes, A, B, and C. Since, knowing what data the table has is not the primary concern for the time being, instead of filling some random data into the table, we define it with dashes (-).
A possible set :
- ABC -> ABC
- AB -> C
- AC -> B
- BC -> A
- A -> BC
- B -> AC
- C -> AB
This is a set of keys, and we can use any of those to identify the record. BUT don’t you think we should be efficient by choosing a key out of all? What, according to you, is the best option to go for the above example? If you’re thinking of one of A, B, or C, you’re right. It’s a good choice because given a single value, we can fetch all other values whereas, in other available keys, we should know 2 or 3 attributes beforehand to fetch values of all other attributes. Keeping this in mind, let’s discuss other keys.
2. Candidate key
From all the super keys available, the candidate key is the one whose proper subset is not a super key. Don’t get it? Let’s see it with an example.
Using the same table Alpha,
But now, the given set of the super key is:
ABC-> ABC
AB -> C
BC -> A
A -> BC
Given ABC key, we’re saying if we find its proper subset, i.e., {A, B, C, AB, AC, BC} and any of them can be a key, then ABC can’t become a candidate key.
AB, which comes from the proper subset of ABC, is also a key. Therefore, ABC is not a candidate key.
Note: If x is a proper subset of y, then x must NOT have one element that is included in y. Example,
X = {1,2,3} Y = {1,2,3,4}
Here, we can say X is a proper subset of Y since X doesn’t have 4 in it.
Similarly, AB is also not a candidate key because we have got A in its proper subset which, can be used as a key. Therefore, at last, we found that A is the ONLY candidate key available for the above schema.
You should keep in mind that we can also have multiple candidate keys. For the above example, if we have a set of the super key as,
- ABC-> ABC
- AB -> C
- BC -> A
- A -> BC
- B -> AC
- C -> AB
In this, we got A, B, and C as candidate keys. How?
3. Primary key
We learned to find candidate keys from a set of super keys. Now, the primary key is nothing but a candidate key which has given the right to be called the primary key. All other candidate keys can also be used as a primary key, but the database administrator (or you) can choose a single key out of those to be a primary key. We can use the primary key to uniquely identify the records of a table.
Using table Alpha,
Now we have a set of the super key as,
- ABC-> ABC
- AB -> C
- BC -> A
- A -> BC
- B -> AC
- C -> AB
Out of those, A, B, and C are valid candidate keys.
From A, B, and C, we can choose either A, B, or C as primary key.
I’m choosing A, you can choose B, and we both are fine. Got it 🙂
4. Alternate key
We learned the primary key is nothing but a candidate key given primary key rights. But what do we call all other candidate keys? Yes, you’re right. We call them alternate keys. Alternate keys are nothing but candidate keys that don’t get rights of the primary key.
A | B | C |
---|---|---|
- | - | - |
- | - | - |
- | - | - |
We have a set of the super key as,
ABC-> ABC
AB -> C
BC -> A
A -> BC
B -> AC
C -> AB
Out of those, A, B, and C are valid candidate keys. If we choose A as the primary key then B and C will become alternate keys.
5. Foreign key
In starting off our discussion, we discussed that any basic database includes 2 or more tables at least. And, in most of the cases, you’ll find some relation between them. So, in order to relate 2 tables with each other, we use the foreign key. We make the primary key of a table as a foreign key in another table of interest. Let’s see this in action.
Suppose we’ve table user and order, and the order table wants to use the contents of the table user. The functionality we’re looking for is that we are able to get each user and their corresponding order data. There’s one bad way and one good way to design this.
Let’s see a bad way first.
Table user
User_id (primary key) | Name | Age |
---|---|---|
1 | A | 18 |
2 | B | 19 |
3 | C | 16 |
Table order
S no | Name | Age | Order_id (primary key) | Order_name | Order_price |
---|---|---|---|---|---|
1 | A | 18 | 1 | Pizza burst | 500 |
2 | B | 19 | 2 | Pizza mania | 150 |
3 | C | 16 | 3 | Cheese Burger | 120 |
Why is this a bad way? First, we’re unnecessarily copying the entire table of the user inside the order table. Secondly, if a user changes their name, we’ve to modify the user table and the order table to reflect this change. And it’s just 2 tables. Imagine if we’ve a bunch of tables then it becomes cumbersome to reflect all those changes. Isn’t it?
So, what is a good way? Use foreign key. See the design below.
Sno (foreign key) | Order_id (primary key) | Order_name | Order_price |
---|---|---|---|
1 | 1 | Pizza burst | 500 |
2 | 2 | Pizza mania | 150 |
3 | 3 | Cheese Burger | 120 |
Do you get the point now? Instead of copying the entire table records, we set SNo (primary key in the user table) as a foreign key in the order table. We are leveraging the power of the primary key as it’s capable of uniquely identifying the records. So why copy? Instead, refer to those through its primary key and set it as a foreign key in another table.
Now, if you want to change anything inside the user, only the user table has to be modified, and all the changes will automatically reflect everywhere.
6. Composite key
A | B | C |
---|---|---|
- | - | - |
- | - | - |
- | - | - |
We have a set of the super key as,
- ABC-> ABC
- AB -> C
- BC -> A
Now, 2 possible candidate keys here would be AB and BC. Suppose we make AB the primary key. Since AB consists of 2 attributes (A and B), therefore it can be referred to as a composite key.
7. Unique key
Unique key looks very similar to primary key but they’re not and have a very different usage altogether. Once we set an attribute as a unique key then its column value should be unique, although only ONE value could be null. Let’s say you want the name of each user to be unique (for some reason) then you can use this key.
Also, a unique key can be applied to multiple attributes, not just one like in the primary key. And we can even modify the unique key attribute, which is not possible in the case of the primary key.
See example.
Id (primary key) | Name | Age (unique key) |
---|---|---|
1 | A | 500 |
2 | B | – |
3 | C | 120 |
Note: – means null here.
We set a unique key on the Age attribute, which ensures that no value inside the Age attribute is duplicated, and at the same time, it can accept one value to be null (if required).
8. Surrogate key
A surrogate key is a unique identifier that uniquely identifies the object or entity. It is used for representing the existence of data analysis. It represents an outside entity as a database object but is not visible to the user and application.
See this through an example.
key | Order_name | Order_price |
---|---|---|
142 | Pizza burst | 500 |
256 | Pizza mania | 160 |
781 | 1 | 120 |
Here, we can use the key attribute as a surrogate key because the value of the Key attribute is different for different locations and names of the order.
Conclusion
- Keys in DBMS are nothing but a way to uniquely identify each table's record.
- Learned about the super key and how to find candidate keys from it. We pick one key from a set of candidate keys as a primary key and name all other keys as alternate keys.
- If that primary key contains 2 or more attributes, then it's called a composite key.
- Unique key, which allows the value of the attributes to be unique (accepting one value as null if required), and
- Surrogate key, which is nothing but a unique identifier of an object or entity and is different at different locations.