What is a SURROGATE Key in SQL?

A SURROGATE Key in SQL is a primary key that serves as a unique identifier for each row in the table. We can identify a unique row using this key. SURROGATE Keys have no commercial significance. This sort of key is either generated by the database or by another application (not supplied by the user). A SURROGATE Primary Key has only one requirement: each row must have a unique value for that column. SURROGATE keys are frequently referred to as artificial keys or identity keys. It is suitable for usage in data warehouses.

The following qualities should be present in a SURROGATE key:

A Unique Value
The system generates the key, which means it is generated automatically.
The key is hidden from the user because they have no business meaning (it is not a component of the application) and is not composed of several keys.
The key has no semantic meaning.

Consider the following example for a better understanding of the SURROGATE key in SQL.

Assume you need to maintain track of good data in a table. The data includes information on each well's geographic location and depth, as indicated in the Well table below:

Longitude	Latitude	Depth
220	140	5.6
220	160	5.6
220	170	7.5
340	170	8.2
340	510	9.4

The table above shows two wells with the same longitude, latitude, or depth. As a result, you can't select the primary key from one of these three columns because they don't uniquely identify the row.

You'll need to create a SURROGATE key column, which can be a unique auto-number column. Here's an example of an auto-number SURROGATE field called WellId in the table:

WellId	Longitude	Latitude	Depth
1	220	140	5.6
2	220	160	5.6
3	220	170	7.5
4	340	170	8.2
5	340	510	9.4

Note: SURROGATE Keys are never used for any business logic other than simple Create, Read, Update, and Delete (CRUD) operations.

How to Create the SURROGATE Key in SQL?

Many database management systems have features that allow you to create a SURROGATE key in your table. The column name is usually taken from the table name plus a suffix: Id of key - ContactId, SupplierKey, etc.

Common ways for creating SURROGATE keys in SQL Server are:

A monotonically increasing integer, such as when using an identity column.
A globally unique identifier (GUID) data type.

Using an IDENTITY Column

The IDENTITY property is intended to scale out over all distributions in the dedicated SQL pool while maintaining load performance. As a result, the deployment of IDENTITY is oriented toward achieving these objectives.

Due to the distributed architecture of the data warehouse, the IDENTITY property does not ensure the sequence in which the SURROGATE values are allocated. The IDENTITY property is intended to scale out over all distributions in the dedicated SQL pool while maintaining load performance.

An IDENTITY column is used by SQL Server to implement the auto-increment feature. A table containing an identity column automatically increases its value whenever a new record is inserted into the table.

Output:

Using a GUID Column

SQL Server GUID is a binary data type of 16 bytes. The GUIDs are formed in such a way that it is practically impossible for their values to ever get duplicated. These values must be unique across databases and servers.

GUID values are stored in SQL Server databases using the UNIQUEIDENTIFIER data type. The NEWID() function in SQL Server can be used to produce a GUID value.

Output:

How Can We Use a SURROGATE Key in SQL?

SURROGATE Keys can be used in a variety of ways, as discussed below:

Auto Incremental Key in Database SURROGATE

An auto-incremented key can implement a SURROGATE Key. SQL Server supports an IDENTITY column for the auto-increment feature. It enables the creation of an exclusive number each time a new record is added to the database table.

Manual Incremental Key in Database

A manual incremental key can implement a SURROGATE Key. We can obtain the maximum value of a column using the max() function, and this value is incremented by one. When a table contains a large amount of data, this technique suffers from a performance issue.

Example:

Globally Unique Identifiers (GUID)

GUID is a Microsoft standard that expands on the Universally Unique Identifier (UUID) concept. In SQL Server, we can produce a new GUID by using the NEWID() function. It's a 16-bit GUID.

Example:

NEWSEQUENTIALID() can be used with the uniqueidentifier table column's DEFAULT constraints. The NEWSEQUENTIALID() function cannot be used as a query reference.

Universally Unique Identifier (UUID)

UUIDs are 128-bit values generated by hashing the Ethernet card ID and the SQL Server's current data time.

Advantages of SURROGATE Key in SQL

A SURROGATE key can be extremely useful for analytical purposes for the following reasons:

SURROGATE keys are unique. As the system generates SURROGATE keys, the system cannot create and store a duplicate value.
SURROGATE keys apply the same set of rules to all records. A program generates the system-generated value, which results in the SURROGATE key value. Any key generated by the software will apply uniform rules to each record.
SURROGATE keys stand the test of time. There will be no need to alter the key in the future because SURROGATE keys lack any context or commercial purpose.
SURROGATE keys can have any number of values. Sequential, timestamp and random keys have no practical limits to unique combinations.
Object Relational Mapping (ORM) frameworks like Entity Framework, N-Hibernate, and others are optimized for use with SURROGATE Keys. It is very easy to apply them over the composite keys.
It enables a higher level of standardization, preventing data duplication inside the database.

Disadvantages of SURROGATE Key in SQL

Extra column(s)/indexes for SURROGATE keys will necessitate additional disc space.
Extra column(s)/indexes for SURROGATE keys will necessitate more IO when inserting/updating data.
Key value has no connection to the data; hence the design technically violates 3NF (i.e., normalization)
If another unique constraint isn't defined on the natural key, the table's natural key values may have duplicate values.
It adds the sequential number by a random amount.
A SURROGATE Key requires some administrative overhead to keep up with.
A SURROGATE Key requires more disc space to be stored.
It is impossible to use the SURROGATE key value as a search key.
It is challenging to distinguish between test and production data. For example, it's challenging to determine if production data was put into a test environment because SURROGATE key values are automatically generated values with no meaningful context.

Conclusion

A SURROGATE key in SQL is a value generated by the system (it could be a GUID, sequence, unique identifier, etc.) that is used to uniquely identify a record in a table. The key could be made up of one or more columns (i.e., Composite Key).
A SURROGATE key in SQL can be used to reduce coupling (i.e., Composite Key) because it has no business value and is not tied to any external application connected to a database.
A large natural key can reduce a database's performance but using a SURROGATE key may improve performance because SURROGATE keys are usually integer values.
A smaller index on a primary key will perform better on JOIN operations. While using many natural main keys, JOIN operations can become extremely complex.

Learn More

Advanced concepts of SQL