Correlated Subquery - Scaler Topics

Overview

A subquery is a query that is enclosed within another query in SQL. When a subquery returns data, this data is used by the outer query.

In other words, this data is a prerequisite for the outer query to work. In this case, instead of writing two separate queries, we write one inside of the other. However, a correlated subquery needs the values of the main query to work. How? Let's find out!

What is A Correlated Subquery?

A correlated subquery uses the values of the outer or main query. This query is dependent on the outer one and hence cannot be executed in an independent manner. The main query can be in the form of the following:

SELECT
UPDATE
DELETE

A correlated subquery is used in row-by-row processing and gets evaluated once for each row processed by the outer query. A correlated subquery is a way to read all the rows in a table. It is used when a subquery is expected to return different values for each row the main query looks at, so it reads all of them. Since it gets executed repeatedly, it is also known as repeating subquery.

correlated subquery image

We shall cover examples involving correlated subqueries in the sections ahead.

Example

Let us consider a table vehicles from some database that lists the most preferred car models in a city. The table columns are:

Cars
car_model
car_name
brand
category
market_price

We will write a query to find all the cars whose market price equals the lowest price in the same category.

Following is the MySQL Code :

MySQL:

Output:

car_name	market_price	category
A	$18000	Hatchback
X	$18000	Hatchback
ABC	$20000	Sedan
XYZ	$20000	Sedan
Z	$20000	Sedan
I	$24000	SUV

Explanation: In the example above, we have a SELECT statement within a SELECT statement. For every car evaluated by the outer or main query, the lowest price in its category is found with the help of the subquery.

In case the price of any car is equal to the least price of all cars in the same category, it is added to the result. Thus, the correlated subquery is executed for every row evaluated by the outer query.

Nested Subqueries Vs Correlated Subqueries

In a nested query, the inner query executes only once and runs first. The outer query uses the data returned by the inner query.

On the other hand, the correlated subqueries execute after the outer query and use values obtained from it, as discussed above. Let us look at some examples to understand better.

We have tables employees and teams from a sample database.

Employees:

employees
employee_id
name
age
team_id
salary

Teams:

teams
team_id
name
strength

For Nested Subqueries :

We shall write a query to find all the employees from the " Finance " team.

Output:

name	employee_id	team_id
James	1234	2
Jack	7483	2
Mary	7980	2
Anna	5476	2
Dan	8392	2

So in the above query, the inner query returns the id for the team "Finance" and the outer query returns the "employees" who have the same "team_id".

As you can see, the inner query can also be executed as a standalone query.

team_id
2

"team_id" for the team "Finance" is returned.

For Correlated Subquery:

We shall write a query to find all the employees having salaries greater than the average salary in their team.

Output:

name	employee_id	salary	team_id
Emma	3636	$11700	1
George	1284	$11900	1
Jack	7483	$10000	2
Anna	5476	$10500	2
Chris	3897	$12000	3
Jennifer	2736	$9000	4
Eric	9876	$9850	4

Here, the inner subquery is dependent on the outer for values, hence can not be executed alone. Row-by-row processing takes place here.

Using EXISTS with a Correlated Subquery

The EXISTS keyword in SQL is used to check the existence of a particular record in a subquery. If the subquery returns a record, then the EXISTS keyword returns TRUE. Following is the syntax of a subquery while using EXISTS keyword:

Syntax:

Let us look at an example to understand better. Let us consider two tables, "BLOGS" and "AUTHORS" from some sample databases.

AUTHORS
Author_ID
Author_Name
Contact_Address
Total_Blogs

BLOGS
Blog_ID
Blog_Name
Author_ID
Total_Views

We shall write a query to find the authors who have written blogs with more than 20,000 views.

So as you see, the inner query is dependent on the outer query for its values. The inner query returns TRUE or FALSE according to the condition applied.

Using NOT EXISTS With A Correlated Subquery

NOT EXISTS is exactly the opposite of the EXISTS keyword. It checks for the existence of particular records in the subquery. If the record is not present, it returns TRUE; otherwise FALSE.

Following is the syntax of a subquery while using the NOT EXISTS keyword :

Syntax:

The following example shall help you understand better:

In a section before, we considered an example to find the authors who have written blogs with more than 20,000 views. Let us see how the query gets modified using NOT EXISTS.

AUTHORS
Author_ID
Author_Name
Contact_Address
Total_Blogs

BLOGS
Blog_ID
Blog_Name
Author_ID
Total_Views

So as you can see, the condition Total_Views < 20000 is exactly the opposite to get the same results as EXISTS using NOT EXISTS.

Correlated Vs Uncorrelated Subqueries In SQL

Let us see how correlated and uncorrelated subqueries differ :

Correlated Subqueries	Uncorrelated subqueries
A correlated subquery is dependent on the outer query for its values.	An uncorrelated subquery is not dependent on the outer query for its values
It can not be executed as a standalone query.	It is possible to execute an uncorrelated subquery as a standalone query.
It gets executed multiple times, also called repeating subquery.	It gets executed only once.
Since it gets executed for every row, thus slower in performance	An uncorrelated subquery is faster than the correlated subquery
Most commonly, these subqueries are used with keywords exits and not exists	Mainly used with IN and NOT INclauses.

Conclusion

A correlated subquery is dependent on the outer query for its values.
It is mainly used in the row-by-row processing of tables.
The most commonly used keywords with correlated subqueries are EXISTS and NOT EXISTS.