SQL GROUP BY
When we work with a database table, we might come across conditions where we have to group the records based on their categories. So in that case, we use GROUP BY in SQL to group those records based on our needs. We can use the WHERE clause with the ORDER BY clause to group data based on our needs.
What is GROUP BY?
GROUP BY is a SQL query clause used to group data rows based on one or more columns in a table. The query returns aggregated data instead of individual rows when using GROUP BY. The GROUP BY clause is often combined with aggregate functions such as SUM, AVG, COUNT, MAX, and MIN to calculate statistics or summarize data.
It's important to note that only the columns listed in the GROUP BY clause and aggregate functions can be included in the SELECT statement when using GROUP BY. Any other columns not included in the GROUP BY clause or an aggregate function will result in an error.
GROUP BY Syntax
Sample Table
Student Table:
Suppose we have a table name student containing details of all students in the CS department. So we will use this table's sample data to group students based on their departments.
Student_id | Name | Section | Marks | Grade |
---|---|---|---|---|
1 | Chandler | A | 92 | A |
2 | Monica | B | 90 | A |
3 | Phoebe | C | 93 | A |
4 | Joey | C | 60 | C |
5 | Rachel | B | 65 | C |
6 | Ross | A | 96 | A |
Emp Table:
This table contains details of employees' roles in a particular company and has data fields as EmpID, Role, Dept.
EmpID | Role | Dept |
---|---|---|
1 | SDE-1 | Engineering |
2 | Trainee | IT |
3 | Trainee | IT |
4 | SDE-1 | Engineering |
5 | SDE-1 | Engineering |
6 | Technical Architect | Engineering |
SQL GROUP BY Examples
GROUP BY with a Single Column
When we are grouping the Table using a single column, the output has all the rows with the same value of that particular column.
Let us understand the scenario in which we had to get all maximum marks from each section in CS department.
Syntax:
Example:
Let's take the above example. So for that, we will group all the entries by sections from which they belong and will apply max function on marks to get maximum marks from each section.
Output:
We will simply get the maximum marks of students section-wise along with their section names.
Section | Marks |
---|---|
B | 90 |
C | 93 |
A | 96 |
GROUP BY with Multiple Columns
When we are grouping the Table using multiple columns, the output has all the rows with the same value of both of those columns.
Let us understand grouping for multiple columns by example in which we have to get the count of all employees grouped by Role and Dept.
Syntax:
Example:
We will group all the entries by dept and role of the employees and will apply the count function on the entries to get a count by dept and role.
Output:
We will simply get an output table with a count corresponding to the dept and role of the employees.
Count(Emp_Id) | Role | Dept |
---|---|---|
1 | Technical Architect | Engineering |
2 | Trainee | IT |
3 | SDE-1 | Engineering |
GROUP BY in SQL with Join Clause
GROUP BY can be used in SQL queries with a JOIN clause to group data from multiple tables based on some common columns. When using GROUP BY with JOIN, the resulting data is grouped based on the columns specified in the GROUP BY clause across all tables in the JOIN statement.
For example, let's say we have two tables, employees and departments, with the following columns:
The employees table has the following attributes:
- employee_id
- first_name
- last_name
- department_id
The departments table has the following attributes:
- department_id
- department_name
- manager_name
To join these two tables and group the data by department name, we can use the following SQL query:
This query will join the employees and departments tables on the department_id column. The resulting data will be grouped using the department name attribute. The SELECT statement will obtain the department name and the count of employees in each department using the COUNT() function and the employee_id column from the employees table.
GROUP BY with Having Clause
We can use the HAVING clause to specify which groups will be included in the final result set. We also can't utilize aggregate functions like SUM(), COUNT(), and so on with the WHERE clause, as WHERE clause is used to place conditions on columns. So If we want to use aggregate functions on a group of data(group by), we must use the HAVING clause.
Syntax:
Example:
We will take the above student table and will apply the filter condition for getting the maximum marks from each section and max marks should be more than 90.
Output:
We got the values of the section having the maximum marks greater value 90.
Section | Marks |
---|---|
C | 93 |
A | 96 |
Conclusion
- We use GROUP BY in SQL to group those records based on our needs.
- The GROUP BY clause in SQL is typically used to organize similar types of data into groups. If multiple rows in a specific column have the same entries, it will group all the rows with the same entries.
- When we are grouping the Table using a single column, the output has all the rows with the same value as that particular column.
- When we are grouping the Table using multiple columns, the output has all the rows with the same value of both of those columns.
- We can use the HAVING clause to apply filter conditions for getting data for the final result set.