How to Optimize Query Performance in MySQL Databases?
Despite being the most widely used relational database management system, MySQL still requires periodic optimization, especially for large and intricate data sets. Regular optimization is critical for ensuring proper system performance.
Optimize query in mysql typically involves configuring, profiling, and monitoring performance at various levels. You don't necessarily need extensive SQL knowledge to tune MySQL performance.
Your queries execution is optimised by the built-in MySQL query optimizer. However, the optimizer's performance might be hindered by improperly build queries. Your queries still have an impact on database speed even if you use other optimisation approaches like good schema design or indexing.
A Quick Introduction to Query Processing
The processing of a query in MySQL involves a series of steps, which, when understood, can provide valuable insights to optimize query in mysql. Although these steps are internally complex, they can be summarized as follows:
- The MySQL client utilizes the MySQL Client/Server Protocol to transmit your query to the MySQL server.
- After being sent to the MySQL server, the query undergoes parsing, preprocessing, and optimization to create a query execution plan with the help of the MySQL query optimizer. Before execution, the optimizer may request statistics about the tables referenced in your query from the Storage engine.
- The Query Execution Engine carries out the execution plan by utilizing special handler interfaces to make calls to the Storage engine.
- The results of the query are sent from the MySQL server to the MySQL client.
In the upcoming sections, we will discover methods to obtain information regarding the default execution plan and investigate ways to optimize query in mysql that can affect the plan.
Identify and Analyze Slow Queries in MySQL
Although slow queries can be inconvenient, the positive side is that we can spot and address problems.
MySQL's response time is the length of time it takes to process your query. It is the most significant metric used to assess the speed of your query. In other words, a query or transaction's performance is directly correlated with the amount of time it takes to respond.
A query's speed can be affected by a number of variables, including the hardware configuration, improper index usage, authorization settings, poor schema design, and the queries' intended use. The last element, the question itself, is the one that this guide emphasizes.
You provide the MySQL server a set of instructions to follow each time you send a query to it.
A query with only one parameter is an example of a basic instruction. On the other hand, sophisticated instructions can include complex subqueries and joins. Your queries will take time to process no matter how they are sent to the MySQL server.
Fortunately, there are lot of built-in tools that enable you to evaluate the performance of your query. This process is referred to as query profiling, which involves identifying how MySQL utilizes time to process and execute your query. The subsequent sections discuss various query monitoring tools and illustrate how to use them to analyze your query's performance.
The slow_query_log
Examining the slower query logs is another method for locating slow requests. You can use this built-in functionality to log searches that take longer than the long_query_time system setting allows for. By default the limit of time is 10 seconds, so if a query takes longer than that, MySQL will log it. Like the INFORMATION_SCHEMA.PROFILING, this slower query log is not turned on by default, it must be activated by setting up the slow_query_log variable to 'ON':
You can verify the value of a global variable by running the following query:
By entering a particular command, you can modify your time limit in seconds:
Additionally, you can modify the default path of the hostname-slow.log file in var/lib/mysql to any other location of your choice:
After exceeding the time limit you set, MySQL server logs the query in the slow query log file. You can inspect the log file to see the slow queries. To demonstrate, suppose you set the slow query log file path to "/tmp/slow_queries.log" and the long_query_time to the default of 10 seconds. It is possible to intentionally optimize query in mysql using the SLEEP() function, which can be used to simulate performance issues and test the behavior of the system under such conditions. An example of using the SLEEP() function to introduce a 10-second delay is shown below:
Once you have finished examining, you can disable this feature by adjusting the global variable 'slow_query_log' to 'OFF'.
The INFORMATION_SCHEMA.PROFILING table
The profiling data you run during an active interactive session is stored in the INFORMATION_SCHEMA.PROFILING table. By setting the profiling session variable as shown below, you can turn on query monitoring for your current running session even if it is disabled by default. Note that when a session expires, the profiling data is lost: If you would like to follow along, you can download and install the sakila sample database that is referenced throughout this article.
Once you have done with the above step, you can choose the database that you wish to utilize:
You then execute your query (without using the EXPLAIN statement).
Start the SHOW PROFILES search.
The whole list of current session queries will be retrieved by using SHOW PROFILES query. Three columns are present:
- Q_ID- A distinct numerical identification for a query.
- Time- The amount of time it takes to run a query.
- Query- The column shows the query ran by the MySQL server.
The EXPLAIN Statement
The EXPLAIN statement is another query analysis tool offered by MySQL, which displays the execution plan that the MySQL server will use. It is a user-friendly feature, and to view how MySQL will execute a query before running it, you simply need to add the EXPLAIN statement before the query.
By using "\G" inplace of ";" to conclude the SQL statement, MySQL will present the resulting query in a vertical format, making it more convenient to go through the tables that contain numerous lengthy columns.
The previous instance indicates that a server can obtain the outcome by scanning only 598 rows. EXPLAIN is often utilized for assessing MySQL's join execution approach.
Based on this outcome, we can observe that MySQL intends to commence with the city table and then search the address table, followed by the customer table. However, the question is, what is the reason behind selecting this particular route?
The solution is the join optimizer to optimize query in mysql. Join optimization is among the most effective techniques that is used to optimize query in mysql involving joins. It accomplishes this by arranging the joins in multiple orders and approximating the various expenses for each order before ultimately selecting the most economical alternative that yields the same outcome as the original query order.
The EXPLAIN statement can help in identifying the appropriate locations to add indexes to tables, enabling queries to operate more quickly by utilizing the indexes to locate rows within those tables.
The EXPLAIN ANALYZE Statement
Our discussion of profiling techniques would not be complete without mentioning EXPLAIN ANALYZE.
EXPLAIN ANALYZE operates similarly to the EXPLAIN statement, but with greater detail. It provides additional statistics regarding query's execution.
EXPLAIN does not perform queries, whereas EXPLAIN ANALYSE does. This is another distinction. This is because it has to talk to the Storage engine in order to retrieve the absent statistics. In a subsequent part, we'll go through how this connection functions. Let's illustrate this profiling method in the meantime. Simply add the following line to the front of the query to profile it using the EXPLAIN ANALYSE statement:
The outcomes illustrate, how MySQL server implemented the plan, offering several additional statistics, including:
- The actual time in milliseconds required to fetch the first row and all rows in a table
- The expected price of the MySQL query.
- The real number of rows scanned.
- The number of loops that were created.
The total cost of all operations, such as CPU or I/O operations, is included in a query execution plan, such as the plan's projected cost.
Inspecting the last_query_cost session variable
Checking the session's variable "lastquerycost" is another method for calculating the expected price of the query:
The cost for the latest compiled query plan, as determined by MySQL's query optimizer, is displayed in the results. It is a practical method for contrasting two or more iterations of the query. In the example that follows, we execute two related searches and contrast their costs. The outcomes demonstrate that our initial query's execution plan would be more affordable:
Only retrieve the data you need
When the server scans all the rows requested by your query, it locks any resources accessible in the session, making it impossible for the client to interrupt the process without forcibly ending the session. Additionally, the client cannot selectively access the data until the server has scanned all the rows.
If your query is requesting too much data, the MySQL server may spend excessive time scanning numerous rows of data. Fortunately, this can be resolved by following these steps:
Fetch the number of rows you need only
Suppose you're developing a movie database website similar to IMDB that only needs to retrieve around 10 rows of movie information per page. In most cases, you wouldn't fetch all the data like this:
Fetching all 1000 rows of the table for the minimal amount of data we need would require the MySQL server to conduct a complete scan.
To address this issue, we can utilize a prevalent database technique known as pagination. By employing the LIMIT and OFFSET clauses, we can specify the number of rows we require and from which row the data should be retrieved. It's not our goal to teach you about the pagination in this guide (we'll address this in another article). Nonetheless, our optimized query would resemble the following:
Fetch only the columns you need
Although we optimize query in mysql in the last example, they are not yet fully optimized. If we examine them, we can see that we retrieved all table columns by using SELECT * (select all). A more efficient approach would have been to specify the columns we require in our query. To determine the columns in our film table, let's take a look:
Fetch the Columns You Need when Joining Tables
Similar to the last example. By just requesting the columns that you actually need from the resulting set, you can speed up your joins. In the previous join example, let's imagine you simply want to retrieve the first and final names of every customer. We would format our inquiry as follows:
By doing this, you can stop the MySQL database server to scan every column in the combined tables.
Use Caching Strategies to Prevent Fetching the Same Data Repeatedly
In case your application requires a particular part of the data repeatedly, caching the data after the first query and reusing it subsequently can be a more efficient approach rather than executing the query every time.
The only exceptions are when you require your queried data to always be current or when the data is subject to frequent changes. The Query Cache, an integrated cache layer that was once part of MySQL, has been removed due to scalability issues. Fortunately, several programs, like Redis and Memcached can aid in caching data.
Avoid LIKE Expressions with Leading Wildcards
LIKE clauses work in combination with wildcards. These are characteristics that help in finding data that satisfy complicated requirements.
Wildcard character placement within the LIKE clauses can result in unexpected performance. Due to MySQL's inability to effectively employ indexes when you use a leading wildcard, this practice is recognised to be the main problem most of the time:
UNION vs. UNION ALL
Although MySQL's query optimizer doesn't perform as many optimizations on UNION queries as it does on other queries, you can assist the optimizer by specifying clauses like LIMIT, ORDER BY, WHERE, etc. Use of the UNION ALL , however, would be an even better choice if you need to get rid of rows that are duplicated in the query's result.
This is because leaving out ALL causes a temporary table to add the DISTINCT selection, which forces the MySQL server to do costly scans of every single row to check for uniqueness. The main lesson here is to not use UNION and DISTINCT unless absolutely required.
Optimize JOIN queries
When possible, you ought to utilize INNER JOIN rather than OUTER JOIN. Having indexes on each column listed within the USING and ON clauses will also help you optimize the JOIN query.
GROUP BY and ORDER BY
Expressions for ORDER BY and GROUP BY must always use columns within a single database to ensure the server may perform these actions using indexes.
Optimize COUNT() queries
There are two uses for the unique aggregate function COUNT(). This COUNT() function first counts the instances in which the expression have value when you supply a column name or expression.
The COUNT(*) expression can be used to count the total number of rows in a query result, which is the second application of COUNT(). Consequently, when you need to know how many rows the query result contains, use this version.
How to help the MySQL optimizer
In a certain situations, the optimizer might not pick the optimal execution path. One of the causes may be that MySQL must make "educated" assumptions about the data since it lacks sufficient knowledge of the present data. To address this issue, there are a few methods that can be used:
Firstly, one can make use of that EXPLAIN statement to obtain information on how the MySQL processes the query, as discussed earlier.
Another technique is to use index hints, such as USE INDEX, IGNORE INDEX and FORCE INDEX for the table. These hints inform MySQL that scanning of the table is costly in comparison to use up the provided index. With index hints, you have complete control of how the query optimizer selects the indexes for the execution plan. After trying out the other strategies to optimize query in mysql, it is preferable to employ this as a final option. Utilizing index hints requires the following syntax:
To control the optimizer's strategies based on specific criteria, optimizer hints can be used at various scope levels such as query block, index-level and global. In general, these hints are utilized for each statement & can be combined with index hints. Although they appear to be comments, a query parser can distinguish them. The syntax for including optimizer hints is as follows:
- The distribution of keys for the scanned table can be updated by utilizing "ANALYZE TABLE table_name."
- Global and table-level STRAIGHT_JOIN can be utilized.
- Rephrase your query: Your query may occasionally need to be refactored to speed up response time. In the sections before this one, we talked about various possible circumstances.
- To optimize the performance of your tables, consider incorporating indexes. Indexes can enhance the speed of JOIN operations and expedite data retrieval in MySQL, ultimately improving the responsiveness of your queries. However, it is important to assess whether adding indexes is suitable for tables that do not undergo frequent batch inserts or updates. It might not be best to add indexes for smaller tables. Additionally, you shouldn't index everything because it uses up more disc space and slows down adding or updating information.
- In order to maximize the speed at which you can access tables, it is crucial to prioritize schema optimization. MySQL must process your queries more slowly and with more system resources because of a badly constructed database schema.
Schema Optimization and Indexing
The optimization of queries is of great importance in the relational database like MySQL, as it enables you to identify where the database spends most of its time. You also comprehend how the strength of the query you submit to the server running MySQL might influence the response time.
To optimize query in mysql is simply one aspect of enhancing MySQL's performance. Other alternatives to think about include indexing and schema optimization. These aspects are interconnected, as schemas and indexes can influence query speed, and in turn, query execution speed can impact overall database performance.
Conclusion
- To optimize query in mysql is a crucial aspect of improving MySQL performance.
- Understanding how MySQL executes queries helps identify where the database spends its time and how the quality of queries impacts response time.
- Query optimization is just one piece of the puzzle, schema optimization and indexing are also critical options to consider, as they can affect query speed and vice versa.
- It's essential to strike a balance between these different aspects to optimize MySQL performance fully.
- By implementing best practices to optimize query in mysql, schema optimization, and indexing, you can improve the performance of your MySQL database and achieve faster query execution speeds.