Are Your Database Queries Costing The Business Time and Revenue?


Why Should You Care
The performance of a business website or application can determine if a company gains or loses customers. Customer expectations are rising, so every part, including the database design, must be optimized to meet these expectations.
For junior backend engineers, the difference between junior and senior roles often involves attention to detail, like asking: “Why is there a delay in data retrieval?”
The insights in this article can boost your professional standing and reduce negative feedback or job insecurity.
Common signs of performance issues in an application
The following inexhaustive list of symptoms indicates performance issues in an enterprise application.
Users Complaints
Slow page loading times
Error Messages
High churn rate
The interface is not responsive on a mobile device.
Several factors could contribute to the symptoms listed above. However, the main focus of this article is the fixes that can be carried out on the backend side of the infrastructure, as the database queries retrieving the data, if not painstakingly scripted, can also contribute to this side effect.
Identifying Inefficient Database Queries
This is where we go deeper. If you're not technically inclined, please bear with me as this is how I communicate with engineers.
My Engineers would say:
“But I don’t write raw SQL... I use ORMs.”
Yes, ORMs often work well for us. However, in an Agile environment, we frequently change the product based on user feedback. Some changes may need custom queries that ORMs can't handle, allowing us to embed custom scripts.
Manual processes can lead to errors and inefficiencies. That's why we're here to learn how to spot inefficient queries.
Here are some examples of poorly scripted queries:
Wasteful use of the SELECT statement: This is illustrated below
-- A script that retrieves all records in the bookings table SELECT * FROM bookings -- -------------------------------------- -- This query also retreives all columns each row fetched. -- Thus, increasing the required bandwidth for the interfaced applications
The use of Subqueries - While subqueries can be powerful, they hurt code readability. Each nested query becomes a nested loop, which can be inefficient as per Big O notation O(n^m).
Example:
-- A script that retrieves properties with average rating > 4.0 using a non-subquery: SELECT p.property_id, p.name, p.location FROM properties AS p WHERE (SELECT AVG(rating) FROM reviews AS r WHERE r.property_id = p.property_id) > 4.0; -- This correlated subquery retreives the users who have made over 3 bookings on our platform: SELECT u.user_id, CONCAT(u.first_name, " ", u.last_name) AS `name`, u.email FROM users AS u WHERE (SELECT COUNT(*) FROM bookings AS b WHERE b.user_id = u.user_id) > 3;
Filtering a large table with a non-indexed column slows down queries. Indexed columns speed up search queries.
Large tables not partitioned can cause slow responses due to the extra work needed to fetch data.
How To Optimize SQL Database Queries
Analyze Query Performance - Back to our bookings table
Assuming we only need some columns and not all, we can save bandwidth by running the script below instead:
-- A script that retrieves all records in the bookings table SELECT b.bookind_id, b.booking_date, b.status FROM bookings AS b -- -------------------------------------- -- This query only retreives the needed columns each row fetched. -- Thus, saving bandwidth
Refactor Inefficient Queries - A good place to start with refactoring is replacing all subqueries with JOINS and CTEs. A more readable and efficient option for the subquery above is shown below:
-- A script that retrieves properties with average rating > 4.0 using a non-subquery: SELECT p.property_id, p.name, p.location FROM properties AS p INNER JOIN reviews AS r ON r.property_id = p.property_id) WHERE r.rating > 4.0; -- Both correlated and Non-correlated subquery can be improved using CTEs and Windows Fuction which is a topic for another day.
Optimize Database Schema - Pay more attention to both the conceptual and logical design phases of the Database:
Establish a foreign key relationship with related tables using the same data types for columns to improve JOINs operation, Normalise the data from First Normal Form(1NF) to at least Third Normal Form(3NF), Make use of the data types that meet the minimum storage requirement for data (Use CHAR data type where Possible Instead of VARCHAR)Continuously Monitor and Iterate - Database query optimization is an iterative process. For insights into how effective your applied optimization techniques are, you need to analyze the query before and after optimization. There are a plethora of tools at your disposal, which is beyond the scope of this article.
The ones worthy of note in the MYSQL world are and
EXPLAIN ANALYZE
and `EXPLAIN
`An example of the basic usage of these tools is shown below:
-- Inspect the query to identify inefficiency -- EXPLAIN EXPLAIN ANALYZE SELECT b.booking_id, b.start_date, b.end_date, b.total_price, b.status AS booking_status, b.created_at AS booking_created_at, u.user_id AS guest_id, CONCAT(u.first_name, ' ', u.last_name) AS guest_name, u.email AS guest_email, p.property_id, p.name AS property_name, p.location AS property_location, p.pricepernight, p.host_id, CONCAT(h.first_name, ' ', h.last_name) AS host_name, pay.payment_id, pay.amount AS payment_amount, pay.payment_date, pay.payment_method FROM bookings b JOIN users u ON b.user_id = u.user_id JOIN properties p ON b.property_id = p.property_id JOIN users h ON p.host_id = h.user_id LEFT JOIN payments pay ON b.booking_id = pay.booking_id WHERE b.status = 'confirmed' AND b.start_date >= '2024-01-01' AND b.end_date <= '2024-12-31' ORDER BY b.created_at DESC\G;
-- Refactor the query to reduce execution time, by using indexing.
CREATE INDEX idx_booking_user ON bookings(user_id);
CREATE INDEX idx_booking_property ON bookings(property_id);
CREATE INDEX idx_booking_filter ON bookings(status, start_date, end_date);
CREATE INDEX idx_payment_booking ON payments(booking_id);
CREATE INDEX idx_property_host ON properties(host_id);
EXPLAIN ANALYZE
SELECT
b.booking_id,
b.start_date,
b.end_date,
b.total_price,
b.status AS booking_status,
b.created_at AS booking_created_at,
u.user_id AS guest_id,
CONCAT(u.first_name, ' ', u.last_name) AS guest_name,
u.email AS guest_email,
p.property_id,
p.name AS property_name,
p.location AS property_location,
pay.payment_id,
pay.amount AS payment_amount,
pay.payment_date,
pay.payment_method
FROM
bookings b
JOIN
users u ON b.user_id = u.user_id
JOIN
properties p ON b.property_id = p.property_id
LEFT JOIN
payments pay ON b.booking_id = pay.booking_id
WHERE
b.status = 'confirmed'
AND b.start_date >= '2024-01-01'
AND b.end_date <= '2024-12-31'
ORDER BY
b.created_at DESC;
The output of the EXPLAIN
command gives a rough estimate of the number of rows to be traversed, the query type(An indicator of how the table was looked up), and more. More realistic and extensive information is provided by the output of the EXPLAIN ANALYZE
command
Best Practices for SQL Query Optimization
Proper query structuring - Partitioning frequently accessed large tables based on an indexed column is also a good practice.
Indexing strategies: A good candidate column for indexing is the column used with the
WHERE
,JOIN
,GROUP BY
and aggregate functionsUse of query optimization tools and software: Two popular ones are
EXPLAIN
andEXPLAIN ANALYZE
Regularly updating and maintaining the database
Training for developers on efficient query practices
Lastly, never underestimate the power of that Database Administrator guy close to you. Consult him for better query analysis and recommendations.
What’s Next
Performance monitoring is not a one-off exercise, as the goal is to continuously monitor and refine database performance. Iteratively analyze query execution plans and make schema adjustments based on needs by following best practices.
Subscribe to my newsletter
Read articles from Godwin Lawal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Godwin Lawal
Godwin Lawal
Godwin is an experienced professional with a strong background in Software Engineering, specializing in Database management, API design, and Implementation, Software delivery, and System Automation