MySQL Query Optimization: Practical Techniques for Developers
Learn advanced MySQL optimization techniques including indexing strategies, query optimization, and performance tuning for high-traffic applications.
StalkTechie
Author
MySQL Query Optimization: Practical Techniques for Developers
In high-traffic applications, inefficient MySQL queries can lead to slow response times, high server load, and poor user experience. This guide covers practical optimization techniques, from indexing and query writing to schema design, caching, and monitoring tools. We'll focus on MySQL 8.0+ features, with real-world examples applicable to frameworks like Laravel, Django, or Node.js.
Understanding Query Execution with EXPLAIN
Always start optimization by analyzing queries using EXPLAIN. It shows how MySQL executes a query, including indexes used, rows scanned, and join types.
EXPLAIN SELECT * FROM products WHERE price > 100 AND category_id = 5 ORDER BY created_at DESC LIMIT 10;
Key columns in output:
- type: ALL (full scan - bad), range (good), ref (okay).
- rows: Estimated rows examined; lower is better.
- Extra: "Using filesort" (avoid for large sets), "Using temporary" (costly).
- key: Index used; NULL means no index.
Use EXPLAIN ANALYZE (MySQL 8.0.18+) for actual execution times.
EXPLAIN ANALYZE SELECT ...;
Tip: Run in production-like data volumes; small datasets hide issues.
Indexing Strategies
Indexes speed up reads but slow writes. Create them on frequently filtered/sorted/joined columns.
Basic Indexes
CREATE INDEX idx_price ON products(price);
CREATE INDEX idx_category_active ON products(category_id, is_active);
Composite indexes: Order matters (leftmost prefix rule). For queries filtering on category then price, index (category_id, price).
covering Indexes
Include extra columns to avoid table lookups (index-only scans).
CREATE INDEX idx_cover ON products(category_id, price) INCLUDE (name, slug); -- MySQL 8.0.13+ with INCLUDE
Full-Text Indexes for Search
ALTER TABLE products ADD FULLTEXT(name, description);
SELECT * FROM products WHERE MATCH(name, description) AGAINST('wireless headphone' IN NATURAL LANGUAGE MODE);
For boolean search or relevance scoring.
Hash vs BTREE
Default BTREE for ranges; HASH for equality (Memory engine only).
Avoid Over-Indexing
- Monitor with
SHOW INDEX FROM table;. - Drop unused: Use pt-index-usage tool from Percona Toolkit.
- Indexes on low-cardinality columns (e.g., boolean) are inefficient.
Writing Efficient Queries
Select Only Needed Columns
Avoid SELECT *; it increases IO and prevents covering indexes.
SELECT id, name, price FROM products WHERE ...; -- Good
Use JOINs Wisely
Prefer INNER JOIN over subqueries. Use STRAIGHT_JOIN for forcing order if optimizer chooses poorly.
SELECT p.name, c.name FROM products p INNER JOIN categories c ON p.category_id = c.id WHERE p.price > 100;
Avoid Functions on Indexed Columns
Breaks index usage: Bad - WHERE DATE(created_at) = '2023-01-01'; Good - WHERE created_at >= '2023-01-01' AND created_at < '2023-01-02'.
Pagination Optimization
Use keyset pagination over OFFSET (avoids full scans).
-- Offset (slow on large offsets)
SELECT * FROM products ORDER BY id DESC LIMIT 10 OFFSET 10000;
-- Keyset (efficient)
SELECT * FROM products WHERE id < :last_id ORDER BY id DESC LIMIT 10;
Aggregate Queries
Use indexes on GROUP BY/ORDER BY columns.
SELECT category_id, COUNT(*) FROM products GROUP BY category_id; -- Index on category_id
Schema Design Best Practices
- Normalization vs Denormalization: Normalize for consistency; denormalize for read-heavy (e.g., add redundant columns).
- Data Types: Use INT over VARCHAR for IDs; DECIMAL for money; ENUM for fixed options (but sparingly).
- Partitioning: For large tables (e.g., by date).
CREATE TABLE sales (
id BIGINT AUTO_INCREMENT,
sale_date DATE,
amount DECIMAL(10,2),
PRIMARY KEY (id, sale_date)
) PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025)
);
Caching Strategies
- Query Cache: Deprecated in 8.0; use application caching (Redis/Memcached).
- Redis for Frequent Queries: Cache results with TTL.
// Laravel example
$products = Cache::remember('top_products', 3600, function () {
return Product::where('is_featured', true)->get();
});
Configuration Tuning
Edit my.cnf/my.ini:
query_cache_size=0(off in 8.0+).innodb_flush_log_at_trx_commit=2(performance vs durability).tmp_table_size=64M,max_heap_table_size=64Mfor large temp tables.table_open_cache=4000for many tables.
Use mysqltuner.pl script for recommendations.
Monitoring and Profiling
- Slow Query Log: Enable with
long_query_time=1; analyze with pt-query-digest. - Performance Schema: Query
performance_schema.events_statements_summary_by_digestfor top queries. - Tools: MySQL Workbench, pm-top, New Relic, or Percona Monitoring.
- SHOW PROCESSLIST; To kill long-running queries.
Advanced Techniques
Query Rewriting and Hints
SELECT ... STRAIGHT_JOIN ...; -- Force join order
SELECT ... USE INDEX(idx_price); -- Hint index
Subquery Optimization
MySQL 8.0+ optimizes CTEs and window functions.
WITH ranked_products AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) as rn
FROM products
)
SELECT * FROM ranked_products WHERE rn = 1;
Sharding and Replication
For scale: Read replicas for queries; Vitess or ProxySQL for sharding.
Connection Pooling
Use ProxySQL or application pools to reduce overhead.
Common Pitfalls and Fixes
- N+1 Queries: Fix with eager loading (e.g., JOIN or ORM selects).
- Lock Contention: Use SELECT ... FOR UPDATE sparingly; optimize transactions.
- Large IN Clauses: Batch or use temp tables.
- ORM Overhead: Profile raw queries; use DB::listen in Laravel.
Testing and Benchmarking
- sysbench for load testing.
- Generate test data with Faker or procedures.
- A/B test changes on staging with real query logs.
Optimization is iterative: Measure, change, measure again. Tools like EXPLAIN and monitoring are your allies. For cloud (AWS RDS, Google Cloud SQL), leverage auto-scaling and read replicas. Always backup before schema changes!