目录
JOIN优化
驱动表 vs 被驱动表
- 外层循环的表是驱动表,内层循环的表是被驱动表
Table | Join Type |
---|---|
t1 | range |
t2 | ref |
t3 | ALL |
t1 是 t2 的驱动表,t2 是 t1 的被驱动表
t2 是 t3 的驱动表,t3 是 t2 的被驱动表
JOIN调优原则-1
- 用小表驱动大表
- 一般无需人工考虑,关联查询优化器会自动选择最优的执行顺序
- 如果优化器抽风,可使用 STRAIGHT_JOIN, 强制MySQL先读取左边的表,在读取右边的表
- 如果有where条件,应当要能够使用索引,并尽可能地减少外层循环的数据量
1. 示例表
-- 小表
mysql> SELECT COUNT(*) FROM employees;
+----------+
| COUNT(*) |
+----------+
| 300025 |
+----------+
1 row in set (0.02 sec)
-- 大表
mysql> SELECT COUNT(*) FROM salaries;
+----------+
| COUNT(*) |
+----------+
| 2844047 |
+----------+
1 row in set (0.10 sec)
2. employees是驱动表,salaries是被驱动表
-- 1. 当id相同的时候,那么就从上往下执行
-- 2. 当id不同的时候,那么就根据id大的先执行
-- 3. 下面列子就会先执行第一行
mysql> EXPLAIN
-> SELECT * FROM employees e
-> LEFT JOIN salaries s on e.emp_no = s.emp_no
-> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | e | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | s | NULL | ref | PRIMARY | PRIMARY | 4 | const | 17 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
3. employees是驱动表,dept_emp是被驱动表
dept_emp是驱动表,departments是被驱动
mysql> EXPLAIN
-> SELECT * FROM employees e
-> LEFT JOIN dept_emp de on e.emp_no = de.emp_no
-> LEFT JOIN departments d on de.dept_no = d.dept_no
-> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
| 1 | SIMPLE | e | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | de | NULL | ref | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | d | NULL | eq_ref | PRIMARY | PRIMARY | 16 | employees.de.dept_no | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
3 rows in set, 1 warning (0.00 sec)
4. 尽管查询语句是大表驱动小表, 关联查询优化器会自动选择最优的执行顺序
mysql> EXPLAIN
-> SELECT * FROM salaries s
-> LEFT JOIN employees e on s.emp_no = e.emp_no
-> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | e | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | s | NULL | ref | PRIMARY | PRIMARY | 4 | const | 17 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
5. 使用 STRAIGHT_JOIN, 强制MySQL先读取左边的表,在读取右边的表
mysql> EXPLAIN
-> SELECT * FROM salaries s
-> STRAIGHT_JOIN employees e on s.emp_no = e.emp_no
-> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | s | NULL | ref | PRIMARY | PRIMARY | 4 | const | 17 | 100.00 | NULL |
| 1 | SIMPLE | e | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
JOIN调优原则-2
- join的字段尽量创建索引
- join字段的类型要保持一致
- 尽量减少扫描的行数(explain-rows)
- 尽量控制在百万以内(经验之谈,仅供参考)
示例创表语句
create table `test_user`
(
`id` int auto_increment,
`name` varchar(45) NOT NULL DEFAULT '',
PRIMARY KEY (`id`)
);
INSERT INTO test_user (id, name)
VALUES (1, 'Eddie');
CREATE TABLE `test_user_ext`
(
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` varchar(45) NOT NULL DEFAULT '',
`email` varchar(50) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `test_user_ext_user_id_index` (`user_id`)
);
INSERT INTO test_user_ext (id, user_id, email)
VALUES (1, '1', 'eddie@qq.com');
问题查询 (隐式转换)
mysql> EXPLAIN
-> SELECT * FROM test_user u
-> LEFT JOIN test_user_ext tue on u.id = tue.user_id
-> WHERE u.id = 1;
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | u | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | tue | NULL | ALL | test_user_ext_user_id_index | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
2 rows in set, 3 warnings (0.01 sec)
会发现 test_user_ext 表出现 type=ALL 全表扫描的情况,
为什么出现这种情况呢?
因为join字段的类型不同时,索引无法使用:
u.id(int) = tue.user_id(varchar) 两个关联字段类型不一样而导致
通过终端查看 show warnings 获得警告信息
mysql> use employees;
mysql> EXPLAIN SELECT * FROM test_user u LEFT JOIN test_user_ext tue on u.id = tue.user_id WHERE u.id = 1;
mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'test_user_ext_user_id_index' due to type or collation conversion on field 'user_id' |
| Warning | 1739 | Cannot use range access on index 'test_user_ext_user_id_index' due to type or collation conversion on field 'user_id' |
| Note | 1003 | /* select#1 */ select '1' AS `id`,'Eddie' AS `name`,`employees`.`tue`.`id` AS `id`,`employees`.`tue`.`user_id` AS `user_id`,`employees`.`tue`.`email` AS `email` from `employees`.`test_user` `u` left join `employees`.`test_user_ext` `tue` on(('1' = `employees`.`tue`.`user_id`)) where true |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
解决办法
varchar类型修改为int类型即可
ALTER TABLE test_user_ext MODIFY user_id int DEFAULT 0 NOT NULL
mysql> EXPLAIN SELECT * FROM test_user u LEFT JOIN test_user_ext tue on u.id = tue.user_id WHERE u.id = 1;
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | u | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | tue | NULL | ref | test_user_ext_user_id_index | test_user_ext_user_id_index | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
JOIN调优原则-3
- 参与join的表不要太多
- 阿里编程规约建议不超过3张 (比如业务就是很复杂,只能拆分语句,Java编写不同方法调用)
- 如果被驱动表join字段用不了索引,且内存较为充足,可以考虑把参数:join buffer 设置得大一些
复杂语句拆分示例
-- 拆分前
SELECT * FROM employees e
LEFT JOIN dept_emp de on e.emp_no = de.emp_no
LEFT JOIN departments d on de.dept_no = d.dept_no
WHERE e.emp_no = 10001;
-- 拆分后
SELECT * FROM employees WHERE emp_no = 10001;
SELECT * FROM dept_emp WHERE emp_no = 10001; ### d005
SELECT * FROM departments WHERE dept_no = 'd005';
比如使用Mybatis, 可以拆分三个不同的方法,得到数据,这里举例说明,一般三个Join是可以接受的
不要以为写复杂的语句就很牛,要么你是DBA,要么你就想将来接手项目的人崩溃。拆分代码方法性能绝对比你多张表join 数据库循环更有效率。
吐槽
在本人工作经历,曾经有一些自认很牛叉的人,关联十张或者更多join, 然后一堆函数和字段返回再次查询,然后在 UNION ALL。
无论怎么跟他们解释,都是感觉自己一条SQL写到尾。返回数据不是更快捷,可惜将来接手的人,看到Mybatis的XML文件,一个方法过百行,十几个join,就无语了。
只能说一句,你们要么就是很懒的人不想写代码,要么就想把项目搞崩。