MySQL 优化JOIN语句详解 | Eddie'Blog
MySQL 优化JOIN语句详解

MySQL 优化JOIN语句详解

eddie 325 2021-05-21

目录

JOIN优化

驱动表 vs 被驱动表

  • 外层循环的表是驱动表,内层循环的表是被驱动表
TableJoin Type
t1range
t2ref
t3ALL

t1 是 t2 的驱动表,t2 是 t1 的被驱动表

t2 是 t3 的驱动表,t3 是 t2 的被驱动表

JOIN调优原则-1

  • 用小表驱动大表
    • 一般无需人工考虑,关联查询优化器会自动选择最优的执行顺序
    • 如果优化器抽风,可使用 STRAIGHT_JOIN, 强制MySQL先读取左边的表,在读取右边的表
    • 如果有where条件,应当要能够使用索引,并尽可能地减少外层循环的数据量

1. 示例表

-- 小表
mysql> SELECT COUNT(*) FROM employees;
+----------+
| COUNT(*) |
+----------+
|   300025 |
+----------+
1 row in set (0.02 sec)

-- 大表
mysql> SELECT COUNT(*) FROM salaries;
+----------+
| COUNT(*) |
+----------+
|  2844047 |
+----------+
1 row in set (0.10 sec)

2. employees是驱动表,salaries是被驱动表

-- 1. 当id相同的时候,那么就从上往下执行
-- 2. 当id不同的时候,那么就根据id大的先执行
-- 3. 下面列子就会先执行第一行
mysql> EXPLAIN
    -> SELECT * FROM employees e 
    -> LEFT JOIN salaries s on e.emp_no = s.emp_no
    -> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | s     | NULL       | ref   | PRIMARY       | PRIMARY | 4       | const |   17 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

3. employees是驱动表,dept_emp是被驱动表

dept_emp是驱动表,departments是被驱动

mysql> EXPLAIN
    -> SELECT * FROM employees e 
    -> LEFT JOIN dept_emp de on e.emp_no = de.emp_no
    -> LEFT JOIN departments d on de.dept_no = d.dept_no
    -> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref                  | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | const  | PRIMARY       | PRIMARY | 4       | const                |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | de    | NULL       | ref    | PRIMARY       | PRIMARY | 4       | const                |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | d     | NULL       | eq_ref | PRIMARY       | PRIMARY | 16      | employees.de.dept_no |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+--------+---------------+---------+---------+----------------------+------+----------+-------+
3 rows in set, 1 warning (0.00 sec)

4. 尽管查询语句是大表驱动小表, 关联查询优化器会自动选择最优的执行顺序

mysql> EXPLAIN
    -> SELECT * FROM salaries s
    -> LEFT JOIN employees e on s.emp_no = e.emp_no
    -> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | s     | NULL       | ref   | PRIMARY       | PRIMARY | 4       | const |   17 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

5. 使用 STRAIGHT_JOIN, 强制MySQL先读取左边的表,在读取右边的表

mysql> EXPLAIN
    -> SELECT * FROM salaries s
    -> STRAIGHT_JOIN employees e on s.emp_no = e.emp_no
    -> WHERE e.emp_no = 10001;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | s     | NULL       | ref   | PRIMARY       | PRIMARY | 4       | const |   17 |   100.00 | NULL  |
|  1 | SIMPLE      | e     | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

JOIN调优原则-2

  • join的字段尽量创建索引
    • join字段的类型要保持一致
  • 尽量减少扫描的行数(explain-rows)
    • 尽量控制在百万以内(经验之谈,仅供参考)

示例创表语句

create table `test_user`
(
    `id`   int auto_increment,
    `name` varchar(45) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`)
);

INSERT INTO test_user (id, name)
VALUES (1, 'Eddie');

CREATE TABLE `test_user_ext`
(
    `id`      int(11)     NOT NULL AUTO_INCREMENT,
    `user_id` varchar(45) NOT NULL DEFAULT '',
    `email`   varchar(50) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `test_user_ext_user_id_index` (`user_id`)
);

INSERT INTO test_user_ext (id, user_id, email)
VALUES (1, '1', 'eddie@qq.com');

问题查询 (隐式转换)

mysql> EXPLAIN
    -> SELECT * FROM test_user u
    -> LEFT JOIN test_user_ext tue on u.id = tue.user_id
    -> WHERE u.id = 1;
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys               | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | const | PRIMARY                     | PRIMARY | 4       | const |    1 |   100.00 | NULL        |
|  1 | SIMPLE      | tue   | NULL       | ALL   | test_user_ext_user_id_index | NULL    | NULL    | NULL  |    1 |   100.00 | Using where |
+----+-------------+-------+------------+-------+-----------------------------+---------+---------+-------+------+----------+-------------+
2 rows in set, 3 warnings (0.01 sec)

会发现 test_user_ext 表出现 type=ALL 全表扫描的情况,

为什么出现这种情况呢?

因为join字段的类型不同时,索引无法使用:

u.id(int) = tue.user_id(varchar) 两个关联字段类型不一样而导致

通过终端查看 show warnings 获得警告信息
mysql> use employees;
mysql> EXPLAIN SELECT * FROM test_user u LEFT JOIN test_user_ext tue on u.id = tue.user_id WHERE u.id = 1;
mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                                                                                                                                          |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'test_user_ext_user_id_index' due to type or collation conversion on field 'user_id'                                                                                                                                                                              |
| Warning | 1739 | Cannot use range access on index 'test_user_ext_user_id_index' due to type or collation conversion on field 'user_id'                                                                                                                                                                            |
| Note    | 1003 | /* select#1 */ select '1' AS `id`,'Eddie' AS `name`,`employees`.`tue`.`id` AS `id`,`employees`.`tue`.`user_id` AS `user_id`,`employees`.`tue`.`email` AS `email` from `employees`.`test_user` `u` left join `employees`.`test_user_ext` `tue` on(('1' = `employees`.`tue`.`user_id`)) where true |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
解决办法

varchar类型修改为int类型即可

ALTER TABLE test_user_ext MODIFY user_id int DEFAULT 0 NOT NULL

mysql> EXPLAIN SELECT * FROM test_user u LEFT JOIN test_user_ext tue on u.id = tue.user_id WHERE u.id = 1;
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys               | key                         | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | u     | NULL       | const | PRIMARY                     | PRIMARY                     | 4       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | tue   | NULL       | ref   | test_user_ext_user_id_index | test_user_ext_user_id_index | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+-----------------------------+-----------------------------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

JOIN调优原则-3

  • 参与join的表不要太多
    • 阿里编程规约建议不超过3张 (比如业务就是很复杂,只能拆分语句,Java编写不同方法调用)
  • 如果被驱动表join字段用不了索引,且内存较为充足,可以考虑把参数:join buffer 设置得大一些

复杂语句拆分示例

-- 拆分前
SELECT * FROM employees e 
LEFT JOIN dept_emp de on e.emp_no = de.emp_no
LEFT JOIN departments d on de.dept_no = d.dept_no
WHERE e.emp_no = 10001;

-- 拆分后
SELECT * FROM employees WHERE emp_no = 10001;
SELECT * FROM dept_emp WHERE emp_no = 10001;        ### d005
SELECT * FROM departments WHERE dept_no = 'd005';

比如使用Mybatis, 可以拆分三个不同的方法,得到数据,这里举例说明,一般三个Join是可以接受的


不要以为写复杂的语句就很牛,要么你是DBA,要么你就想将来接手项目的人崩溃。拆分代码方法性能绝对比你多张表join 数据库循环更有效率。

吐槽

在本人工作经历,曾经有一些自认很牛叉的人,关联十张或者更多join, 然后一堆函数和字段返回再次查询,然后在 UNION ALL。

无论怎么跟他们解释,都是感觉自己一条SQL写到尾。返回数据不是更快捷,可惜将来接手的人,看到Mybatis的XML文件,一个方法过百行,十几个join,就无语了。

只能说一句,你们要么就是很懒的人不想写代码,要么就想把项目搞崩。