MySQL查询优化—不止是索引-技术精选-玖叶教程网

在这篇博文中，我将讨论优化索引是否总是提高MySQL查询性能的关键(spoiler，它不是)。

在查看MySQL查询性能时，我们首先关注的语句是否使用正确的索引来检索数据，通过索引来找到数据总是昂贵，而您应该关注的是MySQL查询优化。然而，情况并非总是如此。

让我们来看看这个查询来说明:

mysql> show create table tbl G
*************************** 1. row ***************************
Table: tbl
Create Table: CREATE TABLE `tbl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`g` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=2340933 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> explain select g,count(*) c from tbl where k<1000000 group by g having c>7 G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tbl
partitions: NULL
type: ALL
possible_keys: k_1
key: NULL
key_len: NULL
ref: NULL
rows: 998490
filtered: 50.00
Extra: Using where; Using temporary; Using filesort
1 row in set, 1 warning (0.00 sec)
mysql> select g,count(*) c from tbl where k<1000000 group by g having c>7;
+--------+----+
| g | c |
+--------+----+
| 28846 | 8 |
| 139660 | 8 |
| 153286 | 8 |
...
| 934984 | 8 |
+--------+----+
22 rows in set (6.80 sec)

查看这个查询，许多人可能会认为这个查询的主要问题是进行了全表扫描。有人可能会问，“为什么MySQL优化器不使用索引(k)?”(顺便说一句，这是因为这个条件不够有选择性。) 这种想法可能会使得有人使用该索引，而导致更糟糕的性能:

mysql> select g,count(*) c from tbl force index(k) where k<1000000 group by g having c>7;
+--------+----+
| g | c |
+--------+----+
| 28846 | 8 |
| 139660 | 8 |
...
| 934984 | 8 |
+--------+----+
22 rows in set (9.37 sec)

或者有人可能将索引(k)扩展到(k,g)，成为一个组合索引去覆盖，但是这也不会提高性能:

mysql> alter table tbl drop key k_1, add key(k,g);
Query OK, 0 rows affected (5.35 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> explain select g,count(*) c from tbl where k<1000000 group by g having c>7 G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tbl
partitions: NULL
type: range
possible_keys: k
key: k
key_len: 4
ref: NULL
rows: 499245
filtered: 100.00
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set, 1 warning (0.00 sec)

mysql> select g,count(*) c from tbl where k<1000000 group by g having c>7;
+--------+----+
| g | c |
+--------+----+
| 28846 | 8 |
| 139660 | 8 |
...
| 915436 | 8 |
| 934984 | 8 |
+--------+----+
22 rows in set (6.80 sec)

很明显这是一种费力不讨好的事情，最重要的是如何能尽快找到匹配k<1000000的所有行。在这个例子里这些都不是问题，事实上，涉及到所有相同列但不使用GROUP BY 的查询的速度是原来的10倍:

mysql> select sum(g) from tbl where k<1000000;
+--------------+
| sum(g) |
+--------------+
| 500383719481 |
+--------------+
1 row in set (0.68 sec)

对于这个例子，是否使用索引查找不应该是主要问题。相反，我们应该看看如何优化GROUP BY——它负责90%的查询响应时间。

在我的下一篇博文中，我将介绍MySQL的GROUP BY的使用以进一步优化这些查询。

原文摘自

https://www.percona.com/blog/2018/01/30/is-indexing-always-the-key-to-mysql-query-performance/

玖叶教程网

前端编程开发入门

MySQL查询优化—不止是索引