MyISAM表一例查询性能问题排查

问题背景

线上相同结构的两个MyISAM表,数据量差不多,执行的SQL查询计划基本一致,但是执行时间却相差很大

mysql> select * from auction_info_d__201402 where unit_id=7016 and day='2014-03-27';
Empty set (0.31 sec)
mysql> select * from auction_info_d__201403 where unit_id=7016 and day='2013-03-27';
Empty set (19.81 sec)

MySQL版本:

mysql> select @@version;
+------------+
| @@version  |
+------------+
| 5.1.48-log |
+------------+
1 row in set (0.00 sec)


查询计划基本一样

mysql> explain select * from auction_info_d__201402 where unit_id=7016 and day='2014-03-27';
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
| id | select_type | table                       | type | possible_keys | key     | key_len | ref   | rows   | Extra       |
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
|  1 | SIMPLE      | auction_info_d__201402      | ref  | PRIMARY       | PRIMARY | 4       | const | 326584 | Using where |
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
1 row in set (0.00 sec)
 
mysql> explain select * from auction_info_d__201403 where unit_id=7016 and day='2013-03-27';
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
| id | select_type | table                       | type | possible_keys | key     | key_len | ref   | rows   | Extra       |
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
|  1 | SIMPLE      | auction_info_d__201403      | ref  | PRIMARY       | PRIMARY | 4       | const | 224574 | Using where |
+----+-------------+-----------------------------+------+---------------+---------+---------+-------+--------+-------------+
1 row in set (0.00 sec)

排查过程

这两个查询都是使用primary key(只有一个索引:PRIMARY KEY (`unit_id`,`aucid`,`day`))中的unit_id查出数据,然后采用day进行过滤,因为auction_info_d__201402保存的是2月份的数据,而auction_info_d__201403是3月份的数据,因此两个select命中的数据集都应该为空

mysql> select min(day) as mind, max(day) as maxd from auction_info_d__201402 where unit_id=7016;
+------------+------------+
| mind       | maxd       |
+------------+------------+
| 2014-02-01 | 2014-02-28 |
+------------+------------+
1 row in set (0.14 sec)
 
mysql> select min(day) as mind, max(day) as maxd from auction_info_d__201403 where unit_id=7016; 
+------------+------------+
| mind       | maxd       |
+------------+------------+
| 2014-03-01 | 2014-03-31 |
+------------+------------+
1 row in set (0.11 sec)

从explain中可以看出两个select根据unit_id条件过滤数据量相差不大,应该不至于性能相差这么大,首先查询数据量大小和MyISAM相关配置:

$ ll auction_info_d__201402.*
-rw-rw---- 1 mysql dba      10379 Feb  2 09:47 auction_info_d__201402.frm
-rw-rw---- 1 mysql dba 3845544846 Mar  1 06:01 auction_info_d__201402.MYD
-rw-rw---- 1 mysql dba  691238912 Mar  1 06:13 auction_info_d__201402.MYI
$ ll auction_info_d__201403.*
-rw-rw---- 1 mysql dba      10379 Mar  2 05:50 auction_info_d__201403.frm
-rw-rw---- 1 mysql dba 4325708816 Apr  1 06:06 auction_info_d__201403.MYD
-rw-rw---- 1 mysql dba  859683840 Apr  1 10:00 auction_info_d__201403.MYI

MyISAM key_buffer_size:

mysql> show global variables like '%key_buffer%';
+-----------------+-----------+
| Variable_name   | Value     |
+-----------------+-----------+
| key_buffer_size | 104857600 |
+-----------------+-----------+
1 row in set (0.00 sec)

可以看出key_buffer_size约100M的样子,auction_info_d__201402和auction_info_d__201403的索引文件都大于600M,key_buffer_size设置得太小了,调大应该可以增加性能,但是这也无法解释为什么auction_info_d__201402上的查询远远快于auction_info_d__201403上的查询,因此尝试使用profile寻找原因:

mysql> show profile all for query 1;
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
| Status                         | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function  | Source_file   | Source_line |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
| starting                       | 0.000071 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | NULL             | NULL          |        NULL |
| checking query cache for query | 0.000048 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_cache.cc  |        1446 |
| Opening tables                 | 0.000517 | 0.000000 |   0.000000 |                 1 |                   0 |          136 |             0 |             0 |                 0 |                 2 |                 1 |     0 | unknown function | sql_base.cc   |        4517 |
| System lock                    | 0.000034 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | lock.cc       |         258 |
| Table lock                     | 0.000035 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | lock.cc       |         269 |
| init                           | 0.000082 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 2 |     0 | unknown function | sql_select.cc |        2484 |
| optimizing                     | 0.000027 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |         820 |
| statistics                     | 0.000079 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 4 |     0 | unknown function | sql_select.cc |        1011 |
| preparing                      | 0.000031 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        1033 |
| executing                      | 0.000020 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        1755 |
| Sending data                   | 0.312744 | 0.000000 |   0.000000 |               237 |                   3 |         2112 |             0 |             0 |                 0 |               115 |               347 |     0 | unknown function | sql_select.cc |        2309 |
| end                            | 0.000072 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        2530 |
| query end                      | 0.000022 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        4986 |
| freeing items                  | 0.000060 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        6012 |
| logging slow query             | 0.000021 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        1687 |
| cleaning up                    | 0.000023 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        1655 |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
16 rows in set (0.00 sec)
 
mysql> show profile all for query 2;
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
| Status                         | Duration  | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function  | Source_file   | Source_line |
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
| starting                       |  0.000070 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | NULL             | NULL          |        NULL |
| checking query cache for query |  0.000041 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_cache.cc  |        1446 |
| Opening tables                 |  0.000070 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 2 |     0 | unknown function | sql_base.cc   |        4517 |
| System lock                    |  0.000022 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | lock.cc       |         258 |
| Table lock                     |  0.000028 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | lock.cc       |         269 |
| init                           |  0.000046 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        2484 |
| optimizing                     |  0.000027 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |         820 |
| statistics                     |  0.000072 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 1 |     0 | unknown function | sql_select.cc |        1011 |
| preparing                      |  0.000030 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        1033 |
| executing                      |  0.000021 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        1755 |
| Sending data                   | 19.815528 | 0.000000 |   0.000000 |             74035 |                 371 |       601616 |           736 |             0 |                 0 |               221 |              6291 |     0 | unknown function | sql_select.cc |        2309 |
| end                            |  0.000062 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_select.cc |        2530 |
| query end                      |  0.000022 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        4986 |
| freeing items                  |  0.000048 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        6012 |
| logging slow query             |  0.000021 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        1687 |
| logging slow query             |  0.000551 | 0.000000 |   0.000000 |                 2 |                   0 |           80 |             8 |             0 |                 0 |                 1 |                 0 |     0 | unknown function | sql_parse.cc  |        1697 |
| cleaning up                    |  0.000026 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | unknown function | sql_parse.cc  |        1655 |
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+------------------+---------------+-------------+
17 rows in set (0.00 sec)

有了一些发现:慢查询对应的profile中Sending result阶段context switch和page fault比较异常,问题原因逐渐明显:慢查询产生了更多的page置换,page置换消耗了大量时间,导致cpu context switch次数变多,这个主要是因为key_buffer_size设置得太小引起的,但是这个也同样无法解释为什么另外一个查询没有产生同样的问题

因此询问应用开发人员业务场景,被告知auction_info_d__201402是历史数据,不会更新,auction_info_d__201403目前每天凌晨都会load一次当天数据进去,线上其它时间只有select操作,猜想load导致了索引文件碎片化,从而使得随机IO变多:

mysql> show status like 'key_read%';
+-------------------+----------+
| Variable_name     | Value    |
+-------------------+----------+
| Key_read_requests | 19653701 |
| Key_reads         | 337870   |
+-------------------+----------+
2 rows in set (0.00 sec)
 
mysql> select * from auction_info_d__201402 where unit_id=7016 and day='2014-03-27';
Empty set (0.32 sec)
 
mysql> show status like 'key_read%';
+-------------------+----------+
| Variable_name     | Value    |
+-------------------+----------+
| Key_read_requests | 19705651 |
| Key_reads         | 338072   |
+-------------------+----------+
2 rows in set (0.00 sec)
 
mysql> select * from auction_info_d__201403 where unit_id=7016 and day='2013-03-27';
Empty set (19.81 sec)
 
mysql> show status like 'key_read%';
+-------------------+----------+
| Variable_name     | Value    |
+-------------------+----------+
| Key_read_requests | 19831609 |
| Key_reads         | 340363   |
+-------------------+----------+
2 rows in set (0.00 sec)

对比发现:
1. auction_info_d__201402(简称02)上的查询key_reads只增加了2,说明都在内存中命中了,而auction_info_d__201403(简称03)上查询这个增量是12k,cache出现大量miss需要从磁盘中加载数据,这解释了profile的结果
2. 02上查询Key_read_requests的增长为5w多,而03上查询这个增量是12w,说明03上查询需要访问的page数量比较多,因此需要执行Analyze table

MySQL Manual上建议key_buffer_size的设置应该使得:

Key_reads/Key_read_requests < 0.01

而从线上信息计算得到的比值为:(340363/19831609) = 0.017,因此建议增大key_buffer_size

解决方案

1. 增大key_buffer_size,缓冲更多的索引信息,减小page fault
2. load数据后执行一次ANALYZE LOCAL TABLE(使用LOCAL是为了避免影响到备库)更新索引及统计信息

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注