使用DEBUG_SYNC排查MySQL truncate问题

By | 2013 年 12 月 7 日

线上问题

DBA用脚本发送truncate操作清理MySQL表中数据,脚本执行一段时间使用ctrl+c想取消操作,却导致slave同步binlog后数据不一致,复制中断,之后通过select发现master表中数据依旧存在,但是truncate对应的binlog被记录记录,而slave中的表数据被清理,因为binlog复制过来后truncate操作执行成功

重现问题

这个问题比较诡异,truncate实际上是失败,但是binlog记录下来,我们知道binlog记录一个完整事务,只有当事务成功执行完才会写binlog,只有问题能够重现才可能找到原因,遂测试truncate发送后马上ctrl+c取消,然而每次truncate都成功,表中数据被清理,binlog也记录下来,后来思考DBA当时使用的是脚本,发送的是多条truncat操作,这样的话可以让truncate执行过程中被打断的位置尽可能的多,这样复现问题的可能性也高一些,经过多轮尝试,终于重现了:

执行truncate脚本并ctrl+c取消

mysql -uroot -Nse "show tables in tt;"|while read table; do mysql -uroot -e "truncate tt.$table";done
Ctrl-C -- query aborted.
^CCtrl-C -- sending "KILL 230" to server ...
Ctrl-C -- query aborted.
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
^CCtrl-C -- sending "KILL QUERY 233" to server ...
Ctrl-C -- query aborted.
ERROR 1317 (70100) at line 1: Query execution was interrupted
^CCtrl-C -- sending "KILL QUERY 235" to server ...
Ctrl-C -- query aborted.

查看binlog

root@tt 09:19:11>show binlog events in 'mysql-bin.000174';
+------------------+-----+-------------+------------+-------------+--------------------------------------------+
| Log_name         | Pos | Event_type  | Server_id  | End_log_pos | Info                                       |
+------------------+-----+-------------+------------+-------------+--------------------------------------------+
| mysql-bin.000174 |   4 | Format_desc | 2552760298 |         107 | Server ver: 5.5.18.2-tb-log, Binlog ver: 4 |
| mysql-bin.000174 | 107 | Query       | 2552760298 |         180 | truncate tt.t1                             |
| mysql-bin.000174 | 180 | Query       | 2552760298 |         254 | truncate tt.t10                            |
| mysql-bin.000174 | 254 | Query       | 2552760298 |         327 | truncate tt.t4                             |
| mysql-bin.000174 | 327 | Query       | 2552760298 |         400 | truncate tt.t2                             |
| mysql-bin.000174 | 400 | Query       | 2552760298 |         473 | truncate tt.t3                             |
| mysql-bin.000174 | 473 | Query       | 2552760298 |         546 | truncate tt.t5                             |
| mysql-bin.000174 | 546 | Query       | 2552760298 |         619 | truncate tt.t7                             |
| mysql-bin.000174 | 619 | Query       | 2552760298 |         692 | truncate tt.t6                             |
| mysql-bin.000174 | 692 | Query       | 2552760298 |         765 | truncate tt.t8                             |
| mysql-bin.000174 | 765 | Query       | 2552760298 |         838 | truncate tt.t9                             |
+------------------+-----+-------------+------------+-------------+--------------------------------------------+

查看表中数据量

mysql -uroot -Nse "show tables in tt;"|while read table; do echo -e $table'\t'`mysql -uroot -e "select count(*) as rows from tt.$table"`;done
t1      rows 0
t10     rows 0
t2      rows 0
t3      rows 69108864
t4      rows 2000000
t5      rows 2000000
t6      rows 2000000
t7      rows 2000000
t8      rows 0
t9      rows 0

分析问题

问题得到重现,就离找到原因和解决办法不远了,难掩心中激动,从发送ctrl+c出发分析:从上来的log来看,ctrl+c实际上发送了kill query到server,在代码中也证实了,而kill query做的事情很简单:将对应thd的killed成员置为THD::KILL_QUERY(sql_kill函数),因此可以猜测可能的原因:truncate执行过程中,sql_kill将thd->killed置为THD::KILL_QUERY,truncate在某处检查thd的killed状态,发现问题导致退出,用gdb跟踪代码路径:

#0  Truncate_statement::handler_truncate (this=<value optimized out>, thd=0x78c11480, table_ref=0x2ba5dc004c30, is_tmp_table=<value optimized out>)
    at sql/sql_truncate.cc:226
#1  0x00000000007a9c20 in Truncate_statement::truncate_table (this=0x2ba5dc005160, thd=0x78c11480, table_ref=0x2ba5dc004c30)
    at sql/sql_truncate.cc:475
#2  0x00000000007a9cde in Truncate_statement::execute (this=0x2ba5dc005160, thd=0x78c11480) at sql/sql_truncate.cc:529
#3  0x0000000000587a1e in mysql_execute_command (thd=0x78c11480) at sql/sql_parse.cc:4813
#4  0x000000000058a094 in mysql_parse (thd=0x78c11480, rawbuf=<value optimized out>, length=11, parser_state=0x2ba5d807fbd0)
    at sql/sql_parse.cc:6047
#5 0x000000000058bfce in dispatch_command (command=COM_QUERY, thd=0x78c11480, packet=0x2ba5d807fcb8 "", packet_length=11)

在Truncate_statement::truncate_table中发现了可能原因:

    ...
    error= handler_truncate(thd, table_ref, FALSE);
    binlog_stmt= !error || error != HA_ERR_WRONG_COMMAND;
  ...
  /* DDL is logged in statement format, regardless of binlog format. */
  if (binlog_stmt)
    error|= write_bin_log(thd, !error, thd->query(), thd->query_length());
  ...

在handler_truncate执行失败时(执行成功返回0),如果返回值不为HA_ERR_WRONG_COMMAND,会导致binlog被记录下来,即出现文章开头的问题,进入到handler_truncate中继续分析:

/*
  Open and truncate a locked table.
 
  @param  thd           Thread context.
  @param  table_ref     Table list element for the table to be truncated.
  @param  is_tmp_table  True if element refers to a temp table.
 
  @retval  0    Success.
  @retval  > 0  Error code.
*/
int Truncate_statement::handler_truncate(THD *thd, TABLE_LIST *table_ref,
                                         bool is_tmp_table)
{
  ...
  /* Open the table as it will handle some required preparations. */
  if (open_and_lock_tables(thd, table_ref, FALSE, flags))
    DBUG_RETURN(1);
 
  /* Whether to truncate regardless of foreign keys. */
  if (! (thd->variables.option_bits & OPTION_NO_FOREIGN_KEY_CHECKS))
    error= fk_truncate_illegal_if_parent(thd, table_ref->table);
 
  if (!error && (error= table_ref->table->file->ha_truncate()))
    table_ref->table->file->print_error(error, MYF(0));
  ...

table->file->ha_truncate()才是真正进入到storage engine层,一般进入到innodb层后就基本不检查thd->killed状态了,这里找到了一个可疑的点:open_and_lock_tables,如果此处执行失败,handler_truncate返回1(不为HA_ERR_WRONG_COMMAND),Truncate_statement::truncate_table中的binlog_stmt为1,最终导致的结果和我们遇到的问题完全符合,需要确认,跟着代码继续跟踪,找到了导致问题的backtrace:

#0  open_table (thd=0x78c11480, table_list=0x2ba5dc004c30, mem_root=0x2ba5d807e3d0, ot_ctx=0x2ba5d807e410) at sql/sql_base.cc:2683
#1  0x000000000054d5c1 in open_and_process_table (thd=0x78c11480, start=0x2ba5d807e498, counter=0x2ba5d807e4bc, flags=130, prelocking_strategy=0x2ba5d807e5f0)
    at sql/sql_base.cc:4482
#2  open_tables (thd=0x78c11480, start=0x2ba5d807e498, counter=0x2ba5d807e4bc, flags=130, prelocking_strategy=0x2ba5d807e5f0)
    at sql/sql_base.cc:4930
#3  0x000000000054e194 in open_and_lock_tables (thd=<value optimized out>, tables=0x2ba5dc004c30, derived=false, flags=130, 
    prelocking_strategy=<value optimized out>) at sql/sql_base.cc:5527
#4  0x00000000007a93f9 in open_and_lock_tables (this=<value optimized out>, thd=0x78c11480, table_ref=0x2ba5dc004c30, is_tmp_table=<value optimized out>)
    at sql/sql_base.h:475
#5  Truncate_statement::handler_truncate (this=<value optimized out>, thd=0x78c11480, table_ref=0x2ba5dc004c30, is_tmp_table=<value optimized out>)
    at sql/sql_truncate.cc:226
#6  0x00000000007a9c20 in Truncate_statement::truncate_table (this=0x2ba5dc005160, thd=0x78c11480, table_ref=0x2ba5dc004c30)
    at sql/sql_truncate.cc:475
#7  0x00000000007a9cde in Truncate_statement::execute (this=0x2ba5dc005160, thd=0x78c11480) at sql/sql_truncate.cc:529
#8  0x0000000000587a1e in mysql_execute_command (thd=0x78c11480) at sql/sql_parse.cc:4813
#9  0x000000000058a094 in mysql_parse (thd=0x78c11480, rawbuf=<value optimized out>, length=11, parser_state=0x2ba5d807fbd0)
    at sql/sql_parse.cc:6047
#10 0x000000000058bfce in dispatch_command (command=COM_QUERY, thd=0x78c11480, packet=0x2ba5d807fcb8 "", packet_length=11)
    at sql/sql_parse.cc:1273

而在open_table中找到了证据:

  ...
  /* an open table operation needs a lot of the stack space */
  if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias))
    DBUG_RETURN(TRUE);
 
  if (thd->killed)
    DBUG_RETURN(TRUE);
  ...

因此跟踪源码找到了一个可能的解释:truncate table执行过程中ctrl+c将导致thd->killed被设置,open_table在thd->killed的情况下返回TRUE,最终会导致Truncate_statement::handler_truncate函数在执行table->file->ha_truncate()之前提前返回TRUE

DEBUG_SYNC还原问题

以上是一个合理的解释,但是需要将问题还原出来以证明猜测,这个时候就需要DEBUG_SYNC了,在Truncate_statement::handler_truncate中open_and_lock_tables之前添加一个DEBUG_SYNC点before_open_and_lock_tables,在sql_kill之后增加一个DEBUG_SYNC点after_kill,线程1执行truncate,在before_open_and_lock_tables处等待,直到线程2执行到after_kill,之后线程1继续执行

Index: sql/sql_parse.cc
===================================================================
--- sql/sql_parse.cc    (revision 4589)
+++ sql/sql_parse.cc    (working copy)
@@ -4109,6 +4109,7 @@
       goto error;
     }
     sql_kill(thd, (ulong)it->val_int(), lex->type & ONLY_KILL_QUERY);
+    DEBUG_SYNC(thd, "after_kill");
     break;
   }
 #ifndef NO_EMBEDDED_ACCESS_CHECKS
Index: sql/sql_truncate.cc
===================================================================
--- sql/sql_truncate.cc (revision 4589)
+++ sql/sql_truncate.cc (working copy)
@@ -222,6 +222,8 @@
     table_ref->mdl_request.ticket= NULL;
   }
 
+  DEBUG_SYNC(thd, "before_open_and_lock_tables");
+
   /* Open the table as it will handle some required preparations. */
   if (open_and_lock_tables(thd, table_ref, FALSE, flags))
     DBUG_RETURN(1);

构造test case:truncate_bug.test

source include/have_binlog_format_row_or_statement.inc;
 
connection default;
use test;
create table t1(id int auto_increment primary key, a int, b int) engine=innodb;
insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10);
set debug_sync = "before_open_and_lock_tables signal truncate_before_lock wait_for kill_finish";
send truncate t1; 
 
  connect(con1,localhost,root,,);
  connection con1;
  set debug_sync = "now wait_for truncate_before_lock";
  set debug_sync = "after_kill signal kill_finish";
  send kill query 2;
 
connection default;
--ERROR 1317
reap;
 
connection con1;
reap;
show binlog events in "master-bin.000001' from 107;                         
select * from t1; 
drop table t1;

在mysql-test下执行test case:
./mtr –record truncate_bug
查看执行结果:

use test;
create table t1(id int auto_increment primary key, a int, b int) engine=innodb;
insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10);
set debug_sync = "before_open_and_lock_tables signal truncate_before_lock wait_for kill_finish";
truncate t1; 
set debug_sync = "now wait_for truncate_before_lock";
set debug_sync = "after_kill signal kill_finish";
kill query 2;
ERROR 70100: Query execution was interrupted                                                                                          
show binlog events in 'master-bin.000001' from 107;
Log_name        Pos     Event_type      Server_id       End_log_pos     Info
master-bin.000001       107     Query   1       248     use `test`; create table t1(id int auto_increment primary key, a int, b int) e
ngine=innodb
master-bin.000001       248     Query   1       316     BEGIN
master-bin.000001       316     Intvar  1       344     INSERT_ID=1
master-bin.000001       344     Query   1       464     use `test`; insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10)
master-bin.000001       464     Xid     1       491     COMMIT /* xid=16 */
select * from t1; 
id      a       b   
1       1       2   
2       2       4   
3       3       6   
4       4       8   
5       5       10  
drop table t1;

观察到:binlog中truncate被记录下来,而表中数据依旧存在,验证了猜想是正确的,问题出在ctrl+c导致Truncate_statement::handler_truncate中open_and_lock_tables返回1,找到原因后,修复就很简单了,最简单的修复:将open_and_lock_tables失败返回1改成返回HA_ERR_WRONG_COMMAND;复杂一点的修复:open_and_lock_tables返回一个负数,在Truncate_statement::truncate_table中增加相应的判断:

binlog_stmt= !error || (error > 0 && error != HA_ERR_WRONG_COMMAND);

于是向官方report了一个bug并提供了fast fix:MySQL bug#71070

truncate进入innodb后还可以通过ctrl+c或者kill取消吗?

有了上面的基础,验证这个就很easy了,在Truncate_statement::handler_truncate中open_and_lock_tables之后增加一个DEBUG_SYNC点after_open_and_lock_tables,然后执行下面的test case:

source include/have_binlog_format_row_or_statement.inc;
 
connection default;
use test;
create table t1(id int auto_increment primary key, a int, b int) engine=innodb;
insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10);
set debug_sync = "after_open_and_lock_tables signal truncate_after_lock wait_for kill_finish";
send truncate t1; 
 
  connect(con1,localhost,root,,);
  connection con1;
  set debug_sync = "now wait_for truncate_after_lock";
  set debug_sync = "after_kill signal kill_finish";
  send kill query 2;
 
connection default;
#--ERROR 1317;                                                                                                         
reap;
 
connection con1;
reap;
# ignore the first binlog
show binlog events in 'master-bin.000001' from 107;
select * from t1; 
drop table t1;

执行结果:

use test;
create table t1(id int auto_increment primary key, a int, b int) engine=innodb;
insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10);
set debug_sync = "after_open_and_lock_tables signal truncate_after_lock wait_for kill_finish";
truncate t1; 
set debug_sync = "now wait_for truncate_after_lock";
set debug_sync = "after_kill signal kill_finish";
kill query 2;
show binlog events in 'master-bin.000001' from 107;                                                                                   
Log_name        Pos     Event_type      Server_id       End_log_pos     Info
master-bin.000001       107     Query   1       248     use `test`; create table t1(id int auto_increment primary key, a int, b int) e
ngine=innodb
master-bin.000001       248     Query   1       316     BEGIN
master-bin.000001       316     Intvar  1       344     INSERT_ID=1
master-bin.000001       344     Query   1       464     use `test`; insert into t1(a, b) values(1,2),(2,4),(3,6),(4,8),(5,10)
master-bin.000001       464     Xid     1       491     COMMIT /* xid=16 */
master-bin.000001       491     Query   1       565     use `test`; truncate t1
select * from t1; 
id      a       b   
drop table t1;

结论

一旦truncate操作执行完了open_table_and_lock_tables进入到innodb层,通过kill或者ctrl+c是无法取消的!

发表评论

电子邮件地址不会被公开。 必填项已用*标注