问题现象
调用后台接口向表 t1 insert 写入数据时一直等待直到超时,猜测表 t1 被其它事务加锁了没有释放。
问题分析
在发生死锁时,通过执行下面命令查看事务和锁信息:
select * from information_schema.INNODB_TRX
用来查看正在运行的事务信息,包括事务的锁信息,只要有运行的事务就有输出内容
mysql> select * from information_schema.INNODB_TRX\G
*************************** 1. row ***************************
trx_id: 40549
trx_state: RUNNING
trx_started: 2024-10-11 08:13:30
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 5
trx_mysql_thread_id: 41
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 2
trx_lock_structs: 3
trx_lock_memory_bytes: 1136
trx_rows_locked: 3
trx_rows_modified: 2
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
-
select * from information_schema.INNODB_LOCKS
查询锁信息,只有发生死锁时才会输出内容mysql> select * from information_schema.INNODB_LOCKS\G *************************** 1. row *************************** lock_id: 40554:938:3:1 lock_trx_id: 40554 lock_mode: X lock_type: RECORD lock_table: `allen`.`student` lock_index: PRIMARY lock_space: 938 lock_page: 3 lock_rec: 1 lock_data: supremum pseudo-record *************************** 2. row *************************** lock_id: 40549:938:3:1 lock_trx_id: 40549 lock_mode: S lock_type: RECORD lock_table: `allen`.`student` lock_index: PRIMARY lock_space: 938 lock_page: 3 lock_rec: 1 lock_data: supremum pseudo-record 2 rows in set, 1 warning (0.00 sec)
-
select * from information_schema.INNODB_LOCK_WAITS
也是在只有死锁时才会输出内容,查询阻塞的事务因为哪个事务没有释放锁导致获取不到锁mysql> select * from information_schema.INNODB_LOCK_WAITS\G *************************** 1. row *************************** requesting_trx_id: 40554 requested_lock_id: 40554:938:3:1 blocking_trx_id: 40549 blocking_lock_id: 40549:938:3:1 1 row in set, 1 warning (0.00 sec)
上面 3 个命令都只能在 MySQL 8.0之前使用,如果是 MySQL 8.0 及之后的版本,使用下面的命令查询事务锁信息
先执行
SET GLOBAL innodb_status_output_locks = ON;
,再执行show engine innodb status;
mysql> show engine innodb status\G
...
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421571571083608, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421571571082688, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421571571080848, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421571571079008, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 40571, ACTIVE 5 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 43, OS thread handle 140096265193216, query id 1332 172.17.0.1 root update
/* ApplicationName=DBeaver 7.3.5 - SQLEditor <Script-15.sql> */ insert into student (num,age,name) values(003,10,'小明')
------- TRX HAS BEEN WAITING 5 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 938 page no 3 n bits 72 index PRIMARY of table `allen`.`student` trx id 40571 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
------------------
TABLE LOCK table `allen`.`student` trx id 40571 lock mode IX
RECORD LOCKS space id 938 page no 3 n bits 72 index PRIMARY of table `allen`.`student` trx id 40571 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
---TRANSACTION 40566, ACTIVE 10 sec
3 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 41, OS thread handle 140096265463552, query id 1330 172.17.0.1 root
TABLE LOCK table `allen`.`student` trx id 40566 lock mode IS
RECORD LOCKS space id 938 page no 3 n bits 72 index PRIMARY of table `allen`.`student` trx id 40566 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
可以看出执行 insert … select 的时候,会对select查询满足条件的行和间隙加共享锁。之所以加锁为了日志和数据的一致性,如果不加锁,可能在insert into t2 select from t1执行过程中对 t1 insert新数据,insert into t1的binlog日志可能早于 insert … select from t1记录,binlog 同步到备库时,就会先执行insert into t1,再执行insert into t2 select from t1,导致备库和主库不一致。
找到执行 insert … select 事务后,发现 insert … select 其实执行的很快,但此事务在执行完 insert … select 语句后面有比较耗时的操作。
问题解决
将执行 insert … select 事务的耗时操作拆分到另一个事务中,不要影响 insert … select 对查询表的锁释放。