MySQL DeadLock故障排查全過程記錄
【作者】
劉博:攜程技術(shù)保障中心數(shù)據(jù)庫高級(jí)經(jīng)理,主要關(guān)注Sql server和Mysql的運(yùn)維和故障處理。
【環(huán)境】
版本號(hào):5.6.21
隔離級(jí)別:REPEATABLE READ
【問題描述】
接到監(jiān)控報(bào)警,有一個(gè)線上的應(yīng)用DeadLock報(bào)錯(cuò),每15分鐘會(huì)準(zhǔn)時(shí)出現(xiàn),報(bào)錯(cuò)統(tǒng)計(jì)如下圖:
登錄Mysql服務(wù)器查看日志:
mysql> show engine innodb status\G *** (1) TRANSACTION: TRANSACTION 102973, ACTIVE 11 sec starting index read mysql tables in use 3, locked 3 LOCK WAIT 4 lock struct(s), heap size 1136, 3 row lock(s) MySQL thread id 6, OS thread handle 140024996574976, query id 83 localhost us updating UPDATE TestTable SET column1 = 1, Column2 = sysdate(), Column3= '026' Column4 = 0 AND column5 = 485 AND column6 = 'SEK' *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 417 page no 1493 n bits 1000 index idx_column6 of table test.TestTable trx id 102973 lock_mode X waiting Record lock, heap no 859 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007e1452; asc ~ R;; *** (2) TRANSACTION: TRANSACTION 102972, ACTIVE 26 sec starting index read mysql tables in use 3, locked 3 219 lock struct(s), heap size 24784, 2906 row lock(s), undo log entries 7 MySQL thread id 5, OS thread handle 140024996841216, query id 84 localhost us updating UPDATE TestTable SET Column1 = 1, Column2 = sysdate(), Column3 = '026' Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK' *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 417 page no 1493 n bits 1000 index idx_Column6 of table test.TestTable trx id 102972 lock_mode X Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;; Record lock, heap no 859 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007e1452; asc ~ R;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 601 page no 89642 n bits 1000 index idx_column6 of table test.TestTable trx id 32231892482 lock_mode X locks rec but not gap waiting Record lock, heap no 38 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007eea14; asc ~ ;;
大致一看,更新同一索引的同一行,應(yīng)該是一個(gè)Block,報(bào)TimeOut的錯(cuò)才對(duì),怎么會(huì)報(bào)DeadLock?
【初步分析】
先分析下(2) TRANSACTION,TRANSACTION 32231892482。
等待的鎖信息為:
0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007eea14; asc
持有的鎖信息為:
0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007eeac4; asc
再先分析下(1) TRANSACTION,TRANSACTION 32231892617。
等待的鎖信息為:
0: len 3; hex 53454b; asc SEK;; 1: len 8; hex 80000000007eeac4; asc
于是可以畫出的死鎖表,兩個(gè)資源相互依賴,造成死鎖:
TRANSACTION | Hold | Wait |
---|---|---|
32231892617 | 53454b\80000000007eea14 | 53454b\80000000007eeac4 |
32231892482 | 53454b\80000000007eeac4 | 53454b\80000000007eea14 |
讓我們再看一下explain結(jié)果:
mysql>desc UPDATE TestTable SET Column1=1, Column2 = sysdate(),Column3 = '025' Column4 = 0 AND Column5 = 477 AND Column6 = 'SEK' \G;
*************************** 1. row ***************************
id: 1
select_type: UPDATE
table: TestTable
partitions: NULL
type: index_merge
possible_keys: column5_index,idx_column5_column6_Column1,idxColumn6
key: column5_index,idxColumn6
key_len: 8,9
ref: NULL
rows: 7
filtered: 100.00
Extra: Using intersect(column5_index,idxColumn6); Using where
可以看到 EXTRA 列:
Using intersect(column5_index,idxColumn6)
從5.1開始,引入了 index merge 優(yōu)化技術(shù),對(duì)同一個(gè)表可以使用多個(gè)索引分別進(jìn)行條件掃描。
相關(guān)文檔:http://dev.mysql.com/doc/refman/5.7/en/index-merge-optimization.html
The Index Merge method is used to retrieve rows with several range scans and to merge their results into one. The merge can produce unions, intersections, or unions-of-intersections of its underlying scans. This access method merges index scans from a single table; it does not merge scans across multiple tables.
【模擬與驗(yàn)證】
根據(jù)以上初步分析,猜測應(yīng)該就是intersect造成的,于是在測試環(huán)境模擬驗(yàn)證,開啟2個(gè)session模擬死鎖:
時(shí)間序列 | Session1 | Session2 |
---|---|---|
1 | Begin; | |
2 | UPDATE TestTable SET Column2 = sysdate() Column4 = 0 AND Column5 = 47 AND Column6 = 'SEK 執(zhí)行成功,影響7行 |
|
3 | Begin; | |
4 | UPDATE TestTable SET Column2 = sysdate(),Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'; 被Blocking |
|
5 | UPDATE TestTable SET Column2 = sysdate(),Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'; 執(zhí)行成功 |
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction |
依據(jù)以上信息可以發(fā)現(xiàn)Session2雖然被Block了,但也獲取了一些Session1在時(shí)間序列5時(shí)所需資源的X鎖,可以再開啟一個(gè)查詢select count(Column5) from TestTable where Column5 = 485,設(shè)置SET TRANSACTION ISOLATION LEVEL SERIALIZABLE,去查詢Column5 = 485的行,觀察鎖等待的信息:
mysql> SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id
mysql> SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread, b.trx_query blocking_query FROM information_schema.innodb_lock_waits w INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id \G; *************************** 1. row *************************** waiting_trx_id: 103006 waiting_thread: 36 waiting_query: UPDATE TestTable SET Column1 = 1, Column2 = sysdate(), Column3 = '026' Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK' blocking_trx_id: 103003 blocking_thread: 37 blocking_query: NULL *************************** 2. row *************************** waiting_trx_id: 421500433538672 waiting_thread: 39 waiting_query: select count(Column5) from TestTable where Column5 = 485 blocking_trx_id: 103006 blocking_thread: 36 blocking_query: UPDATE TestTable SET Column1 = 1, Column2 = sysdate(), Column3 = '026' Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK' 2 rows in set, 1 warning (0.00 sec) mysql> select * from information_schema.innodb_lock_waits \G; *************************** 1. row *************************** requesting_trx_id: 103006 requested_lock_id: 103006:417:1493:859 blocking_trx_id: 103003 blocking_lock_id: 103003:417:1493:859 *************************** 2. row *************************** requesting_trx_id: 421500433538672 requested_lock_id: 421500433538672:417:749:2 blocking_trx_id: 103006 blocking_lock_id: 103006:417:749:2 2 rows in set, 1 warning (0.00 sec)
mysql> select * from INNODB_LOCKS \G; *************************** 1. row *************************** lock_id: 103006:417:1493:859 lock_trx_id: 103006 lock_mode: X lock_type: RECORD lock_table: test.TestTable lock_index: idxColumn6 lock_space: 417 lock_page: 1493 lock_rec: 859 lock_data: 'SEK', 8262738 *************************** 2. row *************************** lock_id: 103003:417:1493:859 lock_trx_id: 103003 lock_mode: X lock_type: RECORD lock_table:test.TestTable lock_index: idxColumn6 lock_space: 417 lock_page: 1493 lock_rec: 859 lock_data: 'SEK', 8262738 *************************** 3. row *************************** lock_id: 421500433538672:417:749:2 lock_trx_id: 421500433538672 lock_mode: S lock_type: RECORD lock_table: test.TestTable lock_index: column5_index lock_space: 417 lock_page: 749 lock_rec: 2 lock_data: 485, 8317620 *************************** 4. row *************************** lock_id: 103006:417:749:2 lock_trx_id: 103006 lock_mode: X lock_type: RECORD lock_table: test.TestTable lock_index: column5_index lock_space: 417 lock_page: 749 lock_rec: 2 lock_data: 485, 8317620 4 rows in set, 1 warning (0.00 sec)
可以看到Session2,trx_id 103006阻塞了trx_id 421500433538672,而trx_id 421500433538672 requested_lock也正好是lock_data: 485, 8317620。由此可見Session2雖然別block了,但是還是獲取到了Index column5_index相關(guān)的鎖。被Block是因?yàn)閕ntersect的原因,還需要idxColumn6的鎖,至此思路已經(jīng)清晰,對(duì)整個(gè)分配鎖的信息簡化一下,如下表格(請求到的鎖用青色表示,需獲取但未獲取到的鎖用紅色表示):
時(shí)間點(diǎn) | Session1 | Session2 |
---|---|---|
1 | 477 SEK | |
2 | 485 SEK | |
3 | 485 SEK | 死鎖發(fā)生 |
可以看到485 SEK這兩個(gè)資源形成了一個(gè)環(huán)狀,最終發(fā)生死鎖。
【解決方法】
- 最佳的方法是添加column5和Column6的聯(lián)合索引。
- 我們環(huán)境當(dāng)時(shí)的情況發(fā)現(xiàn)Column6的篩選度非常低,就刪除了Column6的索引。
10:55左右刪除索引后,報(bào)錯(cuò)沒有再發(fā)生:
總結(jié)
以上就是這篇文章的全部內(nèi)容了,希望本文的內(nèi)容對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,如果有疑問大家可以留言交流,謝謝大家對(duì)腳本之家的支持。
相關(guān)文章
mysql執(zhí)行sql文件報(bào)錯(cuò)Error: Unknown storage engine‘InnoDB’的解決方法
最近在執(zhí)行一個(gè)innoDB類型sql文件的時(shí)候,發(fā)現(xiàn)系統(tǒng)報(bào)錯(cuò)了,通過查找相關(guān)的資料終于解決了,所以下面這篇文章主要給大家介紹了關(guān)于mysql執(zhí)行sql文件時(shí)報(bào)錯(cuò)Error: Unknown storage engine 'InnoDB'的解決方法,需要的朋友可以參考借鑒,下面來一起看看吧。2017-07-07升級(jí)到mysql-connector-java8.0.27的注意事項(xiàng)
這篇文章主要介紹了升級(jí)到mysql-connector-java8.0.27的注意事項(xiàng),凡是升級(jí)總會(huì)碰到點(diǎn)問題,換了連接器后部署果然報(bào)錯(cuò)了,下面小編給大家分享解決方法,需要的朋友可以參考下2021-12-12MySQL表的CURD操作(數(shù)據(jù)的增刪改查)
數(shù)據(jù)庫本質(zhì)上是一個(gè)文件系統(tǒng),通過標(biāo)準(zhǔn)的SQL語句對(duì)數(shù)據(jù)進(jìn)行CURD操作,下面這篇文章主要給大家介紹了關(guān)于MySQL表的CURD操作的相關(guān)資料,文中通過實(shí)例代碼介紹的非常詳細(xì),需要的朋友可以參考下2023-02-02MYSQL數(shù)據(jù)庫查詢按日期分組統(tǒng)計(jì)詳細(xì)代碼
這篇文章主要給大家介紹了關(guān)于MYSQL數(shù)據(jù)庫查詢按日期分組統(tǒng)計(jì)的相關(guān)資料,按日期分組是指按照時(shí)間維度對(duì)數(shù)據(jù)進(jìn)行分類匯總統(tǒng)計(jì),常用于查詢分析具有時(shí)間屬性的數(shù)據(jù),例如訂單量、用戶活躍等,需要的朋友可以參考下2024-01-01