分享3個MySQL查詢容易踩的坑
前言
在使用MySQL的過程中,相比大家都踩過不少坑,下面我將列舉日常生活中比較高頻的三個”坑“,也歡迎大家踴躍分享自己的心路歷程 ??!
1、無 Order by 的隨機問題
簡單來說,就是在沒有 Order By 的情況下,如果SELECT的字段不同,返回的記錄順序是隨機的,不一定一樣。
因為返回的記錄順序跟插入順序、主鍵順序、以及字段上是否建立了索引等因素都有關聯(lián),可以通過下面的例子進行驗證:
假設有這樣的一張用戶點擊表:
create table t_user_click ( id int(11) auto_increment primary key, obj varchar(64) default '' not null comment '點擊對象', click smallint default 0 not null comment '點擊數(shù)', remark varchar(64) default '' not null comment '備注', created_at timestamp default CURRENT_TIMESTAMP not null comment '創(chuàng)建時間', updated_at timestamp default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP comment '更新時間' ) comment '用戶點擊表';
插入幾條測試數(shù)據(jù):
INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (1, 'aaa', 10, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57'); INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (2, 'bbb', 20, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57'); INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (3, 'ccc', 30, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57'); INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (4, 'ddd', 40, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57'); INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (5, 'eee', 50, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57'); INSERT INTO t_user_click (id, obj, click, remark, created_at, updated_at) VALUES (6, 'fff', 60, '', '2022-10-18 14:44:57', '2022-10-18 14:44:57');
首先來看插入順序帶來的影響,在順序插入幾條順序后,不管查詢多少次,獲取的結果都與插入順序一致:
# 全部字段 mysql> select * from t_user_click where click > 0; +----+-----+-------+--------+---------------------+---------------------+ | id | obj | click | remark | created_at | updated_at | +----+-----+-------+--------+---------------------+---------------------+ | 1 | aaa | 10 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 2 | bbb | 20 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 3 | ccc | 30 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 4 | ddd | 40 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | ... # 與上述獲取全部字段的SQL結果一致 mysql> select id, click from t_user_click where click > 0; ...
其次來看主鍵順序這個因素,當該表進行過 DELETE/REPLACE/UPDATE
操作時,不會再按照插入順序排序了,而是會按照主鍵ID進行排序。
# 進行update操作 UPDATE t_user_click SET id=7 WHERE id=3; # 此時按照主鍵排序,obj=ccc 變成最后一條,而不是第3條 mysql> select * from t_user_click where click > 0; +----+-----+-------+--------+---------------------+---------------------+ | id | obj | click | remark | created_at | updated_at | +----+-----+-------+--------+---------------------+---------------------+ | 1 | aaa | 10 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 2 | bbb | 20 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 4 | ddd | 40 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 5 | eee | 50 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 6 | fff | 60 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 7 | ccc | 30 | | 2022-10-18 14:44:57 | 2022-10-18 14:46:20 | +----+-----+-------+--------+---------------------+---------------------+ # 與上述獲取全部字段的SQL結果一致 mysql> select id, click from t_user_click where click > 0; ...
最后來看有無索引這個因素,當我們在 click
字段上建立索引時,結果又不一樣了:
# 建索引 create index t_user_click_click_index on t_user_click (click); # id=7 在最后一條 mysql> select * from t_user_click where click > 0; +----+-----+-------+--------+---------------------+---------------------+ | id | obj | click | remark | created_at | updated_at | +----+-----+-------+--------+---------------------+---------------------+ | 1 | aaa | 10 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 2 | bbb | 20 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 4 | ddd | 40 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 5 | eee | 50 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 6 | fff | 60 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 7 | ccc | 30 | | 2022-10-18 14:44:57 | 2022-10-18 14:46:20 | +----+-----+-------+--------+---------------------+---------------------+ 6 rows in set (0.00 sec) # id=7 在第3條 mysql> select id, click from t_user_click where click > 0; +----+-------+ | id | click | +----+-------+ | 1 | 10 | | 2 | 20 | | 7 | 30 | | 4 | 40 | | 5 | 50 | | 6 | 60 | +----+-------+ # 我們可以通過 explain 命令來看兩種查詢方式的不同: # 兩種方式一個沒走索引,一個走了索引,從而導致返回結果的不同 mysql> explain select * from t_user_click where click > 0; type:ALL possible_keys:t_user_click_click_index key:NULL Extra:Using where mysql> explain select id, click from t_user_click where click > 0; type:index possible_keys:t_user_click_click_index key:t_user_click_click_index Extra:Using where; Using index
2、Order by + Limit 的隨機問題
簡單來說就是進行**Order by
的字段如果不唯一,則MySQL返回的記錄是隨機的,常見的表現(xiàn)就是數(shù)據(jù)分頁后出現(xiàn)重復**。
不過,這種隨機也不是隨機算法那種打亂的隨機,它跟數(shù)據(jù)的插入順序,以及索引的建立也有一定關系,可以看看下面的例子。
假設有一張這樣的用戶表:
create table t_user_list ( id int(11) auto_increment primary key, name varchar(64) default '' not null comment '名稱', age smallint default 0 not null comment '年齡', created_at timestamp default CURRENT_TIMESTAMP not null comment '創(chuàng)建時間', updated_at timestamp default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP comment '更新時間' ) comment '用戶表';
當你按順序插入測試數(shù)據(jù),然后再去分頁查詢,你可能會發(fā)現(xiàn)每次返回的結果都是固定的,也不會出現(xiàn)隨機的情況(這時候跟插入順序有關)。
# 插入測試數(shù)據(jù) INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (1, 'aaa', 1, '2022-10-18 12:55:19', '2022-10-18 12:55:18'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (2, 'bbb', 2, '2022-10-18 12:55:18', '2022-10-18 12:55:18'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (3, 'ccc', 3, '2022-10-18 12:55:18', '2022-10-18 12:55:18'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (4, 'ddd', 4, '2022-10-18 12:55:18', '2022-10-18 12:55:18'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (5, 'eee', 5, '2022-10-18 12:55:18', '2022-10-18 12:55:18'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (6, 'ggg', 6, '2022-10-18 12:55:19', '2022-10-18 12:55:19'); INSERT INTO t_user_list (id, name, age, created_at, updated_at) VALUES (7, 'iii', 7, '2022-10-18 12:55:17', '2022-10-18 12:55:19'); # 分頁查詢 mysql> select * from t_user_list order by created_at limit 0, 3; +----+------+-----+---------------------+---------------------+ | id | name | age | created_at | updated_at | +----+------+-----+---------------------+---------------------+ | 7 | iii | 7 | 2022-10-18 12:55:17 | 2022-10-18 12:55:19 | | 2 | bbb | 2 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 3 | ccc | 3 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | +----+------+-----+---------------------+---------------------+ 3 rows in set (0.00 sec) mysql> select * from t_user_list order by created_at limit 3, 3; +----+------+-----+---------------------+---------------------+ | id | name | age | created_at | updated_at | +----+------+-----+---------------------+---------------------+ | 4 | ddd | 4 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 5 | eee | 5 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 1 | aaa | 1 | 2022-10-18 12:55:19 | 2022-10-18 12:55:18 | +----+------+-----+---------------------+---------------------+ 3 rows in set (0.00 sec)
而當你這時候新建一個 created_at
索引,并重新插入新的數(shù)據(jù)時,如果你再次查詢,你會驚奇地發(fā)現(xiàn)數(shù)據(jù)重復了。
# 新建索引 create index t_user_list_created_at_index on t_user_list (created_at); # 插入新的測試數(shù)據(jù) INSERT INTO db_article.t_user_list (id, name, age, created_at, updated_at) VALUES (8, 'jjj', 8, '2022-10-18 12:55:18', '2022-10-18 12:55:19'); INSERT INTO db_article.t_user_list (id, name, age, created_at, updated_at) VALUES (9, 'kkk', 9, '2022-10-18 12:55:18', '2022-10-18 12:55:19'); INSERT INTO db_article.t_user_list (id, name, age, created_at, updated_at) VALUES (10, 'mmm', 10, '2022-10-18 12:55:18', '2022-10-18 12:55:19'); # 再次分頁查詢 mysql> select * from t_user_list order by created_at limit 0, 3; +----+------+-----+---------------------+---------------------+ | id | name | age | created_at | updated_at | +----+------+-----+---------------------+---------------------+ | 7 | iii | 7 | 2022-10-18 12:55:17 | 2022-10-18 12:55:19 | | 2 | bbb | 2 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 3 | ccc | 3 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | +----+------+-----+---------------------+---------------------+ 3 rows in set (0.00 sec) # id=3 的記錄重復了 mysql> select * from t_user_list order by created_at limit 3, 3; +----+------+-----+---------------------+---------------------+ | id | name | age | created_at | updated_at | +----+------+-----+---------------------+---------------------+ | 3 | ccc | 3 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 4 | ddd | 4 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | | 5 | eee | 5 | 2022-10-18 12:55:18 | 2022-10-18 12:55:18 | +----+------+-----+---------------------+---------------------+
關于記錄重復的問題,個人的猜想是一開始沒有 created_at
索引,此時磁盤的數(shù)據(jù)塊的順序與插入順序一致,所以返回的結果一直是固定的;但是加了索引后,并且增加了 created_at
一致的幾條數(shù)據(jù),導致索引重建數(shù)據(jù)塊順序發(fā)生變化,從而記錄重復(瞎猜的,大家可以一起討論下)
3、聚合函數(shù) + Limit的不準確問題
簡單來說,就是在對數(shù)據(jù)進行 Limit分頁時,同時使用聚合函數(shù)(比如SUM、COUNT等)對當前分頁的結果進行聚合,則最終得到的聚合結果是不準確的。
還是以第一點的例子和數(shù)據(jù)為例:
mysql> select * from t_user_click; +----+-----+-------+--------+---------------------+---------------------+ | id | obj | click | remark | created_at | updated_at | +----+-----+-------+--------+---------------------+---------------------+ | 1 | aaa | 10 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 2 | bbb | 20 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 4 | ddd | 40 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 5 | eee | 50 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 6 | fff | 60 | | 2022-10-18 14:44:57 | 2022-10-18 14:44:57 | | 7 | ccc | 30 | | 2022-10-18 14:44:57 | 2022-10-18 14:46:20 | +----+-----+-------+--------+---------------------+---------------------+ # 沒分頁的總數(shù) mysql> select sum(click) from t_user_click; +------------+ | sum(click) | +------------+ | 210 | +------------+ 1 row in set (0.00 sec) # 分頁后的總數(shù) mysql> select sum(click) from t_user_click limit 3; +------------+ | sum(click) | +------------+ | 210 | +------------+ # 可以通過子查詢解決 select sum(tmp.click) from (select click from t_user_click limit 3) as tmp;
可以看到,在有Limit的情況下,得到的聚合結果其實是所有記錄的總和,并不是三條記錄的總和。原因就是SELECT語句執(zhí)行時有一定順序,分別是 FROM、ON、JOIN、WHERE、GROUP BY、HAVING、SELECT、DISTINCT、ORDER BY和LIMIT。
在上述SQL中,SELECT優(yōu)先執(zhí)行,即先執(zhí)行 select sum(click) from t_user_click
,這時已經(jīng)得到結果為210,最終再執(zhí)行 limit 3
剔除不符合要求的記錄。
總結
最后小結一下,本文主要介紹了MySQL查詢中三個比較容易踩的坑,從中我們可以得到:
1、SELECT返回的順序跟多種因素有關,如插入順序、主鍵ID順序、索引順序等,如果你返回的結果要求有順序,則記得加上 order by
。
2、加上了 order by
也不要高興地太早,還需要檢查 order by
的字段是否唯一,如果不唯一,返回的結果也有可能是隨機的。
3、在進行聚合查詢時,切記看有無 LIMIT 子句,有的話記得加上子查詢避免查詢的結果不準確。
到此這篇關于3個MySQL查詢容易踩坑的文章就介紹到這了,更多相關MySQL查詢易踩的坑內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
Mysql計算字段長度函數(shù)之CHAR_LENGTH函數(shù)
mysql中計算字符串長度有兩個函數(shù)分別為char_length和length,char_length函數(shù)可以計算unicode字符,下面這篇文章主要給大家介紹了關于Mysql計算字段長度函數(shù)之CHAR_LENGTH函數(shù)的相關資料,需要的朋友可以參考下2023-05-05MySQL 8.0.18 Hash Join不支持left/right join左右連接問題
在MySQL 8.0.18中,增加了Hash Join新功能,它適用于未創(chuàng)建索引的字段,做等值關聯(lián)查詢。這篇文章給大家介紹MySQL 8.0.18 Hash Join不支持left/right join左右連接,感興趣的朋友一起看看吧2019-11-11