MySQL尾部空格處理方法詳解
在之前的這篇文章Oracle/MySQL/PG/SQL Server關系數據庫中NULL與空字符串的區(qū)別中,簡單對比、介紹了一下MySQL的NULL/空格/尾部空格處理方式,主要對比了NULL與長度為空的字符串,其實很多地方沒有分析到位就一筆帶過了。這篇文章重新來細說一下MySQL的尾部空格處理方式。在MySQL中,有幾個因素會影響MySQL如何處理空格。這里簡單淺析一下.
MySQL的排序規(guī)則有一個屬性Pad Attributes屬性,這個屬性的設置會影響數據庫如何處理尾部空格(是否忽略尾部空格),如下官方文檔描述
Collation?Pad?Attributes Collations?based?on?UCA?9.0.0?and?higher?are?faster?than?collations?based?on?UCA?versions?prior?to?9.0.0.?They?also?have?a?pad?attribute?of?NO?PAD,?in?contrast?to?PAD?SPACE?as?used?in?collations?based?on?UCA?versions?prior?to?9.0.0.?For?comparison?of?nonbinary?strings,?NO?PAD?collations?treat?spaces?at?the?end?of?strings?like?any?other?character?(see?Trailing?Space?Handling?in?Comparisons). Comparison?of?nonbinary?string?values?(CHAR,?VARCHAR,?and?TEXT)?that?have?a?NO?PAD?collation?differ?from?other?collations?with?respect?to?trailing?spaces.?For?example,?'a'?and?'a?'?compare?as?different?strings,?not?the?same?string.?This?can?be?seen?using?the?binary?collations?for?utf8mb4.?The?pad?attribute?for?utf8mb4_bin?is?PAD?SPACE,?whereas?for?utf8mb4_0900_bin?it?is?NO?PAD.?Consequently,?operations?involving?utf8mb4_0900_bin?do?not?add?trailing?spaces,?and?comparisons?involving?strings?with?trailing?spaces?may?differ?for?the?two?collations
官方文檔,關于比較中尾部空格處理介紹如下:
Trailing?Space?Handling?in?Comparisons MySQL?collations?have?a?pad?attribute,?which?has?a?value?of?PAD?SPACE?or?NO?PAD: ??Most?MySQL?collations?have?a?pad?attribute?of?PAD?SPACE. ??The?Unicode?collations?based?on?UCA?9.0.0?and?higher?have?a?pad?attribute?of?NO?PAD;?see?Section?10.10.1,?“Unicode?Character?Sets”. For?nonbinary?strings?(CHAR,?VARCHAR,?and?TEXT?values),?the?string?collation?pad?attribute?determines?treatment?in?comparisons?of?trailing?spaces?at?the?end?of?strings: ??For?PAD?SPACE?collations,?trailing?spaces?are?insignificant?in?comparisons;?strings?are?compared?without?regard?to?trailing?spaces. ??NO?PAD?collations?treat?trailing?spaces?as?significant?in?comparisons,?like?any?other?character. The?differing?behaviors?can?be?demonstrated?using?the?two?utf8mb4?binary?collations,?one?of?which?is?PAD?SPACE,?the?other?of?which?is?NO?PAD.
我們先來看看PAD SPACE與NO PAD的具體意義:
PAD SPACE:在排序和比較運算中,忽略字符串尾部空格。
NO PAD:在排序和比較運算中,字符串尾部空格當成普通字符,不能忽略。
SELECT?*? FROM?INFORMATION_SCHEMA.COLLATIONS ???????WHERE?CHARACTER_SET_NAME?=?'utf8mb4'; mysql>?select?@@version; +-----------+ |?@@version?| +-----------+ |?8.0.33????| +-----------+ 1?row?in?set?(0.00?sec) mysql>?SELECT?*? ????->?FROM?INFORMATION_SCHEMA.COLLATIONS ????->????????WHERE?CHARACTER_SET_NAME?=?'utf8mb4' ????->???AND?COLLATION_NAME?IN('utf8mb4_general_ci','utf8mb4_0900_ai_ci'); +--------------------+--------------------+-----+------------+-------------+---------+---------------+ |?COLLATION_NAME?????|?CHARACTER_SET_NAME?|?ID??|?IS_DEFAULT?|?IS_COMPILED?|?SORTLEN?|?PAD_ATTRIBUTE?| +--------------------+--------------------+-----+------------+-------------+---------+---------------+ |?utf8mb4_0900_ai_ci?|?utf8mb4????????????|?255?|?Yes????????|?Yes?????????|???????0?|?NO?PAD????????| |?utf8mb4_general_ci?|?utf8mb4????????????|??45?|????????????|?Yes?????????|???????1?|?PAD?SPACE?????| +--------------------+--------------------+-----+------------+-------------+---------+---------------+ 2?rows?in?set?(0.00?sec) mysql>
下面我們來測試演練一下,首先新建一個測試數據庫gsp
CREATE?DATABASE?IF?NOT?EXISTS?gsp??? DEFAULT?CHARACTER?SET?utf8mb4 DEFAULT?COLLATE?utf8mb4_general_ci; mysql>?create?table?test(id?int?not?null,?name?varchar(16)); Query?OK,?0?rows?affected?(0.01?sec) mysql>?insert?into?test(id?,?name)?values(1,?null); Query?OK,?1?row?affected?(0.00?sec) mysql>?insert?into?test(id?,?name)?values(2,?''); Query?OK,?1?row?affected?(0.01?sec) mysql>?insert?into?test(id?,?name)?values(3,?'?');?--?包含一個空格 Query?OK,?1?row?affected?(0.00?sec) mysql>?insert?into?test(id?,?name)?values(4,?'??');--?包含兩個空格 Query?OK,?1?row?affected?(0.00?sec) mysql>? #length獲取字符串字節(jié)長度的內置函數, #char_length獲取字符串長度的內置函數 #hex將字符或數字轉換為十六進制格式 mysql>?select?id,?length(name),?char_length(name),?hex(name)?from?test; +----+--------------+-------------------+-----------+ |?id?|?length(name)?|?char_length(name)?|?hex(name)?| +----+--------------+-------------------+-----------+ |??1?|?????????NULL?|??????????????NULL?|?NULL??????| |??2?|????????????0?|?????????????????0?|???????????| |??3?|????????????1?|?????????????????1?|?20????????| |??4?|????????????2?|?????????????????2?|?2020??????| +----+--------------+-------------------+-----------+ 4?rows?in?set?(0.00?sec) mysql>?select?*?from?test?where?name=''; +----+------+ |?id?|?name?| +----+------+ |??2?|??????| |??3?|??????| |??4?|??????| +----+------+ 3?rows?in?set?(0.00?sec) mysql> mysql>?select?*?from?test?where?name='?';?--?包含一個空格 +----+------+ |?id?|?name?| +----+------+ |??2?|??????| |??3?|??????| |??4?|??????| +----+------+ 3?rows?in?set?(0.00?sec) mysql>?select?*?from?test?where?name='??';?--?包含兩個空格 +----+------+ |?id?|?name?| +----+------+ |??2?|??????| |??3?|??????| |??4?|??????| +----+------+ 3?rows?in?set?(0.00?sec) mysql>
官方文檔中也要一個例子簡單說明,兩者比較時,如何處理尾部空格。如下所示,相當直觀、明了:
mysql>?SET?NAMES?utf8mb4?COLLATE?utf8mb4_bin; mysql>?SELECT?'a?'?=?'a'; +------------+ |?'a?'?=?'a'?| +------------+ |??????????1?| +------------+ mysql>?SET?NAMES?utf8mb4?COLLATE?utf8mb4_0900_bin; mysql>?SELECT?'a?'?=?'a'; +------------+ |?'a?'?=?'a'?| +------------+ |??????????0?| +------------+
我們上面的例子中,gsp數據庫的排序規(guī)則為utf8mb4_general_ci,它的Pad Attributes屬性為PAD SPACE,在排序和比較運算中,數據庫將忽略字符串尾部空格,所以上面幾個SQL的查詢結果是一樣的。如果你數據庫的排序規(guī)則選擇utf8mb4_0900_ai_ci,那么Pad Attributes屬性為NO PAD,此時,在排序和比較運算中,字符串尾部空格當成普通字符處理,不能忽略.我們知道MySQL的排序規(guī)則分為服務器級別,數據庫級別、表級別、列級別、SQL語句級別。優(yōu)先級順序為:SQL語句級別>列級別>表級別>數據庫級別>服務器級別。你可以重新建一個數據庫,設置數據庫排序規(guī)則為utf8mb4_0900_ai_ci,也可以修改其他級別的排序規(guī)則...。這里我們就直接修改表的列級別排序規(guī)則。
腳本如下所示:
drop?table?test; create?table?test(id?int?not?null,?name?varchar(16)?collate?utf8mb4_0900_ai_ci?); insert?into?test(id?,?name)?values(1,?null); insert?into?test(id?,?name)?values(2,?''); insert?into?test(id?,?name)?values(3,?'?');?--?包含一個空格 insert?into?test(id?,?name)?values(4,?'??');--?包含兩個空格
然后,我們測試一下,如下所示,此時由于數據庫將字符串尾部空格當成普通字符,所以此時的查詢結果就不同了。如下所示
mysql>?select?*?from?test?where?name='';?--?長度為零的空字符串 +----+------+ |?id?|?name?| +----+------+ |??2?|??????| +----+------+ 1?row?in?set?(0.00?sec) mysql>?select?*?from?test?where?name='?';?--?含一個空格 +----+------+ |?id?|?name?| +----+------+ |??3?|??????| +----+------+ 1?row?in?set?(0.00?sec) mysql>?select?*?from?test?where?name='??';--?包含兩個空格 +----+------+ |?id?|?name?| +----+------+ |??4?|??????| +----+------+ 1?row?in?set?(0.00?sec) mysql>
另外,我們這里測試的是VARCHAR類型,如果字段類型為CHAR呢?
其實呢,對于CHAR類型和VARCHA類型,它們的存儲略有區(qū)別:
CHAR(N):當插入的字符數小于N,它會在字符串的右邊補充空格,直到總字符數達到N再進行存儲;當查詢返回數據時默認會將字符串尾部的空格去掉,除非SQL_MODE設置PAD_CHAR_TO_FULL_LENGTH。
VARCHAR(N):當插入的字符數小于N,它不會在字符串的右邊補充空格,insert內容原封不動的進行存儲;如果原本字符串右邊有空格,在存儲和查詢返回時都會保留空格
The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255. When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.
PAD_CHAR_TO_FULL_LENGTH
By default, trailing spaces are trimmed from CHAR column values on retrieval. If PAD_CHAR_TO_FULL_LENGTH is enabled, trimming does not occur and retrieved CHAR values are padded to their full length. This mode does not apply to VARCHAR columns, for which trailing spaces are retained on retrieval.
Note: As of MySQL 8.0.13, PAD_CHAR_TO_FULL_LENGTH is deprecated. Expect it to be removed in a future version of MySQL.
注意事項:
PAD_CHAR_TO_FULL_LENGTH只影響CHAR類型,不影響VARCHAR類型。
MySQL 8.0.13后,PAD_CHAR_TO_FULL_LENGTH參數過時/廢棄了。這個參數可能在后續(xù)的MySQL版本中被移除。
當前版本(MySQL 8.0.33)中,暫時還可以在SQL_MODE中設置這個參數,不過默認不會設置此參數。那么我們來測試驗證一下:
drop?table?test; create?table?test(id?int?not?null,?name?char(16)??); insert?into?test(id?,?name)?values(1,?null); insert?into?test(id?,?name)?values(2,?''); insert?into?test(id?,?name)?values(3,?'?');?--?包含一個空格 insert?into?test(id?,?name)?values(4,?'??');--?包含兩個空格 mysql>?show?variables?like?'sql_mode'\G ***************************?1.?row?*************************** Variable_name:?sql_mode ????????Value:?ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION 1?row?in?set?(0.00?sec) mysql>?--如下所示,SQL_MODE沒有設置PAD_CHAR_TO_FULL_LENGTH時,查詢返回數據時默認會將字符串尾部的空格去掉,所以你看到長度為零 mysql>?select?id,?length(name),?char_length(name),hex(name)?from?test; +----+--------------+-------------------+-----------+ |?id?|?length(name)?|?char_length(name)?|?hex(name)?| +----+--------------+-------------------+-----------+ |??1?|?????????NULL?|??????????????NULL?|?NULL??????| |??2?|????????????0?|?????????????????0?|???????????| |??3?|????????????0?|?????????????????0?|???????????| |??4?|????????????0?|?????????????????0?|???????????| +----+--------------+-------------------+-----------+ 4?rows?in?set?(0.00?sec) mysql>
那么我們手工設置一下當前會話的SQL_MODE,然后對比測試一下:
mysql>?set?sql_mode='PAD_CHAR_TO_FULL_LENGTH'; Query?OK,?0?rows?affected,?1?warning?(0.01?sec) mysql>?show?variables?like?'sql_mode'; +---------------+-------------------------+ |?Variable_name?|?Value???????????????????| +---------------+-------------------------+ |?sql_mode??????|?PAD_CHAR_TO_FULL_LENGTH?| +---------------+-------------------------+ 1?row?in?set?(0.01?sec) mysql>?select?id,?length(name),?char_length(name),hex(name)?from?test; +----+--------------+-------------------+----------------------------------+ |?id?|?length(name)?|?char_length(name)?|?hex(name)????????????????????????| +----+--------------+-------------------+----------------------------------+ |??1?|?????????NULL?|??????????????NULL?|?NULL?????????????????????????????| |??2?|???????????16?|????????????????16?|?20202020202020202020202020202020?| |??3?|???????????16?|????????????????16?|?20202020202020202020202020202020?| |??4?|???????????16?|????????????????16?|?20202020202020202020202020202020?| +----+--------------+-------------------+----------------------------------+ 4?rows?in?set?(0.00?sec) mysql>
通過上面的分析講述,我們知道當數據庫的排序規(guī)則的Pad Attributes屬性為NO PAD時,此時SQL_MODE的PAD_CHAR_TO_FULL_LENGTH設置與否將會影響查詢結果。我們新建一個gsp2數據庫,數據庫排序規(guī)則為utf8mb4_0900_ai_ci,下面我們通過實驗對比一下就知道了:
mysql>?use?gsp2; Database?changed mysql>?show?variables?like?'sql_mode'\G ***************************?1.?row?*************************** Variable_name:?sql_mode ????????Value:?ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION 1?row?in?set?(0.00?sec) --?對于CHAR類型,查詢返回數據時默認會將字符串尾部的空格去掉,所以name=''會得到三條記錄 mysql>?select?*?from?test?where?name=''; +----+------+ |?id?|?name?| +----+------+ |??2?|??????| |??3?|??????| |??4?|??????| +----+------+ 3?rows?in?set?(0.01?sec) mysql>?select?*?from?test?where?name='?'; Empty?set?(0.00?sec) mysql>?select?*?from?test?where?name='??'; Empty?set?(0.00?sec) mysql>?set?sql_mode='PAD_CHAR_TO_FULL_LENGTH'; Query?OK,?0?rows?affected,?1?warning?(0.00?sec) mysql>?show?variables?like?'sql_mode'\G ***************************?1.?row?*************************** Variable_name:?sql_mode ????????Value:?PAD_CHAR_TO_FULL_LENGTH 1?row?in?set?(0.01?sec) --?SQL_MODE設置PAD_CHAR_TO_FULL_LENGTH。查詢返回數據時,字符串尾部的空格不會去掉,此時,name字段時16個空格,故而下面查詢條件查不到數據。 mysql>??select?*?from?test?where?name=''; Empty?set?(0.01?sec) mysql>?select?*?from?test?where?name='?'; Empty?set?(0.00?sec) mysql>?select?*?from?test?where?name='??'; Empty?set?(0.00?sec) mysql>
總結
關于MySQL的尾部空格是否忽略,以及對查詢結果的影響,既跟數據庫的排序規(guī)則有關(確切來說,是跟數據庫排序規(guī)則的Pad Attributes有關),其實還跟字符類型和SQL_MODE是否設置PAD_CHAR_TO_FULL_LENGTH有關。
到此這篇關于MySQL尾部空格處理方法詳解的文章就介紹到這了,更多相關MySQL尾部空格處理內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
mysql怎么關閉sql_mode=ONLY_FULL_GROUP_BY模式
前段時間在項目開發(fā)過程中發(fā)現了系統異常,打開日志查看發(fā)現了如下的這個報錯,查找相關資料終于解決了,這篇文章主要給大家介紹了關于mysql怎么關閉sql_mode=ONLY_FULL_GROUP_BY模式的相關資料,需要的朋友可以參考下2024-01-01