快捷導(dǎo)航

python判斷是否漢字的5種方法實例

更新時間：2023年06月29日 09:32:53 作者：dingdongkk

這篇文章主要給大家介紹了關(guān)于python判斷是否漢字的5種方法,文中通過實例代碼將判斷的幾種方法介紹的非常詳細(xì),對大家學(xué)習(xí)或者使用python具有一定的參考學(xué)習(xí)價值,需要的朋友可以參考下

1. 使用Python內(nèi)置的ord()

ord()函數(shù)將字符轉(zhuǎn)換為Unicode編碼，然后判斷其范圍是否在漢字的范圍內(nèi)：

示例代碼：

def is_chinese(char):
    if '\u4e00' <= char <= '\u9fff':
        return True
    else:
        return False

2. 使用Python內(nèi)置的unicodedata庫：

使用Python內(nèi)置的unicodedata庫可以用來判斷一個字符是否為漢字

示例代碼：

import unicodedata
def is_chinese(char):
    if 'CJK' in unicodedata.name(char):
        return True
    else:
        return False

3. 使用正則表達(dá)式

可以使用正則表達(dá)式來判斷一個字符是否為漢字。例如，使用 [^\u4e00-\u9fa5] 可以匹配所有非漢字字符，而 [^\x00-\xff] 可以匹配所有雙字節(jié)字符，包括漢字和符號等。

示例代碼：

import re

# 判斷字符是否為漢字
def is_chinese(word):
    pattern = re.compile(r'[^\u4e00-\u9fa5]')
    if pattern.search(word):
        return False
    else:
        return True

4. 使用中文字符集

可以使用中文字符集來判斷一個字符是否為漢字。例如，使用 GB2312 字符集或者 GBK 字符集，將每個漢字編碼為一個雙字節(jié)字符，判斷一個字符是否在這個字符集中即可。

示例代碼：

# 判斷字符是否為漢字
def is_chinese(word):
    if b'\xb0\xa1' <= word.encode('gb2312') <= b'\xd7\xf9':
        return True
    else:
        return False

5. 使用第三方庫

還可以使用一些第三方庫來判斷一個字符是否為漢字，例如 xpinyin 庫可以將一個字符串轉(zhuǎn)換為拼音，并判斷字符串是否為漢字。
示例代碼：

from xpinyin import Pinyin

# 判斷字符是否為漢字
def is_chinese(word):
    pinyin = Pinyin()
    if pinyin.get_pinyin(word, '').isalpha():
        return False
    else:
        return True

補(bǔ)充：Python 判斷字符串是否包含中文漢字

一行代碼實現(xiàn):

# 一行代碼判斷是否有漢字,ddd 代表要檢測的字符串
f = lambda x='ddd':sum([1 if u'\u4e00' <= i <= u'\u9fff' else 0 for i in x])>0
 
f('444')
False
 
f('ddddd的')
True
 
# 直接上也行 x 代表字符串
sum([1 if u'\u4e00' <= i <= u'\u9fff' else 0 for i in x])>0
 
sum([1 if u'\u4e00' <= i <= u'\u9fff' else 0 for i in 'dd哈'])>0
True

def is_chinese(string):
    """
    檢查整個字符串是否包含中文
    :param string: 需要檢查的字符串
    :return: bool
    """
    for ch in string:
        if u'\u4e00' <= ch <= u'\u9fff':
            return True
 
    return False
 
ret1 = is_chinese("a哦哦哈aaa")
print(ret1)
 
ret2 = is_chinese("123")
print(ret2)