PHP截取中文字符串的問題
更新時間:2006年07月12日 00:00:00 作者:
以下代碼試用于GB2312編碼,截取中文字符串是PHP中一個頭疼的問題,解決方法是根據(jù)值是否大于等于128來判斷是否是雙字節(jié)字符,以避免出現(xiàn)亂碼的情況。但中英文混合、特殊符號等問題總是存在,現(xiàn)在寫一個比較全面的,僅供參考:
程序說明:
1. len 參數(shù)以中文字符為標(biāo)準(zhǔn),1len等于2個英文字符,為了形式上好看些
2. 如果將magic參數(shù)設(shè)為false,則中文和英文同等看待,取絕對的字符數(shù)
3. 特別適用于用htmlspecialchars()進行過編碼的字符串
4. 能正確處理GB2312中實體字符模式()
程序代碼:
function FSubstr($title,$start,$len="",$magic=true)
{
/**
* powered by Smartpig
* mailto:d.einstein@263.net
*/
$length = 0;
if($len == "") $len = strlen($title);
//判斷起始為不正確位置
if($start > 0)
{
$cnum = 0;
for($i=0;$i<$start;$i++)
{
if(ord(substr($title,$i,1)) >= 128) $cnum ++;
}
if($cnum%2 != 0) $start--;
unset($cnum);
}
if(strlen($title)<=$len) return substr($title,$start,$len);
$alen = 0;
$blen = 0;
$realnum = 0;
for($i=$start;$i<strlen($title);$i++)
{
$ctype = 0;
$cstep = 0;
$cur = substr($title,$i,1);
if($cur == "&")
{
if(substr($title,$i,4) == "<")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,4) == ">")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,5) == "&")
{
$cstep = 5;
$length += 5;
$i += 4;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == """)
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == "'")
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(preg_match("/&#(\d+);/i",substr($title,$i,8),$match))
{
$cstep = strlen($match[0]);
$length += strlen($match[0]);
$i += strlen($match[0])-1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}
}else{
if(ord($cur)>=128)
{
$cstep = 2;
$length += 2;
$i += 1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}else{
$cstep = 1;
$length +=1;
$realnum ++;
if($magic)
{
$alen++;
}
}
}
if($magic)
{
if(($blen*2+$alen) == ($len*2)) break;
if(($blen*2+$alen) == ($len*2+1))
{
if($ctype == 1)
{
$length -= $cstep;
break;
}else{
break;
}
}
}else{
if($realnum == $len) break;
}
}
unset($cur);
unset($alen);
unset($blen);
unset($realnum);
unset($ctype);
unset($cstep);
return substr($title,$start,$length);
}
程序說明:
1. len 參數(shù)以中文字符為標(biāo)準(zhǔn),1len等于2個英文字符,為了形式上好看些
2. 如果將magic參數(shù)設(shè)為false,則中文和英文同等看待,取絕對的字符數(shù)
3. 特別適用于用htmlspecialchars()進行過編碼的字符串
4. 能正確處理GB2312中實體字符模式()
程序代碼:
function FSubstr($title,$start,$len="",$magic=true)
{
/**
* powered by Smartpig
* mailto:d.einstein@263.net
*/
$length = 0;
if($len == "") $len = strlen($title);
//判斷起始為不正確位置
if($start > 0)
{
$cnum = 0;
for($i=0;$i<$start;$i++)
{
if(ord(substr($title,$i,1)) >= 128) $cnum ++;
}
if($cnum%2 != 0) $start--;
unset($cnum);
}
if(strlen($title)<=$len) return substr($title,$start,$len);
$alen = 0;
$blen = 0;
$realnum = 0;
for($i=$start;$i<strlen($title);$i++)
{
$ctype = 0;
$cstep = 0;
$cur = substr($title,$i,1);
if($cur == "&")
{
if(substr($title,$i,4) == "<")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,4) == ">")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,5) == "&")
{
$cstep = 5;
$length += 5;
$i += 4;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == """)
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == "'")
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(preg_match("/&#(\d+);/i",substr($title,$i,8),$match))
{
$cstep = strlen($match[0]);
$length += strlen($match[0]);
$i += strlen($match[0])-1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}
}else{
if(ord($cur)>=128)
{
$cstep = 2;
$length += 2;
$i += 1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}else{
$cstep = 1;
$length +=1;
$realnum ++;
if($magic)
{
$alen++;
}
}
}
if($magic)
{
if(($blen*2+$alen) == ($len*2)) break;
if(($blen*2+$alen) == ($len*2+1))
{
if($ctype == 1)
{
$length -= $cstep;
break;
}else{
break;
}
}
}else{
if($realnum == $len) break;
}
}
unset($cur);
unset($alen);
unset($blen);
unset($realnum);
unset($ctype);
unset($cstep);
return substr($title,$start,$length);
}
相關(guān)文章
php網(wǎng)絡(luò)安全session利用的小思路
這篇文章主要為大家介紹了關(guān)于php網(wǎng)絡(luò)安全中session利用的小思路示例詳解,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步2022-02-02轉(zhuǎn)換中文為unicode 轉(zhuǎn)換unicode到正常文本
轉(zhuǎn)換中文為unicode 轉(zhuǎn)換unicode到正常文本...2006-09-09Linux服務(wù)器配置PHP文件下載,中文亂碼問題,下載出錯如何解決
這篇文章主要介紹了Linux服務(wù)器配置PHP文件下載,中文亂碼問題,下載出錯如何解決,感興趣的小伙伴可以看下2021-07-07