PHP屏蔽蜘蛛訪問代碼及常用搜索引擎的HTTP_USER_AGENT
更新時間:2013年03月06日 09:00:57 投稿:whsnow
屏蔽蜘蛛相信每一位站長都不希望這樣做吧,因為蜘蛛的訪問就沒有用戶的瀏覽,直接會給我們帶來一定損失,不過也有例外,某些網(wǎng)站就不希望被蜘蛛爬行,接下來為你介紹屏蔽蜘蛛的php代碼
PHP屏蔽蜘蛛訪問代碼代碼:

常用搜索引擎名與 HTTP_USER_AGENT對應值
百度baiduspider
谷歌googlebot
搜狗sogou
騰訊SOSOsosospider
雅虎slurp
有道youdaobot
Bingbingbot
MSNmsnbot
Alexais_archiver
function is_crawler() {
$userAgent = strtolower($_SERVER['HTTP_USER_AGENT']);
$spiders = array(
'Googlebot', // Google 爬蟲
'Baiduspider', // 百度爬蟲
'Yahoo! Slurp', // 雅虎爬蟲
'YodaoBot', // 有道爬蟲
'msnbot' // Bing爬蟲
// 更多爬蟲關鍵字
);
foreach ($spiders as $spider) {
$spider = strtolower($spider);
if (strpos($userAgent, $spider) !== false) {
return true;
}
}
return false;
}
下面的php代碼附帶了更多的蜘蛛標識
function isCrawler() {
echo $agent= strtolower($_SERVER['HTTP_USER_AGENT']);
if (!empty($agent)) {
$spiderSite= array(
"TencentTraveler",
"Baiduspider+",
"BaiduGame",
"Googlebot",
"msnbot",
"Sosospider+",
"Sogou web spider",
"ia_archiver",
"Yahoo! Slurp",
"YoudaoBot",
"Yahoo Slurp",
"MSNBot",
"Java (Often spam bot)",
"BaiDuSpider",
"Voila",
"Yandex bot",
"BSpider",
"twiceler",
"Sogou Spider",
"Speedy Spider",
"Google AdSense",
"Heritrix",
"Python-urllib",
"Alexa (IA Archiver)",
"Ask",
"Exabot",
"Custo",
"OutfoxBot/YodaoBot",
"yacy",
"SurveyBot",
"legs",
"lwp-trivial",
"Nutch",
"StackRambler",
"The web archive (IA Archiver)",
"Perl tool",
"MJ12bot",
"Netcraft",
"MSIECrawler",
"WGet tools",
"larbin",
"Fish search",
);
foreach($spiderSite as $val) {
$str = strtolower($val);
if (strpos($agent, $str) !== false) {
return true;
}
}
} else {
return false;
}
}
if (isCrawler()){
echo "你好蜘蛛精!";
}
else{
echo "你不是蜘蛛精啊!";
}
使用PHP實現(xiàn)蜘蛛訪問日志統(tǒng)計
$useragent = addslashes(strtolower($_SERVER['HTTP_USER_AGENT']));
if (strpos($useragent, 'googlebot')!== false){$bot = 'Google';}
elseif (strpos($useragent,'mediapartners-google') !== false){$bot = 'Google Adsense';}
elseif (strpos($useragent,'baiduspider') !== false){$bot = 'Baidu';}
elseif (strpos($useragent,'sogou spider') !== false){$bot = 'Sogou';}
elseif (strpos($useragent,'sogou web') !== false){$bot = 'Sogou web';}
elseif (strpos($useragent,'sosospider') !== false){$bot = 'SOSO';}
elseif (strpos($useragent,'360spider') !== false){$bot = '360Spider';}
elseif (strpos($useragent,'yahoo') !== false){$bot = 'Yahoo';}
elseif (strpos($useragent,'msn') !== false){$bot = 'MSN';}
elseif (strpos($useragent,'msnbot') !== false){$bot = 'msnbot';}
elseif (strpos($useragent,'sohu') !== false){$bot = 'Sohu';}
elseif (strpos($useragent,'yodaoBot') !== false){$bot = 'Yodao';}
elseif (strpos($useragent,'twiceler') !== false){$bot = 'Twiceler';}
elseif (strpos($useragent,'ia_archiver') !== false){$bot = 'Alexa_';}
elseif (strpos($useragent,'iaarchiver') !== false){$bot = 'Alexa';}
elseif (strpos($useragent,'slurp') !== false){$bot = '雅虎';}
elseif (strpos($useragent,'bot') !== false){$bot = '其它蜘蛛';}
if(isset($bot)){
$fp = @fopen('bot.txt','a');
fwrite($fp,date('Y-m-d H:i:s')."\t".$_SERVER["REMOTE_ADDR"]."\t".$bot."\t".'http://'.$_SERVER['SERVER_NAME'].$_SERVER["REQUEST_URI"]."\r\n");
fclose($fp);
}
相關文章
學習php設計模式 php實現(xiàn)原型模式(prototype)
這篇文章主要介紹了php設計模式中的原型模式,使用php實現(xiàn)原型模式,感興趣的小伙伴們可以參考一下2015-12-12
php常用字符串查找函數(shù)strstr()與strpos()實例分析
這篇文章主要介紹了php常用字符串查找函數(shù)strstr()與strpos(),結合具體實例形式分析了php字符串查找函數(shù)strstr()與strpos()的具體功能、用法、區(qū)別及相關操作注意事項,需要的朋友可以參考下2019-06-06
PHP使用zlib擴展實現(xiàn)GZIP壓縮輸出的方法詳解
這篇文章主要介紹了PHP使用zlib擴展實現(xiàn)GZIP壓縮輸出的方法,結合實例形式詳細分析了php gzip配置及壓縮輸出的相關操作技巧,需要的朋友可以參考下2018-04-04

