用python爬取電腦壁紙實(shí)例代碼

更新時間：2022年02月14日 08:46:48 作者：密發(fā)漸消

大家好，本篇文章主要講的是用python爬取電腦壁紙實(shí)例代碼，感興趣的同學(xué)趕快來看一看吧，對你有幫助的話記得收藏一下

前言

聽說好的編程習(xí)慣是從寫文章敲代碼開始的，下面給大家介紹一個簡單的python爬取圖片的過程，超簡單。我都不好意思寫，但是主要是捋一下爬取過程。本文只是技術(shù)交流的，請不要商業(yè)用途哈

一、用到的工具

使用python爬蟲工具，我使用的工具就是學(xué)習(xí)python都會用的的工具，一個是pycharm，一個是chrome，使用chrome只是我的個人習(xí)慣，也可以用其他的瀏覽器，我除了這兩個軟件還用到了window自帶的瀏覽器。

二、爬取步驟與過程

1.用到的庫

爬取圖片我主要用到了爬蟲初學(xué)的requests請求模塊和xpath模塊，這用xpath只是為了方便找圖片的鏈接和路徑的，當(dāng)然也可以用re模塊和Beautiful Soup模塊這些。time模塊是為了后續(xù)下載圖片做延時的，畢竟要保護(hù)下網(wǎng)站，Pinyin模塊里面有個程序要將中文轉(zhuǎn)成拼音。

import time
import pinyin
import requests
from lxml import etree    #這是導(dǎo)入xpath模塊

2.解析代碼

首先輸入選擇圖片類型和圖片的也網(wǎng)址頁碼，因?yàn)橐环N類型的圖片有很多圖片的，一個網(wǎng)頁是放不下的這就需要選擇多個頁碼。

type=input("請輸入圖片的類型：")
type=pinyin.get(f"{type}",format="strip")   #這是將輸入的中文轉(zhuǎn)成拼音，format是為了去掉拼音的聲標(biāo)
m=input("請輸入圖片的頁碼：")        #圖片類型所在網(wǎng)頁的頁碼

接下來就請求網(wǎng)址了，其中先獲取網(wǎng)址的源代碼，然后通過xpath獲取圖片的鏈接和名字，為什么是小圖片呢，因?yàn)橐粋€網(wǎng)頁要顯示很多圖片，如果是大圖片，一個網(wǎng)頁只能放一張的。

url=f"https://pic.netbian.com/4k{type}/index_{m}.html"
header={
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36"
}
resp=requests.get(url=url,headers=header)
resp.encoding="gbk"
tree=etree.HTML(resp.text)
tu_links=tree.xpath('//*[@id="main"]/div[3]/ul/li/a/@href')    #小圖片的鏈接
tu_names=tree.xpath('//*[@id="main"]/div[3]/ul/li/a/b/text()')  #圖片的名字

接下來就是一層一層的找圖片的鏈接，和下載地址，也就是一個查找的循環(huán)操作。

    tu_link.split()
    child_url = url.split(f'/4k{type}/')[0] + tu_link       #將小圖片的部分鏈接和主鏈接拼接起來
    resp1 = requests.get(url=child_url, headers=header)     #打開小圖片的網(wǎng)址
    resp1.encoding = 'gbk'
    tree1 = etree.HTML(resp1.text)
    child_tu_link= tree1.xpath('/html/body/div[2]/div[1]/div[2]/div[1]/div[2]/a/img/@src')     #獲取大圖片的部分鏈接
    child_tu_link=child_tu_link[0]
    child_tu_link_over=url.split(f'/4k{type}/')[0]+child_tu_link        #將大圖片鏈接拼接起來
    resp2=requests.get(child_tu_link_over,headers=header)       #獲取大圖片

最后就是下載圖片啦

    with open(f"壁紙圖片/{name}.jpg",mode="wb") as f:           #接下來就是下載了
        f.write(resp2.content)

來看看最后的效果吧

3.最后上全部的代碼啦

這就是簡簡單單的爬取圖片的代碼，完全沒有用到什么復(fù)雜的知識，簡簡單單，你值得擁有哈哈

import time
import pinyin
import requests
from lxml import etree    #這是導(dǎo)入xpath模塊
type=input("請輸入圖片的類型：")
type=pinyin.get(f"{type}",format="strip")   #這是將輸入的中文轉(zhuǎn)成拼音，format是為了去掉拼音的聲標(biāo)
m=input("請輸入圖片的頁碼：")        #圖片類型所在網(wǎng)頁的頁碼
url=f"https://pic.netbian.com/4k{type}/index_{m}.html"
header={
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36"
}
resp=requests.get(url=url,headers=header)
resp.encoding="gbk"
tree=etree.HTML(resp.text)
tu_links=tree.xpath('//*[@id="main"]/div[3]/ul/li/a/@href')    #小圖片的部分鏈接
tu_names=tree.xpath('//*[@id="main"]/div[3]/ul/li/a/b/text()')  #圖片的名字
n=0
for tu_link in tu_links:
    tu_link.split()
    child_url = url.split(f'/4k{type}/')[0] + tu_link       #將小圖片的部分鏈接和主鏈接拼接起來
    resp1 = requests.get(url=child_url, headers=header)     #打開小圖片的網(wǎng)址
    resp1.encoding = 'gbk'
    tree1 = etree.HTML(resp1.text)
    child_tu_link= tree1.xpath('/html/body/div[2]/div[1]/div[2]/div[1]/div[2]/a/img/@src')     #獲取大圖片的部分鏈接
    child_tu_link=child_tu_link[0]
    child_tu_link_over=url.split(f'/4k{type}/')[0]+child_tu_link        #將大圖片鏈接拼接起來
    resp2=requests.get(child_tu_link_over,headers=header)       #獲取大圖片
    name=tu_names[n]
    with open(f"壁紙圖片/{name}.jpg",mode="wb") as f:           #接下來就是下載了
        f.write(resp2.content)
        print(f"{name}          下載完畢！！")
        n+=1        #每下載一張圖片n就加1
        time.sleep(1.5)
        resp1.close()
print("全部下載完畢??！")       #over！over！
resp.close()        #最后記得要把所有請求的響應(yīng)關(guān)閉

注意：本文章只用于技術(shù)交流，請勿用于商用，如有違反，吾概不負(fù)責(zé)?。?！