Python 使用指定的網(wǎng)卡發(fā)送HTTP請求的實(shí)例
需求: 一臺機(jī)器上有多個(gè)網(wǎng)卡, 如何訪問指定的 URL 時(shí)使用指定的網(wǎng)卡發(fā)送數(shù)據(jù)呢?
$ curl --interface eth0 www.baidu.com # curl interface 可以指定網(wǎng)卡
閱讀 urllib.py 的源碼, 追述到 open_http –> httplib.HTTP –> httplib.HTTP._connection_class = HTTPConnection
HTTPConnection 在創(chuàng)建的時(shí)候會指定一個(gè) source_address.
HTTPConnection.connect 時(shí)調(diào)用 HTTPConnection._create_connection = socket.create_connection
# 先看一下本地網(wǎng)卡信息 $ ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 options=3<RXCSUM,TXCSUM> inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 nd6 options=1<PERFORMNUD> en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether c8:e0:eb:17:3a:73 inet6 fe80::cae0:ebff:fe17:3a73%en0 prefixlen 64 scopeid 0x4 inet 192.168.20.2 netmask 0xffffff00 broadcast 192.168.20.255 nd6 options=1<PERFORMNUD> media: autoselect status: active en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=4<VLAN_MTU> ether 0c:5b:8f:27:9a:64 inet6 fe80::e5b:8fff:fe27:9a64%en8 prefixlen 64 scopeid 0xa inet 192.168.8.100 netmask 0xffffff00 broadcast 192.168.8.255 nd6 options=1<PERFORMNUD> media: autoselect (100baseTX <full-duplex>) status: active
可以看到en0和en1, 這兩塊網(wǎng)卡都可以訪問公網(wǎng). lo0是本地回環(huán).
直接修改 socket.py 做測試.
def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None): """If *source_address* is set it must be a tuple of (host, port) for the socket to bind as a source address before making the connection. An host of '' or port 0 tells the OS to use the default. source_address 如果設(shè)置, 必須是傳遞元組 (host, port), 默認(rèn)是 ("", 0) """ host, port = address err = None for res in getaddrinfo(host, port, 0, SOCK_STREAM): af, socktype, proto, canonname, sa = res sock = None try: sock = socket(af, socktype, proto) # sock.bind(("192.168.20.2", 0)) # en0 # sock.bind(("192.168.8.100", 0)) # en1 # sock.bind(("127.0.0.1", 0)) # lo0 if timeout is not _GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) if source_address: print "socket bind source_address: %s" % source_address sock.bind(source_address) sock.connect(sa) return sock except error as _: err = _ if sock is not None: sock.close() if err is not None: raise err else: raise error("getaddrinfo returns an empty list")
參考說明文檔, 直接分三次綁定不通網(wǎng)卡的 IP 地址, 端口設(shè)置為0.
# 測試 en0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .148.245.16 # 測試 en1 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .94.115.227 # 測試 lo0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 578, in create_connection raise err IOError: [Errno socket error] [Errno 49] Can't assign requested address
測試通過, 說明在多網(wǎng)卡情況下, 創(chuàng)建 socket 時(shí)綁定某塊網(wǎng)卡的 IP 就可以, 端口需要設(shè)置為0. 如果端口不設(shè)置為0, 第二次請求時(shí), 可以看到拋異常, 端口被占用.
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 577, in create_connection raise err IOError: [Errno socket error] [Errno 48] Address already in use
如果是在項(xiàng)目中, 只需要把 socket.create_connection 這個(gè)函數(shù)的形參 source_address 設(shè)置為對應(yīng)網(wǎng)卡的 (IP, 0) 就可以.
# test-interface_urllib.py import socket import urllib, urllib2 _create_socket = socket.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _create_socket(*args, **kwargs) socket.create_connection = create_connection print urllib.urlopen("http://ip.haschek.at").read()
通過測試, 可以發(fā)現(xiàn)已經(jīng)可以通過制定的網(wǎng)卡發(fā)送數(shù)據(jù), 并且 IP 地址對應(yīng)網(wǎng)卡分配的 IP.
問題, 爬蟲經(jīng)常使用 requests, requests 是否支持呢. 通過測試, 可以發(fā)現(xiàn), requests 并沒有使用 python 內(nèi)置的 socket 模塊.
看源碼, requests 是如果創(chuàng)建的 socket 連接呢. 方法和查看 urllib 創(chuàng)建socket 的方式一樣. 具體就不寫了.
因?yàn)槲矣玫氖?python 2.7, 所以可以定位到 requests 使用的 socket 模塊是 urllib3.utils.connection 的.
修改方法和 urllib 相差不大.
import urllib3.connection _create_socket = urllib3.connection.connection.create_connection # pass urllib3.connection.connection.create_connection = create_connection # pass
運(yùn)行后, 可能會拋出異常. requests.exceptions.ConnectionError: Max retries exceeded with .. Invalid argument
這個(gè)異常不是每次出現(xiàn), 跟 IP 段有關(guān)系, 跳轉(zhuǎn)遞歸層數(shù)太多導(dǎo)致, 只需要將 kwargs 中的 socket_options去掉即可. 127.0.0.1肯定會出異常.
import socket import urllib import urllib2 import urllib3.connection import requests as req _default_create_socket = socket.create_connection _urllib3_create_socket = urllib3.connection.connection.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def default_create_connection(*args, **kwargs): try: del kwargs["socket_options"] except: pass in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _default_create_socket(*args, **kwargs) def urllib3_create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS in_args = True args = tuple(args) if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _urllib3_create_socket(*args, **kwargs) socket.create_connection = default_create_connection # 因?yàn)榕紶枙鰡栴}, 所以使用默認(rèn)的 socket.create_connection # urllib3.connection.connection.create_connection = urllib3_create_connection urllib3.connection.connection.create_connection = default_create_connection print " *** test requests: " + req.get("http://ip.haschek.at").content print " *** test urllib: " + urllib.urlopen("http://ip.haschek.at").read() print " *** test urllib2: " + urllib2.urlopen("http://ip.haschek.at").read()
注意: 使用 urllib3.utils.connection 好像不起作用
稍微再完善一下, 就是把根據(jù)網(wǎng)卡名自動獲取 IP.
import subprocess def get_all_net_devices(): sub = subprocess.Popen("ls /sys/class/net", shell=True, stdout=subprocess.PIPE) sub.wait() net_devices = sub.stdout.read().strip().splitlines() # ['eth0', 'eth1', 'lo'] # 這里簡單過濾一下網(wǎng)卡名字, 根據(jù)需求改動 net_devices = [i for i in net_devices if "ppp" in i] return net_devices ALL_DEVICES = get_all_net_devices() def get_local_ip(device_name): sub = subprocess.Popen("/sbin/ifconfig en0 | grep '%s ' | awk '{print $2}'" % device_name, shell=True, stdout=subprocess.PIPE) sub.wait() ip = sub.stdout.read().strip() return ip def random_local_ip(): return get_local_ip(random.choice(ALL_DEVICES)) # code ...
只需要把 args[2] = SOURCE_ADDRESS 和 kwargs["source_address"] = SOURCE_ADDRESS改成 random_local_ip() 或者 get_local_ip("eth0")
至于有什么用途, 就全憑想象了.
以上這篇Python 使用指定的網(wǎng)卡發(fā)送HTTP請求的實(shí)例就是小編分享給大家的全部內(nèi)容了,希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
相關(guān)文章
Django之PopUp的具體實(shí)現(xiàn)方法
今天小編就為大家分享一篇Django之PopUp的具體實(shí)現(xiàn)方法,具有很好的參考價(jià)值,希望對大家有所幫助。一起跟隨小編過來看看吧2019-08-08python打印當(dāng)前文件的絕對路徑并解決打印為空的問題
這篇文章主要介紹了python打印當(dāng)前文件的絕對路徑并解決打印為空的問題,文中補(bǔ)充介紹了python中對文件路徑的獲取方法,需要的朋友可以參考下2023-03-03將圖片文件嵌入到wxpython代碼中的實(shí)現(xiàn)方法
前面一篇文章中提到的那個(gè)程序,GUI中包含了一張圖片。在編譯成exe文件發(fā)布時(shí),無法直接生成一個(gè)單獨(dú)的exe文件。因此需要直接把圖片寫入到代碼中2014-08-08Python利用shutil模塊實(shí)現(xiàn)文件夾的復(fù)制刪除與裁剪
shutil模塊是對os模塊的補(bǔ)充,主要針對文件的拷貝、刪除、移動、壓縮和解壓操作。本文將利用shutil模塊實(shí)現(xiàn)文件夾的復(fù)制刪除與裁剪,需要的可以參考一下2022-05-05