Python requests上傳文件實(shí)現(xiàn)步驟
官方文檔:https://2.python-requests.org//en/master/
工作中涉及到一個(gè)功能,需要上傳附件到一個(gè)接口,接口參數(shù)如下:
使用http post提交附件 multipart/form-data 格式,url : http://test.com/flow/upload,
字段列表:
md5: //md5加密(隨機(jī)值_當(dāng)時(shí)時(shí)間戳)
filesize: //文件大小
file: //文件內(nèi)容(須含文件名)
返回值:
{"success":true,"uploadName":"tmp.xml","uploadPath":"uploads\/201311\/758e875fb7c7a508feef6b5036119b9f"}
由于工作中主要用python,并且項(xiàng)目中已有使用requests庫(kù)的地方,所以計(jì)劃使用requests來實(shí)現(xiàn),本來以為是很簡(jiǎn)單的一個(gè)小功能,結(jié)果花費(fèi)了大量的時(shí)間,requests官方的例子只提到了上傳文件,并不需要傳額外的參數(shù):
https://2.python-requests.org//en/master/user/quickstart/#post-a-multipart-encoded-file
>>> url = 'https://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
>>> r = requests.post(url, files=files)
>>> r.text
{
...
"files": {
"file": "<censored...binary...data>"
},
...
}
但是如果涉及到了參數(shù)的傳遞時(shí),其實(shí)就要用到requests的兩個(gè)參數(shù):data、files,將要上傳的文件傳入files,將其他參數(shù)傳入data,request庫(kù)會(huì)將兩者合并到一起做一個(gè)multi part,然后發(fā)送給服務(wù)器。
最終實(shí)現(xiàn)的代碼是這樣的:
with open(file_name) as f:
content = f.read()
request_data = {
'md5':md5.md5('%d_%d' % (0, int(time.time()))).hexdigest(),
'filesize':len(content),
}
files = {'file':(file_name, open(file_name, 'rb'))}
MyLogger().getlogger().info('url:%s' % (request_url))
resp = requests.post(request_url, data=request_data, files=files)
雖然最終代碼可能看起來很簡(jiǎn)單,但是其實(shí)我費(fèi)了好大功夫才確認(rèn)這樣是OK的,中間還翻了requests的源碼,下面記錄一下翻閱源碼的過程:
首先,找到post方法的實(shí)現(xiàn),在requests.api.py中:
def post(url, data=None, json=None, **kwargs):
r"""Sends a POST request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request('post', url, data=data, json=json, **kwargs)
這里可以看到它調(diào)用了request方法,咱們繼續(xù)跟進(jìn)request方法,在requests.api.py中:
def request(method, url, **kwargs):
"""Constructs and sends a :class:`Request <Request>`.
:param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary, list of tuples or bytes to send
in the query string for the :class:`Request`.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
:param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
to add for the file.
:param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How many seconds to wait for the server to send data
before giving up, as a float, or a :ref:`(connect timeout, read
timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param stream: (optional) if ``False``, the response content will be immediately downloaded.
:param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
:return: :class:`Response <Response>` object
:rtype: requests.Response
Usage::
>>> import requests
>>> req = requests.request('GET', 'https://httpbin.org/get')
<Response [200]>
"""
# By using the 'with' statement we are sure the session is closed, thus we
# avoid leaving sockets open which can trigger a ResourceWarning in some
# cases, and look like a memory leak in others.
with sessions.Session() as session:
return session.request(method=method, url=url, **kwargs)
這個(gè)方法的注釋比較多,從注釋里其實(shí)已經(jīng)可以看到files參數(shù)使用傳送文件,但是還是無法知道當(dāng)需要同時(shí)傳遞參數(shù)和文件時(shí)該如何處理,繼續(xù)跟進(jìn)session.request方法,在requests.session.py中:
def request(self, method, url,
params=None, data=None, headers=None, cookies=None, files=None,
auth=None, timeout=None, allow_redirects=True, proxies=None,
hooks=None, stream=None, verify=None, cert=None, json=None):
"""Constructs a :class:`Request <Request>`, prepares it and sends it.
Returns :class:`Response <Response>` object.
:param method: method for the new :class:`Request` object.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary or bytes to be sent in the query
string for the :class:`Request`.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json to send in the body of the
:class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the
:class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the
:class:`Request`.
:param files: (optional) Dictionary of ``'filename': file-like-objects``
for multipart encoding upload.
:param auth: (optional) Auth tuple or callable to enable
Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Set to True by default.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol or protocol and
hostname to the URL of the proxy.
:param stream: (optional) whether to immediately download the response
content. Defaults to ``False``.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param cert: (optional) if String, path to ssl client cert file (.pem).
If Tuple, ('cert', 'key') pair.
:rtype: requests.Response
"""
# Create the Request.
req = Request(
method=method.upper(),
url=url,
headers=headers,
files=files,
data=data or {},
json=json,
params=params or {},
auth=auth,
cookies=cookies,
hooks=hooks,
)
prep = self.prepare_request(req)
proxies = proxies or {}
settings = self.merge_environment_settings(
prep.url, proxies, stream, verify, cert
)
# Send the request.
send_kwargs = {
'timeout': timeout,
'allow_redirects': allow_redirects,
}
send_kwargs.update(settings)
resp = self.send(prep, **send_kwargs)
return resp
先大概看一下這個(gè)方法,先是準(zhǔn)備request,最后一步是調(diào)用send,推測(cè)應(yīng)該是發(fā)送請(qǐng)求了,所以我們需要跟進(jìn)到prepare_request方法中,在requests.session.py中:
def prepare_request(self, request):
"""Constructs a :class:`PreparedRequest <PreparedRequest>` for
transmission and returns it. The :class:`PreparedRequest` has settings
merged from the :class:`Request <Request>` instance and those of the
:class:`Session`.
:param request: :class:`Request` instance to prepare with this
session's settings.
:rtype: requests.PreparedRequest
"""
cookies = request.cookies or {}
# Bootstrap CookieJar.
if not isinstance(cookies, cookielib.CookieJar):
cookies = cookiejar_from_dict(cookies)
# Merge with session cookies
merged_cookies = merge_cookies(
merge_cookies(RequestsCookieJar(), self.cookies), cookies)
# Set environment's basic authentication if not explicitly set.
auth = request.auth
if self.trust_env and not auth and not self.auth:
auth = get_netrc_auth(request.url)
p = PreparedRequest()
p.prepare(
method=request.method.upper(),
url=request.url,
files=request.files,
data=request.data,
json=request.json,
headers=merge_setting(request.headers, self.headers, dict_class=CaseInsensitiveDict),
params=merge_setting(request.params, self.params),
auth=merge_setting(auth, self.auth),
cookies=merged_cookies,
hooks=merge_hooks(request.hooks, self.hooks),
)
return p
在prepare_request中,生成了一個(gè)PreparedRequest對(duì)象,并調(diào)用其prepare方法,跟進(jìn)到prepare方法中,在requests.models.py中:
def prepare(self,
method=None, url=None, headers=None, files=None, data=None,
params=None, auth=None, cookies=None, hooks=None, json=None):
"""Prepares the entire request with the given parameters."""
self.prepare_method(method)
self.prepare_url(url, params)
self.prepare_headers(headers)
self.prepare_cookies(cookies)
self.prepare_body(data, files, json)
self.prepare_auth(auth, url)
# Note that prepare_auth must be last to enable authentication schemes
# such as OAuth to work on a fully prepared request.
# This MUST go after prepare_auth. Authenticators could add a hook
self.prepare_hooks(hooks)
這里調(diào)用許多prepare_xx方法,這里我們只關(guān)心處理了data、files、json的方法,跟進(jìn)到prepare_body中,在requests.models.py中:
def prepare_body(self, data, files, json=None):
"""Prepares the given HTTP body data."""
# Check if file, fo, generator, iterator.
# If not, run through normal process.
# Nottin' on you.
body = None
content_type = None
if not data and json is not None:
# urllib3 requires a bytes-like body. Python 2's json.dumps
# provides this natively, but Python 3 gives a Unicode string.
content_type = 'application/json'
body = complexjson.dumps(json)
if not isinstance(body, bytes):
body = body.encode('utf-8')
is_stream = all([
hasattr(data, '__iter__'),
not isinstance(data, (basestring, list, tuple, Mapping))
])
try:
length = super_len(data)
except (TypeError, AttributeError, UnsupportedOperation):
length = None
if is_stream:
body = data
if getattr(body, 'tell', None) is not None:
# Record the current file position before reading.
# This will allow us to rewind a file in the event
# of a redirect.
try:
self._body_position = body.tell()
except (IOError, OSError):
# This differentiates from None, allowing us to catch
# a failed `tell()` later when trying to rewind the body
self._body_position = object()
if files:
raise NotImplementedError('Streamed bodies and files are mutually exclusive.')
if length:
self.headers['Content-Length'] = builtin_str(length)
else:
self.headers['Transfer-Encoding'] = 'chunked'
else:
# Multi-part file uploads.
if files:
(body, content_type) = self._encode_files(files, data)
else:
if data:
body = self._encode_params(data)
if isinstance(data, basestring) or hasattr(data, 'read'):
content_type = None
else:
content_type = 'application/x-www-form-urlencoded'
self.prepare_content_length(body)
# Add content-type if it wasn't explicitly provided.
if content_type and ('content-type' not in self.headers):
self.headers['Content-Type'] = content_type
self.body = body
這個(gè)函數(shù)比較長(zhǎng),需要重點(diǎn)關(guān)注L52,這里調(diào)用了_encode_files方法,我們跟進(jìn)這個(gè)方法:
def _encode_files(files, data):
"""Build the body for a multipart/form-data request.
Will successfully encode files when passed as a dict or a list of
tuples. Order is retained if data is a list of tuples but arbitrary
if parameters are supplied as a dict.
The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype)
or 4-tuples (filename, fileobj, contentype, custom_headers).
"""
if (not files):
raise ValueError("Files must be provided.")
elif isinstance(data, basestring):
raise ValueError("Data must not be a string.")
new_fields = []
fields = to_key_val_list(data or {})
files = to_key_val_list(files or {})
for field, val in fields:
if isinstance(val, basestring) or not hasattr(val, '__iter__'):
val = [val]
for v in val:
if v is not None:
# Don't call str() on bytestrings: in Py3 it all goes wrong.
if not isinstance(v, bytes):
v = str(v)
new_fields.append(
(field.decode('utf-8') if isinstance(field, bytes) else field,
v.encode('utf-8') if isinstance(v, str) else v))
for (k, v) in files:
# support for explicit filename
ft = None
fh = None
if isinstance(v, (tuple, list)):
if len(v) == 2:
fn, fp = v
elif len(v) == 3:
fn, fp, ft = v
else:
fn, fp, ft, fh = v
else:
fn = guess_filename(v) or k
fp = v
if isinstance(fp, (str, bytes, bytearray)):
fdata = fp
elif hasattr(fp, 'read'):
fdata = fp.read()
elif fp is None:
continue
else:
fdata = fp
rf = RequestField(name=k, data=fdata, filename=fn, headers=fh)
rf.make_multipart(content_type=ft)
new_fields.append(rf)
body, content_type = encode_multipart_formdata(new_fields)
return body, content_type
OK,到此為止,仔細(xì)閱讀完這個(gè)段代碼,就可以搞明白requests.post方法傳入的data、files兩個(gè)參數(shù)的作用了,其實(shí)requests在這里把它倆合并在一起了,作為post的body。
以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持腳本之家。
相關(guān)文章
關(guān)于python列表相關(guān)知識(shí)點(diǎn)
這篇文章主要介紹了關(guān)于python列表相關(guān)知識(shí)點(diǎn),變量可以存儲(chǔ)一個(gè)元素,而列表是一個(gè)大容器,可以存儲(chǔ)N多個(gè)元素,程序可以方便的對(duì)這些數(shù)據(jù)進(jìn)行整體操作,需要的朋友可以參考下2023-04-04
Python第三方庫(kù)qrcode或MyQr生成博客地址二維碼
使用第三方庫(kù)qrcode或者M(jìn)yQr給自己的博客網(wǎng)址快速生成二維碼,一鍵分享,文中含有詳細(xì)示例代碼,有需要的朋友可以借鑒參考下,希望能夠有所幫助2021-10-10
Python模擬瀏覽器上傳文件腳本的方法(Multipart/form-data格式)
今天小編就為大家分享一篇Python模擬瀏覽器上傳文件腳本的方法(Multipart/form-data格式),具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過來看看吧2018-10-10
Python實(shí)現(xiàn)將字典(列表按列)存入csv文件
這篇文章主要介紹了Python實(shí)現(xiàn)將字典(列表按列)存入csv文件方式,具有很好的參考價(jià)值,希望對(duì)大家有所幫助,如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2024-06-06
Python打包文件執(zhí)行報(bào)錯(cuò):ModuleNotFoundError: No module 
這篇文章給大家介紹了Python打包文件執(zhí)行報(bào)錯(cuò):ModuleNotFoundError: No module named ‘pymssql‘的解決方法,如果有遇到相同問題的朋友可以參考閱讀一下本文2023-10-10
Python?數(shù)據(jù)可視化超詳細(xì)講解折線圖的實(shí)現(xiàn)
數(shù)據(jù)可以幫助我們描述這個(gè)世界、闡釋自己的想法和展示自己的成果,但如果只有單調(diào)乏味的文本和數(shù)字,我們卻往往能難抓住觀眾的眼球。而很多時(shí)候,一張漂亮的可視化圖表就足以勝過千言萬語(yǔ),讓我們來用Python實(shí)現(xiàn)一個(gè)可視化的折線圖2022-03-03

