java 實現(xiàn)通過 post 方式提交json參數(shù)操作
由于所爬取的網(wǎng)站需要驗證碼,通過網(wǎng)頁的開發(fā)人員工具【F12】及在線http post,get接口測試請求工具(http://coolaf.com/)發(fā)現(xiàn)訪問時加上請求頭header 信息時可以跳過驗證碼校驗。
而且該網(wǎng)站只接受post請求,對提交的參數(shù)也只接受json格式,否則請求失敗。
現(xiàn)將通過 post 方式提交json參數(shù)的方法記錄如下:
import java.io.UnsupportedEncodingException;
import java.net.URI;
import java.net.URLDecoder;
import java.util.ArrayList;
import java.util.List;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.methods.HttpRequestBase;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
/**
* <p>@PostJsonParamsTest.java</p>
* @version 1.0
* @author zxk
* @Date 2018-3-3
*/
public class PostJsonParamsTest {
// 超時時間
private static final int RUN_TIME =10000;
// 爬取初始頁數(shù)
private String page;
public static void main(String[] args) throws Exception {
PostJsonParamsTest crawl = new PostJsonParamsTest();
// 請求的url地址
String url ="http://www.gzcredit.gov.cn/Service/CreditService.asmx/searchOrgWithPage";
// 設(shè)置起始訪問頁碼
crawl.setPage("1");
String isStop = "";
// 設(shè)置請求
HttpRequestBase request = null;
request = new HttpPost(url);
try {
// 設(shè)置config
RequestConfig requestConfig = RequestConfig.custom()
.setSocketTimeout(RUN_TIME)
.setConnectTimeout(RUN_TIME)
.setConnectionRequestTimeout(RUN_TIME)
.build();
request.setConfig(requestConfig);
// json 格式的 post 參數(shù)
String postParams ="{\"condition\":{\"qymc\":\"%%%%\",\"cydw\":\"\"},\"pageNo\":"+crawl.getPage()+",\"pageSize\":100,count:2709846}";
System.out.println(postParams);
HttpEntity httpEntity = new StringEntity(postParams);
((HttpPost) request).setEntity(httpEntity);
// 添加請求頭,可以繞過驗證碼
request.addHeader("Accept","application/json, text/javascript, */*");
request.addHeader("Accept-Encoding","gzip, deflate");
request.addHeader("Accept-Language", "zh-CN,zh;q=0.8");
request.addHeader("Connection", "keep-alive");
request.addHeader("Host", "www.gzcredit.gov.cn");
request.addHeader("Content-Type", "application/json; charset=UTF-8");
URIBuilder builder = new URIBuilder(url);
URI uri = builder.build();
uri = new URI(URLDecoder.decode(uri.toString(), "UTF-8"));
request.setURI(uri);
while(!isStop.equals("停止")||isStop.equals("重跑")){
isStop = crawl.crawlList(request);
if(isStop.equals("爬取")){
crawl.setPage(String.valueOf(Integer.parseInt(crawl.getPage())+1));
}
// if("2713".equals(crawl.getPage())) break;
if("2".equals(crawl.getPage())){
break;
}
}
} catch (NumberFormatException e) {
e.printStackTrace();
throw new NumberFormatException("數(shù)字格式錯誤");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
throw new UnsupportedEncodingException("不支持的編碼集");
}
}
/**
* 爬取搜索列表
* @param page
* @return
*/
private String crawlList(HttpRequestBase request){
int statusCode = 0;
// 下面兩種方式都可以用來創(chuàng)建客戶端連接,相當于打開了一個瀏覽器
CloseableHttpClient httpClient = HttpClients.createDefault();
// HttpClient httpClient = HttpClientBuilder.create().build();
HttpEntity httpEntity = null;
HttpResponse response = null;
try {
try {
response = httpClient.execute(request);
} catch (Exception e){
e.printStackTrace();
EntityUtils.consumeQuietly(httpEntity);
return "重跑";
}
//打印狀態(tài)
statusCode =response.getStatusLine().getStatusCode();
if(statusCode!=200){
EntityUtils.consumeQuietly(httpEntity);
return "重跑";
}
//實體
httpEntity = response.getEntity();
String searchListStr = EntityUtils.toString(httpEntity,"GBK").replaceAll("\\\\米", "米");
String allData = (String) JSONObject.parseObject(searchListStr).get("d");
// 字符串值中間含雙引號的替換處理
String s = allData.replaceAll("\\{\"","{'")
.replaceAll("\":\"", "':'")
.replaceAll("\",\"", "','")
.replaceAll("\":", "':")
.replaceAll(",\"", ",'")
.replaceAll("\"\\}", "'}")
.replaceAll("\"", "")
.replaceAll("'", "\"")
.replaceAll("<br />", "")
.replaceAll("\t", "")
.replaceAll("\\\\", "?");
JSONObject jsonData = JSONObject.parseObject(s);
JSONArray jsonContent = jsonData.getJSONArray("orgList");
searchListStr = null;
allData = null;
s = null;
if (jsonContent==null || jsonContent.size()<1) {
return "重跑";
}
System.out.println(jsonContent.toJSONString());
return "爬取";
} catch (Exception e) {
e.printStackTrace();
return "重跑";
} finally{
EntityUtils.consumeQuietly(httpEntity);
}
}
private String getPage() {
return page;
}
private void setPage(String page) {
this.page = page;
}
}
補充知識:JAVA利用HttpClient發(fā)送post請求,將請求數(shù)據(jù)放到body里
我就廢話不多說了,大家還是直接看代碼吧~
/**
* post請求 ,請求數(shù)據(jù)放到body里
* @param url 請求地址
* @param bodyData 參數(shù)
* @author wangyj
* @date 2019年4月20日
*/
public static String doPostBodyData(String url, String bodyData) throws Exception{
String result = "";
CloseableHttpClient httpClient = null;
CloseableHttpResponse response = null;
try {
HttpPost httpPost = getHttpPost(url, null); // 請求地址
httpPost.setEntity(new StringEntity(bodyData, Encoding));
httpClient = getHttpClient();
// 得到返回的response
response = httpClient.execute(httpPost);
HttpEntity entity = response.getEntity();
result = getResult(entity, Encoding);
} catch (Exception e) {
throw e;
} finally {
// 關(guān)閉httpClient
if (null != httpClient) {
httpClient.close();
}
// 關(guān)閉response
if (null != response) {
EntityUtils.consume(response.getEntity()); // 會自動釋放連接
response.close();
}
}
return result;
}
以上這篇java 實現(xiàn)通過 post 方式提交json參數(shù)操作就是小編分享給大家的全部內(nèi)容了,希望能給大家一個參考,也希望大家多多支持腳本之家。
相關(guān)文章
mybatis教程之增刪改查_動力節(jié)點Java學院整理
這篇文章主要介紹了mybatis教程之增刪改查,小編覺得挺不錯的,現(xiàn)在分享給大家,也給大家做個參考。一起跟隨小編過來看看吧2017-09-09
使用mybatis的typeHandler對clob進行流讀寫方式
這篇文章主要介紹了使用mybatis的typeHandler對clob進行流讀寫方式,具有很好的參考價值,希望對大家有所幫助。如有錯誤或未考慮完全的地方,望不吝賜教2022-01-01
關(guān)于java自定義線程池的原理與實現(xiàn)
本文介紹了如何自定義線程池和阻塞隊列,包括阻塞隊列的實現(xiàn)方法,線程池的構(gòu)建以及拒絕策略的應(yīng)用,詳細闡述了線程池中任務(wù)的提交和執(zhí)行流程,以及如何處理任務(wù)超出隊列容量的情況2022-04-04
Java 實戰(zhàn)項目之誠途旅游系統(tǒng)的實現(xiàn)流程
讀萬卷書不如行萬里路,只學書上的理論是遠遠不夠的,只有在實戰(zhàn)中才能獲得能力的提升,本篇文章手把手帶你用java+SpringBoot+Vue+maven+Mysql實現(xiàn)一個精美的物流管理系統(tǒng),大家可以在過程中查缺補漏,提升水平2021-11-11

