一文詳解各種ElasticSearch查詢在Java中的實現(xiàn)
以下為摘錄自用,非本人撰寫
本文基于elasticsearch 7.13.2版本,es從7.0以后,發(fā)生了很大的更新。7.3以后,已經(jīng)不推薦使用TransportClient這個client,取而代之的是Java High Level REST Client。
01 測試使用的數(shù)據(jù)示例
首先是,Mysql中的部分測試數(shù)據(jù):
Mysql中的一行數(shù)據(jù)在ES中以一個文檔形式存在:
{ "_index" : "person", "_type" : "_doc", "_id" : "4", "_score" : 1.0, "_source" : { "address" : "峨眉山", "modifyTime" : "2021-06-29 19:46:25", "createTime" : "2021-05-14 11:37:07", "sect" : "峨嵋派", "sex" : "男", "skill" : "降龍十八掌", "name" : "宋青書", "id" : 4, "power" : 50, "age" : 21 } }
簡單梳理了一下ES JavaAPI的相關體系,感興趣的可以自己研讀一下源碼。
接下來,我們用十幾個實例,迅速上手ES的查詢操作,每個示例將提供SQL語句、ES語句和Java代碼。
02 詞條查詢
所謂詞條查詢,也就是ES不會對查詢條件進行分詞處理,只有當詞條和查詢字符串完全匹配時,才會被查詢到。
2.1 等值查詢-term
等值查詢,即篩選出一個字段等于特定值的所有記錄。
SQL:
select * from person where name = '張無忌';
而使用ES查詢語句卻很不一樣(注意查詢字段帶上keyword):
GET /person/_search { "query": { "term": { "name.keyword": { "value": "張無忌", "boost": 1.0 } } } }
ElasticSearch 5.0以后,string類型有重大變更,移除了string類型,string字段被拆分成兩種新的數(shù)據(jù)類型: text用于全文搜索的,而keyword用于關鍵詞搜索。
查詢結果:
{ "took" : 0, "timed_out" : false, "_shards" : { // 分片信息 "total" : 1, // 總計分片數(shù) "successful" : 1, // 查詢成功的分片數(shù) "skipped" : 0, // 跳過查詢的分片數(shù) "failed" : 0 // 查詢失敗的分片數(shù) }, "hits" : { // 命中結果 "total" : { "value" : 1, // 數(shù)量 "relation" : "eq" // 關系:等于 }, "max_score" : 2.8526313, // 最高分數(shù) "hits" : [ { "_index" : "person", // 索引 "_type" : "_doc", // 類型 "_id" : "1", "_score" : 2.8526313, "_source" : { "address" : "光明頂", "modifyTime" : "2021-06-29 16:48:56", "createTime" : "2021-05-14 16:50:33", "sect" : "明教", "sex" : "男", "skill" : "九陽神功", "name" : "張無忌", "id" : 1, "power" : 99, "age" : 18 } } ] } }
Java 中構造 ES 請求的方式:(后續(xù)例子中只保留 SearchSourceBuilder 的構建語句)
/** * term精確查詢 * * @throws IOException */ @Autowired private RestHighLevelClient client; @Test public void queryTerm() throws IOException { // 根據(jù)索引創(chuàng)建查詢請求 SearchRequest searchRequest = new SearchRequest("person"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.termQuery("name.keyword", "張無忌")); System.out.println("searchSourceBuilder=====================" + searchSourceBuilder); searchRequest.source(searchSourceBuilder); SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println(JSONObject.toJSON(response)); }
仔細觀察查詢結果,會發(fā)現(xiàn)ES查詢結果中會帶有_score
這一項,ES會根據(jù)結果匹配程度進行評分。打分是會耗費性能的,如果確認自己的查詢不需要評分,就設置查詢語句關閉評分:
GET /person/_search { "query": { "constant_score": { "filter": { "term": { "sect.keyword": { "value": "張無忌", "boost": 1.0 } } }, "boost": 1.0 } } }
Java構建查詢語句:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 這樣構造的查詢條件,將不進行score計算,從而提高查詢效率 searchSourceBuilder.query(QueryBuilders.constantScoreQuery(QueryBuilders.termQuery("sect.keyword", "明教")));
2.2 多值查詢-terms
多條件查詢類似 Mysql 里的IN 查詢,例如:
select * from persons where sect in('明教','武當派');
ES查詢語句:
GET /person/_search { "query": { "terms": { "sect.keyword": [ "明教", "武當派" ], "boost": 1.0 } } }
Java 實現(xiàn):
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.termsQuery("sect.keyword", Arrays.asList("明教", "武當派"))); }
2.3 范圍查詢-range
范圍查詢,即查詢某字段在特定區(qū)間的記錄。
SQL:
select * from pesons where age between 18 and 22;
ES查詢語句:
GET /person/_search { "query": { "range": { "age": { "from": 10, "to": 20, "include_lower": true, "include_upper": true, "boost": 1.0 } } }
Java構建查詢條件:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.rangeQuery("age").gte(10).lte(30)); }
2.4 前綴查詢-prefix
前綴查詢類似于SQL中的模糊查詢。
SQL:
select * from persons where sect like '武當%';
ES查詢語句:
{ "query": { "prefix": { "sect.keyword": { "value": "武當", "boost": 1.0 } } } }
Java構建查詢條件:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.prefixQuery("sect.keyword","武當")); }
2.5 通配符查詢-wildcard
通配符查詢,與前綴查詢類似,都屬于模糊查詢的范疇,但通配符顯然功能更強。
SQL:
select * from persons where name like '張%忌';
ES查詢語句:
{ "query": { "wildcard": { "sect.keyword": { "wildcard": "張*忌", "boost": 1.0 } } } }
Java構建查詢條件:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.wildcardQuery("sect.keyword","張*忌"));
03 負責查詢
前面的例子都是單個條件查詢,在實際應用中,我們很有可能會過濾多個值或字段。先看一個簡單的例子:
select * from persons where sex = '女' and sect = '明教';
這樣的多條件等值查詢,就要借用到組合過濾器了,其查詢語句是:
{ "query": { "bool": { "must": [ { "term": { "sex": { "value": "女", "boost": 1.0 } } }, { "term": { "sect.keywords": { "value": "明教", "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } } }
Java構造查詢語句:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.boolQuery() .must(QueryBuilders.termQuery("sex", "女")) .must(QueryBuilders.termQuery("sect.keyword", "明教")) );
3.1 布爾查詢
布爾過濾器(bool filter)屬于復合過濾器(compound filter)的一種 ,可以接受多個其他過濾器作為參數(shù),并將這些過濾器結合成各式各樣的布爾(邏輯)組合。
bool 過濾器下可以有4種子條件,可以任選其中任意一個或多個。filter是比較特殊的,這里先不說。
{ "bool" : { "must" : [], "should" : [], "must_not" : [], } }
- must:所有的語句都必須匹配,與 ‘=’ 等價。
- must_not:所有的語句都不能匹配,與 ‘!=’ 或 not in 等價。
- should:至少有n個語句要匹配,n由參數(shù)控制。
精度控制:
所有 must 語句必須匹配,所有 must_not 語句都必須不匹配,但有多少 should 語句應該匹配呢?默認情況下,沒有 should 語句是必須匹配的,只有一個例外:那就是當沒有 must 語句的時候,至少有一個 should 語句必須匹配。
我們可以通過 minimum_should_match 參數(shù)控制需要匹配的 should 語句的數(shù)量,它既可以是一個絕對的數(shù)字,又可以是個百分比:
GET /person/_search { "query": { "bool": { "must": [ { "term": { "sex": { "value": "女", "boost": 1.0 } } } ], "should": [ { "term": { "address.keyword": { "value": "峨眉山", "boost": 1.0 } } }, { "term": { "sect.keyword": { "value": "明教", "boost": 1.0 } } } ], "adjust_pure_negative": true, "minimum_should_match": "1", "boost": 1.0 } } }
Java構建查詢語句:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.boolQuery() .must(QueryBuilders.termQuery("sex", "女")) .should(QueryBuilders.termQuery("address.word", "峨眉山")) .should(QueryBuilders.termQuery("sect.keyword", "明教")) .minimumShouldMatch(1) );
最后,看一個復雜些的例子,將bool的各子句聯(lián)合使用:
select * from persons where sex = '女' and age between 30 and 40 and sect != '明教' and (address = '峨眉山' OR skill = '暗器')
用 Elasticsearch 來表示上面的 SQL 例子:
GET /person/_search { "query": { "bool": { "must": [ { "term": { "sex": { "value": "女", "boost": 1.0 } } }, { "range": { "age": { "from": 30, "to": 40, "include_lower": true, "include_upper": true, "boost": 1.0 } } } ], "must_not": [ { "term": { "sect.keyword": { "value": "明教", "boost": 1.0 } } } ], "should": [ { "term": { "address.keyword": { "value": "峨眉山", "boost": 1.0 } } }, { "term": { "skill.keyword": { "value": "暗器", "boost": 1.0 } } } ], "adjust_pure_negative": true, "minimum_should_match": "1", "boost": 1.0 } } }
用Java構建這個查詢條件:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery() .must(QueryBuilders.termQuery("sex", "女")) .must(QueryBuilders.rangeQuery("age").gte(30).lte(40)) .mustNot(QueryBuilders.termQuery("sect.keyword", "明教")) .should(QueryBuilders.termQuery("address.keyword", "峨眉山")) .should(QueryBuilders.rangeQuery("power.keyword").gte(50).lte(80)) .minimumShouldMatch(1); // 設置should至少需要滿足幾個條件 // 將BoolQueryBuilder構建到SearchSourceBuilder中 searchSourceBuilder.query(boolQueryBuilder);
3.2 Filter查詢
query和filter的區(qū)別:query查詢的時候,會先比較查詢條件,然后計算分值,最后返回文檔結果;而filter是先判斷是否滿足查詢條件,如果不滿足會緩存查詢結果(記錄該文檔不滿足結果),滿足的話,就直接緩存結果,filter不會對結果進行評分,能夠提高查詢效率。
filter的使用方式比較多樣,下面用幾個例子演示一下。
方式一,單獨使用:
{ "query": { "bool": { "filter": [ { "term": { "sex": { "value": "男", "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } } }
單獨使用時,filter與must基本一樣,不同的是filter不計算評分,效率更高。
Java構建查詢語句:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.boolQuery() .filter(QueryBuilders.termQuery("sex", "男")) );
方式二,和must、must_not同級,相當于子查詢:
select * from (select * from persons where sect = '明教')) a where sex = '女';
ES查詢語句:
{ "query": { "bool": { "must": [ { "term": { "sect.keyword": { "value": "明教", "boost": 1.0 } } } ], "filter": [ { "term": { "sex": { "value": "女", "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } } }
Java:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.boolQuery() .must(QueryBuilders.termQuery("sect.keyword", "明教")) .filter(QueryBuilders.termQuery("sex", "女")) );
方式三,將must、must_not置于filter下,這種方式是最常用的:
{ "query": { "bool": { "filter": [ { "bool": { "must": [ { "term": { "sect.keyword": { "value": "明教", "boost": 1.0 } } }, { "range": { "age": { "from": 20, "to": 35, "include_lower": true, "include_upper": true, "boost": 1.0 } } } ], "must_not": [ { "term": { "sex.keyword": { "value": "女", "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } } ], "adjust_pure_negative": true, "boost": 1.0 } } }
Java:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 構建查詢語句 searchSourceBuilder.query(QueryBuilders.boolQuery() .filter(QueryBuilders.boolQuery() .must(QueryBuilders.termQuery("sect.keyword", "明教")) .must(QueryBuilders.rangeQuery("age").gte(20).lte(35)) .mustNot(QueryBuilders.termQuery("sex.keyword", "女"))) );
04 聚合查詢
接下來,我們將用一些案例演示ES聚合查詢。
4.1 最值、平均值、求和
案例:查詢最大年齡、最小年齡、平均年齡。
SQL:
select max(age) from persons;
ES:
GET /person/_search { "aggregations": { "max_age": { "max": { "field": "age" } } } }
Java:
@Autowired private RestHighLevelClient client; @Test public void maxQueryTest() throws IOException { // 聚合查詢條件 AggregationBuilder aggBuilder = AggregationBuilders.max("max_age").field("age"); SearchRequest searchRequest = new SearchRequest("person"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 將聚合查詢條件構建到SearchSourceBuilder中 searchSourceBuilder.aggregation(aggBuilder); System.out.println("searchSourceBuilder----->" + searchSourceBuilder); searchRequest.source(searchSourceBuilder); // 執(zhí)行查詢,獲取SearchResponse SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println(JSONObject.toJSON(response)); }
使用聚合查詢,結果中默認只會返回10條文檔數(shù)據(jù)(當然我們關心的是聚合的結果,而非文檔)。返回多少條數(shù)據(jù)可以自主控制:
GET /person/_search { "size": 20, "aggregations": { "max_age": { "max": { "field": "age" } } } }
而Java中只需增加下面一條語句即可:
searchSourceBuilder.size(20);
與max類似,其他統(tǒng)計查詢也很簡單:
AggregationBuilder minBuilder = AggregationBuilders.min("min_age").field("age"); AggregationBuilder avgBuilder = AggregationBuilders.avg("min_age").field("age"); AggregationBuilder sumBuilder = AggregationBuilders.sum("min_age").field("age"); AggregationBuilder countBuilder = AggregationBuilders.count("min_age").field("age");
4.2 去重查詢
案例:查詢一共有多少個門派。
SQL:
select count(distinct sect) from persons;
ES:
{ "aggregations": { "sect_count": { "cardinality": { "field": "sect.keyword" } } } }
Java:
@Test public void cardinalityQueryTest() throws IOException { // 創(chuàng)建某個索引的request SearchRequest searchRequest = new SearchRequest("person"); // 查詢條件 SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 聚合查詢 AggregationBuilder aggBuilder = AggregationBuilders.cardinality("sect_count").field("sect.keyword"); searchSourceBuilder.size(0); // 將聚合查詢構建到查詢條件中 searchSourceBuilder.aggregation(aggBuilder); System.out.println("searchSourceBuilder----->" + searchSourceBuilder); searchRequest.source(searchSourceBuilder); // 執(zhí)行查詢,獲取結果 SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println(JSONObject.toJSON(response)); }
4.3 分組聚合
4.3.1 單條件分組
案例:查詢每個門派的人數(shù)
SQL:
select sect,count(id) from mytest.persons group by sect;
ES:
{ "size": 0, "aggregations": { "sect_count": { "terms": { "field": "sect.keyword", "size": 10, "min_doc_count": 1, "shard_min_doc_count": 0, "show_term_doc_count_error": false, "order": [ { "_count": "desc" }, { "_key": "asc" } ] } } } }
Java:
SearchRequest searchRequest = new SearchRequest("person"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.size(0); // 按sect分組 AggregationBuilder aggBuilder = AggregationBuilders.terms("sect_count").field("sect.keyword"); searchSourceBuilder.aggregation(aggBuilder);
4.3.2 多條件分組
案例:查詢每個門派各有多少個男性和女性
SQL:
select sect,sex,count(id) from mytest.persons group by sect,sex;
ES:
{ "aggregations": { "sect_count": { "terms": { "field": "sect.keyword", "size": 10 }, "aggregations": { "sex_count": { "terms": { "field": "sex.keyword", "size": 10 } } } } } }
4.4 過濾聚合
前面所有聚合的例子請求都省略了 query ,整個請求只不過是一個聚合。這意味著我們對全部數(shù)據(jù)進行了聚合,但現(xiàn)實應用中,我們常常對特定范圍的數(shù)據(jù)進行聚合,例如下例。
案例:查詢明教中的最大年齡。這涉及到聚合與條件查詢一起使用。
SQL:
select max(age) from mytest.persons where sect = '明教';
ES:
GET /person/_search { "query": { "term": { "sect.keyword": { "value": "明教", "boost": 1.0 } } }, "aggregations": { "max_age": { "max": { "field": "age" } } } }
Java:
SearchRequest searchRequest = new SearchRequest("person"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); // 聚合查詢條件 AggregationBuilder maxBuilder = AggregationBuilders.max("max_age").field("age"); // 等值查詢 searchSourceBuilder.query(QueryBuilders.termQuery("sect.keyword", "明教")); searchSourceBuilder.aggregation(maxBuilder);
另外還有一些更復雜的查詢例子。
案例:查詢0-20,21-40,41-60,61以上的各有多少人。
SQL:
select sum(case when age<=20 then 1 else 0 end) ageGroup1, sum(case when age >20 and age <=40 then 1 else 0 end) ageGroup2, sum(case when age >40 and age <=60 then 1 else 0 end) ageGroup3, sum(case when age >60 and age <=200 then 1 else 0 end) ageGroup4 from mytest.persons;
ES:
{ "size": 0, "aggregations": { "age_avg": { "range": { "field": "age", "ranges": [ { "from": 0.0, "to": 20.0 }, { "from": 21.0, "to": 40.0 }, { "from": 41.0, "to": 60.0 }, { "from": 61.0, "to": 200.0 } ], "keyed": false } } } }
查詢結果:
"aggregations" : { "age_avg" : { "buckets" : [ { "key" : "0.0-20.0", "from" : 0.0, "to" : 20.0, "doc_count" : 3 }, { "key" : "21.0-40.0", "from" : 21.0, "to" : 40.0, "doc_count" : 13 }, { "key" : "41.0-60.0", "from" : 41.0, "to" : 60.0, "doc_count" : 4 }, { "key" : "61.0-200.0", "from" : 61.0, "to" : 200.0, "doc_count" : 1 } ] } }
總結
到此這篇關于ElasticSearch查詢在Java中實現(xiàn)的文章就介紹到這了,更多相關Java實現(xiàn)ES查詢內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
關于SpringBoot整合redis使用Lettuce客戶端超時問題
使用到Lettuce連接redis,一段時間后不操作,再去操作redis,會報連接超時錯誤,在其重連后又可使用,糾結是什么原因導致的呢,下面小編給大家?guī)砹薙pringBoot整合redis使用Lettuce客戶端超時問題及解決方案,一起看看吧2021-08-08詳解Java中字符串緩沖區(qū)StringBuffer類的使用
StringBuffer與String類似,只不過StringBuffer在進行字符串處理時不生成新的對象,下面我們就來詳解Java中字符串緩沖區(qū)StringBuffer類的使用:2016-06-06