you know for search ! Official website Guide — Java API – Chinese word segmentation github - Linux ES installation notes
Es data format
- Index (indexes / indexes) - > similar to database
- Document (_ Doc) - > the basic unit for storing data, such as a piece of user data, which is stored in the user index library
- Type (_ Type) - > document type, such as storing user data, the type is the user
- /Class / student / 1 - > class index library, student document type (7.x default type)_ doc), id = student 1
- Document oriented, direct storage of data objects, such as: index a student to the class index library, it directly stores the overall data of the student object
data type -Introduction address
- String: Text: generally used for full-text retrieval, the current Field will be segmented, keyword: keyword, the current file will not be segmented
- number: long,integer,short,byte,double,float,half_float,scaled_float
- boolean: true,flase
- Binary: supports Base64 encode string
- Range: long_range, double_range, date_range, and ip_range
- Geo (geo_point (longitude and latitude type): longitude and latitude storage
- ip: ip storage
// Create an index library and specify the data structure PUT /class { "settings": { "number_of_shards": 1, //Number of slices "number_of_replicas": 1, //Number of backups "index.analysis.analyzer.default.type": "ik_max_word" //Default word splitter settings }, "mappings": { //Mapping document field property mapping "properties": { //attribute "name":{ //name field "type": "text", //The text type will be segmented and stored in the word segmentation database "analyzer": "ik_max_word", //Chinese ik word segmentation, the most fine-grained split "index": true, //fasle is not a search condition "store": false //Do you need additional storage }, "english_name":{ "type":"text", "analyzer": "english" //Specify English word participator }, "sex":{ "type":"keyword" //Keyword type, do not split }, "age":{ "type":"integer" //integer type }, "birthday":{ "type":"date", //date type "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd" }, "interests":{ //Hobby, below the storage of array data "type":"text", "analyzer": "ik_smart", //Chinese ik word segmentation, the most fine-grained split } } } }
Query index
GET /megacorp
Basic description
- Elasticsearch is written in Java and is used internally Lucene Index and search, but its purpose is to make full-text retrieval simple. By hiding the complexity of Lucene, it provides a set of simple and consistent RESTful API instead
- Through various requests to operate data, the Java client of Es also encapsulates various request classes
- For example, create an index
//Createindexrequest - > Request to create an index CreateIndexRequest request = new CreateIndexRequest(index); //Create index response > request response CreateIndexResponse response = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT); flag=response.isAcknowledged();
Including features:
- A distributed real-time document storage, each field can be indexed and searched
- A distributed real time analysis search engine
- Capable of expanding hundreds of service nodes, and supporting PB level structured or unstructured data
CRUD of data
Add data 👇
PUT /class/_doc/1 // PUT /class/1 _doc document type 7.x the default document type is_ doc { "name" : "He decimal", "english_name" : "hexiaoshu", "sex": "male", "age" : 18, "birthday" : "2003-05-11", "interests": "Doing sports with code" }
When the index type is not specified, you can directly add the index library and add the document data eg 👇
PUT /megacorp/employee/1 //megacorp index name employee type name 1 employee id, dynamic mapping field type { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }
Modify data 👇
PUT /class/_doc/1 // Modify the data of student No.1, and change the age to 20 { "name" : "He decimal", "english_name" : "hexiaoshu", "sex": "male", "age" : 20, "birthday" : "2003-05-11", "interests": "Doing sports with code" } //Response results { "_index" : "class", //Index name "_type" : "_doc", //Document type "_id" : "1", //Data id "_version" : 2, //The version tag of cas "result" : "updated", //Update logo "_shards" : { //Number of slices "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 }
Query data: 👇
//Query all_ search GET /class/_search { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "class", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "He decimal", "english_name" : "hexiaoshu", "sex" : "male", "age" : 20, "birthday" : "2003-05-11", "interests" : "Doing sports with code" } }, { "_index" : "class", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "name" : "Thousands of", "english_name" : "qianqian", "sex" : "female", "age" : 18, "birthday" : "2005-05-11", "interests" : "Painting and eating fruit" } } ] }
//Query single data with id 1 GET /class/_doc/1 { "_index" : "class", "_type" : "_doc", "_id" : "1", "_version" : 2, "_seq_no" : 1, "_primary_term" : 1, "found" : true, "_source" : { "name" : "He decimal", "english_name" : "hexiaoshu", "sex" : "male", "age" : 20, "birthday" : "2003-05-11", "interests" : "Doing sports with code" } }
//Single condition matching query GET /class/_search { "query": { "match": { "sex": "male" } } } //If the query files is text type, the query word segmentation will be configured according to the word splitter you set GET /class/_search { "query": { "match": { "interests": "Code" } } } //Multi criteria query, where sex is male and the age is greater than 18, is filter range field gt/gte lt/lte //If at least one of the two conditions in should is satisfied. GET /class/_search { "query": { "bool": { "must": { // or should "match":{ "sex":"male" } }, "filter":{ //The filter will be cached "range":{ "age":{ "gte":18 } } } } } }
filtering context and query context
- Filtering queries simply check inclusion or exclusion, which makes the calculation very fast. With non scoring queries, the results will be cached in memory for quick reading
- Scoring queries not only need to find out the matching documents, but also need to calculate the correlation of each matching document, which makes them more laborious than non scoring queries. At the same time, query results are not cached
- inverted index, a simple scoring query may perform as well or even better when matching a small number of documents as a filter covering millions of documents. But in general, a filter will perform better than a graded query, and the performance is very stable every time.
//multi_match multiple fields to perform the same query operation GET /class/_search { "query": { "multi_match": { "query": "Small", "fields": ["interests","name"] } } } //The date of birth is May 11, 2003 GET /class/_search { "query": { "term": { "birthday": "2003-05-11" } } } //Exists - field is not null and missing is null GET /class/_search { "query": { "bool": { "must": [ { "exists": { "field": "sex" } } ] } } }
The query is sorted according to the specified order. By default, it is scored_ score ranking
//query followed by sort, sorted by birthday GET /class/_search { "query": { "bool": { "filter":{ "range":{ "age":{ "gte":18 } } } } }, "sort": [ { "birthday": { "order": "desc" } } ] } //Multi level sorting "sort": [ { "date": { "order": "desc" }}, { "_score": { "order": "desc" }} ]
Paging query
//Query end + from, size. Similar to mysql limit GET /class/_search { "query": { "match": { "interests": "Code" } }, "from" : 0 , "size" : 10 } //Deep paging scroll, scroll=3m represents the data cache of the current query for 3 minutes //--All the data meeting the query conditions can be put into memory at one time. When paging, query in memory. Compared with shallow paging, you can avoid reading the disk many times. GET /class/_search?scroll=3m { "query": { "match": { "interests": "Code" } }, "from" : 0 , "size" : 10 }
Spring boot framework integrates ES 7.10
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.4.1</version> <relativePath/> <!-- lookup parent from repository --> </parent> <groupId>com</groupId> <artifactId>es</artifactId> <version>0.0.1-SNAPSHOT</version> <name>es</name> <description>Demo project for Spring Boot</description> <properties> <java.version>1.8</java.version> <elasticsearch.version>7.10.0</elasticsearch.version> </properties> <dependencies> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>7.10.0</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-configuration-processor</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <!-- Web page analysis--> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.10.2</version> </dependency> <dependency> <groupId>com.alibaba</groupId> <artifactId>fastjson</artifactId> <version>1.2.75</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <excludes> <exclude> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> </exclude> <exclude> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>7.9.3</version> </exclude> </excludes> </configuration> </plugin> </plugins> </build> </project>
import org.apache.http.HttpHost; import org.elasticsearch.client.RestClient; import org.elasticsearch.client.RestHighLevelClient; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; /** * @Description * @Author Hexiaoshu * @Date 2021/1/5 * @modify */ @Configuration public class ElasticSearchConfig { @Bean public RestHighLevelClient restHighLevelClient(){ return new RestHighLevelClient( RestClient.builder( new HttpHost("es Service address", 9200, "http") //,new HttpHost("localhost", 9201, "http") )); } }
Common query ES version 7.10 integrated by EsUtil
import lombok.extern.slf4j.Slf4j; import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest; import org.elasticsearch.client.GetAliasesResponse; import org.elasticsearch.client.indices.*; import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest; import org.elasticsearch.action.bulk.BulkRequest; import org.elasticsearch.action.bulk.BulkResponse; import org.elasticsearch.action.delete.DeleteRequest; import org.elasticsearch.action.delete.DeleteResponse; import org.elasticsearch.action.get.GetRequest; import org.elasticsearch.action.get.GetResponse; import org.elasticsearch.action.index.IndexRequest; import org.elasticsearch.action.index.IndexResponse; import org.elasticsearch.action.search.SearchRequest; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.action.update.UpdateRequest; import org.elasticsearch.action.update.UpdateResponse; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.client.RestHighLevelClient; import org.elasticsearch.cluster.metadata.AliasMetadata; import org.elasticsearch.cluster.metadata.MappingMetadata; import org.elasticsearch.common.text.Text; import org.elasticsearch.common.xcontent.XContentBuilder; import org.elasticsearch.common.xcontent.XContentFactory; import org.elasticsearch.common.xcontent.XContentType; import org.elasticsearch.index.query.AbstractQueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.SearchHit; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder; import org.elasticsearch.search.fetch.subphase.highlight.HighlightField; import org.elasticsearch.search.sort.SortOrder; import org.springframework.stereotype.Component; import javax.annotation.Resource; import java.io.IOException; import java.util.*; import java.util.concurrent.atomic.AtomicReference; /** * @Description ElasticSearch 7.10.0 Tools * @Author Hexiaoshu * @Date 2021/1/5 * @modify */ @Slf4j @Component public class EsUtil { @Resource private RestHighLevelClient restHighLevelClient; /** * Query index field attribute information index mapping * @param index Index name * @return Map<String, MappingMetadata> */ public Map<String, MappingMetadata> mappingIndex(String index){ Map<String, MappingMetadata> mappings =null; try { GetMappingsRequest request = new GetMappingsRequest().indices(index); GetMappingsResponse response = restHighLevelClient.indices().getMapping(request, RequestOptions.DEFAULT); mappings= response.mappings(); }catch (IOException e){ log.error("es Abnormal connection"); } return mappings; } /** * Query all index libraries, alias - configure * Map<Index name, index metadata > * @return Map<String,Set<AliasMetadata>> */ public Map<String,Set<AliasMetadata>> searchIndex(){ Map<String,Set<AliasMetadata>> aliases = null; try { GetAliasesRequest request = new GetAliasesRequest(); GetAliasesResponse response = restHighLevelClient.indices().getAlias(request, RequestOptions.DEFAULT); aliases = response.getAliases(); }catch (IOException e){ log.error("es Abnormal connection"); } return aliases; } /** * Create index, map property dynamically * @param index Index name * @return Boolean */ public Boolean createIndex(String index){ boolean flag=false; try { CreateIndexRequest request = new CreateIndexRequest(index); CreateIndexResponse response = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT); flag=response.isAcknowledged(); }catch (IOException e){ log.error("es Abnormal connection"); } return flag; } /** * Create an index and specify index properties * @param index Index name * @param source Field properties mapping properties - {source} * @return Boolean */ public Boolean createIndex(String index,Map<String,Map<String,Object>> source){ boolean flag=false; try { CreateIndexRequest request = new CreateIndexRequest(index); XContentBuilder builder = getxContentBuilder(source); request.mapping(builder); CreateIndexResponse response = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT); flag=response.isAcknowledged(); }catch (IOException e){ log.error("es Abnormal connection"); } return flag; } /** * Set index field properties * @param source field * @return XContentBuilder * Request json example * { * "name":{ * "type":"text", * "analyzer": "ik_max_word" * }, * "english_name":{ * "type":"text", * "analyzer": "english" * }, * "age":{ * "type":"integer" * }, * "height":{ * "type":"double" * }, * "hobby":{ * "type":"text", * "analyzer": "ik_max_word", * "index": true, * "store": false * }, * "major":{ * "type":"keyword" * }, * "birthday":{ * "type":"date", * "format": "yyyy-MM-dd HH:mm:ss" * }, * "sex":{ * "type":"keyword" * } * } * @throws IOException */ private XContentBuilder getxContentBuilder(Map<String,Map<String,Object>> source) throws IOException { XContentBuilder builder = XContentFactory.jsonBuilder(); builder.startObject(); { builder.startObject("properties"); { source.forEach((k,v)->{ try { builder.startObject(k); v.forEach((p,f)->{ try { builder.field(p, f); } catch (IOException ex) { ex.printStackTrace(); } }); builder.endObject(); }catch (Exception e){ e.printStackTrace(); } }); } builder.endObject(); } builder.endObject(); return builder; } /** * Judge whether the index exists * @param index Index name * @return Boolean */ public Boolean existIndex(String index){ boolean flag=false; try { GetIndexRequest request = new GetIndexRequest(index); flag= restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT); }catch (IOException e){ log.error("es Abnormal connection"); } return flag; } /** * Delete index * @param index Index name * @return Boolean */ public Boolean delIndex(String index){ boolean flag=false; try { DeleteIndexRequest request = new DeleteIndexRequest(index); flag=restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT).isAcknowledged(); }catch (IOException e){ log.error("es Abnormal connection"); } return flag; } /** * Add document * @param index Index library * @param o object * @param id Document id * @return status */ public Boolean addDoc(String index,Object o,String id){ boolean flag=false; IndexRequest request = new IndexRequest(index); request.id(id); request.source(JsonUtil.toStr(o), XContentType.JSON); try { IndexResponse response = restHighLevelClient.index(request,RequestOptions.DEFAULT); flag= 201 == response.status().getStatus(); } catch (IOException e) { log.error("es Abnormal connection"); } return flag; } /** * Batch add documents * @param index Indexes * @param list aggregate * @return Boolean */ public <T> Boolean addDocs(String index, List<T> list){ if (list==null){ return null; } boolean flag=false; BulkRequest request = new BulkRequest(); list.forEach(e->{ request.add(new IndexRequest(index).source(JsonUtil.toStr(e),XContentType.JSON)); }); try { BulkResponse responses = restHighLevelClient.bulk(request, RequestOptions.DEFAULT); flag=!responses.hasFailures(); } catch (IOException e) { log.error("es Abnormal connection"); } return flag; } /** * Update document * @param index Indexes * @param o object * @param id Document id * @return Boolean */ public Boolean updateDoc(String index,Object o,String id){ boolean flag=false; UpdateRequest request = new UpdateRequest(index,id).doc(JsonUtil.toStr(o), XContentType.JSON); try { UpdateResponse response = restHighLevelClient.update(request, RequestOptions.DEFAULT); flag = 200==response.status().getStatus(); } catch (IOException e) { log.error("es Abnormal connection"); } return flag; } /** * remove document * @param index Indexes * @param docId Document id * @return Boolean */ public Boolean delDoc(String index,String docId){ boolean flag=false; DeleteRequest request = new DeleteRequest(index,docId); try { DeleteResponse response = restHighLevelClient.delete(request, RequestOptions.DEFAULT); flag=200==response.status().getStatus(); } catch (IOException e) { log.error("es Abnormal connection"); } return flag; } /** * Get document by id * @param index Index library * @param docId Document id * @return GetResponse */ public GetResponse getDocById(String index,String docId){ GetResponse response=null; GetRequest request = new GetRequest(index,docId); try { response = restHighLevelClient.get(request, RequestOptions.DEFAULT); } catch (IOException e) { log.error("es Abnormal connection"); } return response; } /** * Document precise query * @param index Indexes * @param filed attribute * @param value value * @param page Current page * @param size Quantity per page * @return SearchResponse */ public List<Map<String,Object>> searchDocTerm(String index,String filed,String value,String sortFiled,String sort,Integer page,Integer size){ HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.field(filed); highlightBuilder.requireFieldMatch(false); highlightBuilder.preTags("<span style='color:red'>"); highlightBuilder.postTags("</span>"); SearchRequest request = setQuery(index, QueryBuilders.termQuery(filed,value),highlightBuilder,sortFiled,sort,page,size); return getData(request,filed); } /** * Full document query * @param index Indexes * @param page Current page * @param size Quantity per page * @return SearchResponse */ public List<Map<String,Object>> searchDocAll(String index,Integer page,Integer size){ SearchRequest request = setQuery(index, QueryBuilders.matchAllQuery(),null,null,null,page,size); return getData(request,null); } /** * Get sourceMap data * @param request SearchRequest * @return List<Map<String, Object>> */ private List<Map<String, Object>> getData(SearchRequest request,String highlightName) { List<Map<String,Object>> list = new LinkedList<>(); try { SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT); SearchHit[] hits = response.getHits().getHits(); Arrays.stream(hits).forEach(e->{ Map<String, Object> sourceAsMap = e.getSourceAsMap(); if (highlightName!=null){ HighlightField highlightField = e.getHighlightFields().get(highlightName); if (highlightField!=null){ Text[] fragments = highlightField.fragments(); AtomicReference<String> highlight= new AtomicReference<>(highlightName); Arrays.stream(fragments).forEach(h-> highlight.updateAndGet(v -> v + h)); sourceAsMap.put(highlightName,highlight.get()); } } list.add(sourceAsMap); }); } catch (IOException e) { log.error("es Abnormal connection"); } return list; } /** * Query settings * @param index Indexes * @param queryBuilder Query abstract class * @return SearchRequest */ private SearchRequest setQuery(String index,AbstractQueryBuilder queryBuilder,HighlightBuilder highlightBuilder,String sortFiled,String sort,Integer page,Integer size){ SearchRequest request = new SearchRequest(index); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.from(page==null?1:page); sourceBuilder.size(size==null?10:size); sourceBuilder.query(queryBuilder); if (sortFiled!=null){ sourceBuilder.sort(sortFiled,"desc".equals(sort)?SortOrder.DESC:SortOrder.ASC ); } if (highlightBuilder!=null){ sourceBuilder.highlighter(highlightBuilder); } request.source(sourceBuilder); return request; } }
@Resource private EsUtil esUtil; @PostMapping("/create_index") public Result test(String index) { Boolean isSuccess = esUtil.createIndex(index); return Result.ok(isSuccess); }
The next main search, I'm stroking!