Series Article Directory
foreword
ES data query, we will explain from 5 aspects:
1. Basic query (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
2. _source filtering/result filtering (similar to select field1,field2 from table-name in mysql)
3. Advanced query (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
4. Filter the query result row records (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
5. Sort (similar to order by field desc/asc in mysql)
Summary: basic query match bool range fuzzy, then filter filter, then _source sort, and finally sort order .
All es operation statements in this article: https://www.syjshare.com/res/5W547A7Z
1. Basic query
basic grammar
GET /index library name/_search { "query":{ "query type":{ "Query conditions":"query condition value" } } }
The query here represents a query object, which can have different query properties
- Query type:
- For example: match_all, match, term, range, etc.
- Query conditions: Query conditions will be written differently depending on the type, which will be explained in detail later
data prefab
1.1 Query all (match_all)
Example:
GET /myindex/_search { "query":{ "match_all": {} } }
- query: represents the query object
- match_all: means query all
result:
- took: the time spent by the query, in milliseconds
- time_out: whether to time out
- _shards: shard information
- hits: search result overview object
- total: the total number of items searched
- max_score: the highest score for the document in all results
- hits: an array of document objects for search results, each element is a piece of searched document information
- _index: index library
- _type: document type
- _id: document id
- _score: document score
- _source: The source data of the document
1.2 Match query (match)
1. Insert prefabricated data
Let's add a piece of data first for testing:
PUT /myindex/goods/3 { "title":"Mi TV 4 A", "price":3899.00 } GET /myindex/_search { "query":{ "match_all": {} } }
Now, there are 2 mobile phones, 1 TV in the index library:
2. The match keyword is or match
match type query, the query conditions will be segmented, and then query, the relationship between multiple terms is or
GET /myindex/_search { "query":{ "match":{ "title":"Xiaomi TV" } } } or GET /myindex/_search { "query":{ "match":{ "title":{ "query": "Xiaomi TV" } } } } or GET /myindex/_search { "query":{ "match":{ "title":{ "query": "Xiaomi TV", "operator": "or" } } } }
result:
In the above case, as long as any one of the four characters of Xiaomi TV matches, it can be matched. Therefore, there is an or relationship between multiple words.
3. The match keyword + operator specifies yes and matches
In some cases, we need to find more precise, we want this relationship to become and, we can do this:
GET /myindex/_search { "query":{ "match": { "title": { "query": "Xiaomi TV", "operator": "and" } } } }
result:
{ "took": 2, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.5753642, "hits": [ { "_index": "heima", "_type": "goods", "_id": "3", "_score": 0.5753642, "_source": { "title": "Mi TV 4 A", "price": 3899 } } ] } }
In this example, only terms that also contain Xiaomi TV will be searched.
4. Choose between or and and
Choosing between or and and is a bit too black and white. If there are 5 query terms after the conditional segmentation given by the user, and want to find documents that only contain 4 of them, what should I do? Setting the operator parameter to and will only exclude this document.
Sometimes this is what we want, but in most use cases of full-text search, we want to include those documents that are potentially relevant while excluding those that are less relevant. In other words, we want to be in the middle of some kind of outcome.
The match query supports the minimum_should_match parameter, which allows us to specify the number of terms that must match to indicate whether a document is relevant. We can set this to a specific number, more commonly a percentage, since we have no control over how many words the user enters when searching:
GET /myindex/_search { "query":{ "match":{ "title":{ "query":"xiaomi curved tv", "minimum_should_match": "75%" } } } }
In this example, the search sentence can be divided into 3 words. If the and relation is used, it needs to satisfy 3 words at the same time to be searched. Here we use the minimum number of brands: 75%, which means that as long as 75% of the total number of entries is matched, here 3*75% is approximately equal to 2. So as long as it contains 2 entries, the conditions are met.
result:
1.3 Multi-field query (multi_match)
multi_match is similar to match, except that it can be queried in multiple fields
GET /myindex/_search { "query":{ "multi_match": { "query": "Millet", "fields": [ "title", "price" ] } } }
In this example, we will query the word Xiaomi in the title field and price field
1.4 Exact value matching (term)
The term query is used to match exact values, which may be numbers, times, booleans, or those unsplit strings
GET /myindex/_search { "query":{ "term":{ "price":2699.00 } } }
result:
{ "took": 2, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "heima", "_type": "goods", "_id": "r9c1KGMBIhaxtY5rlRKv", "_score": 1, "_source": { "title": "Xiaomi phone", "price": 2699 } } ] } }
1.5 Multi-term exact match (terms)
The terms query is the same as the term query, but it allows you to specify multiple values to match against. If the field contains any of the specified values, then the document satisfies the condition:
GET /myindex/_search { "query":{ "terms":{ "price":[2699.00,3899.00] } } }
result:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "myindex", "_type" : "goods", "_id" : "2", "_score" : 1.0, "_source" : { "title" : "Xiaomi phone", "price" : 2699 } }, { "_index" : "myindex", "_type" : "goods", "_id" : "3", "_score" : 1.0, "_source" : { "title" : "Mi TV 4 A", "price" : 3899.0 } } ] } }
One more example, as follows:
GET /myindex/_search { "query":{ "terms":{ "price":[2699.00,2899.00,3899.00] } } }
result:
{ "took": 4, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "heima", "_type": "goods", "_id": "2", "_score": 1, "_source": { "title": "rice phone", "price": 2899 } }, { "_index": "heima", "_type": "goods", "_id": "r9c1KGMBIhaxtY5rlRKv", "_score": 1, "_source": { "title": "Xiaomi phone", "price": 2699 } }, { "_index": "heima", "_type": "goods", "_id": "3", "_score": 1, "_source": { "title": "Mi TV 4 A", "price": 3899 } } ] } }
2. In the returned result, only the fields that need to be returned
By default, elasticsearch will return all fields in the document saved in _source in the search results.
If we only want to get some of the fields, we can add _source filtering
2.1 Implemented through the _source keyword
Example:
GET /myindex/_search { "_source": ["title","price"], "query": { "term": { "price": 2699 } } }
Returned result:
{ "took": 12, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "heima", "_type": "goods", "_id": "r9c1KGMBIhaxtY5rlRKv", "_score": 1, "_source": { "price": 2699, "title": "Xiaomi phone" } } ] } }
2.2 Implemented through includes and excludes
We can also pass:
- includes: to specify the fields you want to display
- excludes: to specify fields that you do not want to display
Both are optional.
Example:
GET /myindex/_search { "_source": { "includes":["title"] }, "query": { "term": { "price": 2699 } } }
The result will be the same as:
GET /myindex/_search { "_source": { "excludes": ["price"] }, "query": { "term": { "price": 2699 } } }
3. Advanced query
3.1 Boolean combination (bool)
bool combines various other queries by must (and), must_not (not), should (or)
GET /myindex/_search { "query":{ "bool":{ "must": { "match": { "title": "rice" }}, "must_not": { "match": { "title": "television" }}, "should": { "match": { "title": "cell phone" }} } } }
result:
{ "took": 10, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.5753642, "hits": [ { "_index": "heima", "_type": "goods", "_id": "2", "_score": 0.5753642, "_source": { "title": "rice phone", "price": 2899 } } ] } }
3.2 Range query (range)
The range query finds those numbers or times that fall within the specified range
GET /myindex/_search { "query":{ "range": { "price": { "gte": 1000.0, "lt": 2800.00 } } } }
The range query allows the following characters:
operator | illustrate |
---|---|
gt | more than the |
gte | greater or equal to |
lt | less than |
lte | less than or equal to |
3.3 Fuzzy query (fuzzy)
We add a new product:
POST /myindex/goods/4 { "title":"apple cell phone", "price":6899.00 }
The fuzzy query is the fuzzy equivalent of the term query. It allows users to deviate the spelling of the search term from the actual term, but the deviation must not exceed an edit distance of 2:
GET /myindex/_search { "query": { "fuzzy": { "title": "appla" } } }
The above query can also query the apple mobile phone
We can specify the allowable edit distance by fuzziness:
GET /myindex/_search { "query": { "fuzzy": { "title": { "value":"appla", "fuzziness":1 } } } }
Fourth, filter the query result row records (filter)
4.1 Filter in conditional query
All queries affect the scoring and ranking of documents. If we need to filter in the query results, and do not want the filter to affect the score, then don't use the filter as a query. Instead use the filter method:
GET /myindex/_search { "query":{ "bool":{ "must":{ "match": { "title": "Xiaomi phone" }}, "filter":{ "range":{"price":{"gt":2000.00,"lt":3800.00}} } } } }
Note: You can also perform bool combination condition filtering again in filter.
Summary: Match the output result of the word segmentation "Xiaomi mobile phone", then filter by price again, and finally output.
4.2 No query conditions, direct filtering
If a query only has filtering, no query conditions, and do not want to score, we can use constant_score to replace the bool query with only the filter statement. Performance-wise it's exactly the same, but goes a long way towards improving query brevity and clarity.
GET /myindex/_search { "query":{ "constant_score": { "filter": { "range":{"price":{"gt":2000.00,"lt":3000.00}} } } }
5. Sorting
5.1 Single field sorting
sort allows us to sort by different fields and specify the sorting method by order
GET /myindex/_search { "query": { "match": { "title": "Xiaomi phone" } }, "sort": [ { "price": { "order": "desc" } } ] }
Summary: Xiaomi mobile phone word segmentation query, and then order desc sorting.
5.2 Multi-field sorting
Suppose we want to query using a combination of price and _score, and the matching results are sorted first by price and then by relevance score:
GET /_search { "query":{ "bool":{ "must":{ "match": { "title": "Xiaomi phone" }} } }, "sort": [ { "price": { "order": "desc" }}, { "_score": { "order": "desc" }} ] }
GET /myindex/_search { "query":{ "bool":{ "must":{ "match": { "title": "Xiaomi phone" }}, "filter":{ "range":{"price":{"gt":2000.00,"lt":3000.00}} } } }, "sort": [ { "price": { "order": "desc" }}, { "_score": { "order": "desc" }} ] }
Summarize
ES data query, we will explain from 5 aspects:
1. Basic query (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
2. _source filtering/result filtering (similar to select field1,field2 from table-name in mysql)
3. Advanced query (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
4. Filter the query result row records (similar to where field1 = xxx and filed2 like "%xxx%" in mysql)
5. Sort (similar to order by field desc/asc in mysql)
Summary: basic query match , then bool range fuzzy, then filter filter, then _source sort, and finally sort order .