用了Elasticsearch,一口气上5T

Terms Aggregation如何在结果里返回其他?

Elasticsearch | 作者 fantuan | 发布于2019年06月18日 | 阅读数:1623

Mapping: customer里有nested address, address里有nested properties。
curl -XPUT 'localhost:9200/customer/' -d '
{
"mappings": {
"doc": {
"properties": {
"address": {
"type": "nested",
"dynamic": "false",
"properties": {
"city": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"country": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"county": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"properties": {
"type": "nested",
"properties": {
"isPrimary": {
"type": "keyword"
},
"type": {
"type": "keyword"
}
}
},
"province": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
}
'
数据:
curl XPUT 'localhost:9200/customer/doc/1' -d
'{
"address": [
{
"country": {
"name": "中国"
},
"province": {
"name": "上海市"
},
"properties": [
{
"type": "BILLING",
"isPrimary": "Y"
}
]
},
{
"country": {
"name": "中国"
},
"province": {
"name": "北京市"
},
"properties": [
{
"type": "BILLING",
"isPrimary": "N"
}
]
}
]
}'

curl XPUT 'localhost:9200/customer/doc/2' -d
'{
"address": null
}'

curl XPUT 'localhost:9200/customer/doc/3' -d
'{
"address": [
{
"country": {
"name": "中国"
},
"province": {
"name": "上海市"
},
"properties": [
{
"type": "SHIPPING_TO",
"isPrimary": "Y"
}
]
}
]
}'

聚合查询:预期:对默认收货地址进行terms aggregation, 不符合过滤条件的归到'其他'。
curl 'localhost:9200/customer/_search' -d
'{
"size": 0,
"aggs": {
"address": {
"nested": {
"path": "address"
},
"aggs": {
"shipping_to_address": {
"aggs": {
"province": {
"terms": {
"field": "address.province.name.keyword",
"size": 10,
"missing": "其他"
}
}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "address.properties",
"query": {
"bool": {
"filter": [
{
"term": {
"address.properties.isPrimary": "Y"
}
},
{
"term": {
"address.properties.type": "SHIPPING_TO"
}
}
]
}
}
}
}
]
}
}
}
}
}
}
}'

实际返回结果,不符合过滤条件的数据没有被统计到'其他'桶里。
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits":
},
"aggregations": {
"address": {
"doc_count": 3,
"shipping_to_address": {
"doc_count": 1,
"province": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "上海市",
"doc_count": 1
}
]
}
}
}
}
}
请教大家如何拿到下面这样不满足过滤条件的统计数字呢?
buckets:[
{
  "key":"上海市",
  "doc_count":1
},
{
  "key":"其他",
  "doc_count":2
}]
 
已邀请:

alvin - 90后IT男

赞同来自: fantuan

top_hits是将分组聚合后的结果每组返回top N, 楼主的语句中filter已经将不符合条件的address过滤掉了,而且missing只是count不含当前聚合字段的文档数量.要想实现楼主想要的结果,最简单粗暴的解决办法就是聚合两次,分别聚合不满足和满足条件的address。
curl -XPOST "http://localhost:9200/customer/doc/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"address": {
"nested": {
"path": "address"
},
"aggs": {
"shipping_to_address": {
"aggs": {
"province": {
"terms": {
"field": "address.province.name.keyword"
}
}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "address.properties",
"query": {
"bool": {
"filter": [
{
"term": {
"address.properties.isPrimary": "Y"
}
},
{
"term": {
"address.properties.type": "SHIPPING_TO"
}
}
]
}
}
}
}
]
}
}
},
"not_shipping_to_address": {
"aggs": {
"province": {
"meta": {
"province": "其他"
},
"terms": {
"field": "address.province.name.keyword"
}
}
},
"filter": {
"bool": {
"must_not": [
{
"nested": {
"path": "address.properties",
"query": {
"bool": {
"filter": [
{
"term": {
"address.properties.isPrimary": "Y"
}
},
{
"term": {
"address.properties.type": "SHIPPING_TO"
}
}
]
}
}
}
}
]
}
}
}
}
}
}
}'

laoyang360 - 《一本书讲透Elasticsearch》作者,Elastic认证工程师 [死磕Elasitcsearch]知识星球地址:http://t.cn/RmwM3N9;微信公众号:铭毅天下; 博客:https://elastic.blog.csdn.net

赞同来自:

之前讨论过类似问题,结合top_hits 聚合实现。

要回复问题请先登录注册