现有9亿条数据,45个索引,每条数据大小为2k左右
在查询时候,首先要按照时间进行排序,然后做三次分组操作,各位大佬看看可以怎么优化
{
"from" : 0,
"size" : 0,
"query" : {
"bool" : {
"must" : [ {
"terms" : {
"keyword1" : [ "AAA","BBB",............"CCC" ]
}
}, {
"range" : {
"keyword2" : {
"from" : "2018-10-01 17:15:20",
"to" : "2018-10-30 17:15:20",
"format" : "yyyy-MM-dd HH:mm:ss",
"include_lower" : false,
"include_upper" : true
}
}
} ],
"must_not" : [ {
"terms" : {
"keyword3" : [ "0" ]
}
}, {
"terms" : {
"keyword3" : [ "XX", "XX1" ]
}
} ]
}
},
"_source" : {
"includes" : [ "keyword3" ],
"excludes" : [ ]
},
"aggregations" : {
"Agg1" : {
"terms" : {
"field" : "keyword4",
"size" : 24,
"order" : {
"_count" : "desc"
}
},
"aggregations" : {
"Agg2" : {
"terms" : {
"field" : "keyword5",
"size" : 2147483647,
"order" : {
"_count" : "desc"
}
},
"aggregations" : {
"orderCount" : {
"bucket_selector" : {
"script" : {
"inline" : "params.orderCount>=10"
},
"buckets_path" : {
"orderCount" : "_count"
}
}
},
"maxTime" : {
"max" : {
"field" : "keyword6",
"format" : "yyyy-MM-dd HH:mm:ss"
}
}
}
}
}
}
}
}
在查询时候,首先要按照时间进行排序,然后做三次分组操作,各位大佬看看可以怎么优化
{
"from" : 0,
"size" : 0,
"query" : {
"bool" : {
"must" : [ {
"terms" : {
"keyword1" : [ "AAA","BBB",............"CCC" ]
}
}, {
"range" : {
"keyword2" : {
"from" : "2018-10-01 17:15:20",
"to" : "2018-10-30 17:15:20",
"format" : "yyyy-MM-dd HH:mm:ss",
"include_lower" : false,
"include_upper" : true
}
}
} ],
"must_not" : [ {
"terms" : {
"keyword3" : [ "0" ]
}
}, {
"terms" : {
"keyword3" : [ "XX", "XX1" ]
}
} ]
}
},
"_source" : {
"includes" : [ "keyword3" ],
"excludes" : [ ]
},
"aggregations" : {
"Agg1" : {
"terms" : {
"field" : "keyword4",
"size" : 24,
"order" : {
"_count" : "desc"
}
},
"aggregations" : {
"Agg2" : {
"terms" : {
"field" : "keyword5",
"size" : 2147483647,
"order" : {
"_count" : "desc"
}
},
"aggregations" : {
"orderCount" : {
"bucket_selector" : {
"script" : {
"inline" : "params.orderCount>=10"
},
"buckets_path" : {
"orderCount" : "_count"
}
}
},
"maxTime" : {
"max" : {
"field" : "keyword6",
"format" : "yyyy-MM-dd HH:mm:ss"
}
}
}
}
}
}
}
}
2 个回复
kennywu76 - Wood
赞同来自: wudingmei1024 、rochy 、laoyang360 、ridethewind
采用@rochy 的建议,aggs2添加"min_doc_count":10这个设置,应该会大大减少buckets的数量。 要注意的是,如果count>10的buckets数量依然很大, 聚合的速度还是快不起来。 返回海量buckets的内存开销无法避免,并且因为消耗内存过多,容易给结点带来很大的GC压力,甚至OOM。 如果单次聚合取回结果困难,可以考虑将聚合分成多个批次分别取回。参考: https://www.elastic.co/guide/e ... tions
rochy - rochy_he
赞同来自: kennywu76 、wudingmei1024 、ridethewind
Agg2 添加上 "min_doc_count": 10 限制最小文档个数,这样就可以去掉 orderCount 这个聚合了