date_histogram聚合中设置interval为多天时起始日期如何确定？

Elasticsearch | 作者 guoxiaoguo | 发布于2018年06月01日 | 阅读数：5791

查询语句如下：数据的范围限定在两周内，想把每七天的数据分到一个桶。但是结果却出现了三个桶。
{
"size": 0,
"query": {
"range": {
"timestamp": {
"time_zone": "+08:00",
"gte": "now-13d/d",
"lt": "now/d"
}
}
},
"aggs": {
"secondAggs": {
"date_histogram": {
"field": "timestamp",
"interval": "7d",
"time_zone": "+08:00"
}
}
}
}

result:
"aggregations" : {
"secondAggs" : {
"buckets" : [
{
"key_as_string" : "2018-05-17T00:00:00.000+08:00",
"key" : 1526486400000,
"doc_count" : 1035155
},
{
"key_as_string" : "2018-05-24T00:00:00.000+08:00",
"key" : 1527091200000,
"doc_count" : 1370881
},
{
"key_as_string" : "2018-05-31T00:00:00.000+08:00",
"key" : 1527696000000,
"doc_count" : 198188
}
]
}
}
实际上数据范围是 2018-05-19 00:00:00 ~ 2018-06-01 00:00:00，分两个桶不是正好吗，为什么会出现三个桶？

2 个回复

strglee

赞同来自: guoxiaoguo

是这样的

https://www.elastic.co/guide/e ... nse_3

A multi-bucket aggregation similar to the histogram except it can only be applied on date values. Since dates are represented in Elasticsearch internally as long values, it is possible to use the normal histogram on dates as well, though accuracy will be compromised. The reason for this is in the fact that time based intervals are not fixed (think of leap years and on the number of days in a month). For this reason, we need special support for time based data. From a functionality perspective, this histogram supports the same features as the normal histogram. The main difference is that the interval can be specified by date/time expressions.

文档开始解释了，es存储时间其实是存储的long int也就是时间戳

那按照什么方式计算bucket_key呢？
https://www.elastic.co/guide/e ... .html

A multi-bucket values source based aggregation that can be applied on numeric values extracted from the documents. It dynamically builds fixed size (a.k.a. interval) buckets over the values.

计算es桶聚合bucket_key的公式是

bucket_key = Math.floor((value - offset) / interval) * interval + offset

2018-05-19 00:00:00 的时间戳是 1526659200 带入公式

math.floor(1526659200/(60*60*24*7)) * (60*60*24*7) = 1526515200

1526515200 的代表的时间是 2018-05-17 00:00:00

zhongkouwei

大神，请问这种情况怎么解决呢

要回复问题请先登录或注册

date_histogram聚合中设置interval为多天时起始日期如何确定？

2 个回复

发起人

活动推荐

相关问题

问题状态

date_histogram聚合中设置interval为多天时起始日期如何确定？

与内容相关的链接

2 个回复

发起人

活动推荐

相关问题

问题状态