场景:按日期纬度一个数据报表,字段有date_str, shop_code, new_user_count, active_count。字段分别含义:日期(精确到日)、车商id、 新增用户、活跃用户数;
先需先按照时间范围查询,车商id分组,new_user_count字段取sum。然后在聚合后过滤new_user_count字段sum结果>30的所有车商id,以及sum结果。
类似sql:select shop_code, sum(new_user_count) from A where date_str >= ? and date_str <= ? group by shop_code having sum(new_user_count) > 30;
ps:不知道es有没有实现先聚合然后过滤的功能。希望大神们能解惑,谢了。
先需先按照时间范围查询,车商id分组,new_user_count字段取sum。然后在聚合后过滤new_user_count字段sum结果>30的所有车商id,以及sum结果。
类似sql:select shop_code, sum(new_user_count) from A where date_str >= ? and date_str <= ? group by shop_code having sum(new_user_count) > 30;
ps:不知道es有没有实现先聚合然后过滤的功能。希望大神们能解惑,谢了。
3 个回复
nihao
赞同来自:
jacky_wang
赞同来自:
daodaodt
赞同来自:
https://www.elastic.co/guide/e ... .html
用它实现了如下SQL
select company_id, count(1) from xxx_table where company_id between A and B group by company_id having count(1) > 1;
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"range": {
"company_id": {
"gte": A,
"lte": B
}
}
}
}
},
"aggs": {
"count_by_company_id": {
"terms": {
"field": "company_id"
},
"aggs": {
"distinct_id": {
"cardinality": {
"field": "company_id"
}
},
"id_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"value": "distinct_id"
},
"script": "params.value> 1"
}
}
}
}
}
}