高峰只对攀登它而不是仰望它的人来说才有真正意义。

es如何先distinct某些字段再group其中的部分字段?

Elasticsearch | 作者 hjxisking | 发布于2021年08月23日 | 阅读数:2064

问题来源:有一个AWS的redshift数据仓库的数据想搬回自己搭建的Elasticsearch内,之前基于postgresql语法写的一些sql需要转换成DSL
SQL简化如下,省略where条件等如下:
with cte as (
SELECT distinct user_id,countymd,countdate,price,action_type,createdate FROM xxx_table
)
select countymd, action_type, count(distinct user_id) as uu, sum(price) as total from cte group by countymd, action_type
本想借着distinct A, B 就等于group by A, B 的思路先 group by 那些distinct的字段,然后在结果集中再group by 
countymd, action_type,但是报错:
[composite] aggregation cannot be used with a parent aggregation
DSL如下:
{
"size": 0,
"query": {
"bool": {
"filter": {
"range":{"countdate":{"gt":"2020-08-06 11:00:00", "lt":"2031-08-06 11:59:59"}}
}
}
},
"_source": false,
"aggs": {
"distinct": {
"composite": {
"sources": [
{"user_id": {"terms": {"field": "user_id"}}},
{"countymd": {"terms": {"field": "countymd"}}},
{"countdate": {"terms": {"field": "countdate"}}},
{"price": {"terms": {"field": "price"}}},
{"action_type": {"terms": {"field": "action_type"}}},
{"createdate": {"terms": {"field": "createdate"}}}
]
},
"aggs": {
"groupby": {
"composite": {
"sources": [
{"countymd": {"terms": {"field": "countymd"}}},
{"action_type": {"terms": {"field": "action_type"}}}
]
}
},
"aggs": {
"uu": {
"cardinality": {"field": "user_id"}
},
"total": {
"stats": {"field": "price"}
}
}
}
}
}
}

请问该如何解决这种先 distinct 再 group by 的查询?
已邀请:

lijianghu - timesaving

赞同来自:

composite本身就是distinct的,第一层的聚合可以去除,只保留第二和第三层的aggs,可以试一下!

要回复问题请先登录注册