亲,只收二进制

Terms Aggregation组合统计terms字段被拆分

Elasticsearch | 作者 overfight | 发布于2015年05月26日 | 阅读数:10819

我想对一个带有特殊字符的设备名(类似dcn-r2-s-gdjm-gs)字段进行组合统计log的数量,用Terms Aggregation发现字段被拆分 掉了,类似dcn-r2-s-gdjm-gs,被拆分成dcn,r2,s,gdjm,gs,有办法让这个字段保持整体吗?
查询语句
curl -XGET 'http://localhost:9200/nmssyslog-2015-05/_search?search_type=count' -d '{
"aggs": {
"logcount": {
"terms": {
"field": "systemname",
"size": 20
}
}
}
}'
查询结果
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 82722,
"max_score": 0,
"hits": []
},
"aggregations": {
"products": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "dcn",
"doc_count": 82612
},
{
"key": "r",
"doc_count": 82612
},
{
"key": "2",
"doc_count": 78802
},
{
"key": "gdjm",
"doc_count": 73430
},
{
"key": "s",
"doc_count": 73430
},
{
"key": "gs",
"doc_count": 73421
},
{
"key": "dcn-r2-s-gdjm-gs",
"doc_count": 70115
},
{
"key": "c",
"doc_count": 9182
},
{
"key": "gdgz",
"doc_count": 9182
},
{
"key": "dcn-r2-c-gdgz-yj",
"doc_count": 8683
},
{
"key": "yj",
"doc_count": 8683
},
{
"key": "1",
"doc_count": 3810
},
{
"key": "dcn-r1-s-gdjm-gs",
"doc_count": 3306
},
{
"key": "dcn-r1-c-gdgz-kxc",
"doc_count": 499
},
{
"key": "kxc",
"doc_count": 499
},
{
"key": "unknow",
"doc_count": 110
},
{
"key": "jh",
"doc_count": 9
},
{
"key": "dcn-r1-s-gdjm-jh,dcn-r1-s-gdjm-jh",
"doc_count": 5
},
{
"key": "dcn-r2-s-gdjm-jh",
"doc_count": 4
}
]
}
}
}
分词配置
################################## Analyzer ################################
index :
analysis :
analyzer :
default_index:
type : custom
tokenizer: whitespace
filter: [word_delimiter, lowercase]
default_search:
type : custom
tokenizer: whitespace
filter: [word_delimiter, lowercase]
filter:
word_delimiter:
type : word_delimiter
preserve_original : true
已邀请:

wx7614140 - 码农一只

赞同来自:

你的字段的mapping看来是用分词器分词了.
建议新增一个字段跟分词字段内容一致,但是没有分词,aggs统计的时候统计新增字段

jingkyks - 水果铅笔2B橡皮

赞同来自:

该字段设置为multi-field。分词字段用于query,不分词字段用于aggr

toxic_07

赞同来自:

不要对字段进行分词即可。

要回复问题请先登录注册