请教一个关于es中文 聚合查询 Fielddata is disabled on text 出现的问题
Elasticsearch | 作者 a2615381 | 发布于2019年01月05日 | 阅读数:6778
我有一个 类似group by的需求场景, 是吧一个 中文分词字段 进行去重,然后得到每个 有多少个
使用了AggregationBuilder 工具, 再对不分词字段,挺正常的,可以使用
但是使用在 中文分词字段则会报错,内容如下
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text
fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the
inverted index. Note that this can however use significant memory. Alternatively use a keyword field
instead."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query",
"grouped":true,"failed_shards":[{"shard":0,"index":"school","node":"H7VIRoOwS8mws78T-0Ce-Q","reason":{
"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set
fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index.Note that
this can however use significant memory. Alternatively use a keyword field instead."}}]},"status":400}
查询资料发现是因为 分词问题,网上给出的解决办法是2个
1 ,这是region这个排序字段的fileddata为true。 但是这个方法很不推荐,会占用大量内存
2 ,查询时候增加 字段值 .keyword 。如下
AggregationBuilder aggregationBuilder =
AggregationBuilders.terms("nameAgg").field("name.keyword").size(Integer.MAX_VALUE) //1
.subAggregation(AggregationBuilders.terms("jobAgg").field("job.keyword").size(Integer.MAX_VALUE) //2
.subAggregation(AggregationBuilders.avg("ageAgg").field("age")) //3
.subAggregation(AggregationBuilders.count("totalNum").field("name.keyword"))); //4
searchSourceBuilder.aggregation(aggregationBuilder);
但是我增加了 keyword之后发现,只是不报错了。但是值是空的, 请问怎么解决
使用了AggregationBuilder 工具, 再对不分词字段,挺正常的,可以使用
但是使用在 中文分词字段则会报错,内容如下
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text
fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the
inverted index. Note that this can however use significant memory. Alternatively use a keyword field
instead."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query",
"grouped":true,"failed_shards":[{"shard":0,"index":"school","node":"H7VIRoOwS8mws78T-0Ce-Q","reason":{
"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set
fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index.Note that
this can however use significant memory. Alternatively use a keyword field instead."}}]},"status":400}
查询资料发现是因为 分词问题,网上给出的解决办法是2个
1 ,这是region这个排序字段的fileddata为true。 但是这个方法很不推荐,会占用大量内存
2 ,查询时候增加 字段值 .keyword 。如下
AggregationBuilder aggregationBuilder =
AggregationBuilders.terms("nameAgg").field("name.keyword").size(Integer.MAX_VALUE) //1
.subAggregation(AggregationBuilders.terms("jobAgg").field("job.keyword").size(Integer.MAX_VALUE) //2
.subAggregation(AggregationBuilders.avg("ageAgg").field("age")) //3
.subAggregation(AggregationBuilders.count("totalNum").field("name.keyword"))); //4
searchSourceBuilder.aggregation(aggregationBuilder);
但是我增加了 keyword之后发现,只是不报错了。但是值是空的, 请问怎么解决
4 个回复
rochy - rochy_he
赞同来自: a2615381
匿名用户
赞同来自:
"properties": {
"brandName": {
"search_analyzer": "query_ansj",
"analyzer": "index_ansj",
"type": "text"
},
"color": {
"search_analyzer": "query_ansj",
"analyzer": "index_ansj",
"type": "text"
},
"typeFourId": {
"type": "integer"
},
"modelAttr": {
"index": false,
"type": "text"
},
"modelId": {
"type": "long"
},
"seriesName": {
"search_analyzer": "query_ansj",
"analyzer": "index_ansj",
"type": "text"
},
大概就是这样了
laoyang360 - 《一本书讲透Elasticsearch》作者,Elastic认证工程师 [死磕Elasitcsearch]知识星球地址:http://t.cn/RmwM3N9;微信公众号:铭毅天下; 博客:https://elastic.blog.csdn.net
赞同来自:
a2615381
赞同来自:
修改 查询字段 mapping,增加多属性 keyword
curl -XPUT localhost:9200/my_index/my_type/_mapping -d '
{
"my_type": {
"properties": {
"fName": {
"type": "text",
"search_analyzer": "query_ansj",
"analyzer": "index_ansj",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}'
查询代码
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); TermsAggregationBuilder aggHouse=AggregationBuilders.terms("modelNameAgg").field("fName.raw").size(10); searchSourceBuilder.aggregation(aggHouse); Search search = new Search.Builder(searchSourceBuilder.toString()).addIndex(es.getEsIndex()).addType(es.getEsType()).build(); SearchResult searchResult = js.execute(search); List<TermsAggregation.Entry> nameAgg = searchResult.getAggregations().getTermsAggregation("modelNameAgg").getBuckets();
但是仅限于 修改之后 增加的数据或者修改的数据,旧数据无法聚合,因为索引没有建立吧。也能理解,
重刷一遍数据就好了