无论才能、知识多么卓著,如果缺乏热情,则无异纸上画饼充饥,无补于事。

【ES性能问题】matchquery 慢 慢 慢, 原因讨论?

匿名 | 发布于2017年09月16日 | 阅读数:11228

背景:ES5.4.0, 3节点集群部署,ESheap_size = 31GB。能做的配置优化都已经参考官网配置。

1)检索关键词“梯子”的时候会非常慢。
耗时如下:
{
"took": 7778,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 92,
"max_score": 15.208916,

。。。。。
2)做的matchquery全文检索,文档都非常大,最大文档2.5MB左右(返回结果中),我是对content内容进行的检索。

3)Heap使用记录,三节点:10%(主节点)、4%(路由节点)、58%(数据节点)

想分析到底什么原因导致的?以及如何优化查询

————————————————分割线————————————————————————————————————
以下内容是profile监测记录。
"profile": {
"shards": [
{
"id": "[BosI3A-bSJaVM0LHULDC9A][baike_index][0]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "content:梯子",
"time": "0.5679210000ms",
"time_in_nanos": 567921,
"breakdown": {
"score": 4818,
"build_scorer_count": 23,
"match_count": 0,
"create_weight": 490622,
"next_doc": 5347,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 21,
"score_count": 18,
"build_scorer": 67071,
"advance": 0,
"advance_count": 0
}
}
],
"rewrite_time": 379627,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "0.03252600000ms",
"time_in_nanos": 32526,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time": "0.01902000000ms",
"time_in_nanos": 19020
}
]
}
]
}
],
"aggregations":
},
{
"id": "[BosI3A-bSJaVM0LHULDC9A][baike_index][1]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "content:梯子",
"time": "0.8072510000ms",
"time_in_nanos": 807251,
"breakdown": {
"score": 5896,
"build_scorer_count": 29,
"match_count": 0,
"create_weight": 745418,
"next_doc": 7775,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 16,
"score_count": 13,
"build_scorer": 48103,
"advance": 0,
"advance_count": 0
}
}
],
"rewrite_time": 508630,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "0.04867000000ms",
"time_in_nanos": 48670,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time": "0.02597900000ms",
"time_in_nanos": 25979
}
]
}
]
}
],
"aggregations":
},
{
"id": "[BosI3A-bSJaVM0LHULDC9A][baike_index][2]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "content:梯子",
"time": "0.8369420000ms",
"time_in_nanos": 836942,
"breakdown": {
"score": 8774,
"build_scorer_count": 31,
"match_count": 0,
"create_weight": 744262,
"next_doc": 11751,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 29,
"score_count": 25,
"build_scorer": 72069,
"advance": 0,
"advance_count": 0
}
}
],
"rewrite_time": 458680,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "0.06351200000ms",
"time_in_nanos": 63512,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time": "0.03681100000ms",
"time_in_nanos": 36811
}
]
}
]
}
],
"aggregations":
},
{
"id": "[BosI3A-bSJaVM0LHULDC9A][baike_index][4]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "content:梯子",
"time": "0.9213890000ms",
"time_in_nanos": 921389,
"breakdown": {
"score": 8899,
"build_scorer_count": 28,
"match_count": 0,
"create_weight": 745401,
"next_doc": 11755,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 29,
"score_count": 22,
"build_scorer": 155254,
"advance": 0,
"advance_count": 0
}
}
],
"rewrite_time": 508615,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "0.07178900000ms",
"time_in_nanos": 71789,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time": "0.04696100000ms",
"time_in_nanos": 46961
}
]
}
]
}
],
"aggregations":
},
{
"id": "[DbUASJSzTDWaLQocTYuKpA][baike_index][3]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "content:梯子",
"time": "0.3358920000ms",
"time_in_nanos": 335892,
"breakdown": {
"score": 6083,
"build_scorer_count": 26,
"match_count": 0,
"create_weight": 273080,
"next_doc": 4956,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 19,
"score_count": 14,
"build_scorer": 51713,
"advance": 0,
"advance_count": 0
}
}
],
"rewrite_time": 265629,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "0.02894000000ms",
"time_in_nanos": 28940,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time": "0.01789500000ms",
"time_in_nanos": 17895
}
]
}
]
}
],
"aggregations":
}
]
}
}
已邀请:

kennywu76 - Wood

赞同来自: laoyang360 eboy

match query本身看起来不会产生性能问题,值得怀疑的是highlighter, 如果content都比较大, plain highlighter会比较慢。  先试一下fast-vector-highlighter是否可以缓解问题。

laoyang360 - 《一本书讲透Elasticsearch》作者,Elastic认证工程师 [死磕Elasitcsearch]知识星球地址:http://t.cn/RmwM3N9;微信公众号:铭毅天下; 博客:https://elastic.blog.csdn.net

赞同来自:

查询条件:
 {
"from" : 90,
"size" : 10,
"query" : {
"match" : {
"content" : {
"query" : "梯子",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
},
"_source" : {
"includes" : [
"title",
"content"

],
"excludes" : [ ]
},
"highlight" : {
"pre_tags" : [
"<span style=\"color:red\">"
],
"post_tags" : [
"</span>"
],
"fragment_size" : 100,
"number_of_fragments" : 5,
"require_field_match" : false,
"fields" : {
"content" : { }
}
}
}
mapping:就是text + keyword 传统配置。 没有加fast-vector-highlighter 高亮处理,用的传统方式plain。(mapping没有加,需要修改mapping,重新导入数据才可以。)

白衬衣 - 金桥

赞同来自:

"max_expansions" : 50,需要匹配这么多选项吗?

要回复问题请先登录注册