绊脚石乃是进身之阶。

同一条文档 同字段 主分片与副本分片的 termvectors 不相同?

Elasticsearch | 作者 storm | 发布于2017年05月22日 | 阅读数:4545

系统提示:这个人太懒了,什么问题描述都没有写!

已邀请:

storm

赞同来自:

在做match查询是,发现同样的条件  每次查询出来的结果都不一样
 

20170522171259.png



_cat/shards 后 确定 主分配与副本分片 文档分布一致

QQ截图20170522171827.png


 单独查 主分配 副本分片 数量一致 不会变动
POST /centapost_tj/post/_search?preference=_primary
{
"_source": "estatename",
"from": 0,
"size": 2000,
"query": {
"match": {
"estatename": {
"query": "花"
}
}
}
}

POST /centapost_tj/post/_search?preference=_replica
{
"_source": "estatename",
"from": 0,
"size": 2000,
"query": {
"match": {
"estatename": {
"query": "花"
}
}
}
}



 
查询结果  主分片 108  副本分片 53  固定 一直不变
 
 
挑了一条 主分片 主分片存在 副本分片 没的_id   分别查 在主分配 和副本分片的_termvectors


 

storm

赞同来自:

GET  /centapost_tj/post/4c1e65e4-b970-c6fd-4fa0-08d48fadf832/_termvectors?fields=estatename&routing=2282S&preference=_primary

GET /centapost_tj/post/4c1e65e4-b970-c6fd-4fa0-08d48fadf832/_termvectors?fields=estatename&routing=2282S&preference=_replica

查询结果不相同  ,副本分片更接近预期  
 

storm

赞同来自:

primary  查询结果
{
"term_vectors": {
"estatename": {
"field_statistics": {
"sum_doc_freq": 29620,
"doc_count": 5014,
"sum_ttf": 29992
},
"terms": {
"二": {
"term_freq": 1,
"tokens": [
{
"position": 6,
"start_offset": 6,
"end_offset": 7
}
]
},
"二期": {
"term_freq": 1,
"tokens": [
{
"position": 5,
"start_offset": 6,
"end_offset": 8
}
]
},
"厦": {
"term_freq": 1,
"tokens": [
{
"position": 2,
"start_offset": 1,
"end_offset": 2
}
]
},
"期": {
"term_freq": 1,
"tokens": [
{
"position": 7,
"start_offset": 7,
"end_offset": 8
}
]
},
"水语花城": {
"term_freq": 1,
"tokens": [
{
"position": 3,
"start_offset": 2,
"end_offset": 6
}
]
},
"花城": {
"term_freq": 1,
"tokens": [
{
"position": 4,
"start_offset": 4,
"end_offset": 6
}
]
},
"花溪": {
"term_freq": 1,
"tokens": [
{
"position": 8,
"start_offset": 8,
"end_offset": 10
}
]
},
"苑": {
"term_freq": 1,
"tokens": [
{
"position": 9,
"start_offset": 10,
"end_offset": 11
}
]
},
"金厦水语花城": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 0,
"end_offset": 6
}
]
},
"金厦水语花城二期花溪苑": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 11
}
]
}
}
}
}
}

storm

赞同来自:

replica 查询结果
{
"_index": "tj_toggle",
"_type": "post",
"_id": "4c1e65e4-b970-c6fd-4fa0-08d48fadf832",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"estatename": {
"field_statistics": {
"sum_doc_freq": 20652,
"doc_count": 5014,
"sum_ttf": 21077
},
"terms": {
"二": {
"term_freq": 1,
"tokens": [
{
"position": 6,
"start_offset": 6,
"end_offset": 7
}
]
},
"二期": {
"term_freq": 1,
"tokens": [
{
"position": 5,
"start_offset": 6,
"end_offset": 8
}
]
},
"厦": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 1,
"end_offset": 2
}
]
},
"期": {
"term_freq": 1,
"tokens": [
{
"position": 7,
"start_offset": 7,
"end_offset": 8
}
]
},
"水": {
"term_freq": 1,
"tokens": [
{
"position": 2,
"start_offset": 2,
"end_offset": 3
}
]
},
"溪": {
"term_freq": 1,
"tokens": [
{
"position": 9,
"start_offset": 9,
"end_offset": 10
}
]
},
"花": {
"term_freq": 1,
"tokens": [
{
"position": 8,
"start_offset": 8,
"end_offset": 9
}
]
},
"花城": {
"term_freq": 1,
"tokens": [
{
"position": 4,
"start_offset": 4,
"end_offset": 6
}
]
},
"苑": {
"term_freq": 1,
"tokens": [
{
"position": 10,
"start_offset": 10,
"end_offset": 11
}
]
},
"语": {
"term_freq": 1,
"tokens": [
{
"position": 3,
"start_offset": 3,
"end_offset": 4
}
]
},
"金": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 1
}
]
}
}
}
}
}

storm

赞同来自:

IK 是默认分词器 使用的"tokenizer": "ik_max_word"
"creation_date": "1495445300345",
"analysis": {
"filter": {
"stop": {
"type": "stop",
"stopwords_path": "analysis/stopwords.dic"
},
"synonym": {
"type": "synonym",
"synonyms_path": "analysis/synonym.dic"
}
},
"analyzer": {
"default": {
"filter": [
"synonym"
],
"char_filter": [
"html_strip"
],
"type": "custom",
"tokenizer": "ik_max_word"
}
}
},


"address": {
"type": "text"
},
"isonly": {
"type": "boolean"
},
"estatename": {
"type": "text"
},
"isexclusive": {
"type": "boolean"
},

要回复问题请先登录注册