找到问题的解决办法了么?

搜索关键字顺序问题

Elasticsearch | 作者 Xiaoming | 发布于2018年10月25日 | 阅读数:1638

有这样个问题, 如:在搜索 "呼吸睡眠" 时会把 "睡眠呼吸" 匹配出来",如何能做到保证顺序,过滤掉 "睡眠呼吸"
 
我用的语句如下,还是会把 "睡眠呼吸"这种匹配出来
{
"query":{
"function_score":{
"query":{
"bool":{
"must":[
{
"terms":{
"status":[
1
]
}
},
{
"match_phrase":{
"title":{
"query":"呼吸睡眠",
"slop":0

}
}
}
]
}
},
"field_value_factor":{
"field":"create_time",
"modifier":"log2p",
"factor":2
}
}
},
"from":0,
"size":10,
"_source":[
"id",
"title"
]
}
Mapping:
"mappings": {
"list": {
"properties": {
"activity_id": {
"type": "keyword"
},
"create_time": {
"type": "integer"
},
"id": {
"type": "keyword",
"fields": {
"int": {
"type": "integer"
}
}
},
"is_closed": {
"type": "keyword"
},
"is_del": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"uid": {
"type": "keyword"
}
}
}
}
分词:
GET my_index/_analyze
{
"field":"title",
"text":"呼吸睡眠"
}
{
"tokens": [
{
"token": "呼吸",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "呼",
"start_offset": 0,
"end_offset": 1,
"type": "CN_WORD",
"position": 1
},
{
"token": "吸",
"start_offset": 1,
"end_offset": 2,
"type": "CN_WORD",
"position": 2
},
{
"token": "睡眠",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 3
},
{
"token": "睡",
"start_offset": 2,
"end_offset": 3,
"type": "CN_WORD",
"position": 4
},
{
"token": "眠",
"start_offset": 3,
"end_offset": 4,
"type": "CN_WORD",
"position": 5
}
]
}
结果:

WX20181026-095204@2x.png

 
已邀请:

Xiaoming - 80s

赞同来自:

slop没起作用

rockybean - Elastic Certified Engineer, ElasticStack Fans,公众号:ElasticTalk

赞同来自:

实测没有问题,你自己看下你分词后的结果是什么
 
PUT test_phrase/doc/1
{
"name":"呼吸睡眠"
}

PUT test_phrase/doc/2
{
"name":"睡眠呼吸"
}

GET test_phrase/_search
{
"query": {
"match_phrase": {
"name": {
"query": "呼吸睡眠",
"slop": 0

}
}
}
}


GET test_phrase/_analyze
{
"field":"name",
"text":"呼吸睡眠"
}

rochy - rochy_he

赞同来自:

可否把搜索结果贴一下?
感觉上述的写法没有问题;
还有就是你分词器的用的什么?
 
 

rochy - rochy_he

赞同来自:

你的结果主要是因为“呼吸及睡眠节律紊乱”这句话匹配到的原因;
“呼吸及睡眠 ” 在进行 ik 分词时,由于停用词的原因 “及”这个词语会被移除,造成分词结果为:呼吸/睡眠
这样刚好就匹配到了你的查询语句

要回复问题请先登录注册