有这样个问题, 如:在搜索 "呼吸睡眠" 时会把 "睡眠呼吸" 匹配出来",如何能做到保证顺序,过滤掉 "睡眠呼吸"
我用的语句如下,还是会把 "睡眠呼吸"这种匹配出来
我用的语句如下,还是会把 "睡眠呼吸"这种匹配出来
{
"query":{
"function_score":{
"query":{
"bool":{
"must":[
{
"terms":{
"status":[
1
]
}
},
{
"match_phrase":{
"title":{
"query":"呼吸睡眠",
"slop":0
}
}
}
]
}
},
"field_value_factor":{
"field":"create_time",
"modifier":"log2p",
"factor":2
}
}
},
"from":0,
"size":10,
"_source":[
"id",
"title"
]
}
Mapping:"mappings": {
"list": {
"properties": {
"activity_id": {
"type": "keyword"
},
"create_time": {
"type": "integer"
},
"id": {
"type": "keyword",
"fields": {
"int": {
"type": "integer"
}
}
},
"is_closed": {
"type": "keyword"
},
"is_del": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"uid": {
"type": "keyword"
}
}
}
}
分词:GET my_index/_analyze
{
"field":"title",
"text":"呼吸睡眠"
}
{
"tokens": [
{
"token": "呼吸",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "呼",
"start_offset": 0,
"end_offset": 1,
"type": "CN_WORD",
"position": 1
},
{
"token": "吸",
"start_offset": 1,
"end_offset": 2,
"type": "CN_WORD",
"position": 2
},
{
"token": "睡眠",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 3
},
{
"token": "睡",
"start_offset": 2,
"end_offset": 3,
"type": "CN_WORD",
"position": 4
},
{
"token": "眠",
"start_offset": 3,
"end_offset": 4,
"type": "CN_WORD",
"position": 5
}
]
}
结果:
4 个回复
Xiaoming - 80s
赞同来自:
rockybean - Elastic Certified Engineer, ElasticStack Fans,公众号:ElasticTalk
赞同来自:
rochy - rochy_he
赞同来自:
感觉上述的写法没有问题;
还有就是你分词器的用的什么?
rochy - rochy_he
赞同来自:
“呼吸及睡眠 ” 在进行 ik 分词时,由于停用词的原因 “及”这个词语会被移除,造成分词结果为:呼吸/睡眠
这样刚好就匹配到了你的查询语句