比如有个标题为"北京?"的文档,正常情况下我检索"北京"可以命中,此时我想输入"?"就命中该条文档该如何操作呢?hanlp分词没有过滤问号,为什么检索不到呢?是停词原因还是编码原因呢?求大神解答
检索"北京?"
GET test_index/_search
{
"query": {
"match_phrase": {
"title.text": "北京?"
}
}
}
结果:
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.47000363,
"_source" : {
"title" : {
"text" : "北京?"
}
}
}
]
检索"?" 结果是 "hits" : [ ]
检索"北京?"
GET test_index/_search
{
"query": {
"match_phrase": {
"title.text": "北京?"
}
}
}
结果:
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.47000363,
"_source" : {
"title" : {
"text" : "北京?"
}
}
}
]
检索"?" 结果是 "hits" : [ ]
6 个回复
devinBing
赞同来自:
2.如果北京可以,那问题应该是出在?这个符号上了,检查下分析器配置
oooishii - es小白
赞同来自:
"settings": { "index.routing.allocation.require.rack": "cool", "number_of_shards": "8", "number_of_replicas": "0", "analysis": { "analyzer": { "my_hanlp_analyzer": { "tokenizer": "my_hanlp" } }, "tokenizer": { "my_hanlp": { "type": "hanlp", "enable_index_mode": true, "enable_custom_dictionary": true, "enable_translated_name_recognize": true, "enable_name_recognize": true, "enable_stop_dictionary": true, "enable_custom_config": true, "enable_remote_dict": true } } } },
"mappings": { "properties": {
"title": { //赋予资源的名称 "properties": { "text": { "type": "text", "analyzer": "my_hanlp_analyzer", "search_analyzer": "my_hanlp_analyzer", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } }
}}
FFFrp
赞同来自:
oooishii - es小白
赞同来自:
GET _analyze
{
"analyzer":"hanlp",
"text":"北京?"
}
结果
{
"tokens" : [
{
"token" : "北京",
"start_offset" : 0,
"end_offset" : 2,
"type" : "ns",
"position" : 0
},
{
"token" : "?",
"start_offset" : 2,
"end_offset" : 3,
"type" : "n",
"position" : 1
}
]
}
搜索
GET _analyze
{
"analyzer":"hanlp",
"text":"北京\uff1f"
}
结果也是一样
{
"tokens" : [
{
"token" : "北京",
"start_offset" : 0,
"end_offset" : 2,
"type" : "ns",
"position" : 0
},
{
"token" : "?",
"start_offset" : 2,
"end_offset" : 3,
"type" : "n",
"position" : 1
}
]
}
oooishii - es小白
赞同来自:
GET test_index/_search
{
"query": {
"match_phrase": {
"title.text": "\uff1f"
}
}
}
和检索
GET test_index/_search
{
"query": {
"match_phrase": {
"title.text": "?"
}
}
}
结果都没有
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
Ombres
赞同来自: