我首先建立了一个有嵌套对象的索引
# create index mapping
PUT /array-tag
{
"mappings": {
"chnTag" : {
"properties" : {
"Tags":{
"type": "nested",
"properties": {
"Tags.tag_name" : {
"type" : "text",
"analyzer": "ik_smart"
}
}
}
}
}
}
}
接着插入一条记录
# insert docs
POST /array-tag/chnTag?pretty
{
"img_id": "img001",
"Tags": [
{
"tag_name" : "羽毛球",
"tag_score" : "10",
"tag_source" : "A"
},
{
"tag_name" : "李宗伟",
"tag_score" : "9",
"tag_source" : "B"
},
{
"tag_name" : "林丹",
"tag_score" : "8",
"tag_source" : "C"
},
{
"tag_name" : "世界冠军",
"tag_score" : "8",
"tag_source" : "C"
}
]
}
最后执行嵌套对象查询
# query with nested
GET /array-tag/_search?search_type=dfs_query_then_fetch&explain
{
"query": {
"nested": {
"path": "Tags",
"query": {
"match": {
"Tags.tag_name": "林丹 羽毛球"
}
}
}
}
}
然而eplain结果中,还是没有正确的分词。我确定ik已经配置完成了
部分查询结果如下:
"_explanation": {
"value": 2.8700664,
"description": "Score based on 2 child docs in range from 0 to 3, best match:",
"details": [
{
"value": 3.1784883,
"description": "sum of:",
"details": [
{
"value": 1.059496,
"description": "weight(Tags.tag_name:羽 in 3) [PerFieldSimilarity], result of:",
"details": [
{
"value": 1.059496,
"description": "score(doc=3,freq=1.0 = termFreq=1.0\n), product of:",
"details": [
{
"value": 1.2039728,
"description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details": [
{
"value": 1,
"description": "docFreq",
"details": []
},
{
"value": 4,
"description": "docCount",
"details": []
}
]
},
{
"value": 0.88,
"description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details": [
{
"value": 1,
"description": "termFreq=1.0",
"details": []
},
{
"value": 1.2,
"description": "parameter k1",
"details": []
},
{
"value": 0.75,
"description": "parameter b",
"details": []
},
{
"value": 3,
"description": "avgFieldLength",
"details": []
},
{
"value": 4,
"description": "fieldLength",
"details": []
}
]
}
]
}
]
},
{
"value": 1.059496,
"description": "weight(Tags.tag_name:毛 in 3) [PerFieldSimilarity], result of:",
"details": [
{
"value": 1.059496,
"description": "score(doc=3,freq=1.0 = termFreq=1.0\n), product of:",
"details": [
{
"value": 1.2039728,
"description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details": [
{
"value": 1,
"description": "docFreq",
"details": []
},
{
"value": 4,
"description": "docCount",
"details": []
}
]
},
{
"value": 0.88,
"description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details": [
{
"value": 1,
"description": "termFreq=1.0",
"details": []
},
{
"value": 1.2,
"description": "parameter k1",
"details": []
},
{
"value": 0.75,
"description": "parameter b",
"details": []
},
{
"value": 3,
"description": "avgFieldLength",
"details": []
},
{
"value": 4,
"description": "fieldLength",
"details": []
}
]
}
]
}
]
},
ik 配置测试
{
"tokens": [
{
"token": "羽毛球",
"start_offset": 0,
"end_offset": 3,
"type": "CN_WORD",
"position": 0
},
{
"token": "林丹",
"start_offset": 3,
"end_offset": 5,
"type": "CN_WORD",
"position": 1
},
{
"token": "世界冠军",
"start_offset": 5,
"end_offset": 9,
"type": "CN_WORD",
"position": 2
}
]
}
请问各位大佬,要怎么才能使嵌套查询给出正确分词结果呢
1 个回复
aillycs
赞同来自: