使用 man ascii 来查看 ASCII 表。

ik 分词器和ElasticSearch查询条件

Elasticsearch | 作者 Frank007 | 发布于2016年12月01日 | 阅读数:6092

title:百度张亚勤:ABC时代来了,迎战云计算“马拉松”
创建mapping的时候指定了title字段的分词器为ik,分词结果为
{"tokens": [

{

"token": "百度",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0

}
,
{

"token": "百",
"start_offset": 0,
"end_offset": 1,
"type": "TYPE_CNUM",
"position": 1

}
,
{

"token": "度",
"start_offset": 1,
"end_offset": 2,
"type": "COUNT",
"position": 2

}
,
{

"token": "张",
"start_offset": 2,
"end_offset": 3,
"type": "CN_CHAR",
"position": 3

}
,
{

"token": "亚",
"start_offset": 3,
"end_offset": 4,
"type": "CN_WORD",
"position": 4

}
,
{

"token": "勤",
"start_offset": 4,
"end_offset": 5,
"type": "CN_WORD",
"position": 5

}
,
{

"token": "abc",
"start_offset": 6,
"end_offset": 9,
"type": "ENGLISH",
"position": 6

}
,
{

"token": "时代",
"start_offset": 9,
"end_offset": 11,
"type": "CN_WORD",
"position": 7

}
,
{

"token": "来了",
"start_offset": 11,
"end_offset": 13,
"type": "CN_WORD",
"position": 8

}
,
{

"token": "迎战",
"start_offset": 14,
"end_offset": 16,
"type": "CN_WORD",
"position": 9

}
,
{

"token": "战云",
"start_offset": 15,
"end_offset": 17,
"type": "CN_WORD",
"position": 10

}
,
{

"token": "云",
"start_offset": 16,
"end_offset": 17,
"type": "CN_WORD",
"position": 11

}
,
{

"token": "计算",
"start_offset": 17,
"end_offset": 19,
"type": "CN_WORD",
"position": 12

}
,
{

"token": "马拉松",
"start_offset": 20,
"end_offset": 23,
"type": "CN_WORD",
"position": 13

}
,
{

"token": "马拉",
"start_offset": 20,
"end_offset": 22,
"type": "CN_WORD",
"position": 14

}
,
{

"token": "松",
"start_offset": 22,
"end_offset": 23,
"type": "CN_WORD",
"position": 15

}

]}
通过张亚勤或者云计算搜不到内容
查询语句为
{
  "query": {
    "term": {
      "title": "张亚勤"
    }
  }
}
java代码为

response = client.prepareSearch("blog") .setTypes("article") .setQuery(QueryBuilders.termQuery("title", "张亚勤")) .setFrom(0).setSize(60).setExplain(true) .execute() .actionGet();
问题:ik分词分出来的大部分是两个汉字的,三个汉字就匹配不到了,怎么修改ik的配置,或者改一下查询条件,能够搜到相应的结果?
已邀请:

ybtsdst - focus on lucene & es

赞同来自:

1. 试试ik_smart.
2. term换成match
3. 把张亚勤或者云计算这些词补充到ik词库中

要回复问题请先登录注册