elasticsearch模糊查询为什么搜不到啊, 我用的就是ngram啊

Elasticsearch | 作者 yuechen323 | 发布于2016年10月26日 | 阅读数：6808

我要实现模糊查询, 就好比like '%keyword%', 建立了如下的索引

PUT myidx1

{

  "_all": {

    "enabled": false

  },

  "settings": {

    "analysis": {

      "tokenizer": {

        "my_ngram": {

          "type": "nGram",

          "min_gram": "1",

          "max_gram": "20",

          "token_chars": [

            "letter",

            "digit"

          ]

        }

      },

      "analyzer": {

        "mylike": {

          "tokenizer": "my_ngram",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  },

  "mapping": {

    "mytype": {

      "dynamic": false,

      "properties": {

        "name": {

          "type": "string",

          "analyzer": "mylike"

        }

      }

    }

  }

}

测试一个分词

POST myidx1/_analyze

{

    "analyzer": "mylike",

    "text": "文档3-aaa111"

}

结果是ok的, 如下:

{

   "tokens": [

      {

         "token": "文",

         "start_offset": 0,

         "end_offset": 1,

         "type": "word",

         "position": 0

      },

      {

         "token": "文档",

         "start_offset": 0,

         "end_offset": 2,

         "type": "word",

         "position": 1

      },

      {

         "token": "文档3",

         "start_offset": 0,

         "end_offset": 3,

         "type": "word",

         "position": 2

      },

      {

         "token": "档",

         "start_offset": 1,

         "end_offset": 2,

         "type": "word",

         "position": 3

      },

      {

         "token": "档3",

         "start_offset": 1,

         "end_offset": 3,

         "type": "word",

         "position": 4

      },

      {

         "token": "3",

         "start_offset": 2,

         "end_offset": 3,

         "type": "word",

         "position": 5

      },

      {

         "token": "a",

         "start_offset": 4,

         "end_offset": 5,

         "type": "word",

         "position": 6

      },

      {

         "token": "aa",

         "start_offset": 4,

         "end_offset": 6,

         "type": "word",

         "position": 7

      },

      {

         "token": "aaa",

         "start_offset": 4,

         "end_offset": 7,

         "type": "word",

         "position": 8

      },

      {

         "token": "aaa1",

         "start_offset": 4,

         "end_offset": 8,

         "type": "word",

         "position": 9

      },

      {

         "token": "aaa11",

         "start_offset": 4,

         "end_offset": 9,

         "type": "word",

         "position": 10

      },

      {

         "token": "aaa111",

         "start_offset": 4,

         "end_offset": 10,

         "type": "word",

         "position": 11

      },

      {

         "token": "a",

         "start_offset": 5,

         "end_offset": 6,

         "type": "word",

         "position": 12

      },

      {

         "token": "aa",

         "start_offset": 5,

         "end_offset": 7,

         "type": "word",

         "position": 13

      },

      {

         "token": "aa1",

         "start_offset": 5,

         "end_offset": 8,

         "type": "word",

         "position": 14

      },

      {

         "token": "aa11",

         "start_offset": 5,

         "end_offset": 9,

         "type": "word",

         "position": 15

      },

      {

         "token": "aa111",

         "start_offset": 5,

         "end_offset": 10,

         "type": "word",

         "position": 16

      },

      {

         "token": "a",

         "start_offset": 6,

         "end_offset": 7,

         "type": "word",

         "position": 17

      },

      {

         "token": "a1",

         "start_offset": 6,

         "end_offset": 8,

         "type": "word",

         "position": 18

      },

      {

         "token": "a11",

         "start_offset": 6,

         "end_offset": 9,

         "type": "word",

         "position": 19

      },

      {

         "token": "a111",

         "start_offset": 6,

         "end_offset": 10,

         "type": "word",

         "position": 20

      },

      {

         "token": "1",

         "start_offset": 7,

         "end_offset": 8,

         "type": "word",

         "position": 21

      },

      {

         "token": "11",

         "start_offset": 7,

         "end_offset": 9,

         "type": "word",

         "position": 22

      },

      {

         "token": "111",

         "start_offset": 7,

         "end_offset": 10,

         "type": "word",

         "position": 23

      },

      {

         "token": "1",

         "start_offset": 8,

         "end_offset": 9,

         "type": "word",

         "position": 24

      },

      {

         "token": "11",

         "start_offset": 8,

         "end_offset": 10,

         "type": "word",

         "position": 25

      },

      {

         "token": "1",

         "start_offset": 9,

         "end_offset": 10,

         "type": "word",

         "position": 26

      }

   ]

}

词全分出来了, 然后插入几条数据

POST myidx1/mytype/_bulk

{ "index": { "_id": 4            }}

{ "name": "文档3-aaa111" }

{ "index": { "_id": 5            }}

{ "name": "yyy111"}

{ "index": { "_id": 6            }}

{ "name": "yyy111"}

测试一下查询效果, 但是搜不到, 快崩溃了

GET myidx1/mytype/_search

{

    "query": {

        "match": {

            "name": "1"

        }

    }

}

难道只能让我用 wildcard, 我不甘心

1 个回复

medcl - 今晚打老虎。

赞同来自: yuechen323

你执行

GET myidx1/_mapping

看看

要回复问题请先登录或注册

elasticsearch模糊查询为什么搜不到啊, 我用的就是ngram啊

1 个回复

发起人

相关问题

问题状态

elasticsearch模糊查询为什么搜不到啊, 我用的就是ngram啊

与内容相关的链接

1 个回复

发起人

相关问题

问题状态