Elasticsearch

ik自定义分词和停用词遇到一个问题, 或者在脚本中如何过滤不想返回的数据呢

a1667499668 发起了问题 • 1 人关注 • 0 个回复 • 3703 次浏览 • 2023-04-26 18:16 • 来自相关话题

使用es做搜索，比如用户输入柠檬，搜出来的结果，柠檬汽水，柠檬位牙膏等在前面，真正想要的水果那个柠檬在后面。已经在中文分词中加了柠檬，还是不管用

贡献

YuLiGod 回复了问题 • 36 人关注 • 17 个回复 • 17265 次浏览 • 2023-04-24 11:00 • 来自相关话题

es使用老版本命令插入新版本的问题！！！

贡献

FFFrp 回复了问题 • 3 人关注 • 2 个回复 • 3335 次浏览 • 2023-04-23 14:51 • 来自相关话题

es中, painless可以把json字符串转为数组或list的吗

贡献

Ombres 回复了问题 • 3 人关注 • 2 个回复 • 4713 次浏览 • 2023-04-23 10:51 • 来自相关话题

ES是否可以设置内部做重试？

贡献

Charele 回复了问题 • 4 人关注 • 3 个回复 • 4982 次浏览 • 2023-04-22 16:30 • 来自相关话题

ngram分词，and操作搜索不到我理想结果，求大神帮忙看下呢

贡献

YuLiGod 回复了问题 • 4 人关注 • 4 个回复 • 1988 次浏览 • 2023-04-14 08:42 • 来自相关话题

Web Scraper + Elasticsearch + Kibana + SearchKit 打造的豆瓣电影top250 搜索演示系统

森发表了文章 • 0 个评论 • 6149 次浏览 • 2023-04-09 10:56 • 来自相关话题

Web Scraper + Elasticsearch + Kibana + SearchKit 打造的豆瓣电影top250 搜索演示系统

作者：小森同学

声明：电影数据来源于“豆瓣电影”，如有侵权，请联系删除

Web Scraper

json { "_id": "top250", "startUrl": ["<a href="https://movie.douban.com/top250?start=" rel="nofollow" target="_blank">https://movie.douban.com/top250?start=</a>[0-225:25]&filter="], "selectors": [{ "id": "container", "multiple": true, "parentSelectors": ["_root"], "selector": ".grid_view li", "type": "SelectorElement" }, { "id": "name", "multiple": false, "parentSelectors": ["container"], "regex": "", "selector": "span.title:nth-of-type(1)", "type": "SelectorText" }, { "id": "number", "multiple": false, "parentSelectors": ["container"], "regex": "", "selector": "em", "type": "SelectorText" }, { "id": "score", "multiple": false, "parentSelectors": ["container"], "regex": "", "selector": "span.rating_num", "type": "SelectorText" }, { "id": "review", "multiple": false, "parentSelectors": ["container"], "regex": "", "selector": "span.inq", "type": "SelectorText" }, { "id": "year", "multiple": false, "parentSelectors": ["container"], "regex": "\\d{4}", "selector": "p:nth-of-type(1)", "type": "SelectorText" }, { "id": "tour_guide", "multiple": false, "parentSelectors": ["container"], "regex": "^导演: \\S*", "selector": "p:nth-of-type(1)", "type": "SelectorText" }, { "id": "type", "multiple": false, "parentSelectors": ["container"], "regex": "[^/]+$", "selector": "p:nth-of-type(1)", "type": "SelectorText" }, { "id": "area", "multiple": false, "parentSelectors": ["container"], "regex": "[^\\/]+(?=\\/[^\\/]*$)", "selector": "p:nth-of-type(1)", "type": "SelectorText" }, { "id": "detail_link", "multiple": false, "parentSelectors": ["container"], "selector": ".hd a", "type": "SelectorLink" }, { "id": "director", "multiple": false, "parentSelectors": ["detail_link"], "regex": "", "selector": "span:nth-of-type(1) .attrs a", "type": "SelectorText" }, { "id": "screenwriter", "multiple": false, "parentSelectors": ["detail_link"], "regex": "(?<=编剧: )[\\u4e00-\\u9fa5A-Za-z0-9/()\\·\\s]+(?=主演)", "selector": "div#info", "type": "SelectorText" }, { "id": "film_length", "multiple": false, "parentSelectors": ["detail_link"], "regex": "\\d+", "selector": "span[property='v:runtime']", "type": "SelectorText" }, { "id": "IMDb", "multiple": false, "parentSelectors": ["detail_link"], "regex": "(?<=[IMDb:\\s+])\\S*(?=\\d*$)", "selector": "div#info", "type": "SelectorText" }, { "id": "language", "multiple": false, "parentSelectors": ["detail_link"], "regex": "(?<=语言: )\\S+", "selector": "div#info", "type": "SelectorText" }, { "id": "alias", "multiple": false, "parentSelectors": ["detail_link"], "regex": "(?<=又名: )[\\u4e00-\\u9fa5A-Za-z0-9/()\\s]+(?=IMDb)", "selector": "div#info", "type": "SelectorText" }, { "id": "pic", "multiple": false, "parentSelectors": ["container"], "selector": "img", "type": "SelectorImage" }] } 

elasticsearch

 { "mappings": { "properties": { "IMDb": { "type": "keyword", "copy_to": [ "all" ] }, "alias": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "all": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "area": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "director": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "film_length": { "type": "long" }, "id": { "type": "keyword" }, "language": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "link": { "type": "keyword" }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "number": { "type": "long" }, "photo": { "type": "keyword" }, "review": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "score": { "type": "double" }, "screenwriter": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "type": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "copy_to": [ "all" ], "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "year": { "type": "long" } } } } 

kibana

需要使用pipeline对索引字段进行处理，如对type 通过空格进行分割为数组等，可以参照官方文档或其他博客。

制作仪表板省略, 请自行搜索

SearchKit

参考 https://github.com/searchkit/searchkit-starter-app