为ES构建Stanford NLP分词插件

Stanford NLP？

Stanford分词器是斯坦福大学NLP团队维护的一个开源分词器，支持了包括中文、英文…的语言，而且除了分词之外，它还支持了包括词性分析、情感分析…的各种功能。\ 这俩是这个project的项目主页

Why Stanford core NLP？

市面上确实会有很多很有名的开源分词器，比如IK、Jieba，还有一些其他团队和公司提供的开源/商用的分词器，他们各有优劣。但是在各种分词器上比较了一大堆的分词case之后，我们发现Stanford NLP似乎是最适合我们当前需求的一个，因为我们不仅仅需要分词，还需要一些包括情感分析之类在内的更多的一些功能。

我们公司是做金融数据的搜索推荐的，在对比了各家分词器之后我们老板觉得Stanford NLP的效果最好，但是作为算法出身的人，他实现了一套非常重的分词、排序、搜索的服务。

在对比如研报、财报之类的信息进行搜索的时候确实会比较有效，但是在对经济类的新闻进行搜索的时候就会显得十分的笨重。

基于这个背景，我开始试图在ES里面引入老板推崇的Stanford 分词器来适应他的搜索、分词的需要，同时也能够不通过他那个笨重的分词排序服务来对我们系统中大量的经济、金融类的新闻进行分词、索引，并提供和他自己分词效果类似的分词和检索服务。

Why this project

我在包括百度、某谷姓404网站、GitHub以及国内的中文社区（Elastic中文社区)在内的各种地方搜过也问过了，但是似乎没有一个直接开箱可用的分词插件。所以，我只剩一条路了，就是搭建一个自己的插件来引用这个分词器。

How

对ES来说，插件主要分为两个部分：

让ES可以看到的部分（class extends Plugin）
自己行使职能的部分（functional part）

plugin

为了让ES可以加载我们的plugin，我们需要先继承Plugin类，然后我们这个是个分词器插件，所以还要实现AnalysisPlugin类
看过ES源码或者其他分词器源码的同学应该会知道，分词器插件需要实现两个方法，一个用来提供tokenizer，一个是analyzer分别对应分词器中的这俩。
- 重写Map<String, AnalysisModule.AnalysisProvider<TokenizerFactory>>是为了可以提供搜索用分词器
- 重写Map<String, AnalysisModule.AnalysisProvider<AnalyzerProvider<? extends Analyzer>>>是为了可以提供索引用分词器
在这个分词器里面我们主要是依靠Tokenizer来实现分词的

functional class

分词器，特别是Tokenizer主要是靠重写三个方法来实现分词的

incrementToken：用来确定每一个词元，输出每一个单词（字）以及它的位置、长度等
reset：用来重制分词结果
end：用来告诉ES，这段文本的分词已经结束了

所以我们主要需要重写的就是这仨方法，当然了，为了能让分词器正确的使用，我们还需要添加一些分词器的配置和初始化的内容，具体代码不写了可以参考我的git，主要讲两个坑：

ES是通过配置文件里的路径来寻找对应的插件类
然后通过配置文件里的key和刚才提到的代码里的key来寻找对应的分词器，所以这俩地方不要写错了 #plugin-descriptor.properties: classname=org.elasticsearch.plugin.analysis.AnalysisSDPlugin #plugin-descriptor.properties: name=stanford-core-nlp
在开发过程中由于有java-security的存在，所以需要通过AccessController来调用和加载我们需要的外部jar包

odds and ends

Stanford分词器里面包含了很多功能，目前我使用了分词的部分
分词器自带词典文件，不过如果要做词典的修改可能需要解包，修改，再重新打包
我现在hardcode了一大堆的标点符号在里面，后面可能会去优化一下部分逻辑
待完成的功能还有其他功能包括情感分析之类的

also see

GitHub 地址

继续阅读 »

为ES构建Stanford NLP分词插件

Stanford NLP？

Stanford分词器是斯坦福大学NLP团队维护的一个开源分词器，支持了包括中文、英文…的语言，而且除了分词之外，它还支持了包括词性分析、情感分析…的各种功能。\ 这俩是这个project的项目主页

Why Stanford core NLP？

市面上确实会有很多很有名的开源分词器，比如IK、Jieba，还有一些其他团队和公司提供的开源/商用的分词器，他们各有优劣。但是在各种分词器上比较了一大堆的分词case之后，我们发现Stanford NLP似乎是最适合我们当前需求的一个，因为我们不仅仅需要分词，还需要一些包括情感分析之类在内的更多的一些功能。

我们公司是做金融数据的搜索推荐的，在对比了各家分词器之后我们老板觉得Stanford NLP的效果最好，但是作为算法出身的人，他实现了一套非常重的分词、排序、搜索的服务。

在对比如研报、财报之类的信息进行搜索的时候确实会比较有效，但是在对经济类的新闻进行搜索的时候就会显得十分的笨重。

基于这个背景，我开始试图在ES里面引入老板推崇的Stanford 分词器来适应他的搜索、分词的需要，同时也能够不通过他那个笨重的分词排序服务来对我们系统中大量的经济、金融类的新闻进行分词、索引，并提供和他自己分词效果类似的分词和检索服务。

Why this project

我在包括百度、某谷姓404网站、GitHub以及国内的中文社区（Elastic中文社区)在内的各种地方搜过也问过了，但是似乎没有一个直接开箱可用的分词插件。所以，我只剩一条路了，就是搭建一个自己的插件来引用这个分词器。

How

对ES来说，插件主要分为两个部分：

让ES可以看到的部分（class extends Plugin）
自己行使职能的部分（functional part）

plugin

为了让ES可以加载我们的plugin，我们需要先继承Plugin类，然后我们这个是个分词器插件，所以还要实现AnalysisPlugin类
看过ES源码或者其他分词器源码的同学应该会知道，分词器插件需要实现两个方法，一个用来提供tokenizer，一个是analyzer分别对应分词器中的这俩。
- 重写Map<String, AnalysisModule.AnalysisProvider<TokenizerFactory>>是为了可以提供搜索用分词器
- 重写Map<String, AnalysisModule.AnalysisProvider<AnalyzerProvider<? extends Analyzer>>>是为了可以提供索引用分词器
在这个分词器里面我们主要是依靠Tokenizer来实现分词的

functional class

分词器，特别是Tokenizer主要是靠重写三个方法来实现分词的

incrementToken：用来确定每一个词元，输出每一个单词（字）以及它的位置、长度等
reset：用来重制分词结果
end：用来告诉ES，这段文本的分词已经结束了

所以我们主要需要重写的就是这仨方法，当然了，为了能让分词器正确的使用，我们还需要添加一些分词器的配置和初始化的内容，具体代码不写了可以参考我的git，主要讲两个坑：

ES是通过配置文件里的路径来寻找对应的插件类
然后通过配置文件里的key和刚才提到的代码里的key来寻找对应的分词器，所以这俩地方不要写错了 #plugin-descriptor.properties: classname=org.elasticsearch.plugin.analysis.AnalysisSDPlugin #plugin-descriptor.properties: name=stanford-core-nlp
在开发过程中由于有java-security的存在，所以需要通过AccessController来调用和加载我们需要的外部jar包

odds and ends

Stanford分词器里面包含了很多功能，目前我使用了分词的部分
分词器自带词典文件，不过如果要做词典的修改可能需要解包，修改，再重新打包
我现在hardcode了一大堆的标点符号在里面，后面可能会去优化一下部分逻辑
待完成的功能还有其他功能包括情感分析之类的

also see

GitHub 地址

收起阅读 »

社区日报第517期 (2019-01-22)

1、使用logstash搜集csv日志。
http://t.cn/E5Ml4lv
2、日志监控和分析：ELK、Splunk和Graylog对比。
http://t.cn/E5MlcsH
3、从 10 秒到 2 秒！ElasticSearch 性能调优。
http://t.cn/E59fgLI
编辑：叮咚光军
归档：
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第516期 (2019-01-21)

1.针对Logstash吞吐量一次优化
http://t.cn/E5X40JT

2.Opbeat已死，请用Elastic APM
http://t.cn/EyhRQRJ

3.亿级PV的ELK集群实践之路
http://t.cn/RnvPElX

编辑：cyberdak
归档：https://elasticsearch.cn/article/6339
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

岗位描述
1、负责众云大数据平台Elasticsearch相关服务的功能设计、开发、运营和维护工作；
2、持续优化Elasticsearch的性能，完善功能，支持各业务线检索、聚合等场景；
3、负责众云事业部ELK平台的运营和维护工作；
岗位要求
1. 计算机相关专业全日制本科及以上学历，三年以上开发工作经验；
2. 熟练掌握java语言，熟练使用linux，强悍的编码和troubleshooting能力；
3. 深入了解Elasticsearch、Solr等开源搜索引擎，了解Lucene、Elasticsearch源码优先；
4. 精通搜索引擎架构原理、排序算法、索引处理及分词算法，索引数据结构；
5. 熟练掌握常见SQL数据库原理、数据库设计、查询编写和优化；
6. 有基础框架、中间件、基础库的开发经验优先；
7. 具有大型搜索引擎或舆情相关项目经验优先；
8. 对linux kernel、存储、文件系统、分布式任一方向有深入研究者优先；
9. 逻辑分析能力强，善于沟通，有良好的团队合作精神，良好的学习能力；

继续阅读 »

社区日报第515期 (2019-01-20)

1.Java应用日志导入ELK。
http://t.cn/E5GiA1T
2.使用Tokens分发Cassandra数据。
http://t.cn/E5GIOd5
3.(自备梯子)为什么如此难以让计算机像人一样说话？
http://t.cn/EqFOf04

编辑：至尊宝
归档：https://elasticsearch.cn/article/6337
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第514期 (2019-01-19）

Flink 写入数据到 ElasticSearch。 http://t.cn/E5L88Q7

2.ES分布式一致性原则分析系列：节点、Meta、数据（需翻墙）。 http://t.cn/E5LGg4i

http://t.cn/E5LqqbD

http://t.cn/E5L57t0

一周热点：最近刷屏的《啥是佩奇》。 http://t.cn/E5vkFQc

编辑：bsll
归档：https://elasticsearch.cn/article/6336
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

如何修改kibana的默认主页

在6.0版本以前，登录kibana之后，默认会路由到app/kibana下的discover应用。在6.3版本以后，新增了一个home路径/app/kibana#/home?_g=h@44136fa，访问根路径\会直接跳到以上路径。

希望在kibana上做更多定制化开发的同学，或许会有需求在登录kibana之后能够跳转到自己的页面。

要完成以上需求，只需要在kibana的配置文件里面增加一行：

server.defaultRoute: /app/system_portal

以上例子，我让kibana登录之后直接跳到我自己的app插件system_portal

配置默认路由的文件, src/server/http/get_default_route.js：

import _ from 'lodash';

export default _.once(function (kbnServer) {
  const {
    config
  } = kbnServer;
  // 根目录basePath加上defaultRoute
  return `${config.get('server.basePath')}${config.get('server.defaultRoute')}`;
});

默认路由就是定义在server.defaultRoute中，默认值是app/kibana，可查看src/server/config/schema.js:

import Joi from 'joi';
import { constants as cryptoConstants } from 'crypto';
import os from 'os';

import { fromRoot } from '../../utils';
import { getData } from '../path';

export default async () => Joi.object({
  pkg: Joi.object({
    version: Joi.string().default(Joi.ref('$version')),
    branch: Joi.string().default(Joi.ref('$branch')),
    buildNum: Joi.number().default(Joi.ref('$buildNum')),
    buildSha: Joi.string().default(Joi.ref('$buildSha')),
  }).default(),

  env: Joi.object({
    name: Joi.string().default(Joi.ref('$env')),
    dev: Joi.boolean().default(Joi.ref('$dev')),
    prod: Joi.boolean().default(Joi.ref('$prod'))
  }).default(),

  dev: Joi.object({
    basePathProxyTarget: Joi.number().default(5603),
  }).default(),

  pid: Joi.object({
    file: Joi.string(),
    exclusive: Joi.boolean().default(false)
  }).default(),

  cpu: Joi.object({
    cgroup: Joi.object({
      path: Joi.object({
        override: Joi.string().default()
      })
    })
  }),

  cpuacct: Joi.object({
    cgroup: Joi.object({
      path: Joi.object({
        override: Joi.string().default()
      })
    })
  }),

  server: Joi.object({
    uuid: Joi.string().guid().default(),
    name: Joi.string().default(os.hostname()),
    host: Joi.string().hostname().default('localhost'),
    port: Joi.number().default(5601),
    maxPayloadBytes: Joi.number().default(1048576),
    autoListen: Joi.boolean().default(true),
    defaultRoute: Joi.string().default('/app/kibana').regex(/^\//, `start with a slash`),
    basePath: Joi.string().default('').allow('').regex(/(^$|^\/.*[^\/]$)/, `start with a slash, don't end with one`),
    rewriteBasePath: Joi.boolean().when('basePath', {
      is: '',
      then: Joi.default(false).valid(false),
      otherwise: Joi.default(false),
    }),
    customResponseHeaders: Joi.object().unknown(true).default({}),
    ssl: Joi.object({
      enabled: Joi.boolean().default(false),
      redirectHttpFromPort: Joi.number(),
      certificate: Joi.string().when('enabled', {
        is: true,
        then: Joi.required(),
      }),
      key: Joi.string().when('enabled', {
        is: true,
        then: Joi.required()
      }),
      keyPassphrase: Joi.string(),
      certificateAuthorities: Joi.array().single().items(Joi.string()).default(),
      supportedProtocols: Joi.array().items(Joi.string().valid('TLSv1', 'TLSv1.1', 'TLSv1.2')),
      cipherSuites: Joi.array().items(Joi.string()).default(cryptoConstants.defaultCoreCipherList.split(':'))
    }).default(),
    cors: Joi.when('$dev', {
      is: true,
      then: Joi.object().default({
        origin: ['*://localhost:9876'] // karma test server
      }),
      otherwise: Joi.boolean().default(false)
    }),
    xsrf: Joi.object({
      disableProtection: Joi.boolean().default(false),
      whitelist: Joi.array().items(
        Joi.string().regex(/^\//, 'start with a slash')
      ).default(),
      token: Joi.string().optional().notes('Deprecated')
    }).default(),
  }).default(),

  logging: Joi.object().keys({
    silent: Joi.boolean().default(false),

    quiet: Joi.boolean()
      .when('silent', {
        is: true,
        then: Joi.default(true).valid(true),
        otherwise: Joi.default(false)
      }),

    verbose: Joi.boolean()
      .when('quiet', {
        is: true,
        then: Joi.valid(false).default(false),
        otherwise: Joi.default(false)
      }),

    events: Joi.any().default({}),
    dest: Joi.string().default('stdout'),
    filter: Joi.any().default({}),
    json: Joi.boolean()
      .when('dest', {
        is: 'stdout',
        then: Joi.default(!process.stdout.isTTY),
        otherwise: Joi.default(true)
      }),

    useUTC: Joi.boolean().default(true),
  })
    .default(),

  ops: Joi.object({
    interval: Joi.number().default(5000),
  }).default(),

  plugins: Joi.object({
    paths: Joi.array().items(Joi.string()).default(),
    scanDirs: Joi.array().items(Joi.string()).default(),
    initialize: Joi.boolean().default(true)
  }).default(),

  path: Joi.object({
    data: Joi.string().default(getData())
  }).default(),

  optimize: Joi.object({
    enabled: Joi.boolean().default(true),
    bundleFilter: Joi.string().default('!tests'),
    bundleDir: Joi.string().default(fromRoot('optimize/bundles')),
    viewCaching: Joi.boolean().default(Joi.ref('$prod')),
    watch: Joi.boolean().default(false),
    watchPort: Joi.number().default(5602),
    watchHost: Joi.string().hostname().default('localhost'),
    watchPrebuild: Joi.boolean().default(false),
    watchProxyTimeout: Joi.number().default(5 * 60000),
    useBundleCache: Joi.boolean().default(Joi.ref('$prod')),
    unsafeCache: Joi.when('$prod', {
      is: true,
      then: Joi.boolean().valid(false),
      otherwise: Joi
        .alternatives()
        .try(
          Joi.boolean(),
          Joi.string().regex(/^\/.+\/$/)
        )
        .default(true),
    }),
    sourceMaps: Joi.when('$prod', {
      is: true,
      then: Joi.boolean().valid(false),
      otherwise: Joi
        .alternatives()
        .try(
          Joi.string().required(),
          Joi.boolean()
        )
        .default('#cheap-source-map'),
    }),
    profile: Joi.boolean().default(false)
  }).default(),
  status: Joi.object({
    allowAnonymous: Joi.boolean().default(false)
  }).default(),
  map: Joi.object({
    manifestServiceUrl: Joi.string().default(' https://catalogue.maps.elastic.co/v2/manifest'),
    emsLandingPageUrl: Joi.string().default('https://maps.elastic.co/v2'),
    includeElasticMapsService: Joi.boolean().default(true)
  }).default(),
  tilemap: Joi.object({
    url: Joi.string(),
    options: Joi.object({
      attribution: Joi.string(),
      minZoom: Joi.number().min(0, 'Must be 0 or higher').default(0),
      maxZoom: Joi.number().default(10),
      tileSize: Joi.number(),
      subdomains: Joi.array().items(Joi.string()).single(),
      errorTileUrl: Joi.string().uri(),
      tms: Joi.boolean(),
      reuseTiles: Joi.boolean(),
      bounds: Joi.array().items(Joi.array().items(Joi.number()).min(2).required()).min(2)
    }).default()
  }).default(),
  regionmap: Joi.object({
    includeElasticMapsService: Joi.boolean().default(true),
    layers: Joi.array().items(Joi.object({
      url: Joi.string(),
      format: Joi.object({
        type: Joi.string().default('geojson')
      }).default({
        type: 'geojson'
      }),
      meta: Joi.object({
        feature_collection_path: Joi.string().default('data')
      }).default({
        feature_collection_path: 'data'
      }),
      attribution: Joi.string(),
      name: Joi.string(),
      fields: Joi.array().items(Joi.object({
        name: Joi.string(),
        description: Joi.string()
      }))
    }))
  }).default(),

  i18n: Joi.object({
    defaultLocale: Joi.string().default('en'),
  }).default(),

  // This is a configuration node that is specifically handled by the config system
  // in the new platform, and that the current platform doesn't need to handle at all.
  __newPlatform: Joi.any(),

}).default();

继续阅读 »

在6.0版本以前，登录kibana之后，默认会路由到app/kibana下的discover应用。在6.3版本以后，新增了一个home路径/app/kibana#/home?_g=h@44136fa，访问根路径\会直接跳到以上路径。

希望在kibana上做更多定制化开发的同学，或许会有需求在登录kibana之后能够跳转到自己的页面。

要完成以上需求，只需要在kibana的配置文件里面增加一行：

server.defaultRoute: /app/system_portal

以上例子，我让kibana登录之后直接跳到我自己的app插件system_portal

配置默认路由的文件, src/server/http/get_default_route.js：

import _ from 'lodash';

export default _.once(function (kbnServer) {
  const {
    config
  } = kbnServer;
  // 根目录basePath加上defaultRoute
  return `${config.get('server.basePath')}${config.get('server.defaultRoute')}`;
});

默认路由就是定义在server.defaultRoute中，默认值是app/kibana，可查看src/server/config/schema.js:

import Joi from 'joi';
import { constants as cryptoConstants } from 'crypto';
import os from 'os';

import { fromRoot } from '../../utils';
import { getData } from '../path';

export default async () => Joi.object({
  pkg: Joi.object({
    version: Joi.string().default(Joi.ref('$version')),
    branch: Joi.string().default(Joi.ref('$branch')),
    buildNum: Joi.number().default(Joi.ref('$buildNum')),
    buildSha: Joi.string().default(Joi.ref('$buildSha')),
  }).default(),

  env: Joi.object({
    name: Joi.string().default(Joi.ref('$env')),
    dev: Joi.boolean().default(Joi.ref('$dev')),
    prod: Joi.boolean().default(Joi.ref('$prod'))
  }).default(),

  dev: Joi.object({
    basePathProxyTarget: Joi.number().default(5603),
  }).default(),

  pid: Joi.object({
    file: Joi.string(),
    exclusive: Joi.boolean().default(false)
  }).default(),

  cpu: Joi.object({
    cgroup: Joi.object({
      path: Joi.object({
        override: Joi.string().default()
      })
    })
  }),

  cpuacct: Joi.object({
    cgroup: Joi.object({
      path: Joi.object({
        override: Joi.string().default()
      })
    })
  }),

  server: Joi.object({
    uuid: Joi.string().guid().default(),
    name: Joi.string().default(os.hostname()),
    host: Joi.string().hostname().default('localhost'),
    port: Joi.number().default(5601),
    maxPayloadBytes: Joi.number().default(1048576),
    autoListen: Joi.boolean().default(true),
    defaultRoute: Joi.string().default('/app/kibana').regex(/^\//, `start with a slash`),
    basePath: Joi.string().default('').allow('').regex(/(^$|^\/.*[^\/]$)/, `start with a slash, don't end with one`),
    rewriteBasePath: Joi.boolean().when('basePath', {
      is: '',
      then: Joi.default(false).valid(false),
      otherwise: Joi.default(false),
    }),
    customResponseHeaders: Joi.object().unknown(true).default({}),
    ssl: Joi.object({
      enabled: Joi.boolean().default(false),
      redirectHttpFromPort: Joi.number(),
      certificate: Joi.string().when('enabled', {
        is: true,
        then: Joi.required(),
      }),
      key: Joi.string().when('enabled', {
        is: true,
        then: Joi.required()
      }),
      keyPassphrase: Joi.string(),
      certificateAuthorities: Joi.array().single().items(Joi.string()).default(),
      supportedProtocols: Joi.array().items(Joi.string().valid('TLSv1', 'TLSv1.1', 'TLSv1.2')),
      cipherSuites: Joi.array().items(Joi.string()).default(cryptoConstants.defaultCoreCipherList.split(':'))
    }).default(),
    cors: Joi.when('$dev', {
      is: true,
      then: Joi.object().default({
        origin: ['*://localhost:9876'] // karma test server
      }),
      otherwise: Joi.boolean().default(false)
    }),
    xsrf: Joi.object({
      disableProtection: Joi.boolean().default(false),
      whitelist: Joi.array().items(
        Joi.string().regex(/^\//, 'start with a slash')
      ).default(),
      token: Joi.string().optional().notes('Deprecated')
    }).default(),
  }).default(),

  logging: Joi.object().keys({
    silent: Joi.boolean().default(false),

    quiet: Joi.boolean()
      .when('silent', {
        is: true,
        then: Joi.default(true).valid(true),
        otherwise: Joi.default(false)
      }),

    verbose: Joi.boolean()
      .when('quiet', {
        is: true,
        then: Joi.valid(false).default(false),
        otherwise: Joi.default(false)
      }),

    events: Joi.any().default({}),
    dest: Joi.string().default('stdout'),
    filter: Joi.any().default({}),
    json: Joi.boolean()
      .when('dest', {
        is: 'stdout',
        then: Joi.default(!process.stdout.isTTY),
        otherwise: Joi.default(true)
      }),

    useUTC: Joi.boolean().default(true),
  })
    .default(),

  ops: Joi.object({
    interval: Joi.number().default(5000),
  }).default(),

  plugins: Joi.object({
    paths: Joi.array().items(Joi.string()).default(),
    scanDirs: Joi.array().items(Joi.string()).default(),
    initialize: Joi.boolean().default(true)
  }).default(),

  path: Joi.object({
    data: Joi.string().default(getData())
  }).default(),

  optimize: Joi.object({
    enabled: Joi.boolean().default(true),
    bundleFilter: Joi.string().default('!tests'),
    bundleDir: Joi.string().default(fromRoot('optimize/bundles')),
    viewCaching: Joi.boolean().default(Joi.ref('$prod')),
    watch: Joi.boolean().default(false),
    watchPort: Joi.number().default(5602),
    watchHost: Joi.string().hostname().default('localhost'),
    watchPrebuild: Joi.boolean().default(false),
    watchProxyTimeout: Joi.number().default(5 * 60000),
    useBundleCache: Joi.boolean().default(Joi.ref('$prod')),
    unsafeCache: Joi.when('$prod', {
      is: true,
      then: Joi.boolean().valid(false),
      otherwise: Joi
        .alternatives()
        .try(
          Joi.boolean(),
          Joi.string().regex(/^\/.+\/$/)
        )
        .default(true),
    }),
    sourceMaps: Joi.when('$prod', {
      is: true,
      then: Joi.boolean().valid(false),
      otherwise: Joi
        .alternatives()
        .try(
          Joi.string().required(),
          Joi.boolean()
        )
        .default('#cheap-source-map'),
    }),
    profile: Joi.boolean().default(false)
  }).default(),
  status: Joi.object({
    allowAnonymous: Joi.boolean().default(false)
  }).default(),
  map: Joi.object({
    manifestServiceUrl: Joi.string().default(' https://catalogue.maps.elastic.co/v2/manifest'),
    emsLandingPageUrl: Joi.string().default('https://maps.elastic.co/v2'),
    includeElasticMapsService: Joi.boolean().default(true)
  }).default(),
  tilemap: Joi.object({
    url: Joi.string(),
    options: Joi.object({
      attribution: Joi.string(),
      minZoom: Joi.number().min(0, 'Must be 0 or higher').default(0),
      maxZoom: Joi.number().default(10),
      tileSize: Joi.number(),
      subdomains: Joi.array().items(Joi.string()).single(),
      errorTileUrl: Joi.string().uri(),
      tms: Joi.boolean(),
      reuseTiles: Joi.boolean(),
      bounds: Joi.array().items(Joi.array().items(Joi.number()).min(2).required()).min(2)
    }).default()
  }).default(),
  regionmap: Joi.object({
    includeElasticMapsService: Joi.boolean().default(true),
    layers: Joi.array().items(Joi.object({
      url: Joi.string(),
      format: Joi.object({
        type: Joi.string().default('geojson')
      }).default({
        type: 'geojson'
      }),
      meta: Joi.object({
        feature_collection_path: Joi.string().default('data')
      }).default({
        feature_collection_path: 'data'
      }),
      attribution: Joi.string(),
      name: Joi.string(),
      fields: Joi.array().items(Joi.object({
        name: Joi.string(),
        description: Joi.string()
      }))
    }))
  }).default(),

  i18n: Joi.object({
    defaultLocale: Joi.string().default('en'),
  }).default(),

  // This is a configuration node that is specifically handled by the config system
  // in the new platform, and that the current platform doesn't need to handle at all.
  __newPlatform: Joi.any(),

}).default();

收起阅读 »

1、几个常用的Elasticsearch Management GUI推荐。
http://t.cn/Eql0exu
2、(自备梯子)Dejavu 3.0: 一个你需要的Elasticsearch Web UI。
http://t.cn/EqlOPrD
3、从Lucene到Elasticsearch:全文检索实战。
http://t.cn/EqlOLdL

编辑：叮咚光军
归档：https://elasticsearch.cn/article/6331
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

3、一周热点：GitHub提供免费私有Repo http://t.cn/EGmbUQX

编辑: bsll
归档：https://elasticsearch.cn/article/6328
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

社区日报第506期（2019-1-11）

1、在Spring Boot 2.0中使用ElasticSearch
http://t.cn/EqLtnXp
2、Elasticsearch索引管理利器——Curator深入详解
http://t.cn/EqL9Tdq
3、ES分片分配策略
http://t.cn/EqLc7XK

编辑：铭毅天下
归档：https://elasticsearch.cn/article/6327
订阅：https://tinyletter.com/elastic-daily

继续阅读 »

自研基于StanfordNLP的ES分词插件

为ES构建Stanford NLP分词插件

Stanford NLP？

Why Stanford core NLP？

Why this project

How

plugin

functional class

odds and ends

also see

为ES构建Stanford NLP分词插件

Stanford NLP？

Why Stanford core NLP？

Why this project

How

plugin

functional class

odds and ends

also see

社区日报第517期 (2019-01-22)

社区日报第516期 (2019-01-21)

人民在线招聘ES搜索研发工程师

社区日报第515期 (2019-01-20)

社区日报第514期 (2019-01-19）

如何修改kibana的默认主页

社区日报第513期（2019-1-18）

社区日报第512期 (2019-01-17)

社区日报第511期 (2019-01-16)

社区日报第510期 (2019-01-15)

社区日报第509期 (2019-01-14)

社区日报第508期 (2019-01-13)

社区日报第507期 (2019-01-12）

社区日报第506期（2019-1-11）

活动推荐

热门文章

热门话题