搜索结果正在快递途中

delete index的时候,写入速度受到影响

Elasticsearch | 作者 shjdwxy | 发布于2018年07月15日 | 阅读数:1980

最近线上需要一个问题:有一批index需要一个接一个删除,在删除index的时候,写入速度明显受到影响,请问这个现象产生的深层次原因是什么呢?
 
谢谢
已邀请:

zqc0512 - andy zhou

赞同来自:

单个索引很大吧,把rebalance 关掉再测试下。
还有就是现在集群压力是不是很大  cerebro 看下。

hapjin

赞同来自:

1,线程池角度
index操作对应的Action是org.elasticsearch.action.index.TransportIndexAction,看里面注释说是:已经用TransportBulkAction来表示index操作了。


TransportIndexAction Deprecated.  use TransportBulkAction with a single item instead


TransportBulkAction里面有个TransportShardBulkAction类型的实例变量,批量操作其实就是代理给TransportShardBulkAction执行的:


Groups bulk request items by shard, optionally creating non-existent indices and  delegates to link TransportShardBulkAction for shard-level bulk execution


TransportShardBulkAction 所代表的批量操作用的线程池是:ThreadPool.Names.WRITE
@Inject
public TransportShardBulkAction(Settings settings, TransportService transportService, ClusterService clusterService,
IndicesService indicesService, ThreadPool threadPool, ShardStateAction shardStateAction,
MappingUpdatedAction mappingUpdatedAction, UpdateHelper updateHelper, ActionFilters actionFilters,
IndexNameExpressionResolver indexNameExpressionResolver) {
super(settings, ACTION_NAME, transportService, clusterService, indicesService, threadPool, shardStateAction, actionFilters,
indexNameExpressionResolver, BulkShardRequest::new, BulkShardRequest::new, ThreadPool.Names.WRITE);
this.updateHelper = updateHelper;
this.mappingUpdatedAction = mappingUpdatedAction;
}
所以我觉得有可能:bulk delete 和 bulk indexing 共用了线程池了,如果是共用线程池的话,那么bulk delete 影响bulk index 就很好理解了。(具体逻辑还得再理一下),可参考:org.elasticsearch.threadpool.ThreadPool#ThreadPool
builders.put(Names.INDEX, new FixedExecutorBuilder(settings, Names.INDEX, availableProcessors, 200, true));
builders.put(Names.WRITE, new FixedExecutorBuilder(settings, Names.WRITE, "bulk", availableProcessors, 200));
builders.put(Names.GET, new FixedExecutorBuilder(settings, Names.GET, availableProcessors, 1000));

2,delete触发 translog刷新和refresh segment角度
虽然:org.elasticsearch.index.engine.InternalEngine#delete里面说delete-by-id不会触发生成新的segment,但是会应该会刷新translog的,而indexing也会刷新translog,所以二者有影响。


we don't throttle this when merges fall behind because delete-by-id does not create new segments:


 
3,从底层Lucene上看:既可以通过IndexReader、也可以通过IndexWriter来删除文档,但对一个索引而言,任何时刻只能有一个打开的IndexWriter,当基于IndexReader来删除文档时,已打开的IndexWriter会被强制关闭。从而影响Lucene索引文档的速度。


if you are tempted to use IndexReader for deletion,remember that Lucene allows only one writer to be open at once. Confusingly, an IndexReader that's performing deletes count as a "writer". This means you are forced to close any open IndexWriter before doing deletions with IndexReader and vice versa. 


 额外补充一篇 删除文档 会影响 搜索的性能:lucenes-handling-of-deleted-documents
版本,ES6.3.2

要回复问题请先登录注册