索引分片失败,[shard creation] trying to lock for [shard creation]
Elasticsearch | 作者 ikeky | 发布于2021年10月08日 | 阅读数:2047
目前es集群是14个热节点,10个冷节点,一般热节点的索引是10分片*2 ,冷节点是10*1,7天后的索引就开始冻结。
最近经常性的会出现部分冻结的索引分片分配失败,查询es冷节点日志,也只是显示
[logstash-line8-personal-biz-2021.09.23][9] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
at org.elasticsearch.index.IndexService.createShard(IndexService.java:439) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:654) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:164) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:610) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:586) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:266) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517) ~[elasticsearch-7.4.2.jar:7.4.2]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_221]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514) [elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) [elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) [elasticsearch-7.4.2.jar:7.4.2]
和
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [logstash-line8-personal-biz-2021.09.23][9]: obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [shard creation]
at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:769) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:684) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.index.IndexService.createShard(IndexService.java:359) ~[elasticsearch-7.4.2.jar:7.4.2]
感觉是分片被锁在内存里,导致没法分配,用网上的retry_failed命令也没用,gc也改成g1gc,没有长时间的gc,但还是每天都会发生分片失败。
目前只能是重启节点再重新retry_failed才能恢复被锁的分片,但无法解决根本原因。
最近经常性的会出现部分冻结的索引分片分配失败,查询es冷节点日志,也只是显示
[logstash-line8-personal-biz-2021.09.23][9] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
at org.elasticsearch.index.IndexService.createShard(IndexService.java:439) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:654) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:164) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:610) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:586) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:266) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517) ~[elasticsearch-7.4.2.jar:7.4.2]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_221]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514) [elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) [elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) [elasticsearch-7.4.2.jar:7.4.2]
和
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [logstash-line8-personal-biz-2021.09.23][9]: obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [shard creation]
at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:769) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:684) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.index.IndexService.createShard(IndexService.java:359) ~[elasticsearch-7.4.2.jar:7.4.2]
感觉是分片被锁在内存里,导致没法分配,用网上的retry_failed命令也没用,gc也改成g1gc,没有长时间的gc,但还是每天都会发生分片失败。
目前只能是重启节点再重新retry_failed才能恢复被锁的分片,但无法解决根本原因。
1 个回复
Charele - Cisco4321
赞同来自:
previous lock details: [shard creation] trying to lock for [shard creation]
上次锁是因为"创建分片",这次也是由"创建分片",
说明分片在这个节点上重复创建了。
我不清楚为什么,所以想知道你的“冻结”是什么操作。还有其他雷型的报错吗?