不为失败找理由,要为成功找方法。

Elasticsearch 无法写入数据 checksum failed (hardware problem?)

Elasticsearch | 作者 Jokers | 发布于2019年12月16日 | 阅读数:5299

我的 es 运行一段时间之后就无法在写入数据。错误日志如下:
[2019-12-16T15:01:04,418][WARN ][o.e.c.r.a.AllocationService] [node35] failing shard [failed shard, shard [logstash-XXX-2019.12.15][3], node[oEX5ZMqFRYmyaY5EbUeW1g], [P], s[STARTED], a[id=_1NiZQe6R3OrOTfzoyQWaA], message [shard failure, reason [merge failed]], failure [MergeException[org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=a59ffbbf actual=3f0c5c1b (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/local/elk/elasticsearch-7.4.2/data/nodes/0/indices/UeHIxZrES0iDKgYnFWaaAA/3/index/_4h_Lucene80_0.dvd")))]; nested: CorruptIndexException[checksum failed (hardware problem?) : expected=a59ffbbf actual=3f0c5c1b (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/local/elk/elasticsearch-7.4.2/data/nodes/0/indices/UeHIxZrES0iDKgYnFWaaAA/3/index/_4h_Lucene80_0.dvd")))]; ], markAsStale [true]]
org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=a59ffbbf actual=3f0c5c1b (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/local/elk/elasticsearch-7.4.2/data/nodes/0/indices/UeHIxZrES0iDKgYnFWaaAA/3/index/_4h_Lucene80_0.dvd")))
at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2389) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.4.2.jar:7.4.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=a59ffbbf actual=3f0c5c1b (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/local/elk/elasticsearch-7.4.2/data/nodes/0/indices/UeHIxZrES0iDKgYnFWaaAA/3/index/_4h_Lucene80_0.dvd")))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:526) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer.checkIntegrity(Lucene80DocValuesProducer.java:1367) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader.checkIntegrity(PerFieldDocValuesFormat.java:366) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:127) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:152) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:196) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:151) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4462) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4056) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:101) ~[elasticsearch-7.4.2.jar:7.4.2]
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e96
难道真的是我的硬盘坏了吗?我正在检测硬盘中。
已邀请:

Jokers - hi

赞同来自:

硬盘没有问题。Es 还是会出现这个错误。

medcl - 今晚打老虎。

赞同来自:

v7.4.2啊,很有可能是踩坑了,第二个案例了。

bingo919 - bingo

赞同来自:

[2020-12-26T01:38:22,612][WARN ][o.e.i.c.IndicesClusterStateService] [bd_lass_elasticsearch-data0] [app_2020.12.23-shrink][0]] marking and sending shard failed due to [shard failure, reason [recovery]]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=f5dvwg actual=1ukjrs4 (resource=name [_1_Lucene50_0.pos], length [88425081], checksum [f5dvwg], writtenBy [7.7.0]) (resource=VerifyingIndexOutput(_1_Lucene50_0.pos))
        at org.elasticsearch.index.store.Store$LuceneVerifyingIndexOutput.readAndCompareChecksum(Store.java:1212) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.index.store.Store$LuceneVerifyingIndexOutput.writeByte(Store.java:1190) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.index.store.Store$LuceneVerifyingIndexOutput.writeBytes(Store.java:1220) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.MultiFileWriter.innerWriteFileChunk(MultiFileWriter.java:120) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.MultiFileWriter.access$000(MultiFileWriter.java:43) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.MultiFileWriter$FileChunkWriter.writeChunk(MultiFileWriter.java:200) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.MultiFileWriter.writeFileChunk(MultiFileWriter.java:68) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.RecoveryTarget.writeFileChunk(RecoveryTarget.java:459) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$FileChunkTransportRequestHandler.messageReceived(PeerRecoveryTargetService.java:629) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$FileChunkTransportRequestHandler.messageReceived(PeerRecoveryTargetService.java:603) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-6.8.0.jar:6.8.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.0.jar:6.8.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
[2020-12-26T22:56:07,517][INFO ][o.e.m.j.JvmGcMonitorService] [bd_lass_elasticsearch-data0] [gc][828469] overhead, spent [348ms] collecting in the last [1s]
 
我也出现了相同问题6.8

要回复问题请先登录注册