Elasticsearch broken after restart?

lassejl · September 1, 2017, 12:35pm

Enonic version: 6.8.1
OS: Ubuntu

Hello. I had abit of a problem after trying to restart my server. Elasticsearch never seems to get back up. I get the following error message:

12:04:55.679 WARN  o.elasticsearch.cluster.action.shard - [local-node] [search-cms-repo][0] received shard failed for [search-cms-repo][0], node[5JvrH1OCTOqeSrKKd9ncgA], [P], s[INITIALIZING], indexUUID [jgzh4Jm6QIekt7tMSuIPYw], reason [shard failure [failed recovery][IndexShardGatewayRecoveryException[[search-cms-repo][0] failed recovery]; nested: EngineCreationFailureException[[search-cms-repo][0] failed to upgrade 3x segments]; nested: EOFException[read past EOF: NIOFSIndexInput(path="/enonic-xp/home/repo/index/data/mycluster/nodes/0/indices/search-cms-repo/0/index/segments_ec")]; ]]

Anyone experienced this before or know what to do?

lassejl · September 1, 2017, 12:38pm

Here is abit more information on the error message:

12:09:55.701 WARN org.elasticsearch.indices.cluster - [local-node] [[search-cms-repo][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [search-cms-repo][0] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:162) [repack-elasticsearch-6.8.1.jar:6.8.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [search-cms-repo][0] failed to upgrade 3x segments
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:121) ~[na:na]
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:32) ~[na:na]
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1262) ~[na:na]
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1257) ~[na:na]
at org.elasticsearch.index.shard.IndexShard.prepareForTranslogRecovery(IndexShard.java:784) ~[na:na]
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:226) ~[na:na]
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:112) [repack-elasticsearch-6.8.1.jar:6.8.1]
… 3 common frames omitted
Caused by: java.io.EOFException: read past EOF: NIOFSIndexInput(path="/enonic-xp/home/repo/index/data/mycluster/nodes/0/indices/search-cms-repo/0/index/segments_ec")
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:336) ~[na:na]
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54) ~[na:na]
at org.apache.lucene.store.DataInput.readInt(DataInput.java:98) ~[na:na]
at org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:183) ~[na:na]
at org.elasticsearch.common.lucene.Lucene.indexNeeds3xUpgrading(Lucene.java:767) ~[na:na]
at org.elasticsearch.common.lucene.Lucene.upgradeLucene3xSegmentsMetadata(Lucene.java:778) ~[na:na]
at org.elasticsearch.index.engine.InternalEngine.upgrade3xSegments(InternalEngine.java:1084) ~[na:na]
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:119) ~[na:na]
… 9 common frames omitted

lassejl · September 1, 2017, 12:59pm

In an attempt to fixed it, i just removed the “/enonic-xp/home/repo/index/data/mycluster/nodes/0/indices/search-cms-repo/0/index/segments_ec” file and it seems to have worked. However, i have no idea what this does and if my data might be gone.

tommytusj · September 1, 2017, 8:07pm

Thats scary. Hopefully someone can follow up on this

rmy · September 4, 2017, 7:49am

Hello.

I have seen something similar only once before - I cannot remember if it was after a hard shutdown of the server, or full disk or something like that. Then a segments-file was empty, and the fix was to delete the file. I also found a reference to this as a bug in the version of ES that we are currently using that caused an issue if the file was written but didnt contain any data yet. Did you check the size of the file Lasse?

Was it a normal restart, or after e.g an OOM or some other problems?

So, for the integrity of the data:

This is the search-index, which could be recreated at any time by doing a reindex
You should ALWAYS do regular backups of the index by using the snapshot-service, e.g by using the https://market.enonic.com/vendors/enonic/snapshotter - application. Its an operation with minimal overhead, and its really quick to restore the index into working state.

Elasticsearch broken after restart?

Enonic version: 6.8.1 OS: Ubuntu

Enonic version: 6.8.1
OS: Ubuntu