Data import bug

Enonic version: 6.9.0
OS: Elementary OS Loki (Ubuntu 16.04)


We have updated our dev server to Enonic 6.9.0 and we have also made some data import using Data Toolbox application. Our server went down afterwards. This is the output of the log file:

Jan 24 13:57:24 loki nibio[28953]: 13:57:24.614 WARN o.elasticsearch.cluster.action.shard - [local-node] [search-system-repo][0] received shard failed for [search-system-repo][0], node[FwtEgMPPT0qSqVd3NtZ3GA], [P], s[INITIALIZING], indexUUID [RspF6XLdTQuPrux1GavwAg], reason [shard failure [failed recovery][IndexShardGatewayRecoveryException[[search-system-repo][0] failed recovery]; nested: EngineCreationFailureException[[search-system-repo][0] failed to upgrade 3x segments]; nested: EOFException[read past EOF: NIOFSIndexInput(path="/srv/vhosts/enonic/homes/nibio/repo/index/data/mycluster/nodes/0/indices/search-system-repo/0/index/segments_4")]; ]]Jan 24 13:57:24 loki nibio[28953]: 13:57:24.622 WARN org.elasticsearch.indices.cluster - [local-node] [[storage-system-repo][0]] marking and sending shard failed due to [failed recovery]Jan 24 13:57:24 loki nibio[28953]: org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [storage-system-repo][0] failed recoveryJan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.gateway.IndexShardGatewayService$ [repack-elasticsearch-6.9.0.jar:6.9.0]Jan 24 13:57:24 loki nibio[28953]: #011at java.util.concurrent.ThreadPoolExecutor.runWorker( [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: #011at java.util.concurrent.ThreadPoolExecutor$ [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: #011at [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [storage-system-repo][0] failed to upgrade 3x segmentsJan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.engine.InternalEngine.( ~[na:na]Jan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine( ~[na:na]

It generates a lot of messages like this, so the log file is more than 1GB size now.
Any help or advice would be much appreciated.

Did you make a full data-dump before upgrading? Seems like something has been either killed while doing a translog-write or e.g full disk problem?

Data is not a problem. No worries. We did no full data import.
This was caused not by the upgrade, as we had exactly the same issue with a previous project, which appeared after an import as well.
There was enough disk space, so this is not an issue too.

Ah, ok.

So no shutdowns or anything; just importing data and the server crashes, then there is a problem with restart afterwards?

Is it a large import?

Could you also check if this file exists; and the file-size:


This was not a big import. Something like 10 content objects or less with no big data in there. It was impossible to turn on the server again until removing indexes for elasticsearch.

This file exists, but it’s empty:

Ah, we may have hit an elasticsearch bug, we are in the process of upgrading but not before version 7.

Could you try to delete the file, and see what happens?

We tried removing that file, but the new one empty was created right away.

How is the server setup with regards to memory btw? Could it have been an OOM crash?

It could not be an OOM crash.
We were importing the whole site(much more data) some time later and it was OK.
It feels like this happens when Enonic(or Elasticsearch) cant find(or index) some objects, which are related to a new imported, but they are not created on an instance.

Ok, we’ll do some more investigation on this.