Data import bug

myurchenko · January 24, 2017, 1:24pm

Enonic version: 6.9.0
OS: Elementary OS Loki (Ubuntu 16.04)

Hello

We have updated our dev server to Enonic 6.9.0 and we have also made some data import using Data Toolbox application. Our server went down afterwards. This is the output of the log file:

Jan 24 13:57:24 loki nibio[28953]: 13:57:24.614 WARN o.elasticsearch.cluster.action.shard - [local-node] [search-system-repo][0] received shard failed for [search-system-repo][0], node[FwtEgMPPT0qSqVd3NtZ3GA], [P], s[INITIALIZING], indexUUID [RspF6XLdTQuPrux1GavwAg], reason [shard failure [failed recovery][IndexShardGatewayRecoveryException[[search-system-repo][0] failed recovery]; nested: EngineCreationFailureException[[search-system-repo][0] failed to upgrade 3x segments]; nested: EOFException[read past EOF: NIOFSIndexInput(path=“/srv/vhosts/enonic/homes/nibio/repo/index/data/mycluster/nodes/0/indices/search-system-repo/0/index/segments_4”)]; ]]Jan 24 13:57:24 loki nibio[28953]: 13:57:24.622 WARN org.elasticsearch.indices.cluster - [local-node] [[storage-system-repo][0]] marking and sending shard failed due to [failed recovery]Jan 24 13:57:24 loki nibio[28953]: org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [storage-system-repo][0] failed recoveryJan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:162) [repack-elasticsearch-6.9.0.jar:6.9.0]Jan 24 13:57:24 loki nibio[28953]: #011at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: #011at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: #011at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112]Jan 24 13:57:24 loki nibio[28953]: Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [storage-system-repo][0] failed to upgrade 3x segmentsJan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:121) ~[na:na]Jan 24 13:57:24 loki nibio[28953]: #011at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:32) ~[na:na]

It generates a lot of messages like this, so the log file is more than 1GB size now.
Any help or advice would be much appreciated.
Thanks

rmy · January 24, 2017, 1:56pm

Did you make a full data-dump before upgrading? Seems like something has been either killed while doing a translog-write or e.g full disk problem?

myurchenko · January 24, 2017, 2:00pm

Data is not a problem. No worries. We did no full data import.
This was caused not by the upgrade, as we had exactly the same issue with a previous project, which appeared after an import as well.
There was enough disk space, so this is not an issue too.

rmy · January 24, 2017, 2:04pm

Ah, ok.

So no shutdowns or anything; just importing data and the server crashes, then there is a problem with restart afterwards?

Is it a large import?

rmy · January 24, 2017, 2:09pm

Could you also check if this file exists; and the file-size:

/srv/vhosts/enonic/homes/nibio/repo/index/data/mycluster/nodes/0/indices/search-system-repo/0/index/segments_4

myurchenko · January 24, 2017, 2:21pm

This was not a big import. Something like 10 content objects or less with no big data in there. It was impossible to turn on the server again until removing indexes for elasticsearch.

This file exists, but it’s empty:
/srv/vhosts/enonic/homes/nibio/repo/index/data/mycluster/nodes/0/indices/search-system-repo/0/index/segments_4

rmy · January 24, 2017, 2:23pm

Ah, we may have hit an elasticsearch bug, we are in the process of upgrading but not before version 7.

Could you try to delete the file, and see what happens?

myurchenko · January 24, 2017, 2:25pm

We tried removing that file, but the new one empty was created right away.

rmy · January 24, 2017, 2:26pm

How is the server setup with regards to memory btw? Could it have been an OOM crash?

myurchenko · January 26, 2017, 1:44pm

It could not be an OOM crash.
We were importing the whole site(much more data) some time later and it was OK.
It feels like this happens when Enonic(or Elasticsearch) cant find(or index) some objects, which are related to a new imported, but they are not created on an instance.

rmy · January 27, 2017, 5:15pm

Ok, we’ll do some more investigation on this.

Data import bug

Enonic version: 6.9.0 OS: Elementary OS Loki (Ubuntu 16.04)

Enonic version: 6.9.0
OS: Elementary OS Loki (Ubuntu 16.04)