Trouble with one node after restart of server


#1

Enonic version: 7.2
OS: WindowsServer2019

We have to nodes in a cluster.
Yesterday we patched one node with OS-patches. After that are we getting this in the logs every sec:
1:20:15.428 INFO com.enonic.xp.init.Initializer - Waiting [1s] for system.auditlog repo to be initialized
11:20:16.436 INFO com.enonic.xp.init.Initializer - Waiting [1s] for system.auditlog repo to be initialized
11:20:17.438 INFO com.enonic.xp.init.Initializer - Waiting [1s] for system.auditlog repo to be initialized

The patched node is not reacheable now.
We can’t patch the other node before we got this node online again

Have tried to delete folder under …\home\repo\index\data\ to se if that helps. Folder is created when i start the node. But no response for website.

In the working node i get this in the log
11:10:04.367 INFO o.e.cluster.routing.allocation - [cmsapp02] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[search-com.enonic.cms.default][0]] …]).
11:10:25.467 INFO o.e.cluster.routing.allocation - [cmsapp02] Cluster health status changed from [GREEN] to [YELLOW] (reason: [{cmsapp01}{SxAzPIE_TZuHHqNUO1XhrQ}{cmsapp01.npta.no}{192.168.227.51:9300}{local=false, master=true} transport disconnected]).
11:10:25.467 INFO org.elasticsearch.cluster.service - [cmsapp02] removed {{cmsapp01}{SxAzPIE_TZuHHqNUO1XhrQ}{cmsapp01.domain.no}{192.168.227.51:9300}{local=false, master=true},}, reason: zen-disco-node-failed({cmsapp01}{SxAzPIE_TZuHHqNUO1XhrQ}{cmsapp01.npta.no}{192.168.227.51:9300}{local=false, master=true}), reason(transport disconnected)
11:10:25.467 INFO org.elasticsearch.cluster.routing - [cmsapp02] delaying allocation for [8] unassigned shards, next check in [1m]
11:12:52.835 INFO org.elasticsearch.cluster.service - [cmsapp02] added {{cmsapp01}{cfoFS8U3RKCK8Z4xQ0CDzg}{cmsapp01.domain.no}{192.168.227.51:9300}{local=false, master=true},}, reason: zen-disco-join(join from node[{cmsapp01}{cfoFS8U3RKCK8Z4xQ0CDzg}{cmsapp01.domain.no}{192.168.227.51:9300}{local=false, master=true}])
11:13:08.668 INFO o.e.cluster.routing.allocation - [cmsapp02] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[search-com.enonic.cms.default][0]] …]).

Please help me.

Sondre.


#2

Hi !

https://developer.enonic.com/docs/xp/stable/release/upgrade#v7_2_notes

Was this already at version 7.2 before patching the OS?


#3

Yes it was. We have only patched the OS. No upgrade for Enonic.


#4

This problem happens when cluster was updated from 7.1 to 7.2 without full cluster restart.


#5

OK, so how do I get this cluster on track again?
What will happen when I restart the other node?

Sondre.


#6

It is hard to tell what is going to happen.

We have to nodes in a cluster.

This is generally a bad idea to have even number of nodes, especially exactly two.
Follow documentation for basic cluster setup
https://developer.enonic.com/docs/xp/stable/deployment/strategies#basic_cluster

It is possible that Windows patch had broken network communication for a short while and introduced inconsistency in your cluster.

Have tried to delete folder under …\home\repo\index\data\ to se if that helps. Folder is created when i start the node. But no response for website.

You most likely have broken your cluster by removing these files.

Generally you need to restore from a backup now.