"Errored websocket ####" during long running job

Enonic version: 7.6.1
OS: Linux

Hi!

We have a service, which imports around 22000 objects from an external API.
The import takes around 20-30 minutes of time, and sometimes error messages appear. Seems like when error happens, it is not able to create/update an object. What could cause this issue? Locally works without errors.

Thanks!

WARN  c.e.x.admin.event.impl.EventEndpoint - Errored websocket 3554
org.eclipse.jetty.io.EofException: null
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
	at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
	at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
	at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.flush(FrameFlusher.java:264)
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.process(FrameFlusher.java:193)
	at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
	at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:581)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.close(AbstractWebSocketConnection.java:181)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:510)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:440)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: Broken pipe
	at java.base/sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
	at java.base/sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
	at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:182)
	at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:130)
	at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:496)
	at java.base/java.nio.channels.SocketChannel.write(SocketChannel.java:507)
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:273)
	... 22 common frames omitted

It happens when browser closes websoket connection without reading data fully.
You need to reestablish websoket connection from the browser if it is closed.

If this is a batch process where you are importing 1000 items at a time, maybe reduce your batch size so that you import e.g. 100 items at a time?

Thanks for replies!
After it I thought this might happen because the import itself is run by the service, so the browser got timeout response. So I made it running as a background job with the task lib.
Unfortunately, it did not work too.

It is not quite importing 1000 items at a time - it is importing one item after another, and then publishes them with a pack of around 100 items.

Do you have any other ideas? The biggest trouble is that it works absolutely correct locally, while the error happens on a production only.

I guess my question would be - why are you using websockets to import content?

Actually, I don’t. This is the most weird part for me. I thought XP is using websockets during some content create/update/publish process. But after your question this error is getting even more strange to me.

The import is run using the context lib. The whole import is basically a callback inside context. Could some kind of a websocket be used inside it? And might this be a problem?

Hmm… How do you trigger the job then? Do you use websockets to show it’s progress in a client or similar?

The whole flow looks like this:

  1. Service is triggered via URL
  2. Service calls the tasklib, which runs an import function
  3. Context lib changes context to admin user
  4. A lot of requests to API and contentLib.create/modify/publish happens here
  5. Finished

Websockets are not used in code at all, the progress is not displayed, and the user, who triggered the import just gets a response with an empty page after the background job is started.

Looking close at the error, it does not appear to be directly related. This error comes from websocket comm between content studio and XP.

Thank you very much for pointing this to me. I solved this problem and it is totally not related to web sockets. The reason was incorrect data in database on prod.

As for the websockets issue - is there any trouble in this warning? Should we look more into it?

Jetty team does not take it as a bug, so I guess not. https://github.com/eclipse/jetty.project/issues/4464