Hi guys,
I have a Flink Streaming job running for about a day now without any errors and then I got this in the job manager log: 15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) After this the job did not fail it keeps running. This happened twice. Can anyone tell me what might cause this exception? Cheers, Gyula |
Can you please share the complete logs with me? Uce at apache org ;)
On Saturday, 14 November 2015, Gyula Fóra <[hidden email]> wrote: > Hi guys, > > I have a Flink Streaming job running for about a day now without any errors > and then I got this in the job manager log: > > 15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline > - An exceptionCaught() event was fired, and it reached at > the tail of the pipeline. It usually means the last handler in the > pipeline did not handle the exception. > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > at > io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447) > at > io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > at java.lang.Thread.run(Thread.java:745) > > > After this the job did not fail it keeps running. This happened twice. > Can anyone tell me what might cause this exception? > > > Cheers, > > Gyula > |
In reply to this post by Gyula Fóra-2
This Exception was not thrown by the data exchange component. This is confirmed by the stack trace you have shared. It shows the DefaultThreadFactory, which we don’t use for the data exchange. Any Exception thrown there will actually fail the program.
My best guess is that this was thrown by the new web interface. Was it running with your job? My second best guess is that it was thrown by another component running Netty (maybe a Hadoop client?). – Ufuk PS Thanks for sharing the logs with me. :) > On 14 Nov 2015, at 18:14, Gyula Fóra <[hidden email]> wrote: > > Hi guys, > > I have a Flink Streaming job running for about a day now without any errors > and then I got this in the job manager log: > > 15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline > - An exceptionCaught() event was fired, and it reached at > the tail of the pipeline. It usually means the last handler in the > pipeline did not handle the exception. > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > at io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) > at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) > at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > at java.lang.Thread.run(Thread.java:745) > > > After this the job did not fail it keeps running. This happened twice. > Can anyone tell me what might cause this exception? > > > Cheers, > > Gyula |
I used to get a similar exception [I do not remember if the stack
trace was *exactly *the same but it was from the web interface, and was due to the *connection reset by peer *]. Currently, the web interface does not handle exceptionCaught events cleanly. One of my PRs has addressed this by adding a exception handler at the end of pipeline. https://github.com/apache/flink/pull/1338 -- Sachin Goel Computer Science, IIT Delhi m. +91-9871457685 On Sun, Nov 15, 2015 at 5:48 PM, Ufuk Celebi <[hidden email]> wrote: > This Exception was not thrown by the data exchange component. This is > confirmed by the stack trace you have shared. It shows the > DefaultThreadFactory, which we don’t use for the data exchange. Any > Exception thrown there will actually fail the program. > > My best guess is that this was thrown by the new web interface. Was it > running with your job? > > My second best guess is that it was thrown by another component running > Netty (maybe a Hadoop client?). > > – Ufuk > > PS Thanks for sharing the logs with me. :) > > > On 14 Nov 2015, at 18:14, Gyula Fóra <[hidden email]> wrote: > > > > Hi guys, > > > > I have a Flink Streaming job running for about a day now without any > errors > > and then I got this in the job manager log: > > > > 15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline > > - An exceptionCaught() event was fired, and it reached at > > the tail of the pipeline. It usually means the last handler in the > > pipeline did not handle the exception. > > java.io.IOException: Connection reset by peer > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > > at > io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447) > > at > io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) > > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) > > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > > at java.lang.Thread.run(Thread.java:745) > > > > > > After this the job did not fail it keeps running. This happened twice. > > Can anyone tell me what might cause this exception? > > > > > > Cheers, > > > > Gyula > > |
Thanks guys,
Yes I am running with the new web interface. (no hadoop) I will deploy the new jars once your PR is merged Sachin, and we'll see :) Cheers, Gyula Sachin Goel <[hidden email]> ezt írta (időpont: 2015. nov. 15., V, 13:43): > I used to get a similar exception [I do not remember if the stack > trace was *exactly > *the same but it was from the web interface, and was due to the *connection > reset by peer *]. Currently, the web interface does not handle > exceptionCaught events cleanly. > One of my PRs has addressed this by adding a exception handler at the end > of pipeline. https://github.com/apache/flink/pull/1338 > > > > -- Sachin Goel > Computer Science, IIT Delhi > m. +91-9871457685 > > On Sun, Nov 15, 2015 at 5:48 PM, Ufuk Celebi <[hidden email]> wrote: > > > This Exception was not thrown by the data exchange component. This is > > confirmed by the stack trace you have shared. It shows the > > DefaultThreadFactory, which we don’t use for the data exchange. Any > > Exception thrown there will actually fail the program. > > > > My best guess is that this was thrown by the new web interface. Was it > > running with your job? > > > > My second best guess is that it was thrown by another component running > > Netty (maybe a Hadoop client?). > > > > – Ufuk > > > > PS Thanks for sharing the logs with me. :) > > > > > On 14 Nov 2015, at 18:14, Gyula Fóra <[hidden email]> wrote: > > > > > > Hi guys, > > > > > > I have a Flink Streaming job running for about a day now without any > > errors > > > and then I got this in the job manager log: > > > > > > 15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline > > > - An exceptionCaught() event was fired, and it reached at > > > the tail of the pipeline. It usually means the last handler in the > > > pipeline did not handle the exception. > > > java.io.IOException: Connection reset by peer > > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > > > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > > > at > > > io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447) > > > at > > io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) > > > at > > > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) > > > at > > > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > > > at > > > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > > > at > > > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > > > at java.lang.Thread.run(Thread.java:745) > > > > > > > > > After this the job did not fail it keeps running. This happened twice. > > > Can anyone tell me what might cause this exception? > > > > > > > > > Cheers, > > > > > > Gyula > > > > > |
Free forum by Nabble | Edit this page |