Hello,
tonight i was running a WordCount job with the Python API, and halfway through i got the exception below. the issue did not occur again after ressubmitting the job. DOP=160 taskslots=8 filesize=100GB org.apache.flink.client.program.ProgramInvocationException: The program execution failed: java.lang.RuntimeException: An error occurred while reading the next record: The channel is erroneous. at org.apache.flink.runtime.util.KeyGroupedIterator$ValuesIterator.hasNext(KeyGroupedIterator.java:202) at org.apache.flink.languagebinding.api.java.streaming.Sender.sendRecords(Sender.java:57) at org.apache.flink.languagebinding.api.java.streaming.Streamer.stream(Streamer.java:106) at org.apache.flink.languagebinding.api.java.python.functions.PythonGroupReduce.reduce(PythonGroupReduce.java:77) at org.apache.flink.runtime.operators.GroupReduceDriver.run(GroupReduceDriver.java:108) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:509) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:374) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:265) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: The channel is erroneous. at org.apache.flink.runtime.io.disk.iomanager.ChannelAccess.checkErroneous(ChannelAccess.java:132) at org.apache.flink.runtime.io.disk.iomanager.BlockChannelReader.readBlock(BlockChannelReader.java:75) at org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.sendReadRequest(ChannelReaderInputView.java:263) at org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView.nextSegment(ChannelReaderInputView.java:226) at org.apache.flink.runtime.memorymanager.AbstractPagedInputView.advance(AbstractPagedInputView.java:159) at org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readByte(AbstractPagedInputView.java:270) at org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readUnsignedByte(AbstractPagedInputView.java:277) at org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readInt(AbstractPagedInputView.java:340) at org.apache.flink.api.common.typeutils.base.IntSerializer.deserialize(IntSerializer.java:69) at org.apache.flink.api.common.typeutils.base.IntSerializer.deserialize(IntSerializer.java:28) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:115) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:30) at org.apache.flink.runtime.io.disk.ChannelReaderInputViewIterator.next(ChannelReaderInputViewIterator.java:86) at org.apache.flink.runtime.operators.sort.MergeIterator$HeadStream.nextHead(MergeIterator.java:131) at org.apache.flink.runtime.operators.sort.MergeIterator.next(MergeIterator.java:89) at org.apache.flink.runtime.util.KeyGroupedIterator$ValuesIterator.hasNext(KeyGroupedIterator.java:177) ... 8 more Caused by: java.io.IOException: Input/output error at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.FileDispatcherImpl.read(FileDispatcherImpl.java:46) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) at org.apache.flink.runtime.io.disk.iomanager.SegmentReadRequest.read(BlockChannelAccess.java:221) at org.apache.flink.runtime.io.disk.iomanager.IOManager$ReaderThread.run(IOManager.java:551) at org.apache.flink.client.program.Client.run(Client.java:321) at org.apache.flink.client.program.Client.run(Client.java:287) at org.apache.flink.client.program.Client.run(Client.java:281) at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:54) at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:519) at org.apache.flink.languagebinding.api.java.python.PythonExecutor.main(PythonExecutor.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:389) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:307) at org.apache.flink.client.program.Client.run(Client.java:240) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:332) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:930) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) |
The error message is not very specific. I was unable to find any
explanation on this in the web. I would ignore the message for now. I think its not related to the python interface. I'm running a lot of experiments on the cluster and I have not seen this error there before. If we see the issue appearing more frequently, we should start looking into patterns (which machine?, harddisk?, only a specific job, etc.). On Wed, Sep 10, 2014 at 2:57 AM, Chesnay Schepler < [hidden email]> wrote: > Hello, > > tonight i was running a WordCount job with the Python API, and halfway > through i got the exception below. > the issue did not occur again after ressubmitting the job. > DOP=160 > taskslots=8 > filesize=100GB > > org.apache.flink.client.program.ProgramInvocationException: The > program execution failed: java.lang.RuntimeException: An error > occurred while reading the next record: The channel is erroneous. > at > org.apache.flink.runtime.util.KeyGroupedIterator$ > ValuesIterator.hasNext(KeyGroupedIterator.java:202) > at > org.apache.flink.languagebinding.api.java.streaming.Sender.sendRecords( > Sender.java:57) > at > org.apache.flink.languagebinding.api.java.streaming.Streamer.stream( > Streamer.java:106) > at > org.apache.flink.languagebinding.api.java.python.functions. > PythonGroupReduce.reduce(PythonGroupReduce.java:77) > at > org.apache.flink.runtime.operators.GroupReduceDriver. > run(GroupReduceDriver.java:108) > at > org.apache.flink.runtime.operators.RegularPactTask.run( > RegularPactTask.java:509) > at > org.apache.flink.runtime.operators.RegularPactTask. > invoke(RegularPactTask.java:374) > at > org.apache.flink.runtime.execution.RuntimeEnvironment. > run(RuntimeEnvironment.java:265) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: The channel is erroneous. > at > org.apache.flink.runtime.io.disk.iomanager.ChannelAccess. > checkErroneous(ChannelAccess.java:132) > at > org.apache.flink.runtime.io.disk.iomanager. > BlockChannelReader.readBlock(BlockChannelReader.java:75) > at > org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView. > sendReadRequest(ChannelReaderInputView.java:263) > at > org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView. > nextSegment(ChannelReaderInputView.java:226) > at > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.advance( > AbstractPagedInputView.java:159) > at > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readByte( > AbstractPagedInputView.java:270) > at > org.apache.flink.runtime.memorymanager.AbstractPagedInputView. > readUnsignedByte(AbstractPagedInputView.java:277) > at > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readInt( > AbstractPagedInputView.java:340) > at > org.apache.flink.api.common.typeutils.base.IntSerializer. > deserialize(IntSerializer.java:69) > at > org.apache.flink.api.common.typeutils.base.IntSerializer. > deserialize(IntSerializer.java:28) > at > org.apache.flink.api.java.typeutils.runtime. > TupleSerializer.deserialize(TupleSerializer.java:115) > at > org.apache.flink.api.java.typeutils.runtime. > TupleSerializer.deserialize(TupleSerializer.java:30) > at > org.apache.flink.runtime.io.disk.ChannelReaderInputViewIterator.next( > ChannelReaderInputViewIterator.java:86) > at > org.apache.flink.runtime.operators.sort.MergeIterator$ > HeadStream.nextHead(MergeIterator.java:131) > at > org.apache.flink.runtime.operators.sort.MergeIterator. > next(MergeIterator.java:89) > at > org.apache.flink.runtime.util.KeyGroupedIterator$ > ValuesIterator.hasNext(KeyGroupedIterator.java:177) > ... 8 more > Caused by: java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.FileDispatcherImpl.read(FileDispatcherImpl.java:46) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:197) > at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) > at > org.apache.flink.runtime.io.disk.iomanager.SegmentReadRequest.read( > BlockChannelAccess.java:221) > at > org.apache.flink.runtime.io.disk.iomanager.IOManager$ > ReaderThread.run(IOManager.java:551) > at org.apache.flink.client.program.Client.run(Client.java:321) > at org.apache.flink.client.program.Client.run(Client.java:287) > at org.apache.flink.client.program.Client.run(Client.java:281) > at > org.apache.flink.client.program.ContextEnvironment. > execute(ContextEnvironment.java:54) > at > org.apache.flink.api.java.ExecutionEnvironment.execute( > ExecutionEnvironment.java:519) > at > org.apache.flink.languagebinding.api.java.python.PythonExecutor.main( > PythonExecutor.java:119) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod( > PackagedProgram.java:389) > at > org.apache.flink.client.program.PackagedProgram. > invokeInteractiveModeForExecution(PackagedProgram.java:307) > at org.apache.flink.client.program.Client.run(Client.java:240) > at > org.apache.flink.client.CliFrontend.executeProgram( > CliFrontend.java:332) > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) > at > org.apache.flink.client.CliFrontend.parseParameters( > CliFrontend.java:930) > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) > > |
Looks like a spurious I/O failure during a file read, outside of our code.
I agree with Robert, this looks like a rare OS/disk/java error, and is not for us to worry about. On Wed, Sep 10, 2014 at 10:57 AM, Robert Metzger <[hidden email]> wrote: > The error message is not very specific. I was unable to find any > explanation on this in the web. > > I would ignore the message for now. I think its not related to the python > interface. I'm running a lot of experiments on the cluster and I have not > seen this error there before. > If we see the issue appearing more frequently, we should start looking into > patterns (which machine?, harddisk?, only a specific job, etc.). > > > On Wed, Sep 10, 2014 at 2:57 AM, Chesnay Schepler < > [hidden email]> wrote: > > > Hello, > > > > tonight i was running a WordCount job with the Python API, and halfway > > through i got the exception below. > > the issue did not occur again after ressubmitting the job. > > DOP=160 > > taskslots=8 > > filesize=100GB > > > > org.apache.flink.client.program.ProgramInvocationException: The > > program execution failed: java.lang.RuntimeException: An error > > occurred while reading the next record: The channel is erroneous. > > at > > org.apache.flink.runtime.util.KeyGroupedIterator$ > > ValuesIterator.hasNext(KeyGroupedIterator.java:202) > > at > > > org.apache.flink.languagebinding.api.java.streaming.Sender.sendRecords( > > Sender.java:57) > > at > > org.apache.flink.languagebinding.api.java.streaming.Streamer.stream( > > Streamer.java:106) > > at > > org.apache.flink.languagebinding.api.java.python.functions. > > PythonGroupReduce.reduce(PythonGroupReduce.java:77) > > at > > org.apache.flink.runtime.operators.GroupReduceDriver. > > run(GroupReduceDriver.java:108) > > at > > org.apache.flink.runtime.operators.RegularPactTask.run( > > RegularPactTask.java:509) > > at > > org.apache.flink.runtime.operators.RegularPactTask. > > invoke(RegularPactTask.java:374) > > at > > org.apache.flink.runtime.execution.RuntimeEnvironment. > > run(RuntimeEnvironment.java:265) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.io.IOException: The channel is erroneous. > > at > > org.apache.flink.runtime.io.disk.iomanager.ChannelAccess. > > checkErroneous(ChannelAccess.java:132) > > at > > org.apache.flink.runtime.io.disk.iomanager. > > BlockChannelReader.readBlock(BlockChannelReader.java:75) > > at > > org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView. > > sendReadRequest(ChannelReaderInputView.java:263) > > at > > org.apache.flink.runtime.io.disk.iomanager.ChannelReaderInputView. > > nextSegment(ChannelReaderInputView.java:226) > > at > > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.advance( > > AbstractPagedInputView.java:159) > > at > > > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readByte( > > AbstractPagedInputView.java:270) > > at > > org.apache.flink.runtime.memorymanager.AbstractPagedInputView. > > readUnsignedByte(AbstractPagedInputView.java:277) > > at > > org.apache.flink.runtime.memorymanager.AbstractPagedInputView.readInt( > > AbstractPagedInputView.java:340) > > at > > org.apache.flink.api.common.typeutils.base.IntSerializer. > > deserialize(IntSerializer.java:69) > > at > > org.apache.flink.api.common.typeutils.base.IntSerializer. > > deserialize(IntSerializer.java:28) > > at > > org.apache.flink.api.java.typeutils.runtime. > > TupleSerializer.deserialize(TupleSerializer.java:115) > > at > > org.apache.flink.api.java.typeutils.runtime. > > TupleSerializer.deserialize(TupleSerializer.java:30) > > at > > org.apache.flink.runtime.io.disk.ChannelReaderInputViewIterator.next( > > ChannelReaderInputViewIterator.java:86) > > at > > org.apache.flink.runtime.operators.sort.MergeIterator$ > > HeadStream.nextHead(MergeIterator.java:131) > > at > > org.apache.flink.runtime.operators.sort.MergeIterator. > > next(MergeIterator.java:89) > > at > > org.apache.flink.runtime.util.KeyGroupedIterator$ > > ValuesIterator.hasNext(KeyGroupedIterator.java:177) > > ... 8 more > > Caused by: java.io.IOException: Input/output error > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > at sun.nio.ch.FileDispatcherImpl.read(FileDispatcherImpl.java:46) > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > > at sun.nio.ch.IOUtil.read(IOUtil.java:197) > > at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) > > at > > org.apache.flink.runtime.io.disk.iomanager.SegmentReadRequest.read( > > BlockChannelAccess.java:221) > > at > > org.apache.flink.runtime.io.disk.iomanager.IOManager$ > > ReaderThread.run(IOManager.java:551) > > at org.apache.flink.client.program.Client.run(Client.java:321) > > at org.apache.flink.client.program.Client.run(Client.java:287) > > at org.apache.flink.client.program.Client.run(Client.java:281) > > at > > org.apache.flink.client.program.ContextEnvironment. > > execute(ContextEnvironment.java:54) > > at > > org.apache.flink.api.java.ExecutionEnvironment.execute( > > ExecutionEnvironment.java:519) > > at > > org.apache.flink.languagebinding.api.java.python.PythonExecutor.main( > > PythonExecutor.java:119) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > sun.reflect.NativeMethodAccessorImpl.invoke( > > NativeMethodAccessorImpl.java:57) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke( > > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > > org.apache.flink.client.program.PackagedProgram.callMainMethod( > > PackagedProgram.java:389) > > at > > org.apache.flink.client.program.PackagedProgram. > > invokeInteractiveModeForExecution(PackagedProgram.java:307) > > at org.apache.flink.client.program.Client.run(Client.java:240) > > at > > org.apache.flink.client.CliFrontend.executeProgram( > > CliFrontend.java:332) > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) > > at > > org.apache.flink.client.CliFrontend.parseParameters( > > CliFrontend.java:930) > > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) > > > > > |
Free forum by Nabble | Edit this page |