[jira] [Resolved] (FLINK-652) Runtime Exception when shuffling large Records

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (FLINK-652) Runtime Exception when shuffling large Records

Shang Yuanchun (Jira)

     [ https://issues.apache.org/jira/browse/FLINK-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Metzger resolved FLINK-652.
----------------------------------

       Resolution: Fixed
    Fix Version/s: pre-apache-0.5.1

The issue has been addressed by the new netty-based network stack, which is included starting from 0.5.1

> Runtime Exception when shuffling large Records
> ----------------------------------------------
>
>                 Key: FLINK-652
>                 URL: https://issues.apache.org/jira/browse/FLINK-652
>             Project: Flink
>          Issue Type: Bug
>            Reporter: GitHub Import
>              Labels: github-import
>             Fix For: pre-apache, pre-apache-0.5.1
>
>
> Hi,
> I have a job that reads (sometimes) large web documents from HBase and performs entity extraction in those docs.
> Between the HBase source and the NER Mapper I randomly shuffle the records to different mapper to compensate for skew in the data.
> ```Java
> mapper.setParameter("INPUT_SHIP_STRATEGY", "SHIP_REPARTITION");
> ```
> Look here for details: https://groups.google.com/forum/#!topic/stratosphere-users/pIw0eyMkLd8
> I'm fairly sure the issue is caused by the records being large and by shuffling them over the network in the fist place because:
> 1. the problem didn't occur when the source and NER mapper were chained
> 2. the problem didn't occur when I enabled shuffling but accidentally omitted the document content which made for very small records
> 3. the exception says so, duh!
> ```
> eu.stratosphere.client.program.ProgramInvocationException: The program execution failed: java.io.IOException: An error occurred in the channel: Expected data packet 82 but received 84
>         at eu.stratosphere.nephele.io.channels.bytebuffered.AbstractByteBufferedInputChannel.isClosed(AbstractByteBufferedInputChannel.java:149)
>         at eu.stratosphere.nephele.io.channels.bytebuffered.AbstractByteBufferedInputChannel.readRecord(AbstractByteBufferedInputChannel.java:92)
>         at eu.stratosphere.nephele.io.RuntimeInputGate.readRecord(RuntimeInputGate.java:192)
>         at eu.stratosphere.nephele.io.MutableRecordReader.next(MutableRecordReader.java:78)
>         at eu.stratosphere.pact.runtime.task.util.RecordReaderIterator.next(RecordReaderIterator.java:41)
>         at eu.stratosphere.pact.runtime.task.util.RecordReaderIterator.next(RecordReaderIterator.java:25)
>         at eu.stratosphere.pact.runtime.task.MapDriver.run(MapDriver.java:79)
>         at eu.stratosphere.pact.runtime.task.RegularPactTask.run(RegularPactTask.java:403)
>         at eu.stratosphere.pact.runtime.task.RegularPactTask.invoke(RegularPactTask.java:294)
>         at eu.stratosphere.nephele.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:342)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Expected data packet 82 but received 84
>         at eu.stratosphere.nephele.taskmanager.runtime.RuntimeInputChannelContext.queueTransferEnvelope(RuntimeInputChannelContext.java:147)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager.processEnvelopeWithBuffer(ByteBufferedChannelManager.java:363)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager.processEnvelope(ByteBufferedChannelManager.java:329)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager.processEnvelopeFromNetwork(ByteBufferedChannelManager.java:636)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.IncomingConnection.read(IncomingConnection.java:98)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.IncomingConnectionThread.doRead(IncomingConnectionThread.java:185)
>         at eu.stratosphere.nephele.taskmanager.bytebuffered.IncomingConnectionThread.run(IncomingConnectionThread.java:124)
>         at eu.stratosphere.client.program.Client.run(Client.java:338)
>         at eu.stratosphere.client.program.Client.run(Client.java:282)
>         at eu.stratosphere.client.program.Client.run(Client.java:250)
>         at eu.stratosphere.client.RemoteExecutor.executePlan(RemoteExecutor.java:84)
>         at org.okkam.dopa.ner.ImrNerTaskDriver.runNerTask(ImrNerTaskDriver.java:65)
>         at org.okkam.dopa.ner.ImrNerTaskDriver.main(ImrNerTaskDriver.java:150)
> ```
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/652
> Created by: [mleich|https://github.com/mleich]
> Labels: bug, runtime,
> Created at: Wed Apr 02 15:54:51 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)