JobClientActorSubmissionTimeoutException

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

JobClientActorSubmissionTimeoutException

Hilmi Yildirim
Hi,

I get the following exception when I execute  a code similar to the
ALSITSuite. I train a ALS model and when the following code is executed
I get the error:

     val predictions = als
       .predict(testData)
       .collect()

Exception in thread "main"
org.apache.flink.runtime.client.JobExecutionException: Communication
with JobManager failed: Job submission to the JobManager timed out.
     at
org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141)
     at
org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:408)
     at
org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:394)
     at
org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:386)
     at
org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
     at
org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:87)
     at
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:803)
     at
org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:591)
     at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:544)
     at org.apache.flink.ml.recommendation.ALSTest$.main(ALSTest.scala:138)
     at org.apache.flink.ml.recommendation.ALSTest.main(ALSTest.scala)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:497)
     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by:
org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job
submission to the JobManager timed out.
     at
org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:256)
     at
org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
     at
org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:62)
     at
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
     at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
     at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
     at akka.actor.ActorCell.invoke(ActorCell.scala:487)
     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
     at akka.dispatch.Mailbox.run(Mailbox.scala:221)
     at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
     at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
     at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
     at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
     at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
     at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


Is this the Yarn Concurrency Bug
(https://issues.apache.org/jira/browse/FLINK-3300?jql=project%20%3D%20FLINK)
?

Best Regards,
Hilmi

--
==================================================================
Hilmi Yildirim, M.Sc.
Researcher

DFKI GmbH
Intelligente Analytik für Massendaten
DFKI Projektbüro Berlin
Alt-Moabit 91c
D-10559 Berlin
Phone: +49 30 23895 1814

E-Mail: [hidden email]

-------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: JobClientActorSubmissionTimeoutException

Till Rohrmann
Hi Hilmi,

could you check what happened on the JobManager side?

Cheers,
Till


On Mon, Feb 1, 2016 at 2:39 PM, Hilmi Yildirim <[hidden email]>
wrote:

> Hi,
>
> I get the following exception when I execute  a code similar to the
> ALSITSuite. I train a ALS model and when the following code is executed I
> get the error:
>
>     val predictions = als
>       .predict(testData)
>       .collect()
>
> Exception in thread "main"
> org.apache.flink.runtime.client.JobExecutionException: Communication with
> JobManager failed: Job submission to the JobManager timed out.
>     at
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141)
>     at
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:408)
>     at
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:394)
>     at
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:386)
>     at
> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
>     at
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:87)
>     at
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:803)
>     at
> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:591)
>     at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:544)
>     at org.apache.flink.ml.recommendation.ALSTest$.main(ALSTest.scala:138)
>     at org.apache.flink.ml.recommendation.ALSTest.main(ALSTest.scala)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:497)
>     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
> Caused by:
> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException:
> Job submission to the JobManager timed out.
>     at
> org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:256)
>     at
> org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
>     at
> org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:62)
>     at
> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>     at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>     at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>     at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>     at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>     at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>     at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>     at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>     at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>     at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>     at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
> Is this the Yarn Concurrency Bug (
> https://issues.apache.org/jira/browse/FLINK-3300?jql=project%20%3D%20FLINK)
> ?
>
> Best Regards,
> Hilmi
>
> --
> ==================================================================
> Hilmi Yildirim, M.Sc.
> Researcher
>
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
>
> E-Mail: [hidden email]
>
> -------------------------------------------------------------
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
>
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
>
> Amtsgericht Kaiserslautern, HRB 2313
> -------------------------------------------------------------
>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobClientActorSubmissionTimeoutException

Hilmi Yildirim
Hi,

I started the code with ExecutionEnvironment.createLocalEnvironment()

Are there logs of the jobmanager in local mode?

Best Regards,
Hilmi

Am 01.02.2016 um 14:50 schrieb Till Rohrmann:

> Hi Hilmi,
>
> could you check what happened on the JobManager side?
>
> Cheers,
> Till
> ​
>
> On Mon, Feb 1, 2016 at 2:39 PM, Hilmi Yildirim <[hidden email]>
> wrote:
>
>> Hi,
>>
>> I get the following exception when I execute  a code similar to the
>> ALSITSuite. I train a ALS model and when the following code is executed I
>> get the error:
>>
>>      val predictions = als
>>        .predict(testData)
>>        .collect()
>>
>> Exception in thread "main"
>> org.apache.flink.runtime.client.JobExecutionException: Communication with
>> JobManager failed: Job submission to the JobManager timed out.
>>      at
>> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141)
>>      at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:408)
>>      at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:394)
>>      at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:386)
>>      at
>> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
>>      at
>> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:87)
>>      at
>> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:803)
>>      at
>> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:591)
>>      at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:544)
>>      at org.apache.flink.ml.recommendation.ALSTest$.main(ALSTest.scala:138)
>>      at org.apache.flink.ml.recommendation.ALSTest.main(ALSTest.scala)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>      at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>      at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>      at java.lang.reflect.Method.invoke(Method.java:497)
>>      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
>> Caused by:
>> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException:
>> Job submission to the JobManager timed out.
>>      at
>> org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:256)
>>      at
>> org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
>>      at
>> org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:62)
>>      at
>> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>>      at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>      at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>>      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>      at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>      at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>      at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>      at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>      at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>      at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>>
>> Is this the Yarn Concurrency Bug (
>> https://issues.apache.org/jira/browse/FLINK-3300?jql=project%20%3D%20FLINK)
>> ?
>>
>> Best Regards,
>> Hilmi
>>
>> --
>> ==================================================================
>> Hilmi Yildirim, M.Sc.
>> Researcher
>>
>> DFKI GmbH
>> Intelligente Analytik für Massendaten
>> DFKI Projektbüro Berlin
>> Alt-Moabit 91c
>> D-10559 Berlin
>> Phone: +49 30 23895 1814
>>
>> E-Mail: [hidden email]
>>
>> -------------------------------------------------------------
>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>
>> Geschaeftsfuehrung:
>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>> Dr. Walter Olthoff
>>
>> Vorsitzender des Aufsichtsrats:
>> Prof. Dr. h.c. Hans A. Aukes
>>
>> Amtsgericht Kaiserslautern, HRB 2313
>> -------------------------------------------------------------
>>
>>


--
==================================================================
Hilmi Yildirim, M.Sc.
Researcher

DFKI GmbH
Intelligente Analytik für Massendaten
DFKI Projektbüro Berlin
Alt-Moabit 91c
D-10559 Berlin
Phone: +49 30 23895 1814

E-Mail: [hidden email]

-------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: JobClientActorSubmissionTimeoutException

Till Rohrmann
Yes if you provide a log4j.properties file in your class path which sets
the log level to INFO and adds a console logger, you should see the output
in the IntelliJ console.

E.g.

log4j.rootLogger=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{HH:mm:ss,SSS} %-5p
%-60c %x - %m%n

Cheers,
Till


On Mon, Feb 1, 2016 at 2:53 PM, Hilmi Yildirim <[hidden email]>
wrote:

> Hi,
>
> I started the code with ExecutionEnvironment.createLocalEnvironment()
>
> Are there logs of the jobmanager in local mode?
>
> Best Regards,
> Hilmi
>
>
> Am 01.02.2016 um 14:50 schrieb Till Rohrmann:
>
>> Hi Hilmi,
>>
>> could you check what happened on the JobManager side?
>>
>> Cheers,
>> Till
>> ​
>>
>> On Mon, Feb 1, 2016 at 2:39 PM, Hilmi Yildirim <[hidden email]>
>> wrote:
>>
>> Hi,
>>>
>>> I get the following exception when I execute  a code similar to the
>>> ALSITSuite. I train a ALS model and when the following code is executed I
>>> get the error:
>>>
>>>      val predictions = als
>>>        .predict(testData)
>>>        .collect()
>>>
>>> Exception in thread "main"
>>> org.apache.flink.runtime.client.JobExecutionException: Communication with
>>> JobManager failed: Job submission to the JobManager timed out.
>>>      at
>>>
>>> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141)
>>>      at
>>>
>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:408)
>>>      at
>>>
>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:394)
>>>      at
>>>
>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:386)
>>>      at
>>> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
>>>      at
>>>
>>> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:87)
>>>      at
>>>
>>> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:803)
>>>      at
>>>
>>> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:591)
>>>      at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:544)
>>>      at org.apache.flink.ml
>>> .recommendation.ALSTest$.main(ALSTest.scala:138)
>>>      at org.apache.flink.ml.recommendation.ALSTest.main(ALSTest.scala)
>>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>      at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>      at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>      at java.lang.reflect.Method.invoke(Method.java:497)
>>>      at
>>> com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
>>> Caused by:
>>> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException:
>>> Job submission to the JobManager timed out.
>>>      at
>>>
>>> org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:256)
>>>      at
>>>
>>> org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
>>>      at
>>>
>>> org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:62)
>>>      at
>>>
>>> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>>>      at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>      at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>>>      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>      at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>      at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>      at
>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>      at
>>>
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>>      at
>>>
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>>      at
>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>      at
>>>
>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>
>>>
>>> Is this the Yarn Concurrency Bug (
>>>
>>> https://issues.apache.org/jira/browse/FLINK-3300?jql=project%20%3D%20FLINK
>>> )
>>> ?
>>>
>>> Best Regards,
>>> Hilmi
>>>
>>> --
>>> ==================================================================
>>> Hilmi Yildirim, M.Sc.
>>> Researcher
>>>
>>> DFKI GmbH
>>> Intelligente Analytik für Massendaten
>>> DFKI Projektbüro Berlin
>>> Alt-Moabit 91c
>>> D-10559 Berlin
>>> Phone: +49 30 23895 1814
>>>
>>> E-Mail: [hidden email]
>>>
>>> -------------------------------------------------------------
>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>>
>>> Geschaeftsfuehrung:
>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>>> Dr. Walter Olthoff
>>>
>>> Vorsitzender des Aufsichtsrats:
>>> Prof. Dr. h.c. Hans A. Aukes
>>>
>>> Amtsgericht Kaiserslautern, HRB 2313
>>> -------------------------------------------------------------
>>>
>>>
>>>
>
> --
> ==================================================================
> Hilmi Yildirim, M.Sc.
> Researcher
>
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
>
> E-Mail: [hidden email]
>
> -------------------------------------------------------------
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
>
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
>
> Amtsgericht Kaiserslautern, HRB 2313
> -------------------------------------------------------------
>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobClientActorSubmissionTimeoutException

Hilmi Yildirim
It seems that the exception is caused by a OutOfMemoryError. But this is
only visible in the logs

Am 01.02.2016 um 15:17 schrieb Till Rohrmann:

> Yes if you provide a log4j.properties file in your class path which sets
> the log level to INFO and adds a console logger, you should see the output
> in the IntelliJ console.
>
> E.g.
>
> log4j.rootLogger=INFO, console
> log4j.appender.console=org.apache.log4j.ConsoleAppender
> log4j.appender.console.layout=org.apache.log4j.PatternLayout
> log4j.appender.console.layout.ConversionPattern=%d{HH:mm:ss,SSS} %-5p
> %-60c %x - %m%n
>
> Cheers,
> Till
> ​
>
> On Mon, Feb 1, 2016 at 2:53 PM, Hilmi Yildirim <[hidden email]>
> wrote:
>
>> Hi,
>>
>> I started the code with ExecutionEnvironment.createLocalEnvironment()
>>
>> Are there logs of the jobmanager in local mode?
>>
>> Best Regards,
>> Hilmi
>>
>>
>> Am 01.02.2016 um 14:50 schrieb Till Rohrmann:
>>
>>> Hi Hilmi,
>>>
>>> could you check what happened on the JobManager side?
>>>
>>> Cheers,
>>> Till
>>> ​
>>>
>>> On Mon, Feb 1, 2016 at 2:39 PM, Hilmi Yildirim <[hidden email]>
>>> wrote:
>>>
>>> Hi,
>>>> I get the following exception when I execute  a code similar to the
>>>> ALSITSuite. I train a ALS model and when the following code is executed I
>>>> get the error:
>>>>
>>>>       val predictions = als
>>>>         .predict(testData)
>>>>         .collect()
>>>>
>>>> Exception in thread "main"
>>>> org.apache.flink.runtime.client.JobExecutionException: Communication with
>>>> JobManager failed: Job submission to the JobManager timed out.
>>>>       at
>>>>
>>>> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:141)
>>>>       at
>>>>
>>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:408)
>>>>       at
>>>>
>>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:394)
>>>>       at
>>>>
>>>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:386)
>>>>       at
>>>> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
>>>>       at
>>>>
>>>> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:87)
>>>>       at
>>>>
>>>> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:803)
>>>>       at
>>>>
>>>> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:591)
>>>>       at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:544)
>>>>       at org.apache.flink.ml
>>>> .recommendation.ALSTest$.main(ALSTest.scala:138)
>>>>       at org.apache.flink.ml.recommendation.ALSTest.main(ALSTest.scala)
>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>       at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>       at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>       at java.lang.reflect.Method.invoke(Method.java:497)
>>>>       at
>>>> com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
>>>> Caused by:
>>>> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException:
>>>> Job submission to the JobManager timed out.
>>>>       at
>>>>
>>>> org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:256)
>>>>       at
>>>>
>>>> org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
>>>>       at
>>>>
>>>> org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:62)
>>>>       at
>>>>
>>>> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>>>>       at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>>       at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>>>>       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>>       at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>>       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>>       at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>>       at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>>       at
>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>       at
>>>>
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>>>       at
>>>>
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>>>       at
>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>       at
>>>>
>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>
>>>>
>>>> Is this the Yarn Concurrency Bug (
>>>>
>>>> https://issues.apache.org/jira/browse/FLINK-3300?jql=project%20%3D%20FLINK
>>>> )
>>>> ?
>>>>
>>>> Best Regards,
>>>> Hilmi
>>>>
>>>> --
>>>> ==================================================================
>>>> Hilmi Yildirim, M.Sc.
>>>> Researcher
>>>>
>>>> DFKI GmbH
>>>> Intelligente Analytik für Massendaten
>>>> DFKI Projektbüro Berlin
>>>> Alt-Moabit 91c
>>>> D-10559 Berlin
>>>> Phone: +49 30 23895 1814
>>>>
>>>> E-Mail: [hidden email]
>>>>
>>>> -------------------------------------------------------------
>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>>>
>>>> Geschaeftsfuehrung:
>>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>>>> Dr. Walter Olthoff
>>>>
>>>> Vorsitzender des Aufsichtsrats:
>>>> Prof. Dr. h.c. Hans A. Aukes
>>>>
>>>> Amtsgericht Kaiserslautern, HRB 2313
>>>> -------------------------------------------------------------
>>>>
>>>>
>>>>
>> --
>> ==================================================================
>> Hilmi Yildirim, M.Sc.
>> Researcher
>>
>> DFKI GmbH
>> Intelligente Analytik für Massendaten
>> DFKI Projektbüro Berlin
>> Alt-Moabit 91c
>> D-10559 Berlin
>> Phone: +49 30 23895 1814
>>
>> E-Mail: [hidden email]
>>
>> -------------------------------------------------------------
>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>
>> Geschaeftsfuehrung:
>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>> Dr. Walter Olthoff
>>
>> Vorsitzender des Aufsichtsrats:
>> Prof. Dr. h.c. Hans A. Aukes
>>
>> Amtsgericht Kaiserslautern, HRB 2313
>> -------------------------------------------------------------
>>
>>


--
==================================================================
Hilmi Yildirim, M.Sc.
Researcher

DFKI GmbH
Intelligente Analytik für Massendaten
DFKI Projektbüro Berlin
Alt-Moabit 91c
D-10559 Berlin
Phone: +49 30 23895 1814

E-Mail: [hidden email]

-------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: JobClientActorSubmissionTimeoutException

Ufuk Celebi-2

> On 01 Feb 2016, at 16:45, Hilmi Yildirim <[hidden email]> wrote:
>
> It seems that the exception is caused by a OutOfMemoryError. But this is only visible in the logs

Could you share the relevant parts?

– Ufuk