TM failure when deploying a large number of sources

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

TM failure when deploying a large number of sources

Gyula Fóra-2
Hey guys,

I am writing a job which involves creating many different sources to read
data from (in this case 80 sources wiht the parallelism of 8 each, running
locally on my mac). I cannot create less unfortunately.

The problem is that the job fails while deploying the tasks with the
following exception:

java.lang.Exception: Failed to deploy the task to slot SimpleSlot (1)(63) -
eea7250ab5b368693e3c4f14fb94f86d @ localhost - 8 slots - URL:
akka://flink/user/taskmanager_1 - ALLOCATED/ALIVE: Response was not of type
Acknowledge
at
org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:392)

at akka.dispatch.OnComplete.internal(Future.scala:247)
at akka.dispatch.OnComplete.internal(Future.scala:244)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at
scala.concurrent.impl.ExecutionContextImpl$anon$3.exec(ExecutionContextImpl.scala:107)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Any idea what might cause this?

Cheers,
Gyula
Reply | Threaded
Open this post in threaded view
|

Re: TM failure when deploying a large number of sources

Stephan Ewen
Any further information from teh log?

If you create so many tasks (8 x 80) on one machine, the JVM often has not
enough memory reserved for the stack space to create enough threads (1-2
threads per task)...

On Wed, Oct 7, 2015 at 2:13 PM, Gyula Fóra <[hidden email]> wrote:

> Hey guys,
>
> I am writing a job which involves creating many different sources to read
> data from (in this case 80 sources wiht the parallelism of 8 each, running
> locally on my mac). I cannot create less unfortunately.
>
> The problem is that the job fails while deploying the tasks with the
> following exception:
>
> java.lang.Exception: Failed to deploy the task to slot SimpleSlot (1)(63) -
> eea7250ab5b368693e3c4f14fb94f86d @ localhost - 8 slots - URL:
> akka://flink/user/taskmanager_1 - ALLOCATED/ALIVE: Response was not of type
> Acknowledge
> at
>
> org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:392)
>
> at akka.dispatch.OnComplete.internal(Future.scala:247)
> at akka.dispatch.OnComplete.internal(Future.scala:244)
> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
> at
>
> scala.concurrent.impl.ExecutionContextImpl$anon$3.exec(ExecutionContextImpl.scala:107)
>
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
>
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> Any idea what might cause this?
>
> Cheers,
> Gyula
>
Reply | Threaded
Open this post in threaded view
|

Re: TM failure when deploying a large number of sources

Gyula Fóra
Thanks!

Yes, it was indeed a memory issue:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.flink.runtime.taskmanager.Task.startTaskThread(Task.java:415)
at
org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:904)

I will just decrease the parallelism locally :)

Cheers,
Gyula

Stephan Ewen <[hidden email]> ezt írta (időpont: 2015. okt. 7., Sze,
14:16):

> Any further information from teh log?
>
> If you create so many tasks (8 x 80) on one machine, the JVM often has not
> enough memory reserved for the stack space to create enough threads (1-2
> threads per task)...
>
> On Wed, Oct 7, 2015 at 2:13 PM, Gyula Fóra <[hidden email]> wrote:
>
> > Hey guys,
> >
> > I am writing a job which involves creating many different sources to read
> > data from (in this case 80 sources wiht the parallelism of 8 each,
> running
> > locally on my mac). I cannot create less unfortunately.
> >
> > The problem is that the job fails while deploying the tasks with the
> > following exception:
> >
> > java.lang.Exception: Failed to deploy the task to slot SimpleSlot
> (1)(63) -
> > eea7250ab5b368693e3c4f14fb94f86d @ localhost - 8 slots - URL:
> > akka://flink/user/taskmanager_1 - ALLOCATED/ALIVE: Response was not of
> type
> > Acknowledge
> > at
> >
> >
> org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:392)
> >
> > at akka.dispatch.OnComplete.internal(Future.scala:247)
> > at akka.dispatch.OnComplete.internal(Future.scala:244)
> > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
> > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
> > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
> > at
> >
> >
> scala.concurrent.impl.ExecutionContextImpl$anon$3.exec(ExecutionContextImpl.scala:107)
> >
> > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > at
> >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >
> > at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > at
> >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >
> > Any idea what might cause this?
> >
> > Cheers,
> > Gyula
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: TM failure when deploying a large number of sources

Stephan Ewen
I think the error message could have been better, though...

This actually warrants a JIRA issue...

On Wed, Oct 7, 2015 at 2:44 PM, Gyula Fóra <[hidden email]> wrote:

> Thanks!
>
> Yes, it was indeed a memory issue:
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at org.apache.flink.runtime.taskmanager.Task.startTaskThread(Task.java:415)
> at
>
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:904)
>
> I will just decrease the parallelism locally :)
>
> Cheers,
> Gyula
>
> Stephan Ewen <[hidden email]> ezt írta (időpont: 2015. okt. 7., Sze,
> 14:16):
>
> > Any further information from teh log?
> >
> > If you create so many tasks (8 x 80) on one machine, the JVM often has
> not
> > enough memory reserved for the stack space to create enough threads (1-2
> > threads per task)...
> >
> > On Wed, Oct 7, 2015 at 2:13 PM, Gyula Fóra <[hidden email]> wrote:
> >
> > > Hey guys,
> > >
> > > I am writing a job which involves creating many different sources to
> read
> > > data from (in this case 80 sources wiht the parallelism of 8 each,
> > running
> > > locally on my mac). I cannot create less unfortunately.
> > >
> > > The problem is that the job fails while deploying the tasks with the
> > > following exception:
> > >
> > > java.lang.Exception: Failed to deploy the task to slot SimpleSlot
> > (1)(63) -
> > > eea7250ab5b368693e3c4f14fb94f86d @ localhost - 8 slots - URL:
> > > akka://flink/user/taskmanager_1 - ALLOCATED/ALIVE: Response was not of
> > type
> > > Acknowledge
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:392)
> > >
> > > at akka.dispatch.OnComplete.internal(Future.scala:247)
> > > at akka.dispatch.OnComplete.internal(Future.scala:244)
> > > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
> > > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
> > > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
> > > at
> > >
> > >
> >
> scala.concurrent.impl.ExecutionContextImpl$anon$3.exec(ExecutionContextImpl.scala:107)
> > >
> > > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > at
> > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > >
> > > at
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > at
> > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > >
> > > Any idea what might cause this?
> > >
> > > Cheers,
> > > Gyula
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: TM failure when deploying a large number of sources

Gyula Fóra
Alright, I am creating one.

Stephan Ewen <[hidden email]> ezt írta (időpont: 2015. okt. 7., Sze,
15:44):

> I think the error message could have been better, though...
>
> This actually warrants a JIRA issue...
>
> On Wed, Oct 7, 2015 at 2:44 PM, Gyula Fóra <[hidden email]> wrote:
>
> > Thanks!
> >
> > Yes, it was indeed a memory issue:
> > java.lang.OutOfMemoryError: unable to create new native thread
> > at java.lang.Thread.start0(Native Method)
> > at java.lang.Thread.start(Thread.java:714)
> > at
> org.apache.flink.runtime.taskmanager.Task.startTaskThread(Task.java:415)
> > at
> >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:904)
> >
> > I will just decrease the parallelism locally :)
> >
> > Cheers,
> > Gyula
> >
> > Stephan Ewen <[hidden email]> ezt írta (időpont: 2015. okt. 7., Sze,
> > 14:16):
> >
> > > Any further information from teh log?
> > >
> > > If you create so many tasks (8 x 80) on one machine, the JVM often has
> > not
> > > enough memory reserved for the stack space to create enough threads
> (1-2
> > > threads per task)...
> > >
> > > On Wed, Oct 7, 2015 at 2:13 PM, Gyula Fóra <[hidden email]> wrote:
> > >
> > > > Hey guys,
> > > >
> > > > I am writing a job which involves creating many different sources to
> > read
> > > > data from (in this case 80 sources wiht the parallelism of 8 each,
> > > running
> > > > locally on my mac). I cannot create less unfortunately.
> > > >
> > > > The problem is that the job fails while deploying the tasks with the
> > > > following exception:
> > > >
> > > > java.lang.Exception: Failed to deploy the task to slot SimpleSlot
> > > (1)(63) -
> > > > eea7250ab5b368693e3c4f14fb94f86d @ localhost - 8 slots - URL:
> > > > akka://flink/user/taskmanager_1 - ALLOCATED/ALIVE: Response was not
> of
> > > type
> > > > Acknowledge
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:392)
> > > >
> > > > at akka.dispatch.OnComplete.internal(Future.scala:247)
> > > > at akka.dispatch.OnComplete.internal(Future.scala:244)
> > > > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
> > > > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
> > > > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
> > > > at
> > > >
> > > >
> > >
> >
> scala.concurrent.impl.ExecutionContextImpl$anon$3.exec(ExecutionContextImpl.scala:107)
> > > >
> > > > at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > at
> > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > >
> > > > at
> > >
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > at
> > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > >
> > > > Any idea what might cause this?
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > >
> >
>