Question about scheduling in 0.7

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about scheduling in 0.7

Nico Scherer
Hi all,

we’re currently playing/messing around with scheduling in flink 0.7. We found out that if we run a single job with a certain degree of parallelism, multiple tasks/vertices are executed within a single task manager at the same time (or at least before the prior stage is switched to finished). Further, we noticed that only as many flink instances are requested as the DoP is set and different stages are running in the same slot. We’re wondering how this is implemented? Is a single slot using threading to execute multiple tasks at once?


Regards,

Nico



Reply | Threaded
Open this post in threaded view
|

Re: Question about scheduling in 0.7

Fabian Hueske-2
Hi Nico,

yes, Flink runs tasks in threads. Each TaskManager runs in its own JVM,
everything within a TaskManager is parallelized in threads. Since a TM can
offer multiple slots, also tasks across slots run in the same JVM and in
different threads.

Flink has a pipelined processing model, which means that multiple
sequential tasks can run at the same time and that data is pushed from task
to task in a pipelined fashion. While this was the only communication model
until now, there are currently efforts to also materialize intermediate
datasets (apart from full sorts) for more flexible data shipping strategies
and better fault tolerance.

Let me know if you have further questions about Flink internals.

Best, Fabian

2015-02-23 22:21 GMT+01:00 Nico Scherer <[hidden email]>:

> Hi all,
>
> we’re currently playing/messing around with scheduling in flink 0.7. We
> found out that if we run a single job with a certain degree of parallelism,
> multiple tasks/vertices are executed within a single task manager at the
> same time (or at least before the prior stage is switched to finished).
> Further, we noticed that only as many flink instances are requested as the
> DoP is set and different stages are running in the same slot. We’re
> wondering how this is implemented? Is a single slot using threading to
> execute multiple tasks at once?
>
>
> Regards,
>
> Nico
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about scheduling in 0.7

Stephan Ewen
In addition, to what Fabian wrote:

Yes, one slot can run multiple Tasks. In the batch API, one slot can run
concurrently one task of each operation (for example one source, one
mapper, one reducer, one sink).

On Mon, Feb 23, 2015 at 10:38 PM, Fabian Hueske <[hidden email]> wrote:

> Hi Nico,
>
> yes, Flink runs tasks in threads. Each TaskManager runs in its own JVM,
> everything within a TaskManager is parallelized in threads. Since a TM can
> offer multiple slots, also tasks across slots run in the same JVM and in
> different threads.
>
> Flink has a pipelined processing model, which means that multiple
> sequential tasks can run at the same time and that data is pushed from task
> to task in a pipelined fashion. While this was the only communication model
> until now, there are currently efforts to also materialize intermediate
> datasets (apart from full sorts) for more flexible data shipping strategies
> and better fault tolerance.
>
> Let me know if you have further questions about Flink internals.
>
> Best, Fabian
>
> 2015-02-23 22:21 GMT+01:00 Nico Scherer <[hidden email]>:
>
> > Hi all,
> >
> > we’re currently playing/messing around with scheduling in flink 0.7. We
> > found out that if we run a single job with a certain degree of
> parallelism,
> > multiple tasks/vertices are executed within a single task manager at the
> > same time (or at least before the prior stage is switched to finished).
> > Further, we noticed that only as many flink instances are requested as
> the
> > DoP is set and different stages are running in the same slot. We’re
> > wondering how this is implemented? Is a single slot using threading to
> > execute multiple tasks at once?
> >
> >
> > Regards,
> >
> > Nico
> >
> >
> >
> >
>