Scheduling of Flink jobs

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Scheduling of Flink jobs

Bhupesh Chawda
Hi,

I am running Flink on a cluster of 5 nodes.
Here is my config:



*taskmanager.numberOfTaskSlots: 1parallelism.default: 1*
My Flink dashboard shows the following:

*Task Managers: 5*

*Task Slots: 5*

*Available Task Slots: 5*
I have the following questions:

   1. Why does a job with 8 tasks occupy only 2 task slots (3 slots remain
   free as seen from the UI)? As per my understanding, since the number of
   Task Slots as shown above is just 5, perhaps this job may not get enough
   resources (task slots).
   2. I notice that most of the tasks (operators) in the job run on just
   one of the nodes. The other nodes are idle and free. Is there any way to
   distribute the tasks among other nodes more evenly?

Please advice.

Thanks.

~ Bhupesh
Reply | Threaded
Open this post in threaded view
|

Re: Scheduling of Flink jobs

Stephan Ewen
In the default configuration, the job uses as many slots as the parallelism
of the operators states. I assume you run with a parallelism of 2, so it
occupies two slots.

if you run 5 taskmanagers with each one slot, you should set the
parallelism to 5 as well.

On Mon, Aug 29, 2016 at 4:04 PM, Bhupesh Chawda <[hidden email]> wrote:

> Hi,
>
> I am running Flink on a cluster of 5 nodes.
> Here is my config:
>
>
>
> *taskmanager.numberOfTaskSlots: 1parallelism.default: 1*
> My Flink dashboard shows the following:
>
> *Task Managers: 5*
>
> *Task Slots: 5*
>
> *Available Task Slots: 5*
> I have the following questions:
>
>    1. Why does a job with 8 tasks occupy only 2 task slots (3 slots remain
>    free as seen from the UI)? As per my understanding, since the number of
>    Task Slots as shown above is just 5, perhaps this job may not get enough
>    resources (task slots).
>    2. I notice that most of the tasks (operators) in the job run on just
>    one of the nodes. The other nodes are idle and free. Is there any way to
>    distribute the tasks among other nodes more evenly?
>
> Please advice.
>
> Thanks.
>
> ~ Bhupesh
>
Reply | Threaded
Open this post in threaded view
|

Re: Scheduling of Flink jobs

Bhupesh Chawda-2
Thanks Stephan for your reply.

If I understand correctly, if my parallelism is 1, then all of the
operators, not matter how many (say 20), will still run on just one task
manager.
What happens in case the resources on that task manager are not sufficient
for all of these operators?

~ Bhupesh


On Thu, Sep 1, 2016 at 3:32 PM, Stephan Ewen <[hidden email]> wrote:

> In the default configuration, the job uses as many slots as the parallelism
> of the operators states. I assume you run with a parallelism of 2, so it
> occupies two slots.
>
> if you run 5 taskmanagers with each one slot, you should set the
> parallelism to 5 as well.
>
> On Mon, Aug 29, 2016 at 4:04 PM, Bhupesh Chawda <[hidden email]>
> wrote:
>
> > Hi,
> >
> > I am running Flink on a cluster of 5 nodes.
> > Here is my config:
> >
> >
> >
> > *taskmanager.numberOfTaskSlots: 1parallelism.default: 1*
> > My Flink dashboard shows the following:
> >
> > *Task Managers: 5*
> >
> > *Task Slots: 5*
> >
> > *Available Task Slots: 5*
> > I have the following questions:
> >
> >    1. Why does a job with 8 tasks occupy only 2 task slots (3 slots
> remain
> >    free as seen from the UI)? As per my understanding, since the number
> of
> >    Task Slots as shown above is just 5, perhaps this job may not get
> enough
> >    resources (task slots).
> >    2. I notice that most of the tasks (operators) in the job run on just
> >    one of the nodes. The other nodes are idle and free. Is there any way
> to
> >    distribute the tasks among other nodes more evenly?
> >
> > Please advice.
> >
> > Thanks.
> >
> > ~ Bhupesh
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Scheduling of Flink jobs

Stephan Ewen
You can control the resource sharing of tasks pretty fine grained.

The packing heuristic makes it simpler to initially configure and balance
clusters, because you need not to task-math to compute the resources.



On Thu, Sep 1, 2016 at 1:27 PM, Bhupesh Chawda <[hidden email]>
wrote:

> Thanks Stephan for your reply.
>
> If I understand correctly, if my parallelism is 1, then all of the
> operators, not matter how many (say 20), will still run on just one task
> manager.
> What happens in case the resources on that task manager are not sufficient
> for all of these operators?
>
> ~ Bhupesh
>
>
> On Thu, Sep 1, 2016 at 3:32 PM, Stephan Ewen <[hidden email]> wrote:
>
> > In the default configuration, the job uses as many slots as the
> parallelism
> > of the operators states. I assume you run with a parallelism of 2, so it
> > occupies two slots.
> >
> > if you run 5 taskmanagers with each one slot, you should set the
> > parallelism to 5 as well.
> >
> > On Mon, Aug 29, 2016 at 4:04 PM, Bhupesh Chawda <[hidden email]>
> > wrote:
> >
> > > Hi,
> > >
> > > I am running Flink on a cluster of 5 nodes.
> > > Here is my config:
> > >
> > >
> > >
> > > *taskmanager.numberOfTaskSlots: 1parallelism.default: 1*
> > > My Flink dashboard shows the following:
> > >
> > > *Task Managers: 5*
> > >
> > > *Task Slots: 5*
> > >
> > > *Available Task Slots: 5*
> > > I have the following questions:
> > >
> > >    1. Why does a job with 8 tasks occupy only 2 task slots (3 slots
> > remain
> > >    free as seen from the UI)? As per my understanding, since the number
> > of
> > >    Task Slots as shown above is just 5, perhaps this job may not get
> > enough
> > >    resources (task slots).
> > >    2. I notice that most of the tasks (operators) in the job run on
> just
> > >    one of the nodes. The other nodes are idle and free. Is there any
> way
> > to
> > >    distribute the tasks among other nodes more evenly?
> > >
> > > Please advice.
> > >
> > > Thanks.
> > >
> > > ~ Bhupesh
> > >
> >
>