Support for controlling slot assignment based on CPU requirements

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Support for controlling slot assignment based on CPU requirements

Ken Krugler
Hi all,

I’m running a complex (batch) workflow that has a step where it trains Fasttext models.

This is very CPU-intensive, to the point where it will use all available processing power on a server.

The Flink configuration I’m using is one TaskManager per server, with N slots == available cores.

So what I’d like to do is ensure that if I have N of these training operators running in parallel on N TaskManagers, slot assignment happens such that each TM has one such operator.

Unfortunately, what typically happens now is that most/all of these operators get assigned to the same TM, which then struggles to stay alive under that load.

I haven’t seen any solution to this, though I can imagine some helicopter stunts that could work around the issue.

Any suggestions?

Thanks,

— Ken

PS - I took a look through the list of FLIPs <https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals>, and didn’t see anything that covered this. I image it would need to be something like YARN’s support for per-node vCore capacity and per-task vCore requirements, but on a per-TM/per-operator basis.

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra

Reply | Threaded
Open this post in threaded view
|

Re: Support for controlling slot assignment based on CPU requirements

Xintong Song
Hi Ken,

There is a discussion in issue
<https://issues.apache.org/jira/browse/FLINK-12122> about a feature related
to your demand. It proposes spread tasks evenly across TMs. However, the
feature is still in progress, and it spreads all tasks evenly instead of
specific operators.

For the time being, I would suggest to have only one slot per TM, and use slot
sharing group
<https://ci.apache.org/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources>
to
make sure tasks of the same job graph vertex do not goes into the same
slot/TM.

Thank you~

Xintong Song



On Thu, Jun 13, 2019 at 4:58 AM Ken Krugler <[hidden email]>
wrote:

> Hi all,
>
> I’m running a complex (batch) workflow that has a step where it trains
> Fasttext models.
>
> This is very CPU-intensive, to the point where it will use all available
> processing power on a server.
>
> The Flink configuration I’m using is one TaskManager per server, with N
> slots == available cores.
>
> So what I’d like to do is ensure that if I have N of these training
> operators running in parallel on N TaskManagers, slot assignment happens
> such that each TM has one such operator.
>
> Unfortunately, what typically happens now is that most/all of these
> operators get assigned to the same TM, which then struggles to stay alive
> under that load.
>
> I haven’t seen any solution to this, though I can imagine some helicopter
> stunts that could work around the issue.
>
> Any suggestions?
>
> Thanks,
>
> — Ken
>
> PS - I took a look through the list of FLIPs <
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals>,
> and didn’t see anything that covered this. I image it would need to be
> something like YARN’s support for per-node vCore capacity and per-task
> vCore requirements, but on a per-TM/per-operator basis.
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> Custom big data solutions & training
> Flink, Solr, Hadoop, Cascading & Cassandra
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Support for controlling slot assignment based on CPU requirements

Ken Krugler
Hi Xintong Song,

Thanks for the response.

I’d thought that the slotSharingGroup(“name”) method call was only available for streaming jobs - as I’d noted below, I’m running a batch workflow.

Or is there a way to get this to work? I see that support for slot sharing groups is in Flink’s runtime, versus streaming, but I don’t see how batch workflows set this.

Also, slot sharing is about subtasks sharing the same slot. But if I have one slot per TM, I’ll want to be running 32 TMs/server, and then slot sharing groups shouldn’t have any impact, right?

Though using N TMs/server, each with one slot, might help improve the odds of subtasks for my particular operator running on separate servers (since as pert FLINK-12122, in 1.8 they tend to clump together on the same TM if it has multiple slots).

Thanks again,

— Ken

> On Jun 12, 2019, at 7:02 PM, Xintong Song <[hidden email]> wrote:
>
> Hi Ken,
>
> There is a discussion in issue
> <https://issues.apache.org/jira/browse/FLINK-12122> about a feature related
> to your demand. It proposes spread tasks evenly across TMs. However, the
> feature is still in progress, and it spreads all tasks evenly instead of
> specific operators.
>
> For the time being, I would suggest to have only one slot per TM, and use slot
> sharing group
> <https://ci.apache.org/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources>
> to
> make sure tasks of the same job graph vertex do not goes into the same
> slot/TM.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, Jun 13, 2019 at 4:58 AM Ken Krugler <[hidden email]>
> wrote:
>
>> Hi all,
>>
>> I’m running a complex (batch) workflow that has a step where it trains
>> Fasttext models.
>>
>> This is very CPU-intensive, to the point where it will use all available
>> processing power on a server.
>>
>> The Flink configuration I’m using is one TaskManager per server, with N
>> slots == available cores.
>>
>> So what I’d like to do is ensure that if I have N of these training
>> operators running in parallel on N TaskManagers, slot assignment happens
>> such that each TM has one such operator.
>>
>> Unfortunately, what typically happens now is that most/all of these
>> operators get assigned to the same TM, which then struggles to stay alive
>> under that load.
>>
>> I haven’t seen any solution to this, though I can imagine some helicopter
>> stunts that could work around the issue.
>>
>> Any suggestions?
>>
>> Thanks,
>>
>> — Ken
>>
>> PS - I took a look through the list of FLIPs <
>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals>,
>> and didn’t see anything that covered this. I image it would need to be
>> something like YARN’s support for per-node vCore capacity and per-task
>> vCore requirements, but on a per-TM/per-operator basis.

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra

Reply | Threaded
Open this post in threaded view
|

Re: Support for controlling slot assignment based on CPU requirements

Fabian Hueske-2
Hi Ken,

You are right. Slot-sharing can only be configured for DataStream
applications. The DataSet API does not support this.
I think right now, there is no good solution for this problem.
There are several ongoing efforts to improve Flink's batch capabilities,
incl. better scheduling, failure recovery and integration with the
DataStream API.
Another related feature are resource descriptions for task managers that
allow to schedule tasks with special hardware requirements (like GPUs) to
the right TMs.

I don't know when to expect these features. Some might already be include
in the next release.

Best, Fabian


Am Do., 13. Juni 2019 um 22:11 Uhr schrieb Ken Krugler <
[hidden email]>:

> Hi Xintong Song,
>
> Thanks for the response.
>
> I’d thought that the slotSharingGroup(“name”) method call was only
> available for streaming jobs - as I’d noted below, I’m running a batch
> workflow.
>
> Or is there a way to get this to work? I see that support for slot sharing
> groups is in Flink’s runtime, versus streaming, but I don’t see how batch
> workflows set this.
>
> Also, slot sharing is about subtasks sharing the same slot. But if I have
> one slot per TM, I’ll want to be running 32 TMs/server, and then slot
> sharing groups shouldn’t have any impact, right?
>
> Though using N TMs/server, each with one slot, might help improve the odds
> of subtasks for my particular operator running on separate servers (since
> as pert FLINK-12122, in 1.8 they tend to clump together on the same TM if
> it has multiple slots).
>
> Thanks again,
>
> — Ken
>
> > On Jun 12, 2019, at 7:02 PM, Xintong Song <[hidden email]> wrote:
> >
> > Hi Ken,
> >
> > There is a discussion in issue
> > <https://issues.apache.org/jira/browse/FLINK-12122> about a feature
> related
> > to your demand. It proposes spread tasks evenly across TMs. However, the
> > feature is still in progress, and it spreads all tasks evenly instead of
> > specific operators.
> >
> > For the time being, I would suggest to have only one slot per TM, and
> use slot
> > sharing group
> > <
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources
> >
> > to
> > make sure tasks of the same job graph vertex do not goes into the same
> > slot/TM.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Thu, Jun 13, 2019 at 4:58 AM Ken Krugler <[hidden email]
> >
> > wrote:
> >
> >> Hi all,
> >>
> >> I’m running a complex (batch) workflow that has a step where it trains
> >> Fasttext models.
> >>
> >> This is very CPU-intensive, to the point where it will use all available
> >> processing power on a server.
> >>
> >> The Flink configuration I’m using is one TaskManager per server, with N
> >> slots == available cores.
> >>
> >> So what I’d like to do is ensure that if I have N of these training
> >> operators running in parallel on N TaskManagers, slot assignment happens
> >> such that each TM has one such operator.
> >>
> >> Unfortunately, what typically happens now is that most/all of these
> >> operators get assigned to the same TM, which then struggles to stay
> alive
> >> under that load.
> >>
> >> I haven’t seen any solution to this, though I can imagine some
> helicopter
> >> stunts that could work around the issue.
> >>
> >> Any suggestions?
> >>
> >> Thanks,
> >>
> >> — Ken
> >>
> >> PS - I took a look through the list of FLIPs <
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> >,
> >> and didn’t see anything that covered this. I image it would need to be
> >> something like YARN’s support for per-node vCore capacity and per-task
> >> vCore requirements, but on a per-TM/per-operator basis.
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> Custom big data solutions & training
> Flink, Solr, Hadoop, Cascading & Cassandra
>
>