(DEPRECATED) Apache Flink Mailing List archive.

Resource Optimization for Flink Job in AWS EMR Cluster

Classic

List

Threaded

6 messages Options

DEEP NARAYAN Singh

Resource Optimization for Flink Job in AWS EMR Cluster

Hi All,

I am running a flink streaming job in EMR Cluster with parallelism 21
having 500 records per second.But still seeing cpu utilization is
approximate 5-8 percent.

Below is the long running session command in EMR Cluster having 3 instance
of type C52xlarge(8vcore, 16 GB memory, AWS resource)

*sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*

Anyone can suggest some configuration to maximize the CPU utilization?
And Also what would be the standard utilization of CPU for flink job in
order to achieve the minimum latency?

Any leads would be appreciated.

Thanks,
-Deep

DEEP NARAYAN Singh

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Hi Guys,
Sorry to bother you again.Someone could help me here for clarifying my
doubt. Any help will be highly appreciated.

Thanks,
-Deep

On Wed, Nov 4, 2020 at 6:26 PM DEEP NARAYAN Singh <[hidden email]>
wrote:

> Hi All,
>
> I am running a flink streaming job in EMR Cluster with parallelism 21
> having 500 records per second.But still seeing cpu utilization is
> approximate 5-8 percent.
>
> Below is the long running session command in EMR Cluster having 3 instance
> of type C52xlarge(8vcore, 16 GB memory, AWS resource)
>
> *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
>
> Anyone can suggest some configuration to maximize the CPU utilization?
> And Also what would be the standard utilization of CPU for flink job in
> order to achieve the minimum latency?
>
> Any leads would be appreciated.
>
> Thanks,
> -Deep
>

Satyaa Dixit

Re: Resource Optimization for Flink Job in AWS EMR Cluster

In reply to this post by DEEP NARAYAN Singh

Hi Deep,

Thanks for bringing this on table, I'm also facing a similar kind of issue
while deploying my flink Job w.r.t resources optimization.

Hi Team,

It would be much appreciated if someone helps us here.

Regards,
Satya

On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
wrote:

--
--------------------------
Best Regards
Satya Prakash
(M)+91-9845111913

Prasanna kumar

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Deep,

1) Is it a cpu/memory/io intensive job ??

Based on that you could allocate resources.

From the question, if the CPU is not utilised , you could run multiple
containers on the same machine(tm) ...

Following may not be exact case as yours but to give you an idea.

Few months back I have run jobs in emr processing 4-8k per second from
kafka with paralleism of 8 doing lightweight transformation where end to
end latency was less than a second (10-50ms).

I used slots where memory allocated is 4GB and JM memory 1gb. Here
multilple containers ran on the same machine and I got cpu usgae upto 50%.
Earlier it was in single digits when just single container ran on a machine.

Prasanna.

On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:

> Hi Deep,
>
> Thanks for bringing this on table, I'm also facing a similar kind of issue
> while deploying my flink Job w.r.t resources optimization.
>
> Hi Team,
>
> It would be much appreciated if someone helps us here.
>
>
> Regards,
> Satya
>
> On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
> wrote:
>
> > Hi All,
> >
> > I am running a flink streaming job in EMR Cluster with parallelism 21
> > having 500 records per second.But still seeing cpu utilization is
> > approximate 5-8 percent.
> >
> > Below is the long running session command in EMR Cluster having 3
> instance
> > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> >
> > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> >
> > Anyone can suggest some configuration to maximize the CPU utilization?
> > And Also what would be the standard utilization of CPU for flink job in
> > order to achieve the minimum latency?
> >
> > Any leads would be appreciated.
> >
> > Thanks,
> > -Deep
> >
>
>
> --
> --------------------------
> Best Regards
> Satya Prakash
> (M)+91-9845111913
>

Till Rohrmann

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Hi Deep,

you can increase the average CPU load by reducing the number of overall
resources. Having fewer slots over which you can distribute the work should
increase the resource usage.

Cheers,
Till

On Thu, Nov 5, 2020 at 9:03 AM Prasanna kumar <[hidden email]>
wrote:

> Deep,
>
> 1) Is it a cpu/memory/io intensive job ??
>
> Based on that you could allocate resources.
>
> From the question, if the CPU is not utilised , you could run multiple
> containers on the same machine(tm) ...
>
> Following may not be exact case as yours but to give you an idea.
>
> Few months back I have run jobs in emr processing 4-8k per second from
> kafka with paralleism of 8 doing lightweight transformation where end to
> end latency was less than a second (10-50ms).
>
> I used slots where memory allocated is 4GB and JM memory 1gb. Here
> multilple containers ran on the same machine and I got cpu usgae upto 50%.
> Earlier it was in single digits when just single container ran on a
> machine.
>
> Prasanna.
>
>
> On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:
>
> > Hi Deep,
> >
> > Thanks for bringing this on table, I'm also facing a similar kind of
> issue
> > while deploying my flink Job w.r.t resources optimization.
> >
> > Hi Team,
> >
> > It would be much appreciated if someone helps us here.
> >
> >
> > Regards,
> > Satya
> >
> > On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am running a flink streaming job in EMR Cluster with parallelism 21
> > > having 500 records per second.But still seeing cpu utilization is
> > > approximate 5-8 percent.
> > >
> > > Below is the long running session command in EMR Cluster having 3
> > instance
> > > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> > >
> > > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> > >
> > > Anyone can suggest some configuration to maximize the CPU utilization?
> > > And Also what would be the standard utilization of CPU for flink job in
> > > order to achieve the minimum latency?
> > >
> > > Any leads would be appreciated.
> > >
> > > Thanks,
> > > -Deep
> > >
> >
> >
> > --
> > --------------------------
> > Best Regards
> > Satya Prakash
> > (M)+91-9845111913
> >
>

DEEP NARAYAN Singh

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Thanks Prasanna & Till for quick response.

Looks like my use case is very similar to yours ,I will try to run multiple
containers on the same machine and will update you accordingly.

Thanks ,
-Deep

On Thu, Nov 5, 2020 at 2:33 PM Till Rohrmann <[hidden email]> wrote:

> Hi Deep,
>
> you can increase the average CPU load by reducing the number of overall
> resources. Having fewer slots over which you can distribute the work should
> increase the resource usage.
>
> Cheers,
> Till
>
> On Thu, Nov 5, 2020 at 9:03 AM Prasanna kumar <
> [hidden email]>
> wrote:
>
> > Deep,
> >
> > 1) Is it a cpu/memory/io intensive job ??
> >
> > Based on that you could allocate resources.
> >
> > From the question, if the CPU is not utilised , you could run multiple
> > containers on the same machine(tm) ...
> >
> > Following may not be exact case as yours but to give you an idea.
> >
> > Few months back I have run jobs in emr processing 4-8k per second from
> > kafka with paralleism of 8 doing lightweight transformation where end to
> > end latency was less than a second (10-50ms).
> >
> > I used slots where memory allocated is 4GB and JM memory 1gb. Here
> > multilple containers ran on the same machine and I got cpu usgae upto
> 50%.
> > Earlier it was in single digits when just single container ran on a
> > machine.
> >
> > Prasanna.
> >
> >
> > On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:
> >
> > > Hi Deep,
> > >
> > > Thanks for bringing this on table, I'm also facing a similar kind of
> > issue
> > > while deploying my flink Job w.r.t resources optimization.
> > >
> > > Hi Team,
> > >
> > > It would be much appreciated if someone helps us here.
> > >
> > >
> > > Regards,
> > > Satya
> > >
> > > On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <
> [hidden email]>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am running a flink streaming job in EMR Cluster with parallelism 21
> > > > having 500 records per second.But still seeing cpu utilization is
> > > > approximate 5-8 percent.
> > > >
> > > > Below is the long running session command in EMR Cluster having 3
> > > instance
> > > > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> > > >
> > > > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> > > >
> > > > Anyone can suggest some configuration to maximize the CPU
> utilization?
> > > > And Also what would be the standard utilization of CPU for flink job
> in
> > > > order to achieve the minimum latency?
> > > >
> > > > Any leads would be appreciated.
> > > >
> > > > Thanks,
> > > > -Deep
> > > >
> > >
> > >
> > > --
> > > --------------------------
> > > Best Regards
> > > Satya Prakash
> > > (M)+91-9845111913
> > >
> >
>