Resource Optimization for Flink Job in AWS EMR Cluster

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Resource Optimization for Flink Job in AWS EMR Cluster

DEEP NARAYAN Singh
Hi All,

I am running a flink streaming job in EMR Cluster with parallelism 21
having 500 records per second.But still seeing cpu utilization is
approximate 5-8 percent.

Below is the long running session command in EMR Cluster having 3 instance
of type C52xlarge(8vcore, 16 GB memory, AWS resource)

*sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*

Anyone can suggest some configuration to maximize the CPU utilization?
And Also what would be the standard utilization of CPU for flink job in
order to achieve the minimum latency?

Any leads would be appreciated.

Thanks,
-Deep
Reply | Threaded
Open this post in threaded view
|

Re: Resource Optimization for Flink Job in AWS EMR Cluster

DEEP NARAYAN Singh
Hi Guys,
Sorry to bother you again.Someone could help me here for clarifying my
doubt. Any help will be highly appreciated.

Thanks,
-Deep

On Wed, Nov 4, 2020 at 6:26 PM DEEP NARAYAN Singh <[hidden email]>
wrote:

> Hi All,
>
> I am running a flink streaming job in EMR Cluster with parallelism 21
> having 500 records per second.But still seeing cpu utilization is
> approximate 5-8 percent.
>
> Below is the long running session command in EMR Cluster having 3 instance
> of type C52xlarge(8vcore, 16 GB memory, AWS resource)
>
> *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
>
> Anyone can suggest some configuration to maximize the CPU utilization?
> And Also what would be the standard utilization of CPU for flink job in
> order to achieve the minimum latency?
>
> Any leads would be appreciated.
>
> Thanks,
> -Deep
>
Reply | Threaded
Open this post in threaded view
|

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Satyaa Dixit
In reply to this post by DEEP NARAYAN Singh
Hi Deep,

Thanks for bringing this on table, I'm also facing a similar kind of issue
while deploying my flink Job w.r.t  resources  optimization.

Hi Team,

It would be much appreciated if someone helps us here.


Regards,
Satya

On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
wrote:

> Hi All,
>
> I am running a flink streaming job in EMR Cluster with parallelism 21
> having 500 records per second.But still seeing cpu utilization is
> approximate 5-8 percent.
>
> Below is the long running session command in EMR Cluster having 3 instance
> of type C52xlarge(8vcore, 16 GB memory, AWS resource)
>
> *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
>
> Anyone can suggest some configuration to maximize the CPU utilization?
> And Also what would be the standard utilization of CPU for flink job in
> order to achieve the minimum latency?
>
> Any leads would be appreciated.
>
> Thanks,
> -Deep
>


--
--------------------------
Best Regards
Satya Prakash
(M)+91-9845111913
Reply | Threaded
Open this post in threaded view
|

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Prasanna kumar
Deep,

1) Is it a cpu/memory/io intensive job ??

Based on that you could allocate resources.

From the question, if the CPU is not utilised , you could run multiple
containers on the same machine(tm) ...

Following may not be exact case as yours but to give you an idea.

Few months back I have run jobs in emr processing 4-8k per second from
kafka with paralleism of 8 doing lightweight transformation where end to
end latency was less than a second (10-50ms).

I used slots where memory allocated is 4GB and JM memory 1gb. Here
multilple containers ran on the same machine and I got cpu usgae upto 50%.
Earlier it was in single digits when just single container ran on a machine.

Prasanna.


On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:

> Hi Deep,
>
> Thanks for bringing this on table, I'm also facing a similar kind of issue
> while deploying my flink Job w.r.t  resources  optimization.
>
> Hi Team,
>
> It would be much appreciated if someone helps us here.
>
>
> Regards,
> Satya
>
> On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
> wrote:
>
> > Hi All,
> >
> > I am running a flink streaming job in EMR Cluster with parallelism 21
> > having 500 records per second.But still seeing cpu utilization is
> > approximate 5-8 percent.
> >
> > Below is the long running session command in EMR Cluster having 3
> instance
> > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> >
> > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> >
> > Anyone can suggest some configuration to maximize the CPU utilization?
> > And Also what would be the standard utilization of CPU for flink job in
> > order to achieve the minimum latency?
> >
> > Any leads would be appreciated.
> >
> > Thanks,
> > -Deep
> >
>
>
> --
> --------------------------
> Best Regards
> Satya Prakash
> (M)+91-9845111913
>
Reply | Threaded
Open this post in threaded view
|

Re: Resource Optimization for Flink Job in AWS EMR Cluster

Till Rohrmann
Hi Deep,

you can increase the average CPU load by reducing the number of overall
resources. Having fewer slots over which you can distribute the work should
increase the resource usage.

Cheers,
Till

On Thu, Nov 5, 2020 at 9:03 AM Prasanna kumar <[hidden email]>
wrote:

> Deep,
>
> 1) Is it a cpu/memory/io intensive job ??
>
> Based on that you could allocate resources.
>
> From the question, if the CPU is not utilised , you could run multiple
> containers on the same machine(tm) ...
>
> Following may not be exact case as yours but to give you an idea.
>
> Few months back I have run jobs in emr processing 4-8k per second from
> kafka with paralleism of 8 doing lightweight transformation where end to
> end latency was less than a second (10-50ms).
>
> I used slots where memory allocated is 4GB and JM memory 1gb. Here
> multilple containers ran on the same machine and I got cpu usgae upto 50%.
> Earlier it was in single digits when just single container ran on a
> machine.
>
> Prasanna.
>
>
> On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:
>
> > Hi Deep,
> >
> > Thanks for bringing this on table, I'm also facing a similar kind of
> issue
> > while deploying my flink Job w.r.t  resources  optimization.
> >
> > Hi Team,
> >
> > It would be much appreciated if someone helps us here.
> >
> >
> > Regards,
> > Satya
> >
> > On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <[hidden email]>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am running a flink streaming job in EMR Cluster with parallelism 21
> > > having 500 records per second.But still seeing cpu utilization is
> > > approximate 5-8 percent.
> > >
> > > Below is the long running session command in EMR Cluster having 3
> > instance
> > > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> > >
> > > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> > >
> > > Anyone can suggest some configuration to maximize the CPU utilization?
> > > And Also what would be the standard utilization of CPU for flink job in
> > > order to achieve the minimum latency?
> > >
> > > Any leads would be appreciated.
> > >
> > > Thanks,
> > > -Deep
> > >
> >
> >
> > --
> > --------------------------
> > Best Regards
> > Satya Prakash
> > (M)+91-9845111913
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Resource Optimization for Flink Job in AWS EMR Cluster

DEEP NARAYAN Singh
Thanks  Prasanna & Till for quick response.

Looks like my use case is very similar to yours ,I will try to run multiple
containers on the same machine  and will update you accordingly.

Thanks ,
-Deep


On Thu, Nov 5, 2020 at 2:33 PM Till Rohrmann <[hidden email]> wrote:

> Hi Deep,
>
> you can increase the average CPU load by reducing the number of overall
> resources. Having fewer slots over which you can distribute the work should
> increase the resource usage.
>
> Cheers,
> Till
>
> On Thu, Nov 5, 2020 at 9:03 AM Prasanna kumar <
> [hidden email]>
> wrote:
>
> > Deep,
> >
> > 1) Is it a cpu/memory/io intensive job ??
> >
> > Based on that you could allocate resources.
> >
> > From the question, if the CPU is not utilised , you could run multiple
> > containers on the same machine(tm) ...
> >
> > Following may not be exact case as yours but to give you an idea.
> >
> > Few months back I have run jobs in emr processing 4-8k per second from
> > kafka with paralleism of 8 doing lightweight transformation where end to
> > end latency was less than a second (10-50ms).
> >
> > I used slots where memory allocated is 4GB and JM memory 1gb. Here
> > multilple containers ran on the same machine and I got cpu usgae upto
> 50%.
> > Earlier it was in single digits when just single container ran on a
> > machine.
> >
> > Prasanna.
> >
> >
> > On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, <[hidden email]> wrote:
> >
> > > Hi Deep,
> > >
> > > Thanks for bringing this on table, I'm also facing a similar kind of
> > issue
> > > while deploying my flink Job w.r.t  resources  optimization.
> > >
> > > Hi Team,
> > >
> > > It would be much appreciated if someone helps us here.
> > >
> > >
> > > Regards,
> > > Satya
> > >
> > > On Wed, Nov 4, 2020 at 6:33 PM DEEP NARAYAN Singh <
> [hidden email]>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am running a flink streaming job in EMR Cluster with parallelism 21
> > > > having 500 records per second.But still seeing cpu utilization is
> > > > approximate 5-8 percent.
> > > >
> > > > Below is the long running session command in EMR Cluster having 3
> > > instance
> > > > of type C52xlarge(8vcore, 16 GB memory, AWS resource)
> > > >
> > > > *sudo flink-yarn-session -n 3 -s 7 -jm 4168 -tm 8000 -d*
> > > >
> > > > Anyone can suggest some configuration to maximize the CPU
> utilization?
> > > > And Also what would be the standard utilization of CPU for flink job
> in
> > > > order to achieve the minimum latency?
> > > >
> > > > Any leads would be appreciated.
> > > >
> > > > Thanks,
> > > > -Deep
> > > >
> > >
> > >
> > > --
> > > --------------------------
> > > Best Regards
> > > Satya Prakash
> > > (M)+91-9845111913
> > >
> >
>