[DISCUSS] JMX remote monitoring integration with Flink

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] JMX remote monitoring integration with Flink

Rong Rong
Hi All,

Has anyone tried to manage production Flink applications through JMX remote
monitoring & management[1]?

We were experimenting to enable JMXRMI on Flink by default in production
and would like to share some of our thoughts:
** Is there any straightforward way to dynamically allocate JMXRMI remote
ports?*
  - It is unrealistic to use JMXRMI static port in production environment,
however we have to go all around the logging system to make the dynamic
remote port number printed out in the log files - this seems very
inconvenient.
  - I think it would be very handy if we can show the JMXRMI remote
information on JobManager/TaskManager UI, or via REST API. (I am thinking
about something similar to [2])

** Is there any performance overhead enabling JMX for a Flink application?*
  - We haven't seen any significant performance impact in our experiments.
However the experiment is not that well-rounded and the observation is
inconclusive.
  - I was wondering would it be a good idea to put some benchmark in the
regression tests[3] to see what's the overhead would be?

It would be highly appreciated if anyone could share some experiences or
provide any suggestions in how we can improve the JMX remote integration
with Flink.


Thanks,
Rong


[1]
https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html
[2]
https://samza.apache.org/learn/documentation/0.14/jobs/web-ui-rest-api.html
[3] http://codespeed.dak8s.net:8000/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] JMX remote monitoring integration with Flink

Forward Xu
Hi  RongRong,
Thank you for bringing this discussion, it is indeed not appropriate to
occupy additional ports in the production environment to provide jmxrmi
services. I think [2] RestApi or JobManager/TaskManager UI is a good idea.

Best,
Forward

Rong Rong <[hidden email]> 于2020年3月13日周五 下午8:54写道:

> Hi All,
>
> Has anyone tried to manage production Flink applications through JMX remote
> monitoring & management[1]?
>
> We were experimenting to enable JMXRMI on Flink by default in production
> and would like to share some of our thoughts:
> ** Is there any straightforward way to dynamically allocate JMXRMI remote
> ports?*
>   - It is unrealistic to use JMXRMI static port in production environment,
> however we have to go all around the logging system to make the dynamic
> remote port number printed out in the log files - this seems very
> inconvenient.
>   - I think it would be very handy if we can show the JMXRMI remote
> information on JobManager/TaskManager UI, or via REST API. (I am thinking
> about something similar to [2])
>
> ** Is there any performance overhead enabling JMX for a Flink application?*
>   - We haven't seen any significant performance impact in our experiments.
> However the experiment is not that well-rounded and the observation is
> inconclusive.
>   - I was wondering would it be a good idea to put some benchmark in the
> regression tests[3] to see what's the overhead would be?
>
> It would be highly appreciated if anyone could share some experiences or
> provide any suggestions in how we can improve the JMX remote integration
> with Flink.
>
>
> Thanks,
> Rong
>
>
> [1]
>
> https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html
> [2]
> https://samza.apache.org/learn/documentation/0.14/jobs/web-ui-rest-api.html
> [3] http://codespeed.dak8s.net:8000/
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] JMX remote monitoring integration with Flink

Till Rohrmann
Hi Rong Rong,

you are right that it JMX is quite hard to use in production due to the
mentioned problems with discovering the port. There is actually already a
JIRA ticket [1] discussing this problem. It just never gained enough
traction to be tackled.

In general, I agree that it would be nice to have a REST API with which one
can obtain JVM specific information about a Flink process. This information
could also contain a potentially open JMX port.

[1] https://issues.apache.org/jira/browse/FLINK-5552

Cheers,
Till

On Fri, Mar 13, 2020 at 2:02 PM Forward Xu <[hidden email]> wrote:

> Hi  RongRong,
> Thank you for bringing this discussion, it is indeed not appropriate to
> occupy additional ports in the production environment to provide jmxrmi
> services. I think [2] RestApi or JobManager/TaskManager UI is a good idea.
>
> Best,
> Forward
>
> Rong Rong <[hidden email]> 于2020年3月13日周五 下午8:54写道:
>
> > Hi All,
> >
> > Has anyone tried to manage production Flink applications through JMX
> remote
> > monitoring & management[1]?
> >
> > We were experimenting to enable JMXRMI on Flink by default in production
> > and would like to share some of our thoughts:
> > ** Is there any straightforward way to dynamically allocate JMXRMI remote
> > ports?*
> >   - It is unrealistic to use JMXRMI static port in production
> environment,
> > however we have to go all around the logging system to make the dynamic
> > remote port number printed out in the log files - this seems very
> > inconvenient.
> >   - I think it would be very handy if we can show the JMXRMI remote
> > information on JobManager/TaskManager UI, or via REST API. (I am thinking
> > about something similar to [2])
> >
> > ** Is there any performance overhead enabling JMX for a Flink
> application?*
> >   - We haven't seen any significant performance impact in our
> experiments.
> > However the experiment is not that well-rounded and the observation is
> > inconclusive.
> >   - I was wondering would it be a good idea to put some benchmark in the
> > regression tests[3] to see what's the overhead would be?
> >
> > It would be highly appreciated if anyone could share some experiences or
> > provide any suggestions in how we can improve the JMX remote integration
> > with Flink.
> >
> >
> > Thanks,
> > Rong
> >
> >
> > [1]
> >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html
> > [2]
> >
> https://samza.apache.org/learn/documentation/0.14/jobs/web-ui-rest-api.html
> > [3] http://codespeed.dak8s.net:8000/
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] JMX remote monitoring integration with Flink

Rong Rong
Thanks @Till for sharing the JIRA information.

I thought as well that this should not be an isolated case to our
situation. We would continue to follow up on the JIRA ticket.

Best,
Rong

On Fri, Mar 20, 2020 at 7:30 AM Till Rohrmann <[hidden email]> wrote:

> Hi Rong Rong,
>
> you are right that it JMX is quite hard to use in production due to the
> mentioned problems with discovering the port. There is actually already a
> JIRA ticket [1] discussing this problem. It just never gained enough
> traction to be tackled.
>
> In general, I agree that it would be nice to have a REST API with which one
> can obtain JVM specific information about a Flink process. This information
> could also contain a potentially open JMX port.
>
> [1] https://issues.apache.org/jira/browse/FLINK-5552
>
> Cheers,
> Till
>
> On Fri, Mar 13, 2020 at 2:02 PM Forward Xu <[hidden email]> wrote:
>
> > Hi  RongRong,
> > Thank you for bringing this discussion, it is indeed not appropriate to
> > occupy additional ports in the production environment to provide jmxrmi
> > services. I think [2] RestApi or JobManager/TaskManager UI is a good
> idea.
> >
> > Best,
> > Forward
> >
> > Rong Rong <[hidden email]> 于2020年3月13日周五 下午8:54写道:
> >
> > > Hi All,
> > >
> > > Has anyone tried to manage production Flink applications through JMX
> > remote
> > > monitoring & management[1]?
> > >
> > > We were experimenting to enable JMXRMI on Flink by default in
> production
> > > and would like to share some of our thoughts:
> > > ** Is there any straightforward way to dynamically allocate JMXRMI
> remote
> > > ports?*
> > >   - It is unrealistic to use JMXRMI static port in production
> > environment,
> > > however we have to go all around the logging system to make the dynamic
> > > remote port number printed out in the log files - this seems very
> > > inconvenient.
> > >   - I think it would be very handy if we can show the JMXRMI remote
> > > information on JobManager/TaskManager UI, or via REST API. (I am
> thinking
> > > about something similar to [2])
> > >
> > > ** Is there any performance overhead enabling JMX for a Flink
> > application?*
> > >   - We haven't seen any significant performance impact in our
> > experiments.
> > > However the experiment is not that well-rounded and the observation is
> > > inconclusive.
> > >   - I was wondering would it be a good idea to put some benchmark in
> the
> > > regression tests[3] to see what's the overhead would be?
> > >
> > > It would be highly appreciated if anyone could share some experiences
> or
> > > provide any suggestions in how we can improve the JMX remote
> integration
> > > with Flink.
> > >
> > >
> > > Thanks,
> > > Rong
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html
> > > [2]
> > >
> >
> https://samza.apache.org/learn/documentation/0.14/jobs/web-ui-rest-api.html
> > > [3] http://codespeed.dak8s.net:8000/
> > >
> >
>