Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

Jiayi Liao
Hi vino,
Big +1 for this.
Glad to see new progress on this topic! I’ve left some comments on it.


Best Regards,
Jiayi Liao


Original Message
Sender:vino [hidden email]
Recipient:Georgi [hidden email]
Cc:[hidden email]; [hidden email]; Stefan [hidden email]; Aljoscha [hidden email]; [hidden email]@gmail.com; Stephan [hidden email]; [hidden email]@apache.org; Tzu-Li (Gordon) [hidden email]
Date:Tuesday, Jul 2, 2019 16:45
Subject:Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component


Hi all,


In the past, I have tried to further refine the design of this topic thread and wrote a design document to give more detailed design images and text description, so that it is more conducive to discussion.[1]

Note: The document is not yet completed, for example, the "Implementation" section is missing. Therefore, it is still in an open discussion state. I will improve the rest while listening to the opinions of the community.

Welcome and appreciate more discussions and feedback.



Best,
Vino


[1]:https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing




yanghua1127 [hidden email] 于2019年6月7日周五 下午11:32写道:

Hi Georgi,

Thanks for your feedback. And glad to hear you are using queryable state.

I agree that implementation of option 1 is easier than others. However, when we design the new architecture we need to consider more aspects .e.g. scalability. So it seems option 3 is more suitable. Actually, some committers such as Stefan, Gordon and Aljoscha have given me feedback and direction.

Currently, I am writing the design document. If it is ready to be presented. I will copy to this thread and we can discuss further details.

----
Best,
Vino



On 2019-06-07 19:03 , Georgi Stoyanov Wrote:


Hi Vino,

I was investigating the current architecture and AFAIK the first proposal will be a lot easier to implement, cause currently JM has the information about the states (where, which etc thanks to KvStateLocationRegistry. Correct me if I’m wrong)
We are using the feature and it’s indeed not very cool to iterate trough ports, check which TM is the responsible one etc etc.

It will be very useful if someone from the committers joins the topic and give us some insights what’s going to happen with that feature.


Kind Regards,
Georgi



From: vino yang [hidden email]
 Sent: Thursday, April 25, 2019 5:18 PM
 To: dev [hidden email]; user [hidden email]
 Cc: Stefan Richter [hidden email]; Aljoscha Krettek [hidden email]; [hidden email]
 Subject: [DISCUSS] Improve Queryable State and introduce a QueryServerProxy component

Hi all,

I want to share my thought with you about improving thequeryable state and introducing a QueryServerProxy component.

I think the current queryable state's client is hard to use. Because it needs users to know the TaskManager's address and proxy's port. Actually, some business users who do not have good knowledge about the Flink's inner or runtime in production. However, sometimes they need to query the values of states.

IMO, the reason caused this problem is because of the queryable state's architecture. Currently, the queryable state clientsinteract with querystate client proxy components which host on each TaskManager.This design is difficult to encapsulate the point of change and exposes too much detail to the user.

My personal idea is that we could introduce a really queryable state server, named e.g.QueryStateProxyServerwhich would delegate all the query state request and query the local registry then redirect the request to the specific QueryStateClientProxy(runs on each TaskManager). The server is the users really want to care about. And it would make the users ignorant to the TaskManagers' address and proxies' port. The current QueryStateClientProxy would become QueryStateProxyClient.

Generally speaking, the roles of the QueryStateProxyServer list below:

works as all the query client's proxy to receive all the request and send response; a router to redirect the real query requests to the specific proxy client; maintain route table registry(state - TaskManager, TaskManager-proxy client address) more fine-granted control, such as cache result, ACL, TTL, SLA(rate limit) and so on
About the implementation, there are three opts:

opt 1:

Let the JobManager acts as the query proxy server.
· pros: reuse the exists JM, do not need to introduce a new process can reduce the complexity;
· cons: would make JM heavy burdens, depends on the query frequency, may impact on the stability



opt 2:

Introduce a new component which runs as a single process and acts as the query proxy server:

· pros: reduce the burdens and make the JM more stability
· cons: introduced a new component will make the implementation more complexity


opt 3 (suggestion comes from Stefan Richter):

Combining the two opts, the query server could run as a single entry point(process) and integrate with JobManager.

If we keep it well encapsulated, the only difference would be how we register new TMs with the query server in the different scenarios, in JM we might have this information already, in standalone e.g. the TMs be started with the query server address to register. This would give the convenience to start QS with the JM and the flexibility for power user to reduce load on their JM.

IMO, the queryable state is a very valuable feature. It can let users query some real-time measure results.I hope it will get the attention of the community.

It is just a roughly thought. If it is valuable to the community, I will give a design draft.

What's your opinion? Any feedback and comment are welcome!

Best,
Vino.
Reply | Threaded
Open this post in threaded view
|

Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

vino yang
Hi Jiayi,

Thanks for your comments.

It's valuable. I have accepted it and refined my design document.

You can have another review when you have time.

Best,
Vino

bupt_ljy <[hidden email]> 于2019年7月3日周三 下午2:48写道:

> Hi vino,
> Big +1 for this.
>
> Glad to see new progress on this topic! I’ve left some comments on it.
>
>
> Best Regards,
>
> Jiayi Liao
>
>  Original Message
> *Sender:* vino yang<[hidden email]>
> *Recipient:* Georgi Stoyanov<[hidden email]>
> *Cc:* dev<[hidden email]>; user<[hidden email]>; Stefan
> Richter<[hidden email]>; Aljoscha Krettek<[hidden email]>;
> [hidden email]<[hidden email]>; Stephan Ewen<[hidden email]>;
> [hidden email]<[hidden email]>; Tzu-Li (Gordon) Tai<[hidden email]>
> *Date:* Tuesday, Jul 2, 2019 16:45
> *Subject:* Re: RE: [DISCUSS] Improve Queryable State and introduce
> aQueryServerProxy component
>
> Hi all,
>
> In the past, I have tried to further refine the design of this topic
> thread and wrote a design document to give more detailed design images and
> text description, so that it is more conducive to discussion.[1]
>
> Note: The document is not yet completed, for example, the "Implementation"
> section is missing. Therefore, it is still in an open discussion state. I
> will improve the rest while listening to the opinions of the community.
>
> Welcome and appreciate more discussions and feedback.
>
> Best,
> Vino
>
> [1]:
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing
>
>
> yanghua1127 <[hidden email]> 于2019年6月7日周五 下午11:32写道:
>
>> Hi Georgi,
>>
>> Thanks for your feedback. And glad to hear you are using queryable state.
>>
>> I agree that implementation of option 1 is easier than others. However,
>> when we design the new architecture we need to consider more aspects .e.g.
>> scalability. So it seems option 3 is more suitable. Actually, some
>> committers such as Stefan, Gordon and Aljoscha have given me feedback and
>> direction.
>>
>> Currently, I am writing the design document. If it is ready to be
>> presented. I will copy to this thread and we can discuss further details.
>>
>> ----
>> Best,
>> Vino
>>
>>
>> On 2019-06-07 19:03 , Georgi Stoyanov <[hidden email]> Wrote:
>>
>> Hi Vino,
>>
>>
>>
>> I was investigating the current architecture and AFAIK the first proposal
>> will be a lot easier to implement, cause currently JM has the information
>> about the states (where, which etc thanks to KvStateLocationRegistry.
>> Correct me if I’m wrong)
>>
>> We are using the feature and it’s indeed not very cool to iterate trough
>> ports, check which TM is the responsible one etc etc.
>>
>>
>>
>> It will be very useful if someone from the committers joins the topic and
>> give us some insights what’s going to happen with that feature.
>>
>>
>>
>>
>>
>> Kind Regards,
>>
>> Georgi
>>
>>
>>
>>
>>
>>
>>
>> *From:* vino yang <[hidden email]>
>> *Sent:* Thursday, April 25, 2019 5:18 PM
>> *To:* dev <[hidden email]>; user <[hidden email]>
>> *Cc:* Stefan Richter <[hidden email]>; Aljoscha Krettek <
>> [hidden email]>; [hidden email]
>> *Subject:* [DISCUSS] Improve Queryable State and introduce a
>> QueryServerProxy component
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I want to share my thought with you about improving the queryable state
>> and introducing a QueryServerProxy component.
>>
>>
>>
>> I think the current queryable state's client is hard to use. Because it
>> needs users to know the TaskManager's address and proxy's port. Actually,
>> some business users who do not have good knowledge about the Flink's inner
>> or runtime in production. However, sometimes they need to query the values
>> of states.
>>
>>
>>
>> IMO, the reason caused this problem is because of the queryable state's
>> architecture. Currently, the queryable state clients interact with
>> query state client proxy components which host on each TaskManager. This
>> design is difficult to encapsulate the point of change and exposes too much
>> detail to the user.
>>
>>
>>
>> My personal idea is that we could introduce a really queryable state
>> server, named e.g. *QueryStateProxyServer* which would delegate all the
>> query state request and query the local registry then redirect the request
>> to the specific *QueryStateClientProxy*(runs on each TaskManager). The
>> server is the users really want to care about. And it would make the users
>> ignorant to the TaskManagers' address and proxies' port. The current
>> *QueryStateClientProxy* would become *QueryStateProxyClient*.
>>
>>
>>
>> Generally speaking, the roles of the QueryStateProxyServer list below:
>>
>>
>>
>>    - works as all the query client's proxy to receive all the request
>>    and send response;
>>    - a router to redirect the real query requests to the specific proxy
>>    client;
>>    - maintain route table registry (state <-> TaskManager,
>>    TaskManager<->proxy client address)
>>    - more fine-granted control, such as cache result, ACL, TTL, SLA(rate
>>    limit) and so on
>>
>> About the implementation, there are three opts:
>>
>>
>>
>> opt 1:
>>
>>
>>
>> Let the JobManager acts as the query proxy server.
>>
>> ·  pros: reuse the exists JM, do not need to introduce a new process can
>> reduce the complexity;
>>
>> ·  cons: would make JM heavy burdens, depends on the query frequency,
>> may impact on the stability
>>
>>
>>
>> [image: Screen Shot 2019-04-25 at 5.12.07 PM.png]
>>
>>
>>
>> opt 2:
>>
>>
>>
>> Introduce a new component  which runs as a single process and acts as the
>> query proxy server:
>>
>>
>>
>> ·  pros: reduce the burdens and make the JM more stability
>>
>> ·  cons: introduced a new component will make the implementation more
>> complexity
>>
>> [image: Screen Shot 2019-04-25 at 5.14.05 PM.png]
>>
>>
>>
>> opt 3 (suggestion comes from Stefan Richter):
>>
>>
>>
>> Combining the two opts, the query server could run as a single entry
>> point(process) and integrate with JobManager.
>>
>>
>>
>> If we keep it well encapsulated, the only difference would be how we
>> register new TMs with the query server in the different scenarios, in JM we
>> might have this information already, in standalone e.g. the TMs be started
>> with the query server address to register. This would give the convenience
>> to start QS with the JM and the flexibility for power user to reduce load
>> on their JM.
>>
>>
>>
>> IMO, the queryable state is a very valuable feature. It can let users
>> query some real-time measure results. I hope it will get the attention of
>> the community.
>>
>>
>>
>> It is just a roughly thought. If it is valuable to the community, I will
>> give a design draft.
>>
>>
>>
>> What's your opinion? Any feedback and comment are welcome!
>>
>>
>>
>> Best,
>>
>> Vino.
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

liyu
Hi Vino,

Thanks for the write up. I've left some comments on google doc, please
check. Thanks.

Best Regards,
Yu


On Thu, 4 Jul 2019 at 15:37, vino yang <[hidden email]> wrote:

> Hi Jiayi,
>
> Thanks for your comments.
>
> It's valuable. I have accepted it and refined my design document.
>
> You can have another review when you have time.
>
> Best,
> Vino
>
> bupt_ljy <[hidden email]> 于2019年7月3日周三 下午2:48写道:
>
> > Hi vino,
> > Big +1 for this.
> >
> > Glad to see new progress on this topic! I’ve left some comments on it.
> >
> >
> > Best Regards,
> >
> > Jiayi Liao
> >
> >  Original Message
> > *Sender:* vino yang<[hidden email]>
> > *Recipient:* Georgi Stoyanov<[hidden email]>
> > *Cc:* dev<[hidden email]>; user<[hidden email]>; Stefan
> > Richter<[hidden email]>; Aljoscha Krettek<[hidden email]>;
> > [hidden email]<[hidden email]>; Stephan Ewen<[hidden email]>;
> > [hidden email]<[hidden email]>; Tzu-Li (Gordon) Tai<
> [hidden email]>
> > *Date:* Tuesday, Jul 2, 2019 16:45
> > *Subject:* Re: RE: [DISCUSS] Improve Queryable State and introduce
> > aQueryServerProxy component
> >
> > Hi all,
> >
> > In the past, I have tried to further refine the design of this topic
> > thread and wrote a design document to give more detailed design images
> and
> > text description, so that it is more conducive to discussion.[1]
> >
> > Note: The document is not yet completed, for example, the
> "Implementation"
> > section is missing. Therefore, it is still in an open discussion state. I
> > will improve the rest while listening to the opinions of the community.
> >
> > Welcome and appreciate more discussions and feedback.
> >
> > Best,
> > Vino
> >
> > [1]:
> >
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing
> >
> >
> > yanghua1127 <[hidden email]> 于2019年6月7日周五 下午11:32写道:
> >
> >> Hi Georgi,
> >>
> >> Thanks for your feedback. And glad to hear you are using queryable
> state.
> >>
> >> I agree that implementation of option 1 is easier than others. However,
> >> when we design the new architecture we need to consider more aspects
> .e.g.
> >> scalability. So it seems option 3 is more suitable. Actually, some
> >> committers such as Stefan, Gordon and Aljoscha have given me feedback
> and
> >> direction.
> >>
> >> Currently, I am writing the design document. If it is ready to be
> >> presented. I will copy to this thread and we can discuss further
> details.
> >>
> >> ----
> >> Best,
> >> Vino
> >>
> >>
> >> On 2019-06-07 19:03 , Georgi Stoyanov <[hidden email]> Wrote:
> >>
> >> Hi Vino,
> >>
> >>
> >>
> >> I was investigating the current architecture and AFAIK the first
> proposal
> >> will be a lot easier to implement, cause currently JM has the
> information
> >> about the states (where, which etc thanks to KvStateLocationRegistry.
> >> Correct me if I’m wrong)
> >>
> >> We are using the feature and it’s indeed not very cool to iterate trough
> >> ports, check which TM is the responsible one etc etc.
> >>
> >>
> >>
> >> It will be very useful if someone from the committers joins the topic
> and
> >> give us some insights what’s going to happen with that feature.
> >>
> >>
> >>
> >>
> >>
> >> Kind Regards,
> >>
> >> Georgi
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> *From:* vino yang <[hidden email]>
> >> *Sent:* Thursday, April 25, 2019 5:18 PM
> >> *To:* dev <[hidden email]>; user <[hidden email]>
> >> *Cc:* Stefan Richter <[hidden email]>; Aljoscha Krettek <
> >> [hidden email]>; [hidden email]
> >> *Subject:* [DISCUSS] Improve Queryable State and introduce a
> >> QueryServerProxy component
> >>
> >>
> >>
> >> Hi all,
> >>
> >>
> >>
> >> I want to share my thought with you about improving the queryable state
> >> and introducing a QueryServerProxy component.
> >>
> >>
> >>
> >> I think the current queryable state's client is hard to use. Because it
> >> needs users to know the TaskManager's address and proxy's port.
> Actually,
> >> some business users who do not have good knowledge about the Flink's
> inner
> >> or runtime in production. However, sometimes they need to query the
> values
> >> of states.
> >>
> >>
> >>
> >> IMO, the reason caused this problem is because of the queryable state's
> >> architecture. Currently, the queryable state clients interact with
> >> query state client proxy components which host on each TaskManager. This
> >> design is difficult to encapsulate the point of change and exposes too
> much
> >> detail to the user.
> >>
> >>
> >>
> >> My personal idea is that we could introduce a really queryable state
> >> server, named e.g. *QueryStateProxyServer* which would delegate all the
> >> query state request and query the local registry then redirect the
> request
> >> to the specific *QueryStateClientProxy*(runs on each TaskManager). The
> >> server is the users really want to care about. And it would make the
> users
> >> ignorant to the TaskManagers' address and proxies' port. The current
> >> *QueryStateClientProxy* would become *QueryStateProxyClient*.
> >>
> >>
> >>
> >> Generally speaking, the roles of the QueryStateProxyServer list below:
> >>
> >>
> >>
> >>    - works as all the query client's proxy to receive all the request
> >>    and send response;
> >>    - a router to redirect the real query requests to the specific proxy
> >>    client;
> >>    - maintain route table registry (state <-> TaskManager,
> >>    TaskManager<->proxy client address)
> >>    - more fine-granted control, such as cache result, ACL, TTL, SLA(rate
> >>    limit) and so on
> >>
> >> About the implementation, there are three opts:
> >>
> >>
> >>
> >> opt 1:
> >>
> >>
> >>
> >> Let the JobManager acts as the query proxy server.
> >>
> >> ·  pros: reuse the exists JM, do not need to introduce a new process can
> >> reduce the complexity;
> >>
> >> ·  cons: would make JM heavy burdens, depends on the query frequency,
> >> may impact on the stability
> >>
> >>
> >>
> >> [image: Screen Shot 2019-04-25 at 5.12.07 PM.png]
> >>
> >>
> >>
> >> opt 2:
> >>
> >>
> >>
> >> Introduce a new component  which runs as a single process and acts as
> the
> >> query proxy server:
> >>
> >>
> >>
> >> ·  pros: reduce the burdens and make the JM more stability
> >>
> >> ·  cons: introduced a new component will make the implementation more
> >> complexity
> >>
> >> [image: Screen Shot 2019-04-25 at 5.14.05 PM.png]
> >>
> >>
> >>
> >> opt 3 (suggestion comes from Stefan Richter):
> >>
> >>
> >>
> >> Combining the two opts, the query server could run as a single entry
> >> point(process) and integrate with JobManager.
> >>
> >>
> >>
> >> If we keep it well encapsulated, the only difference would be how we
> >> register new TMs with the query server in the different scenarios, in
> JM we
> >> might have this information already, in standalone e.g. the TMs be
> started
> >> with the query server address to register. This would give the
> convenience
> >> to start QS with the JM and the flexibility for power user to reduce
> load
> >> on their JM.
> >>
> >>
> >>
> >> IMO, the queryable state is a very valuable feature. It can let users
> >> query some real-time measure results. I hope it will get the attention
> of
> >> the community.
> >>
> >>
> >>
> >> It is just a roughly thought. If it is valuable to the community, I will
> >> give a design draft.
> >>
> >>
> >>
> >> What's your opinion? Any feedback and comment are welcome!
> >>
> >>
> >>
> >> Best,
> >>
> >> Vino.
> >>
> >>
> >>
> >>
>