Hi,
I'm Rafi, Data Architect at WalkMe. Our Mobile platform generates events coming from the end-users mobile device for different actions that the user does. The use case I wanted to implement using Flink CEP is as follows: We would like to expose a UI where we could define a set of rules. Rules can be statefull, like: User did X 4 times in the last hour AND User did Y and then did Z in a session AND User average session duration is > 60 seconds As the set of rules are met, we would like to trigger an action. Like a REST call, fire event, etc. This sounds like a good fit for Flink CEP, except that currently, I understand that CEP patterns have to be "hard-coded" in my jobs code in order to build the graph. This set of rules may change many times a day. So re-deploying a Flink job is not an option (or is it?). I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and was hoping you plan to add this feature soon :) This would make a powerful feature and open up many interesting use-cases. Meanwhile, can you suggest a way of implementing this use-case? Hope this makes sense. Would love to hear your thoughts. Thanks, Rafi |
Hi Rafi,
Even I also wanted this facility from Flink Core. But I think this is already solved by Uber on Flink. https://eng.uber.com/athenax/ Best On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: > Hi, > > I'm Rafi, Data Architect at WalkMe. > > Our Mobile platform generates events coming from the end-users mobile > device for different actions that the user does. > > The use case I wanted to implement using Flink CEP is as follows: > > We would like to expose a UI where we could define a set of rules. Rules > can be statefull, like: > > User did X 4 times in the last hour > AND > User did Y and then did Z in a session > AND > User average session duration is > 60 seconds > > As the set of rules are met, we would like to trigger an action. Like a > REST call, fire event, etc. > > This sounds like a good fit for Flink CEP, except that currently, I > understand that CEP patterns have to be "hard-coded" in my jobs code in > order to build the graph. > This set of rules may change many times a day. So re-deploying a Flink job > is not an option (or is it?). > > I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and > was hoping you plan to add this feature soon :) > > This would make a powerful feature and open up many interesting use-cases. > > Meanwhile, can you suggest a way of implementing this use-case? > > Hope this makes sense. > Would love to hear your thoughts. > > Thanks, > Rafi > -- Shivam Sharma Data Engineer @ Goibibo Indian Institute Of Information Technology, Design and Manufacturing Jabalpur Mobile No- (+91) 8882114744 Email:- [hidden email] LinkedIn:-*https://www.linkedin.com/in/28shivamsharma <https://www.linkedin.com/in/28shivamsharma>* |
Hey Rafi,
this is indeed a very nice feature to have. :-) I'm afraid that this is currently hard to do manually with CEP. Let me pull in Dawid and Klou (cc'd) who have worked a lot on CEP. They can probably update you on the plan for FLINK-7129. Best, Ufuk On Tue, Dec 26, 2017 at 8:47 PM, Shivam Sharma <[hidden email]> wrote: > Hi Rafi, > > Even I also wanted this facility from Flink Core. But I think this is > already solved by Uber on Flink. > > https://eng.uber.com/athenax/ > > Best > > On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: > >> Hi, >> >> I'm Rafi, Data Architect at WalkMe. >> >> Our Mobile platform generates events coming from the end-users mobile >> device for different actions that the user does. >> >> The use case I wanted to implement using Flink CEP is as follows: >> >> We would like to expose a UI where we could define a set of rules. Rules >> can be statefull, like: >> >> User did X 4 times in the last hour >> AND >> User did Y and then did Z in a session >> AND >> User average session duration is > 60 seconds >> >> As the set of rules are met, we would like to trigger an action. Like a >> REST call, fire event, etc. >> >> This sounds like a good fit for Flink CEP, except that currently, I >> understand that CEP patterns have to be "hard-coded" in my jobs code in >> order to build the graph. >> This set of rules may change many times a day. So re-deploying a Flink job >> is not an option (or is it?). >> >> I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and >> was hoping you plan to add this feature soon :) >> >> This would make a powerful feature and open up many interesting use-cases. >> >> Meanwhile, can you suggest a way of implementing this use-case? >> >> Hope this makes sense. >> Would love to hear your thoughts. >> >> Thanks, >> Rafi >> > > > > -- > Shivam Sharma > Data Engineer @ Goibibo > Indian Institute Of Information Technology, Design and Manufacturing > Jabalpur > Mobile No- (+91) 8882114744 > Email:- [hidden email] > LinkedIn:-*https://www.linkedin.com/in/28shivamsharma > <https://www.linkedin.com/in/28shivamsharma>* |
Hi Rafi,
Currently this is unfortunately not supported out of the box. To support this, we need 2 features with one having to be added in Flink itself, and the other to the CEP library. The first one is broadcast state and the ability to connect keyed and non-keyed streams. This one is to be added to Flink itself and the good news are that this feature is scheduled to be added to Flink 1.5. The second feature is to modify the CEP operator so that it can support multiple patterns and match incoming events against all of them. For this I have no clear deadline in my mind, but given that there are more and more people asking for it, I think it is going to be added soon. Thanks for raising the issue in the mailing list, Kostas > On Dec 27, 2017, at 2:44 PM, Ufuk Celebi <[hidden email]> wrote: > > Hey Rafi, > > this is indeed a very nice feature to have. :-) I'm afraid that this > is currently hard to do manually with CEP. Let me pull in Dawid and > Klou (cc'd) who have worked a lot on CEP. They can probably update you > on the plan for FLINK-7129. > > Best, > > Ufuk > > > On Tue, Dec 26, 2017 at 8:47 PM, Shivam Sharma <[hidden email]> wrote: >> Hi Rafi, >> >> Even I also wanted this facility from Flink Core. But I think this is >> already solved by Uber on Flink. >> >> https://eng.uber.com/athenax/ >> >> Best >> >> On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: >> >>> Hi, >>> >>> I'm Rafi, Data Architect at WalkMe. >>> >>> Our Mobile platform generates events coming from the end-users mobile >>> device for different actions that the user does. >>> >>> The use case I wanted to implement using Flink CEP is as follows: >>> >>> We would like to expose a UI where we could define a set of rules. Rules >>> can be statefull, like: >>> >>> User did X 4 times in the last hour >>> AND >>> User did Y and then did Z in a session >>> AND >>> User average session duration is > 60 seconds >>> >>> As the set of rules are met, we would like to trigger an action. Like a >>> REST call, fire event, etc. >>> >>> This sounds like a good fit for Flink CEP, except that currently, I >>> understand that CEP patterns have to be "hard-coded" in my jobs code in >>> order to build the graph. >>> This set of rules may change many times a day. So re-deploying a Flink job >>> is not an option (or is it?). >>> >>> I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and >>> was hoping you plan to add this feature soon :) >>> >>> This would make a powerful feature and open up many interesting use-cases. >>> >>> Meanwhile, can you suggest a way of implementing this use-case? >>> >>> Hope this makes sense. >>> Would love to hear your thoughts. >>> >>> Thanks, >>> Rafi >>> >> >> >> >> -- >> Shivam Sharma >> Data Engineer @ Goibibo >> Indian Institute Of Information Technology, Design and Manufacturing >> Jabalpur >> Mobile No- (+91) 8882114744 >> Email:- [hidden email] >> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma >> <https://www.linkedin.com/in/28shivamsharma>* |
Hi again Rafi,
Coming back to the second part of the question on what you can do right now, I would suggest that you launch your initial job with your initial patterns and in your code you assign UID’s to your sources and the CEP operators, e.g.: CEP.pattern(input, mySuperPattern).select(muSelectFunction).uid(“version1") When you want to update the pattern (delete, add a new one or update an existing one): 1) take a savepoint from your running job and kill it. 2) update your patterns, and on the updated patterns only, change the uids so that the state in the savepoint corresponding to your previous version is ignored. 3) restart your job from the savepoint with the “—ignoreUnmappedState” flag, as described here: https://issues.apache.org/jira/browse/FLINK-4445 <https://issues.apache.org/jira/browse/FLINK-4445> This will allow your job to restart from where it left off (the sources checkpoint their state), unchanged patterns to continue from where they were (as you did not change their uids) and new/updated patterns to start from scratch as they have a new uid. I have not really tried it but I think it can work although it requires some manual stopping/ restarting. I do not know if Ufuk has something to add to this. Hope this helps, Kostas > On Dec 28, 2017, at 2:57 PM, Kostas Kloudas <[hidden email]> wrote: > > Hi Rafi, > > Currently this is unfortunately not supported out of the box. > To support this, we need 2 features with one having to be added in Flink itself, > and the other to the CEP library. > > The first one is broadcast state and the ability to connect keyed and non-keyed > streams. This one is to be added to Flink itself and the good news are that this > feature is scheduled to be added to Flink 1.5. > > The second feature is to modify the CEP operator so that it can support multiple > patterns and match incoming events against all of them. For this I have no clear > deadline in my mind, but given that there are more and more people asking for > it, I think it is going to be added soon. > > Thanks for raising the issue in the mailing list, > Kostas > >> On Dec 27, 2017, at 2:44 PM, Ufuk Celebi <[hidden email] <mailto:[hidden email]>> wrote: >> >> Hey Rafi, >> >> this is indeed a very nice feature to have. :-) I'm afraid that this >> is currently hard to do manually with CEP. Let me pull in Dawid and >> Klou (cc'd) who have worked a lot on CEP. They can probably update you >> on the plan for FLINK-7129. >> >> Best, >> >> Ufuk >> >> >> On Tue, Dec 26, 2017 at 8:47 PM, Shivam Sharma <[hidden email] <mailto:[hidden email]>> wrote: >>> Hi Rafi, >>> >>> Even I also wanted this facility from Flink Core. But I think this is >>> already solved by Uber on Flink. >>> >>> https://eng.uber.com/athenax/ <https://eng.uber.com/athenax/> >>> >>> Best >>> >>> On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: >>> >>>> Hi, >>>> >>>> I'm Rafi, Data Architect at WalkMe. >>>> >>>> Our Mobile platform generates events coming from the end-users mobile >>>> device for different actions that the user does. >>>> >>>> The use case I wanted to implement using Flink CEP is as follows: >>>> >>>> We would like to expose a UI where we could define a set of rules. Rules >>>> can be statefull, like: >>>> >>>> User did X 4 times in the last hour >>>> AND >>>> User did Y and then did Z in a session >>>> AND >>>> User average session duration is > 60 seconds >>>> >>>> As the set of rules are met, we would like to trigger an action. Like a >>>> REST call, fire event, etc. >>>> >>>> This sounds like a good fit for Flink CEP, except that currently, I >>>> understand that CEP patterns have to be "hard-coded" in my jobs code in >>>> order to build the graph. >>>> This set of rules may change many times a day. So re-deploying a Flink job >>>> is not an option (or is it?). >>>> >>>> I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and >>>> was hoping you plan to add this feature soon :) >>>> >>>> This would make a powerful feature and open up many interesting use-cases. >>>> >>>> Meanwhile, can you suggest a way of implementing this use-case? >>>> >>>> Hope this makes sense. >>>> Would love to hear your thoughts. >>>> >>>> Thanks, >>>> Rafi >>>> >>> >>> >>> >>> -- >>> Shivam Sharma >>> Data Engineer @ Goibibo >>> Indian Institute Of Information Technology, Design and Manufacturing >>> Jabalpur >>> Mobile No- (+91) 8882114744 >>> Email:- [hidden email] >>> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma >>> <https://www.linkedin.com/in/28shivamsharma>* > |
Looks good to me, Klou. I would only add the following two things:
1) UIDs need to be unique for the complete program, so you would need more specific ones, e.g. `filter-1-version-1` instead of only `version-1`. These are used to map state from the savepoint back to your program. We want to keep the state of those patterns that are not changed, but want to get rid of patterns that are updated with this approach. Note that new patterns will start from scratch whereas kept patterns will continue from whey they left off (e.g. if things already partially matched, this will be kept). 2) The flag has been renamed to ``` --allowNonRestoredState `` – Ufuk On Thu, Dec 28, 2017 at 3:48 PM, Kostas Kloudas <[hidden email]> wrote: > Hi again Rafi, > > Coming back to the second part of the question on what you can do right now, > I would suggest that you launch your initial job with your initial patterns > and > in your code you assign UID’s to your sources and the CEP operators, e.g.: > > CEP.pattern(input, mySuperPattern).select(muSelectFunction).uid(“version1") > > When you want to update the pattern (delete, add a new one or update an > existing one): > > 1) take a savepoint from your running job and kill it. > 2) update your patterns, and on the updated patterns only, change the uids > so that the > state in the savepoint corresponding to your previous version is > ignored. > 3) restart your job from the savepoint with the “—ignoreUnmappedState” > flag, as described here: > https://issues.apache.org/jira/browse/FLINK-4445 > > This will allow your job to restart from where it left off (the sources > checkpoint their state), > unchanged patterns to continue from where they were (as you did not change > their uids) > and new/updated patterns to start from scratch as they have a new uid. > > I have not really tried it but I think it can work although it requires some > manual stopping/ > restarting. > > I do not know if Ufuk has something to add to this. > > Hope this helps, > Kostas > > > On Dec 28, 2017, at 2:57 PM, Kostas Kloudas <[hidden email]> > wrote: > > Hi Rafi, > > Currently this is unfortunately not supported out of the box. > To support this, we need 2 features with one having to be added in Flink > itself, > and the other to the CEP library. > > The first one is broadcast state and the ability to connect keyed and > non-keyed > streams. This one is to be added to Flink itself and the good news are that > this > feature is scheduled to be added to Flink 1.5. > > The second feature is to modify the CEP operator so that it can support > multiple > patterns and match incoming events against all of them. For this I have no > clear > deadline in my mind, but given that there are more and more people asking > for > it, I think it is going to be added soon. > > Thanks for raising the issue in the mailing list, > Kostas > > On Dec 27, 2017, at 2:44 PM, Ufuk Celebi <[hidden email]> wrote: > > Hey Rafi, > > this is indeed a very nice feature to have. :-) I'm afraid that this > is currently hard to do manually with CEP. Let me pull in Dawid and > Klou (cc'd) who have worked a lot on CEP. They can probably update you > on the plan for FLINK-7129. > > Best, > > Ufuk > > > On Tue, Dec 26, 2017 at 8:47 PM, Shivam Sharma <[hidden email]> > wrote: > > Hi Rafi, > > Even I also wanted this facility from Flink Core. But I think this is > already solved by Uber on Flink. > > https://eng.uber.com/athenax/ > > Best > > On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: > > Hi, > > I'm Rafi, Data Architect at WalkMe. > > Our Mobile platform generates events coming from the end-users mobile > device for different actions that the user does. > > The use case I wanted to implement using Flink CEP is as follows: > > We would like to expose a UI where we could define a set of rules. Rules > can be statefull, like: > > User did X 4 times in the last hour > AND > User did Y and then did Z in a session > AND > User average session duration is > 60 seconds > > As the set of rules are met, we would like to trigger an action. Like a > REST call, fire event, etc. > > This sounds like a good fit for Flink CEP, except that currently, I > understand that CEP patterns have to be "hard-coded" in my jobs code in > order to build the graph. > This set of rules may change many times a day. So re-deploying a Flink job > is not an option (or is it?). > > I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 and > was hoping you plan to add this feature soon :) > > This would make a powerful feature and open up many interesting use-cases. > > Meanwhile, can you suggest a way of implementing this use-case? > > Hope this makes sense. > Would love to hear your thoughts. > > Thanks, > Rafi > > > > > -- > Shivam Sharma > Data Engineer @ Goibibo > Indian Institute Of Information Technology, Design and Manufacturing > Jabalpur > Mobile No- (+91) 8882114744 > Email:- [hidden email] > LinkedIn:-*https://www.linkedin.com/in/28shivamsharma > <https://www.linkedin.com/in/28shivamsharma>* > > > |
Hi Kostas & Ufuk,
Appreciate your detailed response. I'll give it a try. Thanks, Rafi On Thu, Dec 28, 2017 at 5:16 PM, Ufuk Celebi <[hidden email]> wrote: > Looks good to me, Klou. I would only add the following two things: > > 1) UIDs need to be unique for the complete program, so you would need > more specific ones, e.g. `filter-1-version-1` instead of only > `version-1`. These are used to map state from the savepoint back to > your program. We want to keep the state of those patterns that are not > changed, but want to get rid of patterns that are updated with this > approach. Note that new patterns will start from scratch whereas kept > patterns will continue from whey they left off (e.g. if things already > partially matched, this will be kept). > > 2) The flag has been renamed to > ``` > --allowNonRestoredState > `` > > – Ufuk > > > On Thu, Dec 28, 2017 at 3:48 PM, Kostas Kloudas > <[hidden email]> wrote: > > Hi again Rafi, > > > > Coming back to the second part of the question on what you can do right > now, > > I would suggest that you launch your initial job with your initial > patterns > > and > > in your code you assign UID’s to your sources and the CEP operators, > e.g.: > > > > CEP.pattern(input, mySuperPattern).select(muSelectFunction).uid(“ > version1") > > > > When you want to update the pattern (delete, add a new one or update an > > existing one): > > > > 1) take a savepoint from your running job and kill it. > > 2) update your patterns, and on the updated patterns only, change the > uids > > so that the > > state in the savepoint corresponding to your previous version is > > ignored. > > 3) restart your job from the savepoint with the “—ignoreUnmappedState” > > flag, as described here: > > https://issues.apache.org/jira/browse/FLINK-4445 > > > > This will allow your job to restart from where it left off (the sources > > checkpoint their state), > > unchanged patterns to continue from where they were (as you did not > change > > their uids) > > and new/updated patterns to start from scratch as they have a new uid. > > > > I have not really tried it but I think it can work although it requires > some > > manual stopping/ > > restarting. > > > > I do not know if Ufuk has something to add to this. > > > > Hope this helps, > > Kostas > > > > > > On Dec 28, 2017, at 2:57 PM, Kostas Kloudas <[hidden email] > > > > wrote: > > > > Hi Rafi, > > > > Currently this is unfortunately not supported out of the box. > > To support this, we need 2 features with one having to be added in Flink > > itself, > > and the other to the CEP library. > > > > The first one is broadcast state and the ability to connect keyed and > > non-keyed > > streams. This one is to be added to Flink itself and the good news are > that > > this > > feature is scheduled to be added to Flink 1.5. > > > > The second feature is to modify the CEP operator so that it can support > > multiple > > patterns and match incoming events against all of them. For this I have > no > > clear > > deadline in my mind, but given that there are more and more people asking > > for > > it, I think it is going to be added soon. > > > > Thanks for raising the issue in the mailing list, > > Kostas > > > > On Dec 27, 2017, at 2:44 PM, Ufuk Celebi <[hidden email]> wrote: > > > > Hey Rafi, > > > > this is indeed a very nice feature to have. :-) I'm afraid that this > > is currently hard to do manually with CEP. Let me pull in Dawid and > > Klou (cc'd) who have worked a lot on CEP. They can probably update you > > on the plan for FLINK-7129. > > > > Best, > > > > Ufuk > > > > > > On Tue, Dec 26, 2017 at 8:47 PM, Shivam Sharma <[hidden email] > > > > wrote: > > > > Hi Rafi, > > > > Even I also wanted this facility from Flink Core. But I think this is > > already solved by Uber on Flink. > > > > https://eng.uber.com/athenax/ > > > > Best > > > > On Tue, Dec 26, 2017 at 6:21 PM, Rafi Aroch <[hidden email]> wrote: > > > > Hi, > > > > I'm Rafi, Data Architect at WalkMe. > > > > Our Mobile platform generates events coming from the end-users mobile > > device for different actions that the user does. > > > > The use case I wanted to implement using Flink CEP is as follows: > > > > We would like to expose a UI where we could define a set of rules. Rules > > can be statefull, like: > > > > User did X 4 times in the last hour > > AND > > User did Y and then did Z in a session > > AND > > User average session duration is > 60 seconds > > > > As the set of rules are met, we would like to trigger an action. Like a > > REST call, fire event, etc. > > > > This sounds like a good fit for Flink CEP, except that currently, I > > understand that CEP patterns have to be "hard-coded" in my jobs code in > > order to build the graph. > > This set of rules may change many times a day. So re-deploying a Flink > job > > is not an option (or is it?). > > > > I found this ticket: https://issues.apache.org/jira/browse/FLINK-7129 > and > > was hoping you plan to add this feature soon :) > > > > This would make a powerful feature and open up many interesting > use-cases. > > > > Meanwhile, can you suggest a way of implementing this use-case? > > > > Hope this makes sense. > > Would love to hear your thoughts. > > > > Thanks, > > Rafi > > > > > > > > > > -- > > Shivam Sharma > > Data Engineer @ Goibibo > > Indian Institute Of Information Technology, Design and Manufacturing > > Jabalpur > > Mobile No- (+91) 8882114744 > > Email:- [hidden email] > > LinkedIn:-*https://www.linkedin.com/in/28shivamsharma > > <https://www.linkedin.com/in/28shivamsharma>* > > > > > > > |
Free forum by Nabble | Edit this page |