Hi everybody,
recently we've seen an increased interest in complex event processing (CEP) by Flink users. Even though most functionality is already there to solve many use cases it would still be helpful for most users to have an easy to use library. Having such a library which allows to define complex event patterns would increase Flink's user range to the CEP community. Once having laid the foundation, I'm optimistic that people will quickly pick it up and further extend it. The major contribution of this library would be to add an efficient non-deterministic finite automaton which can detect complex event patterns. For everything else, Flink already has most of the functionality in place. I've drafted a design document for the first version. Please review it and comment: https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing Thanks, Till |
This is a very comprehensive document, incredible job!
It seems that most of the machinery is already in place in Flink, which would make this a very valuable addition taking into account the implementation effort. On Fri, Jan 8, 2016 at 3:54 PM, Till Rohrmann <[hidden email]> wrote: > Hi everybody, > > recently we've seen an increased interest in complex event processing (CEP) > by Flink users. Even though most functionality is already there to solve > many use cases it would still be helpful for most users to have an easy to > use library. Having such a library which allows to define complex event > patterns would increase Flink's user range to the CEP community. Once > having laid the foundation, I'm optimistic that people will quickly pick it > up and further extend it. > > The major contribution of this library would be to add an efficient > non-deterministic finite automaton which can detect complex event patterns. > For everything else, Flink already has most of the functionality in place. > > I've drafted a design document for the first version. Please review it and > comment: > > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing > > Thanks, > Till > |
In reply to this post by Till Rohrmann
> On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote: > > Hi everybody, > > recently we've seen an increased interest in complex event processing (CEP) > by Flink users. Even though most functionality is already there to solve > many use cases it would still be helpful for most users to have an easy to > use library. Having such a library which allows to define complex event > patterns would increase Flink's user range to the CEP community. Once > having laid the foundation, I'm optimistic that people will quickly pick it > up and further extend it. > > The major contribution of this library would be to add an efficient > non-deterministic finite automaton which can detect complex event patterns. > For everything else, Flink already has most of the functionality in place. > > I've drafted a design document for the first version. Please review it and > comment: > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing Thanks for sharing, Till! I think that this will be a very valuable addition to Flink. Looking forward to it. :-) – Ufuk |
A definite +1 for this feature, thanks for your effort Till!
Really look forward to the POC foundation and would like to help contribute where-ever possible. Pattern matching along with event time support seems to be another major breakthrough for stream processing framework options currently on the table. At our company, we've been using Flink to implement pattern matching very similar to the use cases detailed in Till's design doc for adtech related applications. A comprehensive and expressive DSL for these applications will be fantastic. On Sat, Jan 9, 2016 at 12:36 AM, Ufuk Celebi <[hidden email]> wrote: > > > On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote: > > > > Hi everybody, > > > > recently we've seen an increased interest in complex event processing > (CEP) > > by Flink users. Even though most functionality is already there to solve > > many use cases it would still be helpful for most users to have an easy > to > > use library. Having such a library which allows to define complex event > > patterns would increase Flink's user range to the CEP community. Once > > having laid the foundation, I'm optimistic that people will quickly pick > it > > up and further extend it. > > > > The major contribution of this library would be to add an efficient > > non-deterministic finite automaton which can detect complex event > patterns. > > For everything else, Flink already has most of the functionality in > place. > > > > I've drafted a design document for the first version. Please review it > and > > comment: > > > > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing > > Thanks for sharing, Till! I think that this will be a very valuable > addition to Flink. Looking forward to it. :-) > > – Ufuk > > -- Tzu-Li (Gordon) Tai Data Engineer @ VMFive vmfive.com |
Looks super cool, Till!
Especially the section about the Patterns is great. For the other parts, I was wondering about the overlap with the TableAPI and the SQL efforts. I was thinking that a first version could really focus on the Patterns and make the assumption that they are always applied on a KeyedStream. That way the effort would focus on the most important new addition, and we could evaluate whether we could reuse the TableAPI for the other grouping/windowing/etc parts. Basically, have initially something like this: Pattern<Event> pattern = Pattern.<Event>next("e1").where( (evt) -> evt.id == 42 ) .followedBy("e2").where( (evt) -> evt.id == 1337 ) .within(Time.minutes(10)) KeyedStream<Event> ks = input.keyBy( (evt) -> evt.getId() ); CEP.pattern(ks, pattern).select( new PatternSelectFunction<Event>() { ... } ); All other parts could still be constructed around that in the end. Any thoughts? Greetings, Stepahn On Fri, Jan 8, 2016 at 5:50 PM, Gordon Tai (戴資力) <[hidden email]> wrote: > A definite +1 for this feature, thanks for your effort Till! > Really look forward to the POC foundation and would like to help contribute > where-ever possible. > > Pattern matching along with event time support seems to be another major > breakthrough for stream processing framework options currently on the > table. > > At our company, we've been using Flink to implement pattern matching very > similar to the use cases detailed in Till's design doc for adtech related > applications. A comprehensive and expressive DSL for these applications > will be fantastic. > > On Sat, Jan 9, 2016 at 12:36 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > > On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote: > > > > > > Hi everybody, > > > > > > recently we've seen an increased interest in complex event processing > > (CEP) > > > by Flink users. Even though most functionality is already there to > solve > > > many use cases it would still be helpful for most users to have an easy > > to > > > use library. Having such a library which allows to define complex event > > > patterns would increase Flink's user range to the CEP community. Once > > > having laid the foundation, I'm optimistic that people will quickly > pick > > it > > > up and further extend it. > > > > > > The major contribution of this library would be to add an efficient > > > non-deterministic finite automaton which can detect complex event > > patterns. > > > For everything else, Flink already has most of the functionality in > > place. > > > > > > I've drafted a design document for the first version. Please review it > > and > > > comment: > > > > > > > > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing > > > > Thanks for sharing, Till! I think that this will be a very valuable > > addition to Flink. Looking forward to it. :-) > > > > – Ufuk > > > > > > > -- > Tzu-Li (Gordon) Tai > Data Engineer @ VMFive > vmfive.com > |
In reply to this post by Till Rohrmann
HI Till,
Have you created JIRA ticket to keep track of this proposed new feature? We should create one to keep track updates on the effort. Thanks, Henry On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email]> wrote: > Hi everybody, > > recently we've seen an increased interest in complex event processing (CEP) > by Flink users. Even though most functionality is already there to solve > many use cases it would still be helpful for most users to have an easy to > use library. Having such a library which allows to define complex event > patterns would increase Flink's user range to the CEP community. Once > having laid the foundation, I'm optimistic that people will quickly pick it > up and further extend it. > > The major contribution of this library would be to add an efficient > non-deterministic finite automaton which can detect complex event patterns. > For everything else, Flink already has most of the functionality in place. > > I've drafted a design document for the first version. Please review it and > comment: > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing > > Thanks, > Till |
+1 for the cool design proposal. I also agree with Stephan’s point to focus on the Pattern operator.
Ultimately in the future this could be merged into the SQL lib. There are a few “standards" you could check out such as Oracle’s Pattern Matching extension on SQL [1], apart from EPL. [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956 On 10 Jan 2016, at 02:46, Henry Saputra <[hidden email]<mailto:[hidden email]>> wrote: HI Till, Have you created JIRA ticket to keep track of this proposed new feature? We should create one to keep track updates on the effort. Thanks, Henry On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email]<mailto:[hidden email]>> wrote: Hi everybody, recently we've seen an increased interest in complex event processing (CEP) by Flink users. Even though most functionality is already there to solve many use cases it would still be helpful for most users to have an easy to use library. Having such a library which allows to define complex event patterns would increase Flink's user range to the CEP community. Once having laid the foundation, I'm optimistic that people will quickly pick it up and further extend it. The major contribution of this library would be to add an efficient non-deterministic finite automaton which can detect complex event patterns. For everything else, Flink already has most of the functionality in place. I've drafted a design document for the first version. Please review it and comment: https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing Thanks, Till |
Thanks for the valuable feedback.
@Stephan, you're totally right that the CEP DSLs and SQLs strongly resemble each other. It's probably mainly a question of syntax how a pattern definition can be exposed in stream SQL. For that we should take a closer look at Oracle's Pattern Matching extension on SQL. Thanks for the pointer @Paris. I included the reference into the design document. I also agree that we should first concentrate on the pattern matching functionality. That way, we avoid duplicate work. This means implementing the NFA to detect event patterns and constructing this NFA from a pattern definition. I've created the umbrella JIRA ticket https://issues.apache.org/jira/browse/FLINK-3215 to track the overall progress of the CEP implementation. As first subtasks it includes the implementation of the pattern definition and the NFA. Cheers, Till On Sun, Jan 10, 2016 at 8:49 PM, Paris Carbone <[hidden email]> wrote: > +1 for the cool design proposal. I also agree with Stephan’s point to > focus on the Pattern operator. > Ultimately in the future this could be merged into the SQL lib. There are > a few “standards" you could check out such as Oracle’s Pattern Matching > extension on SQL [1], apart from EPL. > > [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956 > > > > On 10 Jan 2016, at 02:46, Henry Saputra <[hidden email]<mailto: > [hidden email]>> wrote: > > HI Till, > > Have you created JIRA ticket to keep track of this proposed new feature? > > We should create one to keep track updates on the effort. > > Thanks, > > Henry > > On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email] > <mailto:[hidden email]>> wrote: > Hi everybody, > > recently we've seen an increased interest in complex event processing (CEP) > by Flink users. Even though most functionality is already there to solve > many use cases it would still be helpful for most users to have an easy to > use library. Having such a library which allows to define complex event > patterns would increase Flink's user range to the CEP community. Once > having laid the foundation, I'm optimistic that people will quickly pick it > up and further extend it. > > The major contribution of this library would be to add an efficient > non-deterministic finite automaton which can detect complex event patterns. > For everything else, Flink already has most of the functionality in place. > > I've drafted a design document for the first version. Please review it and > comment: > > > https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing > > Thanks, > Till > > |
Free forum by Nabble | Edit this page |