Add CEP library to Flink

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Add CEP library to Flink

Till Rohrmann
Hi everybody,

recently we've seen an increased interest in complex event processing (CEP)
by Flink users. Even though most functionality is already there to solve
many use cases it would still be helpful for most users to have an easy to
use library. Having such a library which allows to define complex event
patterns would increase Flink's user range to the CEP community. Once
having laid the foundation, I'm optimistic that people will quickly pick it
up and further extend it.

The major contribution of this library would be to add an efficient
non-deterministic finite automaton which can detect complex event patterns.
For everything else, Flink already has most of the functionality in place.

I've drafted a design document for the first version. Please review it and
comment:

https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing

Thanks,
Till
Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Kostas Tzoumas-2
This is a very comprehensive document, incredible job!

It seems that most of the machinery is already in place in Flink, which
would make this a very valuable addition taking into account the
implementation effort.


On Fri, Jan 8, 2016 at 3:54 PM, Till Rohrmann <[hidden email]> wrote:

> Hi everybody,
>
> recently we've seen an increased interest in complex event processing (CEP)
> by Flink users. Even though most functionality is already there to solve
> many use cases it would still be helpful for most users to have an easy to
> use library. Having such a library which allows to define complex event
> patterns would increase Flink's user range to the CEP community. Once
> having laid the foundation, I'm optimistic that people will quickly pick it
> up and further extend it.
>
> The major contribution of this library would be to add an efficient
> non-deterministic finite automaton which can detect complex event patterns.
> For everything else, Flink already has most of the functionality in place.
>
> I've drafted a design document for the first version. Please review it and
> comment:
>
>
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing
>
> Thanks,
> Till
>
Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Ufuk Celebi-2
In reply to this post by Till Rohrmann

> On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote:
>
> Hi everybody,
>
> recently we've seen an increased interest in complex event processing (CEP)
> by Flink users. Even though most functionality is already there to solve
> many use cases it would still be helpful for most users to have an easy to
> use library. Having such a library which allows to define complex event
> patterns would increase Flink's user range to the CEP community. Once
> having laid the foundation, I'm optimistic that people will quickly pick it
> up and further extend it.
>
> The major contribution of this library would be to add an efficient
> non-deterministic finite automaton which can detect complex event patterns.
> For everything else, Flink already has most of the functionality in place.
>
> I've drafted a design document for the first version. Please review it and
> comment:
>
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing

Thanks for sharing, Till! I think that this will be a very valuable addition to Flink. Looking forward to it. :-)

– Ufuk

Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Tzu-Li Tai
A definite +1 for this feature, thanks for your effort Till!
Really look forward to the POC foundation and would like to help contribute
where-ever possible.

Pattern matching along with event time support seems to be another major
breakthrough for stream processing framework options currently on the table.

At our company, we've been using Flink to implement pattern matching very
similar to the use cases detailed in Till's design doc for adtech related
applications. A comprehensive and expressive DSL for these applications
will be fantastic.

On Sat, Jan 9, 2016 at 12:36 AM, Ufuk Celebi <[hidden email]> wrote:

>
> > On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote:
> >
> > Hi everybody,
> >
> > recently we've seen an increased interest in complex event processing
> (CEP)
> > by Flink users. Even though most functionality is already there to solve
> > many use cases it would still be helpful for most users to have an easy
> to
> > use library. Having such a library which allows to define complex event
> > patterns would increase Flink's user range to the CEP community. Once
> > having laid the foundation, I'm optimistic that people will quickly pick
> it
> > up and further extend it.
> >
> > The major contribution of this library would be to add an efficient
> > non-deterministic finite automaton which can detect complex event
> patterns.
> > For everything else, Flink already has most of the functionality in
> place.
> >
> > I've drafted a design document for the first version. Please review it
> and
> > comment:
> >
> >
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing
>
> Thanks for sharing, Till! I think that this will be a very valuable
> addition to Flink. Looking forward to it. :-)
>
> – Ufuk
>
>


--
Tzu-Li (Gordon) Tai
Data Engineer @ VMFive
vmfive.com
Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Stephan Ewen
Looks super cool, Till!

Especially the section about the Patterns is great.
For the other parts, I was wondering about the overlap with the TableAPI
and the SQL efforts.

I was thinking that a first version could really focus on the Patterns and
make the assumption that they are always applied on a KeyedStream.
That way the effort would focus on the most important new addition, and we
could evaluate whether we could reuse the TableAPI for the other
grouping/windowing/etc parts.

Basically, have initially something like this:

Pattern<Event> pattern = Pattern.<Event>next("e1").where( (evt) -> evt.id
== 42 )
  .followedBy("e2").where( (evt) -> evt.id == 1337 )
                                .within(Time.minutes(10))

KeyedStream<Event> ks = input.keyBy( (evt) -> evt.getId() );

CEP.pattern(ks, pattern).select( new PatternSelectFunction<Event>() { ... }
);


All other parts could still be constructed around that in the end.

Any thoughts?


Greetings,
Stepahn







On Fri, Jan 8, 2016 at 5:50 PM, Gordon Tai (戴資力) <[hidden email]> wrote:

> A definite +1 for this feature, thanks for your effort Till!
> Really look forward to the POC foundation and would like to help contribute
> where-ever possible.
>
> Pattern matching along with event time support seems to be another major
> breakthrough for stream processing framework options currently on the
> table.
>
> At our company, we've been using Flink to implement pattern matching very
> similar to the use cases detailed in Till's design doc for adtech related
> applications. A comprehensive and expressive DSL for these applications
> will be fantastic.
>
> On Sat, Jan 9, 2016 at 12:36 AM, Ufuk Celebi <[hidden email]> wrote:
>
> >
> > > On 08 Jan 2016, at 15:54, Till Rohrmann <[hidden email]> wrote:
> > >
> > > Hi everybody,
> > >
> > > recently we've seen an increased interest in complex event processing
> > (CEP)
> > > by Flink users. Even though most functionality is already there to
> solve
> > > many use cases it would still be helpful for most users to have an easy
> > to
> > > use library. Having such a library which allows to define complex event
> > > patterns would increase Flink's user range to the CEP community. Once
> > > having laid the foundation, I'm optimistic that people will quickly
> pick
> > it
> > > up and further extend it.
> > >
> > > The major contribution of this library would be to add an efficient
> > > non-deterministic finite automaton which can detect complex event
> > patterns.
> > > For everything else, Flink already has most of the functionality in
> > place.
> > >
> > > I've drafted a design document for the first version. Please review it
> > and
> > > comment:
> > >
> > >
> >
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing
> >
> > Thanks for sharing, Till! I think that this will be a very valuable
> > addition to Flink. Looking forward to it. :-)
> >
> > – Ufuk
> >
> >
>
>
> --
> Tzu-Li (Gordon) Tai
> Data Engineer @ VMFive
> vmfive.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Henry Saputra
In reply to this post by Till Rohrmann
HI Till,

Have you created JIRA ticket to keep track of this proposed new feature?

We should create one to keep track updates on the effort.

Thanks,

Henry

On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email]> wrote:

> Hi everybody,
>
> recently we've seen an increased interest in complex event processing (CEP)
> by Flink users. Even though most functionality is already there to solve
> many use cases it would still be helpful for most users to have an easy to
> use library. Having such a library which allows to define complex event
> patterns would increase Flink's user range to the CEP community. Once
> having laid the foundation, I'm optimistic that people will quickly pick it
> up and further extend it.
>
> The major contribution of this library would be to add an efficient
> non-deterministic finite automaton which can detect complex event patterns.
> For everything else, Flink already has most of the functionality in place.
>
> I've drafted a design document for the first version. Please review it and
> comment:
>
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing
>
> Thanks,
> Till
Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

Paris Carbone
+1 for the cool design proposal. I also agree with Stephan’s point to focus on the Pattern operator.
Ultimately in the future this could be merged into the SQL lib.  There are a few “standards" you could check out such as Oracle’s Pattern Matching extension on SQL [1], apart from EPL.

[1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956



On 10 Jan 2016, at 02:46, Henry Saputra <[hidden email]<mailto:[hidden email]>> wrote:

HI Till,

Have you created JIRA ticket to keep track of this proposed new feature?

We should create one to keep track updates on the effort.

Thanks,

Henry

On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email]<mailto:[hidden email]>> wrote:
Hi everybody,

recently we've seen an increased interest in complex event processing (CEP)
by Flink users. Even though most functionality is already there to solve
many use cases it would still be helpful for most users to have an easy to
use library. Having such a library which allows to define complex event
patterns would increase Flink's user range to the CEP community. Once
having laid the foundation, I'm optimistic that people will quickly pick it
up and further extend it.

The major contribution of this library would be to add an efficient
non-deterministic finite automaton which can detect complex event patterns.
For everything else, Flink already has most of the functionality in place.

I've drafted a design document for the first version. Please review it and
comment:

https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing

Thanks,
Till

Reply | Threaded
Open this post in threaded view
|

Re: Add CEP library to Flink

till.rohrmann
Thanks for the valuable feedback.

@Stephan, you're totally right that the CEP DSLs and SQLs strongly resemble
each other. It's probably mainly a question of syntax how a pattern
definition can be exposed in stream SQL. For that we should take a closer
look at Oracle's Pattern Matching extension on SQL. Thanks for the pointer
@Paris. I included the reference into the design document.

I also agree that we should first concentrate on the pattern matching
functionality. That way, we avoid duplicate work. This means implementing
the NFA to detect event patterns and constructing this NFA from a pattern
definition.

I've created the umbrella JIRA ticket
https://issues.apache.org/jira/browse/FLINK-3215 to track the overall
progress of the CEP implementation. As first subtasks it includes the
implementation of the pattern definition and the NFA.

Cheers,
Till

On Sun, Jan 10, 2016 at 8:49 PM, Paris Carbone <[hidden email]> wrote:

> +1 for the cool design proposal. I also agree with Stephan’s point to
> focus on the Pattern operator.
> Ultimately in the future this could be merged into the SQL lib.  There are
> a few “standards" you could check out such as Oracle’s Pattern Matching
> extension on SQL [1], apart from EPL.
>
> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
>
>
>
> On 10 Jan 2016, at 02:46, Henry Saputra <[hidden email]<mailto:
> [hidden email]>> wrote:
>
> HI Till,
>
> Have you created JIRA ticket to keep track of this proposed new feature?
>
> We should create one to keep track updates on the effort.
>
> Thanks,
>
> Henry
>
> On Fri, Jan 8, 2016 at 6:54 AM, Till Rohrmann <[hidden email]
> <mailto:[hidden email]>> wrote:
> Hi everybody,
>
> recently we've seen an increased interest in complex event processing (CEP)
> by Flink users. Even though most functionality is already there to solve
> many use cases it would still be helpful for most users to have an easy to
> use library. Having such a library which allows to define complex event
> patterns would increase Flink's user range to the CEP community. Once
> having laid the foundation, I'm optimistic that people will quickly pick it
> up and further extend it.
>
> The major contribution of this library would be to add an efficient
> non-deterministic finite automaton which can detect complex event patterns.
> For everything else, Flink already has most of the functionality in place.
>
> I've drafted a design document for the first version. Please review it and
> comment:
>
>
> https://docs.google.com/document/d/15iaBCZkNcpqSma_qrF0GUyobKV_JttEDVuhNd0Y1aAU/edit?usp=sharing
>
> Thanks,
> Till
>
>