Effort to add SQL / StreamSQL to Flink

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: 答复: Effort to add SQL / StreamSQL to Flink

Fabian Hueske-2
Thanks for the initiative Vasia!
I went over the diff and didn't find anything crucial.

I would like to do another pass over the tests though and improve the
exceptions for invalid joins before merging.
Will open a PR later today.

2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <[hidden email]>:

> Yes, the current state corresponds to Task 1. PR #1770 corresponds to Task
> 5. Task 6 should come right after :)
>
> -V.
>
> On 16 March 2016 at 20:35, Robert Metzger <[hidden email]> wrote:
>
> > Cool, this is great news!
> > So "Task 1" from the document [1] is done with the merge? And PR #1770 is
> > going towards "Task 6".
> > I think good support for Stream SQL is a very interesting new feature for
> > Flink.
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> >
> > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > [hidden email]
> > > wrote:
> >
> > > Hello everyone,
> > >
> > > We are happy to announce that the "tableOnCalcite" branch is finally
> > ready
> > > to be merged.
> > > It essentially provides the existing functionality of the Table API,
> but
> > > now the translation happens through Apache Calcite.
> > > You can find the changes rebased on top of the current master in [1].
> > > We have removed the prototype streaming Table API functionality, which
> > will
> > > be added back once PR [2] is merged.
> > >
> > > We'll go through the changes once more and, if no objections, we would
> > like
> > > to go ahead and merge this.
> > >
> > > Cheers,
> > > -Vasia.
> > >
> > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > [2]: https://github.com/apache/flink/pull/1770
> > >
> > >
> > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote:
> > >
> > > > Hi everybody,
> > > >
> > > > as previously announced, I pushed a feature branch called
> > > "tableOnCalcite"
> > > > to the Flink repository.
> > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > >
> > > > Cheers, Fabian
> > > >
> > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > > >
> > > > > We haven't defined the StreamSQL syntax yet (and I think it will
> take
> > > > some
> > > > > time until we are at that point).
> > > > > So we are quite flexible with both featurs.
> > > > >
> > > > > Let's keep this opportunity in mind and coordinate when before
> making
> > > > > decisions about CEP or StreamSQL.
> > > > >
> > > > > Fabian
> > > > >
> > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
> > > > >
> > > > >> First of all, it's a great design document. Looking forward having
> > > > stream
> > > > >> SQL in the foreseeable future :-)
> > > > >>
> > > > >> I think it is a good idea to consolidate stream SQL and CEP in the
> > > long
> > > > >> run. CEP's additional features compared to SQL boil down to
> pattern
> > > > >> detection. Once we have this, it should be only a question of
> > defining
> > > > the
> > > > >> SQL syntax for event patterns in order to integrate CEP with
> stream
> > > SQL.
> > > > >> Oracle has already defined an extension [1] to detect patterns in
> a
> > > set
> > > > of
> > > > >> table rows. This or Esper's event processing language (EPL) [2]
> > could
> > > > be a
> > > > >> good starting point.
> > > > >>
> > > > >> [1]
> > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > >> [2]
> > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > >>
> > > > >> Cheers,
> > > > >> Till
> > > > >>
> > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> [hidden email]>
> > > > >> wrote:
> > > > >>
> > > > >> > Thanks for the feedback!
> > > > >> >
> > > > >> > We will start the SQL effort with putting the existing (batch)
> > Table
> > > > >> API on
> > > > >> > top of Apache Calcite.
> > > > >> > From there we continue to add streaming support for the Table
> API
> > > > >> before we
> > > > >> > put a StreamSQL interface on top.
> > > > >> >
> > > > >> > Consolidating the efforts with the CEP library sounds like a
> good
> > > idea
> > > > >> to
> > > > >> > me.
> > > > >> > Maybe it can be nicely integrated with the streaming table API
> and
> > > > >> later as
> > > > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > > > defined
> > > > >> > yet).
> > > > >> >
> > > > >> > @Till: What do you think about adding CEP features to the Table
> > API.
> > > > >> From
> > > > >> > the CEP design doc, it looks like we need to add a pattern
> > matching
> > > > >> > operator in addition to the window features that we need to add
> > for
> > > > >> > streaming Table API in any case.
> > > > >> >
> > > > >> > Best, Fabian
> > > > >> >
> > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> [hidden email]
> > >:
> > > > >> >
> > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > > language
> > > > >> > > extend to offering a cluster of window, pattern matching.  EPL
> > can
> > > > >> both
> > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > >> > >
> > > > >> > > [1]
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > >> > >
> > > > >> > >
> > > > >> > > Regards
> > > > >> > > Song
> > > > >> > >
> > > > >> > >
> > > > >> > > -----邮件原件-----
> > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > >> > > 收件人: [hidden email]
> > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > >> > >
> > > > >> > > We still don’t have a concensus about the streaming SQL and
> CEP
> > > > >> library
> > > > >> > on
> > > > >> > > Flink. Some people want to merge these two libraries. Maybe we
> > > have
> > > > to
> > > > >> > > discuss about this in mailing list.
> > > > >> > >
> > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > [hidden email]>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > What's the relationship between the streaming SQL proposed
> > here
> > > > and
> > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > >> > > >
> > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > [hidden email]
> > > > >> >
> > > > >> > > wrote:
> > > > >> > > >
> > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > >> > > >>
> > > > >> > > >> - Henry
> > > > >> > > >>
> > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > [hidden email]
> > > > >> > > >> <javascript:;>> wrote:
> > > > >> > > >>
> > > > >> > > >>> Hi Henry,
> > > > >> > > >>>
> > > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> > > and a
> > > > >> few
> > > > >> > > >>> subissues.
> > > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > > described
> > > > >> in
> > > > >> > > >>> the design document in the next days.
> > > > >> > > >>>
> > > > >> > > >>> Thanks, Fabian
> > > > >> > > >>>
> > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > [hidden email]
> > > > >> > > >> <javascript:;>
> > > > >> > > >>> <javascript:;>>:
> > > > >> > > >>>
> > > > >> > > >>>> HI Fabian,
> > > > >> > > >>>>
> > > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > > feature?
> > > > >> > > >>>>
> > > > >> > > >>>> - Henry
> > > > >> > > >>>>
> > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > [hidden email]
> > > > >> > > >> <javascript:;>
> > > > >> > > >>> <javascript:;>> wrote:
> > > > >> > > >>>>> Hi everybody,
> > > > >> > > >>>>>
> > > > >> > > >>>>> in the last days, Timo and I refined the design document
> > for
> > > > >> > > >>>>> adding a
> > > > >> > > >>>> SQL /
> > > > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > > > Stephan.
> > > > >> > > >>>>>
> > > > >> > > >>>>> The document proposes an architecture that is centered
> > > around
> > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project
> and
> > > > >> > > >>>>> includes a SQL
> > > > >> > > >>>> parser,
> > > > >> > > >>>>> a semantic validator for relational queries, and a rule-
> > and
> > > > >> > > >> cost-based
> > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > > > Apache
> > > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is
> > to
> > > > >> > > >>>>> translate Table
> > > > >> > > >>> API
> > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > trees,
> > > > >> > > >>>>> optimize
> > > > >> > > >>>> these
> > > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > > >> programs.The
> > > > >> > > >>>> document
> > > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > > >> > > >>>>>
> > > > >> > > >>>>> Please review the design document and comment.
> > > > >> > > >>>>>
> > > > >> > > >>>>> -- >
> > > > >> > > >>>>>
> > > > >> > > >>>>
> > > > >> > > >>>
> > > > >> > > >>
> > > > >>
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > >> > > >>>>>
> > > > >> > > >>>>> Unless there are major concerns with the design, Timo
> and
> > I
> > > > want
> > > > >> > > >>>>> to
> > > > >> > > >>> start
> > > > >> > > >>>>> next week to move the current Table API on top of Apache
> > > > Calcite
> > > > >> > > >> (Task
> > > > >> > > >>> 1
> > > > >> > > >>>> in
> > > > >> > > >>>>> the document). The goal of this task is to have the same
> > > > >> > > >> functionality
> > > > >> > > >>> as
> > > > >> > > >>>>> currently, but with Calcite in the translation process.
> > This
> > > > is
> > > > >> a
> > > > >> > > >>>> blocking
> > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > > >> > > >>>>> independently
> > > > >> > > >>> work
> > > > >> > > >>>>> on different aspects such as extending the Table API,
> > > adding a
> > > > >> SQL
> > > > >> > > >>>>> interface (basically just a parser), integration with
> > > external
> > > > >> > > >>>>> data sources, better code generation, optimization
> rules,
> > > > >> > > >>>>> streaming
> > > > >> > > >> support
> > > > >> > > >>>> for
> > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > >> > > >>>>>
> > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> Task
> > 1
> > > > and
> > > > >> > > >>>>> merge
> > > > >> > > >>> it
> > > > >> > > >>>> to
> > > > >> > > >>>>> the master branch once the task is completed. Of course,
> > > > >> everybody
> > > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> > know
> > > > such
> > > > >> > > >>>>> that we
> > > > >> > > >>> can
> > > > >> > > >>>>> coordinate our efforts.
> > > > >> > > >>>>>
> > > > >> > > >>>>> Thanks,
> > > > >> > > >>>>> Fabian
> > > > >> > >
> > > > >> > > Regards,
> > > > >> > > Chiwan Park
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: 答复: Effort to add SQL / StreamSQL to Flink

Vasiliki Kalavri
Hi all,

tableOnCalcite has been merged to master :)

Cheers,
-Vasia.

On 17 March 2016 at 11:11, Fabian Hueske <[hidden email]> wrote:

> Thanks for the initiative Vasia!
> I went over the diff and didn't find anything crucial.
>
> I would like to do another pass over the tests though and improve the
> exceptions for invalid joins before merging.
> Will open a PR later today.
>
> 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <[hidden email]>:
>
> > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> Task
> > 5. Task 6 should come right after :)
> >
> > -V.
> >
> > On 16 March 2016 at 20:35, Robert Metzger <[hidden email]> wrote:
> >
> > > Cool, this is great news!
> > > So "Task 1" from the document [1] is done with the merge? And PR #1770
> is
> > > going towards "Task 6".
> > > I think good support for Stream SQL is a very interesting new feature
> for
> > > Flink.
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > >
> > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > [hidden email]
> > > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > We are happy to announce that the "tableOnCalcite" branch is finally
> > > ready
> > > > to be merged.
> > > > It essentially provides the existing functionality of the Table API,
> > but
> > > > now the translation happens through Apache Calcite.
> > > > You can find the changes rebased on top of the current master in [1].
> > > > We have removed the prototype streaming Table API functionality,
> which
> > > will
> > > > be added back once PR [2] is merged.
> > > >
> > > > We'll go through the changes once more and, if no objections, we
> would
> > > like
> > > > to go ahead and merge this.
> > > >
> > > > Cheers,
> > > > -Vasia.
> > > >
> > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > [2]: https://github.com/apache/flink/pull/1770
> > > >
> > > >
> > > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]>
> wrote:
> > > >
> > > > > Hi everybody,
> > > > >
> > > > > as previously announced, I pushed a feature branch called
> > > > "tableOnCalcite"
> > > > > to the Flink repository.
> > > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > > >
> > > > > Cheers, Fabian
> > > > >
> > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > > > >
> > > > > > We haven't defined the StreamSQL syntax yet (and I think it will
> > take
> > > > > some
> > > > > > time until we are at that point).
> > > > > > So we are quite flexible with both featurs.
> > > > > >
> > > > > > Let's keep this opportunity in mind and coordinate when before
> > making
> > > > > > decisions about CEP or StreamSQL.
> > > > > >
> > > > > > Fabian
> > > > > >
> > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
> > > > > >
> > > > > >> First of all, it's a great design document. Looking forward
> having
> > > > > stream
> > > > > >> SQL in the foreseeable future :-)
> > > > > >>
> > > > > >> I think it is a good idea to consolidate stream SQL and CEP in
> the
> > > > long
> > > > > >> run. CEP's additional features compared to SQL boil down to
> > pattern
> > > > > >> detection. Once we have this, it should be only a question of
> > > defining
> > > > > the
> > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > stream
> > > > SQL.
> > > > > >> Oracle has already defined an extension [1] to detect patterns
> in
> > a
> > > > set
> > > > > of
> > > > > >> table rows. This or Esper's event processing language (EPL) [2]
> > > could
> > > > > be a
> > > > > >> good starting point.
> > > > > >>
> > > > > >> [1]
> > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > >> [2]
> > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Till
> > > > > >>
> > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > [hidden email]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Thanks for the feedback!
> > > > > >> >
> > > > > >> > We will start the SQL effort with putting the existing (batch)
> > > Table
> > > > > >> API on
> > > > > >> > top of Apache Calcite.
> > > > > >> > From there we continue to add streaming support for the Table
> > API
> > > > > >> before we
> > > > > >> > put a StreamSQL interface on top.
> > > > > >> >
> > > > > >> > Consolidating the efforts with the CEP library sounds like a
> > good
> > > > idea
> > > > > >> to
> > > > > >> > me.
> > > > > >> > Maybe it can be nicely integrated with the streaming table API
> > and
> > > > > >> later as
> > > > > >> > well with the StreamSQL interface (the StreamSQL dialect is
> not
> > > > > defined
> > > > > >> > yet).
> > > > > >> >
> > > > > >> > @Till: What do you think about adding CEP features to the
> Table
> > > API.
> > > > > >> From
> > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > matching
> > > > > >> > operator in addition to the window features that we need to
> add
> > > for
> > > > > >> > streaming Table API in any case.
> > > > > >> >
> > > > > >> > Best, Fabian
> > > > > >> >
> > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > [hidden email]
> > > >:
> > > > > >> >
> > > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > > > language
> > > > > >> > > extend to offering a cluster of window, pattern matching.
> EPL
> > > can
> > > > > >> both
> > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > >> > >
> > > > > >> > > [1]
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Regards
> > > > > >> > > Song
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > -----邮件原件-----
> > > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > >> > > 收件人: [hidden email]
> > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > >> > >
> > > > > >> > > We still don’t have a concensus about the streaming SQL and
> > CEP
> > > > > >> library
> > > > > >> > on
> > > > > >> > > Flink. Some people want to merge these two libraries. Maybe
> we
> > > > have
> > > > > to
> > > > > >> > > discuss about this in mailing list.
> > > > > >> > >
> > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > [hidden email]>
> > > > > >> wrote:
> > > > > >> > > >
> > > > > >> > > > What's the relationship between the streaming SQL proposed
> > > here
> > > > > and
> > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > >> > > >
> > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > [hidden email]
> > > > > >> >
> > > > > >> > > wrote:
> > > > > >> > > >
> > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > >> > > >>
> > > > > >> > > >> - Henry
> > > > > >> > > >>
> > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > [hidden email]
> > > > > >> > > >> <javascript:;>> wrote:
> > > > > >> > > >>
> > > > > >> > > >>> Hi Henry,
> > > > > >> > > >>>
> > > > > >> > > >>> There is
> https://issues.apache.org/jira/browse/FLINK-2099
> > > > and a
> > > > > >> few
> > > > > >> > > >>> subissues.
> > > > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > > > described
> > > > > >> in
> > > > > >> > > >>> the design document in the next days.
> > > > > >> > > >>>
> > > > > >> > > >>> Thanks, Fabian
> > > > > >> > > >>>
> > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > [hidden email]
> > > > > >> > > >> <javascript:;>
> > > > > >> > > >>> <javascript:;>>:
> > > > > >> > > >>>
> > > > > >> > > >>>> HI Fabian,
> > > > > >> > > >>>>
> > > > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > > > feature?
> > > > > >> > > >>>>
> > > > > >> > > >>>> - Henry
> > > > > >> > > >>>>
> > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > [hidden email]
> > > > > >> > > >> <javascript:;>
> > > > > >> > > >>> <javascript:;>> wrote:
> > > > > >> > > >>>>> Hi everybody,
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> in the last days, Timo and I refined the design
> document
> > > for
> > > > > >> > > >>>>> adding a
> > > > > >> > > >>>> SQL /
> > > > > >> > > >>>>> StreamSQL interface on top of Flink that was started
> by
> > > > > Stephan.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> The document proposes an architecture that is centered
> > > > around
> > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project
> > and
> > > > > >> > > >>>>> includes a SQL
> > > > > >> > > >>>> parser,
> > > > > >> > > >>>>> a semantic validator for relational queries, and a
> rule-
> > > and
> > > > > >> > > >> cost-based
> > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive
> and
> > > > > Apache
> > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan
> is
> > > to
> > > > > >> > > >>>>> translate Table
> > > > > >> > > >>> API
> > > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > > trees,
> > > > > >> > > >>>>> optimize
> > > > > >> > > >>>> these
> > > > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > > > >> programs.The
> > > > > >> > > >>>> document
> > > > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Please review the design document and comment.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> -- >
> > > > > >> > > >>>>>
> > > > > >> > > >>>>
> > > > > >> > > >>>
> > > > > >> > > >>
> > > > > >>
> > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Unless there are major concerns with the design, Timo
> > and
> > > I
> > > > > want
> > > > > >> > > >>>>> to
> > > > > >> > > >>> start
> > > > > >> > > >>>>> next week to move the current Table API on top of
> Apache
> > > > > Calcite
> > > > > >> > > >> (Task
> > > > > >> > > >>> 1
> > > > > >> > > >>>> in
> > > > > >> > > >>>>> the document). The goal of this task is to have the
> same
> > > > > >> > > >> functionality
> > > > > >> > > >>> as
> > > > > >> > > >>>>> currently, but with Calcite in the translation
> process.
> > > This
> > > > > is
> > > > > >> a
> > > > > >> > > >>>> blocking
> > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > > > >> > > >>>>> independently
> > > > > >> > > >>> work
> > > > > >> > > >>>>> on different aspects such as extending the Table API,
> > > > adding a
> > > > > >> SQL
> > > > > >> > > >>>>> interface (basically just a parser), integration with
> > > > external
> > > > > >> > > >>>>> data sources, better code generation, optimization
> > rules,
> > > > > >> > > >>>>> streaming
> > > > > >> > > >> support
> > > > > >> > > >>>> for
> > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> > Task
> > > 1
> > > > > and
> > > > > >> > > >>>>> merge
> > > > > >> > > >>> it
> > > > > >> > > >>>> to
> > > > > >> > > >>>>> the master branch once the task is completed. Of
> course,
> > > > > >> everybody
> > > > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> > > know
> > > > > such
> > > > > >> > > >>>>> that we
> > > > > >> > > >>> can
> > > > > >> > > >>>>> coordinate our efforts.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Thanks,
> > > > > >> > > >>>>> Fabian
> > > > > >> > >
> > > > > >> > > Regards,
> > > > > >> > > Chiwan Park
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: 答复: Effort to add SQL / StreamSQL to Flink

mxm
Yeah! I'm a little late to the party but exciting stuff! :)

On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <[hidden email]
> wrote:

> Hi all,
>
> tableOnCalcite has been merged to master :)
>
> Cheers,
> -Vasia.
>
> On 17 March 2016 at 11:11, Fabian Hueske <[hidden email]> wrote:
>
> > Thanks for the initiative Vasia!
> > I went over the diff and didn't find anything crucial.
> >
> > I would like to do another pass over the tests though and improve the
> > exceptions for invalid joins before merging.
> > Will open a PR later today.
> >
> > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <[hidden email]>:
> >
> > > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> > Task
> > > 5. Task 6 should come right after :)
> > >
> > > -V.
> > >
> > > On 16 March 2016 at 20:35, Robert Metzger <[hidden email]> wrote:
> > >
> > > > Cool, this is great news!
> > > > So "Task 1" from the document [1] is done with the merge? And PR
> #1770
> > is
> > > > going towards "Task 6".
> > > > I think good support for Stream SQL is a very interesting new feature
> > for
> > > > Flink.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > > >
> > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > [hidden email]
> > > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > We are happy to announce that the "tableOnCalcite" branch is
> finally
> > > > ready
> > > > > to be merged.
> > > > > It essentially provides the existing functionality of the Table
> API,
> > > but
> > > > > now the translation happens through Apache Calcite.
> > > > > You can find the changes rebased on top of the current master in
> [1].
> > > > > We have removed the prototype streaming Table API functionality,
> > which
> > > > will
> > > > > be added back once PR [2] is merged.
> > > > >
> > > > > We'll go through the changes once more and, if no objections, we
> > would
> > > > like
> > > > > to go ahead and merge this.
> > > > >
> > > > > Cheers,
> > > > > -Vasia.
> > > > >
> > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > >
> > > > >
> > > > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]>
> > wrote:
> > > > >
> > > > > > Hi everybody,
> > > > > >
> > > > > > as previously announced, I pushed a feature branch called
> > > > > "tableOnCalcite"
> > > > > > to the Flink repository.
> > > > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > > > >
> > > > > > Cheers, Fabian
> > > > > >
> > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > > > > >
> > > > > > > We haven't defined the StreamSQL syntax yet (and I think it
> will
> > > take
> > > > > > some
> > > > > > > time until we are at that point).
> > > > > > > So we are quite flexible with both featurs.
> > > > > > >
> > > > > > > Let's keep this opportunity in mind and coordinate when before
> > > making
> > > > > > > decisions about CEP or StreamSQL.
> > > > > > >
> > > > > > > Fabian
> > > > > > >
> > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]
> >:
> > > > > > >
> > > > > > >> First of all, it's a great design document. Looking forward
> > having
> > > > > > stream
> > > > > > >> SQL in the foreseeable future :-)
> > > > > > >>
> > > > > > >> I think it is a good idea to consolidate stream SQL and CEP in
> > the
> > > > > long
> > > > > > >> run. CEP's additional features compared to SQL boil down to
> > > pattern
> > > > > > >> detection. Once we have this, it should be only a question of
> > > > defining
> > > > > > the
> > > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > > stream
> > > > > SQL.
> > > > > > >> Oracle has already defined an extension [1] to detect patterns
> > in
> > > a
> > > > > set
> > > > > > of
> > > > > > >> table rows. This or Esper's event processing language (EPL)
> [2]
> > > > could
> > > > > > be a
> > > > > > >> good starting point.
> > > > > > >>
> > > > > > >> [1]
> > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > > >> [2]
> > > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > >>
> > > > > > >> Cheers,
> > > > > > >> Till
> > > > > > >>
> > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > [hidden email]>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Thanks for the feedback!
> > > > > > >> >
> > > > > > >> > We will start the SQL effort with putting the existing
> (batch)
> > > > Table
> > > > > > >> API on
> > > > > > >> > top of Apache Calcite.
> > > > > > >> > From there we continue to add streaming support for the
> Table
> > > API
> > > > > > >> before we
> > > > > > >> > put a StreamSQL interface on top.
> > > > > > >> >
> > > > > > >> > Consolidating the efforts with the CEP library sounds like a
> > > good
> > > > > idea
> > > > > > >> to
> > > > > > >> > me.
> > > > > > >> > Maybe it can be nicely integrated with the streaming table
> API
> > > and
> > > > > > >> later as
> > > > > > >> > well with the StreamSQL interface (the StreamSQL dialect is
> > not
> > > > > > defined
> > > > > > >> > yet).
> > > > > > >> >
> > > > > > >> > @Till: What do you think about adding CEP features to the
> > Table
> > > > API.
> > > > > > >> From
> > > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > > matching
> > > > > > >> > operator in addition to the window features that we need to
> > add
> > > > for
> > > > > > >> > streaming Table API in any case.
> > > > > > >> >
> > > > > > >> > Best, Fabian
> > > > > > >> >
> > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > [hidden email]
> > > > >:
> > > > > > >> >
> > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> SQL-standard
> > > > > language
> > > > > > >> > > extend to offering a cluster of window, pattern matching.
> > EPL
> > > > can
> > > > > > >> both
> > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > >> > >
> > > > > > >> > > [1]
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > Regards
> > > > > > >> > > Song
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > -----邮件原件-----
> > > > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > >> > > 收件人: [hidden email]
> > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > >> > >
> > > > > > >> > > We still don’t have a concensus about the streaming SQL
> and
> > > CEP
> > > > > > >> library
> > > > > > >> > on
> > > > > > >> > > Flink. Some people want to merge these two libraries.
> Maybe
> > we
> > > > > have
> > > > > > to
> > > > > > >> > > discuss about this in mailing list.
> > > > > > >> > >
> > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > [hidden email]>
> > > > > > >> wrote:
> > > > > > >> > > >
> > > > > > >> > > > What's the relationship between the streaming SQL
> proposed
> > > > here
> > > > > > and
> > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > >> > > >
> > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > [hidden email]
> > > > > > >> >
> > > > > > >> > > wrote:
> > > > > > >> > > >
> > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > >> > > >>
> > > > > > >> > > >> - Henry
> > > > > > >> > > >>
> > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > [hidden email]
> > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > >> > > >>
> > > > > > >> > > >>> Hi Henry,
> > > > > > >> > > >>>
> > > > > > >> > > >>> There is
> > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > and a
> > > > > > >> few
> > > > > > >> > > >>> subissues.
> > > > > > >> > > >>> I'll reorganize these and add more issues for the
> tasks
> > > > > > described
> > > > > > >> in
> > > > > > >> > > >>> the design document in the next days.
> > > > > > >> > > >>>
> > > > > > >> > > >>> Thanks, Fabian
> > > > > > >> > > >>>
> > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > [hidden email]
> > > > > > >> > > >> <javascript:;>
> > > > > > >> > > >>> <javascript:;>>:
> > > > > > >> > > >>>
> > > > > > >> > > >>>> HI Fabian,
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> Have you created JIRA ticket to keep track of this
> new
> > > > > feature?
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> - Henry
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > > [hidden email]
> > > > > > >> > > >> <javascript:;>
> > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > >> > > >>>>> Hi everybody,
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> in the last days, Timo and I refined the design
> > document
> > > > for
> > > > > > >> > > >>>>> adding a
> > > > > > >> > > >>>> SQL /
> > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was started
> > by
> > > > > > Stephan.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> The document proposes an architecture that is
> centered
> > > > > around
> > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level
> project
> > > and
> > > > > > >> > > >>>>> includes a SQL
> > > > > > >> > > >>>> parser,
> > > > > > >> > > >>>>> a semantic validator for relational queries, and a
> > rule-
> > > > and
> > > > > > >> > > >> cost-based
> > > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive
> > and
> > > > > > Apache
> > > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the
> plan
> > is
> > > > to
> > > > > > >> > > >>>>> translate Table
> > > > > > >> > > >>> API
> > > > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > > > trees,
> > > > > > >> > > >>>>> optimize
> > > > > > >> > > >>>> these
> > > > > > >> > > >>>>> trees, and translate them into DataSet and
> DataStream
> > > > > > >> programs.The
> > > > > > >> > > >>>> document
> > > > > > >> > > >>>>> breaks down the work into several tasks and
> subtasks.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> -- >
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>
> > > > > > >> > > >>>
> > > > > > >> > > >>
> > > > > > >>
> > > >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Unless there are major concerns with the design,
> Timo
> > > and
> > > > I
> > > > > > want
> > > > > > >> > > >>>>> to
> > > > > > >> > > >>> start
> > > > > > >> > > >>>>> next week to move the current Table API on top of
> > Apache
> > > > > > Calcite
> > > > > > >> > > >> (Task
> > > > > > >> > > >>> 1
> > > > > > >> > > >>>> in
> > > > > > >> > > >>>>> the document). The goal of this task is to have the
> > same
> > > > > > >> > > >> functionality
> > > > > > >> > > >>> as
> > > > > > >> > > >>>>> currently, but with Calcite in the translation
> > process.
> > > > This
> > > > > > is
> > > > > > >> a
> > > > > > >> > > >>>> blocking
> > > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we
> can
> > > > > > >> > > >>>>> independently
> > > > > > >> > > >>> work
> > > > > > >> > > >>>>> on different aspects such as extending the Table
> API,
> > > > > adding a
> > > > > > >> SQL
> > > > > > >> > > >>>>> interface (basically just a parser), integration
> with
> > > > > external
> > > > > > >> > > >>>>> data sources, better code generation, optimization
> > > rules,
> > > > > > >> > > >>>>> streaming
> > > > > > >> > > >> support
> > > > > > >> > > >>>> for
> > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> > > Task
> > > > 1
> > > > > > and
> > > > > > >> > > >>>>> merge
> > > > > > >> > > >>> it
> > > > > > >> > > >>>> to
> > > > > > >> > > >>>>> the master branch once the task is completed. Of
> > course,
> > > > > > >> everybody
> > > > > > >> > > >>>>> is welcome to contribute to this effort. Please let
> us
> > > > know
> > > > > > such
> > > > > > >> > > >>>>> that we
> > > > > > >> > > >>> can
> > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Thanks,
> > > > > > >> > > >>>>> Fabian
> > > > > > >> > >
> > > > > > >> > > Regards,
> > > > > > >> > > Chiwan Park
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: 答复: Effort to add SQL / StreamSQL to Flink

Stephan Ewen
Cool stuff!

SQL coming up next? ;-)


On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <[hidden email]> wrote:

> Yeah! I'm a little late to the party but exciting stuff! :)
>
> On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <
> [hidden email]
> > wrote:
>
> > Hi all,
> >
> > tableOnCalcite has been merged to master :)
> >
> > Cheers,
> > -Vasia.
> >
> > On 17 March 2016 at 11:11, Fabian Hueske <[hidden email]> wrote:
> >
> > > Thanks for the initiative Vasia!
> > > I went over the diff and didn't find anything crucial.
> > >
> > > I would like to do another pass over the tests though and improve the
> > > exceptions for invalid joins before merging.
> > > Will open a PR later today.
> > >
> > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <[hidden email]
> >:
> > >
> > > > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> > > Task
> > > > 5. Task 6 should come right after :)
> > > >
> > > > -V.
> > > >
> > > > On 16 March 2016 at 20:35, Robert Metzger <[hidden email]>
> wrote:
> > > >
> > > > > Cool, this is great news!
> > > > > So "Task 1" from the document [1] is done with the merge? And PR
> > #1770
> > > is
> > > > > going towards "Task 6".
> > > > > I think good support for Stream SQL is a very interesting new
> feature
> > > for
> > > > > Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > >
> > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > > [hidden email]
> > > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > finally
> > > > > ready
> > > > > > to be merged.
> > > > > > It essentially provides the existing functionality of the Table
> > API,
> > > > but
> > > > > > now the translation happens through Apache Calcite.
> > > > > > You can find the changes rebased on top of the current master in
> > [1].
> > > > > > We have removed the prototype streaming Table API functionality,
> > > which
> > > > > will
> > > > > > be added back once PR [2] is merged.
> > > > > >
> > > > > > We'll go through the changes once more and, if no objections, we
> > > would
> > > > > like
> > > > > > to go ahead and merge this.
> > > > > >
> > > > > > Cheers,
> > > > > > -Vasia.
> > > > > >
> > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > >
> > > > > >
> > > > > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > Hi everybody,
> > > > > > >
> > > > > > > as previously announced, I pushed a feature branch called
> > > > > > "tableOnCalcite"
> > > > > > > to the Flink repository.
> > > > > > > We will use this branch to work on FLINK-3221 and its
> sub-issues.
> > > > > > >
> > > > > > > Cheers, Fabian
> > > > > > >
> > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > > > > > >
> > > > > > > > We haven't defined the StreamSQL syntax yet (and I think it
> > will
> > > > take
> > > > > > > some
> > > > > > > > time until we are at that point).
> > > > > > > > So we are quite flexible with both featurs.
> > > > > > > >
> > > > > > > > Let's keep this opportunity in mind and coordinate when
> before
> > > > making
> > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > >
> > > > > > > > Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> [hidden email]
> > >:
> > > > > > > >
> > > > > > > >> First of all, it's a great design document. Looking forward
> > > having
> > > > > > > stream
> > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > >>
> > > > > > > >> I think it is a good idea to consolidate stream SQL and CEP
> in
> > > the
> > > > > > long
> > > > > > > >> run. CEP's additional features compared to SQL boil down to
> > > > pattern
> > > > > > > >> detection. Once we have this, it should be only a question
> of
> > > > > defining
> > > > > > > the
> > > > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > > > stream
> > > > > > SQL.
> > > > > > > >> Oracle has already defined an extension [1] to detect
> patterns
> > > in
> > > > a
> > > > > > set
> > > > > > > of
> > > > > > > >> table rows. This or Esper's event processing language (EPL)
> > [2]
> > > > > could
> > > > > > > be a
> > > > > > > >> good starting point.
> > > > > > > >>
> > > > > > > >> [1]
> > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > > > >> [2]
> > > > > >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > >>
> > > > > > > >> Cheers,
> > > > > > > >> Till
> > > > > > > >>
> > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > [hidden email]>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Thanks for the feedback!
> > > > > > > >> >
> > > > > > > >> > We will start the SQL effort with putting the existing
> > (batch)
> > > > > Table
> > > > > > > >> API on
> > > > > > > >> > top of Apache Calcite.
> > > > > > > >> > From there we continue to add streaming support for the
> > Table
> > > > API
> > > > > > > >> before we
> > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > >> >
> > > > > > > >> > Consolidating the efforts with the CEP library sounds
> like a
> > > > good
> > > > > > idea
> > > > > > > >> to
> > > > > > > >> > me.
> > > > > > > >> > Maybe it can be nicely integrated with the streaming table
> > API
> > > > and
> > > > > > > >> later as
> > > > > > > >> > well with the StreamSQL interface (the StreamSQL dialect
> is
> > > not
> > > > > > > defined
> > > > > > > >> > yet).
> > > > > > > >> >
> > > > > > > >> > @Till: What do you think about adding CEP features to the
> > > Table
> > > > > API.
> > > > > > > >> From
> > > > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > > > matching
> > > > > > > >> > operator in addition to the window features that we need
> to
> > > add
> > > > > for
> > > > > > > >> > streaming Table API in any case.
> > > > > > > >> >
> > > > > > > >> > Best, Fabian
> > > > > > > >> >
> > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > [hidden email]
> > > > > >:
> > > > > > > >> >
> > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > SQL-standard
> > > > > > language
> > > > > > > >> > > extend to offering a cluster of window, pattern
> matching.
> > > EPL
> > > > > can
> > > > > > > >> both
> > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > >> > >
> > > > > > > >> > > [1]
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > Regards
> > > > > > > >> > > Song
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > -----邮件原件-----
> > > > > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > >> > > 收件人: [hidden email]
> > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > >> > >
> > > > > > > >> > > We still don’t have a concensus about the streaming SQL
> > and
> > > > CEP
> > > > > > > >> library
> > > > > > > >> > on
> > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > Maybe
> > > we
> > > > > > have
> > > > > > > to
> > > > > > > >> > > discuss about this in mailing list.
> > > > > > > >> > >
> > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > [hidden email]>
> > > > > > > >> wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > What's the relationship between the streaming SQL
> > proposed
> > > > > here
> > > > > > > and
> > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > >> > > >
> > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > [hidden email]
> > > > > > > >> >
> > > > > > > >> > > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > >> > > >>
> > > > > > > >> > > >> - Henry
> > > > > > > >> > > >>
> > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > >> > > >>
> > > > > > > >> > > >>> Hi Henry,
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> There is
> > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > and a
> > > > > > > >> few
> > > > > > > >> > > >>> subissues.
> > > > > > > >> > > >>> I'll reorganize these and add more issues for the
> > tasks
> > > > > > > described
> > > > > > > >> in
> > > > > > > >> > > >>> the design document in the next days.
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > >> > > >>>
> > > > > > > >> > > >>>> HI Fabian,
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of this
> > new
> > > > > > feature?
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> - Henry
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> in the last days, Timo and I refined the design
> > > document
> > > > > for
> > > > > > > >> > > >>>>> adding a
> > > > > > > >> > > >>>> SQL /
> > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> started
> > > by
> > > > > > > Stephan.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> The document proposes an architecture that is
> > centered
> > > > > > around
> > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level
> > project
> > > > and
> > > > > > > >> > > >>>>> includes a SQL
> > > > > > > >> > > >>>> parser,
> > > > > > > >> > > >>>>> a semantic validator for relational queries, and a
> > > rule-
> > > > > and
> > > > > > > >> > > >> cost-based
> > > > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache
> Hive
> > > and
> > > > > > > Apache
> > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the
> > plan
> > > is
> > > > > to
> > > > > > > >> > > >>>>> translate Table
> > > > > > > >> > > >>> API
> > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> expression
> > > > > trees,
> > > > > > > >> > > >>>>> optimize
> > > > > > > >> > > >>>> these
> > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > DataStream
> > > > > > > >> programs.The
> > > > > > > >> > > >>>> document
> > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > subtasks.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> -- >
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>
> > > > > > > >> > > >>
> > > > > > > >>
> > > > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Unless there are major concerns with the design,
> > Timo
> > > > and
> > > > > I
> > > > > > > want
> > > > > > > >> > > >>>>> to
> > > > > > > >> > > >>> start
> > > > > > > >> > > >>>>> next week to move the current Table API on top of
> > > Apache
> > > > > > > Calcite
> > > > > > > >> > > >> (Task
> > > > > > > >> > > >>> 1
> > > > > > > >> > > >>>> in
> > > > > > > >> > > >>>>> the document). The goal of this task is to have
> the
> > > same
> > > > > > > >> > > >> functionality
> > > > > > > >> > > >>> as
> > > > > > > >> > > >>>>> currently, but with Calcite in the translation
> > > process.
> > > > > This
> > > > > > > is
> > > > > > > >> a
> > > > > > > >> > > >>>> blocking
> > > > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we
> > can
> > > > > > > >> > > >>>>> independently
> > > > > > > >> > > >>> work
> > > > > > > >> > > >>>>> on different aspects such as extending the Table
> > API,
> > > > > > adding a
> > > > > > > >> SQL
> > > > > > > >> > > >>>>> interface (basically just a parser), integration
> > with
> > > > > > external
> > > > > > > >> > > >>>>> data sources, better code generation, optimization
> > > > rules,
> > > > > > > >> > > >>>>> streaming
> > > > > > > >> > > >> support
> > > > > > > >> > > >>>> for
> > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> implement
> > > > Task
> > > > > 1
> > > > > > > and
> > > > > > > >> > > >>>>> merge
> > > > > > > >> > > >>> it
> > > > > > > >> > > >>>> to
> > > > > > > >> > > >>>>> the master branch once the task is completed. Of
> > > course,
> > > > > > > >> everybody
> > > > > > > >> > > >>>>> is welcome to contribute to this effort. Please
> let
> > us
> > > > > know
> > > > > > > such
> > > > > > > >> > > >>>>> that we
> > > > > > > >> > > >>> can
> > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Thanks,
> > > > > > > >> > > >>>>> Fabian
> > > > > > > >> > >
> > > > > > > >> > > Regards,
> > > > > > > >> > > Chiwan Park
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

答复: 答复: Effort to add SQL / StreamSQL to Flink

Jiangsong (Hi)
So excited!!   SQL on Flink is ready?  

Are there any show case or howto use?



-----邮件原件-----
发件人: [hidden email] [mailto:[hidden email]] 代表 Stephan Ewen
发送时间: 2016年3月29日 20:00
收件人: [hidden email]
主题: Re: 答复: Effort to add SQL / StreamSQL to Flink

Cool stuff!

SQL coming up next? ;-)


On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <[hidden email]> wrote:

> Yeah! I'm a little late to the party but exciting stuff! :)
>
> On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <
> [hidden email]
> > wrote:
>
> > Hi all,
> >
> > tableOnCalcite has been merged to master :)
> >
> > Cheers,
> > -Vasia.
> >
> > On 17 March 2016 at 11:11, Fabian Hueske <[hidden email]> wrote:
> >
> > > Thanks for the initiative Vasia!
> > > I went over the diff and didn't find anything crucial.
> > >
> > > I would like to do another pass over the tests though and improve
> > > the exceptions for invalid joins before merging.
> > > Will open a PR later today.
> > >
> > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri
> > > <[hidden email]
> >:
> > >
> > > > Yes, the current state corresponds to Task 1. PR #1770
> > > > corresponds to
> > > Task
> > > > 5. Task 6 should come right after :)
> > > >
> > > > -V.
> > > >
> > > > On 16 March 2016 at 20:35, Robert Metzger <[hidden email]>
> wrote:
> > > >
> > > > > Cool, this is great news!
> > > > > So "Task 1" from the document [1] is done with the merge? And
> > > > > PR
> > #1770
> > > is
> > > > > going towards "Task 6".
> > > > > I think good support for Stream SQL is a very interesting new
> feature
> > > for
> > > > > Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPc
> p1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > >
> > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > > [hidden email]
> > > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > finally
> > > > > ready
> > > > > > to be merged.
> > > > > > It essentially provides the existing functionality of the
> > > > > > Table
> > API,
> > > > but
> > > > > > now the translation happens through Apache Calcite.
> > > > > > You can find the changes rebased on top of the current
> > > > > > master in
> > [1].
> > > > > > We have removed the prototype streaming Table API
> > > > > > functionality,
> > > which
> > > > > will
> > > > > > be added back once PR [2] is merged.
> > > > > >
> > > > > > We'll go through the changes once more and, if no
> > > > > > objections, we
> > > would
> > > > > like
> > > > > > to go ahead and merge this.
> > > > > >
> > > > > > Cheers,
> > > > > > -Vasia.
> > > > > >
> > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > >
> > > > > >
> > > > > > On 15 January 2016 at 10:59, Fabian Hueske
> > > > > > <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > Hi everybody,
> > > > > > >
> > > > > > > as previously announced, I pushed a feature branch called
> > > > > > "tableOnCalcite"
> > > > > > > to the Flink repository.
> > > > > > > We will use this branch to work on FLINK-3221 and its
> sub-issues.
> > > > > > >
> > > > > > > Cheers, Fabian
> > > > > > >
> > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > > > > > >
> > > > > > > > We haven't defined the StreamSQL syntax yet (and I think
> > > > > > > > it
> > will
> > > > take
> > > > > > > some
> > > > > > > > time until we are at that point).
> > > > > > > > So we are quite flexible with both featurs.
> > > > > > > >
> > > > > > > > Let's keep this opportunity in mind and coordinate when
> before
> > > > making
> > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > >
> > > > > > > > Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> [hidden email]
> > >:
> > > > > > > >
> > > > > > > >> First of all, it's a great design document. Looking
> > > > > > > >> forward
> > > having
> > > > > > > stream
> > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > >>
> > > > > > > >> I think it is a good idea to consolidate stream SQL and
> > > > > > > >> CEP
> in
> > > the
> > > > > > long
> > > > > > > >> run. CEP's additional features compared to SQL boil
> > > > > > > >> down to
> > > > pattern
> > > > > > > >> detection. Once we have this, it should be only a
> > > > > > > >> question
> of
> > > > > defining
> > > > > > > the
> > > > > > > >> SQL syntax for event patterns in order to integrate CEP
> > > > > > > >> with
> > > > stream
> > > > > > SQL.
> > > > > > > >> Oracle has already defined an extension [1] to detect
> patterns
> > > in
> > > > a
> > > > > > set
> > > > > > > of
> > > > > > > >> table rows. This or Esper's event processing language
> > > > > > > >> (EPL)
> > [2]
> > > > > could
> > > > > > > be a
> > > > > > > >> good starting point.
> > > > > > > >>
> > > > > > > >> [1]
> > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG89
> > > > > 59
> > > > > > > >> [2]
> > > > > >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > >>
> > > > > > > >> Cheers,
> > > > > > > >> Till
> > > > > > > >>
> > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > [hidden email]>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Thanks for the feedback!
> > > > > > > >> >
> > > > > > > >> > We will start the SQL effort with putting the
> > > > > > > >> > existing
> > (batch)
> > > > > Table
> > > > > > > >> API on
> > > > > > > >> > top of Apache Calcite.
> > > > > > > >> > From there we continue to add streaming support for
> > > > > > > >> > the
> > Table
> > > > API
> > > > > > > >> before we
> > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > >> >
> > > > > > > >> > Consolidating the efforts with the CEP library sounds
> like a
> > > > good
> > > > > > idea
> > > > > > > >> to
> > > > > > > >> > me.
> > > > > > > >> > Maybe it can be nicely integrated with the streaming
> > > > > > > >> > table
> > API
> > > > and
> > > > > > > >> later as
> > > > > > > >> > well with the StreamSQL interface (the StreamSQL
> > > > > > > >> > dialect
> is
> > > not
> > > > > > > defined
> > > > > > > >> > yet).
> > > > > > > >> >
> > > > > > > >> > @Till: What do you think about adding CEP features to
> > > > > > > >> > the
> > > Table
> > > > > API.
> > > > > > > >> From
> > > > > > > >> > the CEP design doc, it looks like we need to add a
> > > > > > > >> > pattern
> > > > > matching
> > > > > > > >> > operator in addition to the window features that we
> > > > > > > >> > need
> to
> > > add
> > > > > for
> > > > > > > >> > streaming Table API in any case.
> > > > > > > >> >
> > > > > > > >> > Best, Fabian
> > > > > > > >> >
> > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > [hidden email]
> > > > > >:
> > > > > > > >> >
> > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > SQL-standard
> > > > > > language
> > > > > > > >> > > extend to offering a cluster of window, pattern
> matching.
> > > EPL
> > > > > can
> > > > > > > >> both
> > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > >> > >
> > > > > > > >> > > [1]
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper
> _reference.pdf
> > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > Regards
> > > > > > > >> > > Song
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > -----邮件原件-----
> > > > > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > >> > > 收件人: [hidden email]
> > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > >> > >
> > > > > > > >> > > We still don’t have a concensus about the streaming
> > > > > > > >> > > SQL
> > and
> > > > CEP
> > > > > > > >> library
> > > > > > > >> > on
> > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > Maybe
> > > we
> > > > > > have
> > > > > > > to
> > > > > > > >> > > discuss about this in mailing list.
> > > > > > > >> > >
> > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > [hidden email]>
> > > > > > > >> wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > What's the relationship between the streaming SQL
> > proposed
> > > > > here
> > > > > > > and
> > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > >> > > >
> > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > [hidden email]
> > > > > > > >> >
> > > > > > > >> > > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > >> > > >>
> > > > > > > >> > > >> - Henry
> > > > > > > >> > > >>
> > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > >> > > >>
> > > > > > > >> > > >>> Hi Henry,
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> There is
> > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > and a
> > > > > > > >> few
> > > > > > > >> > > >>> subissues.
> > > > > > > >> > > >>> I'll reorganize these and add more issues for
> > > > > > > >> > > >>> the
> > tasks
> > > > > > > described
> > > > > > > >> in
> > > > > > > >> > > >>> the design document in the next days.
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > >> > > >>>
> > > > > > > >> > > >>>> HI Fabian,
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of
> > > > > > > >> > > >>>> this
> > new
> > > > > > feature?
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> - Henry
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske
> > > > > > > >> > > >>>> <
> > > > > > > [hidden email]
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> in the last days, Timo and I refined the
> > > > > > > >> > > >>>>> design
> > > document
> > > > > for
> > > > > > > >> > > >>>>> adding a
> > > > > > > >> > > >>>> SQL /
> > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> started
> > > by
> > > > > > > Stephan.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> The document proposes an architecture that is
> > centered
> > > > > > around
> > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache
> > > > > > > >> > > >>>>> top-level
> > project
> > > > and
> > > > > > > >> > > >>>>> includes a SQL
> > > > > > > >> > > >>>> parser,
> > > > > > > >> > > >>>>> a semantic validator for relational queries,
> > > > > > > >> > > >>>>> and a
> > > rule-
> > > > > and
> > > > > > > >> > > >> cost-based
> > > > > > > >> > > >>>>> relational optimizer. Calcite is used by
> > > > > > > >> > > >>>>> Apache
> Hive
> > > and
> > > > > > > Apache
> > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell,
> > > > > > > >> > > >>>>> the
> > plan
> > > is
> > > > > to
> > > > > > > >> > > >>>>> translate Table
> > > > > > > >> > > >>> API
> > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> expression
> > > > > trees,
> > > > > > > >> > > >>>>> optimize
> > > > > > > >> > > >>>> these
> > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > DataStream
> > > > > > > >> programs.The
> > > > > > > >> > > >>>> document
> > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > subtasks.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> -- >
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>
> > > > > > > >> > > >>
> > > > > > > >>
> > > > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRj
> > P
> > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Unless there are major concerns with the
> > > > > > > >> > > >>>>> design,
> > Timo
> > > > and
> > > > > I
> > > > > > > want
> > > > > > > >> > > >>>>> to
> > > > > > > >> > > >>> start
> > > > > > > >> > > >>>>> next week to move the current Table API on
> > > > > > > >> > > >>>>> top of
> > > Apache
> > > > > > > Calcite
> > > > > > > >> > > >> (Task
> > > > > > > >> > > >>> 1
> > > > > > > >> > > >>>> in
> > > > > > > >> > > >>>>> the document). The goal of this task is to
> > > > > > > >> > > >>>>> have
> the
> > > same
> > > > > > > >> > > >> functionality
> > > > > > > >> > > >>> as
> > > > > > > >> > > >>>>> currently, but with Calcite in the
> > > > > > > >> > > >>>>> translation
> > > process.
> > > > > This
> > > > > > > is
> > > > > > > >> a
> > > > > > > >> > > >>>> blocking
> > > > > > > >> > > >>>>> task that we hope to complete soon.
> > > > > > > >> > > >>>>> Afterwards, we
> > can
> > > > > > > >> > > >>>>> independently
> > > > > > > >> > > >>> work
> > > > > > > >> > > >>>>> on different aspects such as extending the
> > > > > > > >> > > >>>>> Table
> > API,
> > > > > > adding a
> > > > > > > >> SQL
> > > > > > > >> > > >>>>> interface (basically just a parser),
> > > > > > > >> > > >>>>> integration
> > with
> > > > > > external
> > > > > > > >> > > >>>>> data sources, better code generation,
> > > > > > > >> > > >>>>> optimization
> > > > rules,
> > > > > > > >> > > >>>>> streaming
> > > > > > > >> > > >> support
> > > > > > > >> > > >>>> for
> > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> implement
> > > > Task
> > > > > 1
> > > > > > > and
> > > > > > > >> > > >>>>> merge
> > > > > > > >> > > >>> it
> > > > > > > >> > > >>>> to
> > > > > > > >> > > >>>>> the master branch once the task is completed.
> > > > > > > >> > > >>>>> Of
> > > course,
> > > > > > > >> everybody
> > > > > > > >> > > >>>>> is welcome to contribute to this effort.
> > > > > > > >> > > >>>>> Please
> let
> > us
> > > > > know
> > > > > > > such
> > > > > > > >> > > >>>>> that we
> > > > > > > >> > > >>> can
> > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Thanks,
> > > > > > > >> > > >>>>> Fabian
> > > > > > > >> > >
> > > > > > > >> > > Regards,
> > > > > > > >> > > Chiwan Park
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: 答复: 答复: Effort to add SQL / StreamSQL to Flink

Vasiliki Kalavri
Great to see people excited about this :)
SQL is indeed coming up next. We should have the SQL on DataSets programs
(see FLINK-3640 [1]) pretty soon.

-Vasia.

[1]: https://issues.apache.org/jira/browse/FLINK-3640

On 29 March 2016 at 14:02, Jiangsong (Hi) <[hidden email]> wrote:

> So excited!!   SQL on Flink is ready?
>
> Are there any show case or howto use?
>
>
>
> -----邮件原件-----
> 发件人: [hidden email] [mailto:[hidden email]] 代表 Stephan Ewen
> 发送时间: 2016年3月29日 20:00
> 收件人: [hidden email]
> 主题: Re: 答复: Effort to add SQL / StreamSQL to Flink
>
> Cool stuff!
>
> SQL coming up next? ;-)
>
>
> On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <[hidden email]>
> wrote:
>
> > Yeah! I'm a little late to the party but exciting stuff! :)
> >
> > On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <
> > [hidden email]
> > > wrote:
> >
> > > Hi all,
> > >
> > > tableOnCalcite has been merged to master :)
> > >
> > > Cheers,
> > > -Vasia.
> > >
> > > On 17 March 2016 at 11:11, Fabian Hueske <[hidden email]> wrote:
> > >
> > > > Thanks for the initiative Vasia!
> > > > I went over the diff and didn't find anything crucial.
> > > >
> > > > I would like to do another pass over the tests though and improve
> > > > the exceptions for invalid joins before merging.
> > > > Will open a PR later today.
> > > >
> > > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri
> > > > <[hidden email]
> > >:
> > > >
> > > > > Yes, the current state corresponds to Task 1. PR #1770
> > > > > corresponds to
> > > > Task
> > > > > 5. Task 6 should come right after :)
> > > > >
> > > > > -V.
> > > > >
> > > > > On 16 March 2016 at 20:35, Robert Metzger <[hidden email]>
> > wrote:
> > > > >
> > > > > > Cool, this is great news!
> > > > > > So "Task 1" from the document [1] is done with the merge? And
> > > > > > PR
> > > #1770
> > > > is
> > > > > > going towards "Task 6".
> > > > > > I think good support for Stream SQL is a very interesting new
> > feature
> > > > for
> > > > > > Flink.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPc
> > p1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > > >
> > > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > > > [hidden email]
> > > > > > > wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > > finally
> > > > > > ready
> > > > > > > to be merged.
> > > > > > > It essentially provides the existing functionality of the
> > > > > > > Table
> > > API,
> > > > > but
> > > > > > > now the translation happens through Apache Calcite.
> > > > > > > You can find the changes rebased on top of the current
> > > > > > > master in
> > > [1].
> > > > > > > We have removed the prototype streaming Table API
> > > > > > > functionality,
> > > > which
> > > > > > will
> > > > > > > be added back once PR [2] is merged.
> > > > > > >
> > > > > > > We'll go through the changes once more and, if no
> > > > > > > objections, we
> > > > would
> > > > > > like
> > > > > > > to go ahead and merge this.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > -Vasia.
> > > > > > >
> > > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > > >
> > > > > > >
> > > > > > > On 15 January 2016 at 10:59, Fabian Hueske
> > > > > > > <[hidden email]>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi everybody,
> > > > > > > >
> > > > > > > > as previously announced, I pushed a feature branch called
> > > > > > > "tableOnCalcite"
> > > > > > > > to the Flink repository.
> > > > > > > > We will use this branch to work on FLINK-3221 and its
> > sub-issues.
> > > > > > > >
> > > > > > > > Cheers, Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]
> >:
> > > > > > > >
> > > > > > > > > We haven't defined the StreamSQL syntax yet (and I think
> > > > > > > > > it
> > > will
> > > > > take
> > > > > > > > some
> > > > > > > > > time until we are at that point).
> > > > > > > > > So we are quite flexible with both featurs.
> > > > > > > > >
> > > > > > > > > Let's keep this opportunity in mind and coordinate when
> > before
> > > > > making
> > > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > > >
> > > > > > > > > Fabian
> > > > > > > > >
> > > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> > [hidden email]
> > > >:
> > > > > > > > >
> > > > > > > > >> First of all, it's a great design document. Looking
> > > > > > > > >> forward
> > > > having
> > > > > > > > stream
> > > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > > >>
> > > > > > > > >> I think it is a good idea to consolidate stream SQL and
> > > > > > > > >> CEP
> > in
> > > > the
> > > > > > > long
> > > > > > > > >> run. CEP's additional features compared to SQL boil
> > > > > > > > >> down to
> > > > > pattern
> > > > > > > > >> detection. Once we have this, it should be only a
> > > > > > > > >> question
> > of
> > > > > > defining
> > > > > > > > the
> > > > > > > > >> SQL syntax for event patterns in order to integrate CEP
> > > > > > > > >> with
> > > > > stream
> > > > > > > SQL.
> > > > > > > > >> Oracle has already defined an extension [1] to detect
> > patterns
> > > > in
> > > > > a
> > > > > > > set
> > > > > > > > of
> > > > > > > > >> table rows. This or Esper's event processing language
> > > > > > > > >> (EPL)
> > > [2]
> > > > > > could
> > > > > > > > be a
> > > > > > > > >> good starting point.
> > > > > > > > >>
> > > > > > > > >> [1]
> > > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG89
> > > > > > 59
> > > > > > > > >> [2]
> > > > > > >
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > > >>
> > > > > > > > >> Cheers,
> > > > > > > > >> Till
> > > > > > > > >>
> > > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > > [hidden email]>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > Thanks for the feedback!
> > > > > > > > >> >
> > > > > > > > >> > We will start the SQL effort with putting the
> > > > > > > > >> > existing
> > > (batch)
> > > > > > Table
> > > > > > > > >> API on
> > > > > > > > >> > top of Apache Calcite.
> > > > > > > > >> > From there we continue to add streaming support for
> > > > > > > > >> > the
> > > Table
> > > > > API
> > > > > > > > >> before we
> > > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > > >> >
> > > > > > > > >> > Consolidating the efforts with the CEP library sounds
> > like a
> > > > > good
> > > > > > > idea
> > > > > > > > >> to
> > > > > > > > >> > me.
> > > > > > > > >> > Maybe it can be nicely integrated with the streaming
> > > > > > > > >> > table
> > > API
> > > > > and
> > > > > > > > >> later as
> > > > > > > > >> > well with the StreamSQL interface (the StreamSQL
> > > > > > > > >> > dialect
> > is
> > > > not
> > > > > > > > defined
> > > > > > > > >> > yet).
> > > > > > > > >> >
> > > > > > > > >> > @Till: What do you think about adding CEP features to
> > > > > > > > >> > the
> > > > Table
> > > > > > API.
> > > > > > > > >> From
> > > > > > > > >> > the CEP design doc, it looks like we need to add a
> > > > > > > > >> > pattern
> > > > > > matching
> > > > > > > > >> > operator in addition to the window features that we
> > > > > > > > >> > need
> > to
> > > > add
> > > > > > for
> > > > > > > > >> > streaming Table API in any case.
> > > > > > > > >> >
> > > > > > > > >> > Best, Fabian
> > > > > > > > >> >
> > > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > >> >
> > > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > > SQL-standard
> > > > > > > language
> > > > > > > > >> > > extend to offering a cluster of window, pattern
> > matching.
> > > > EPL
> > > > > > can
> > > > > > > > >> both
> > > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > > >> > >
> > > > > > > > >> > > [1]
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper
> > _reference.pdf
> > > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > Regards
> > > > > > > > >> > > Song
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > -----邮件原件-----
> > > > > > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > > >> > > 收件人: [hidden email]
> > > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > > >> > >
> > > > > > > > >> > > We still don’t have a concensus about the streaming
> > > > > > > > >> > > SQL
> > > and
> > > > > CEP
> > > > > > > > >> library
> > > > > > > > >> > on
> > > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > > Maybe
> > > > we
> > > > > > > have
> > > > > > > > to
> > > > > > > > >> > > discuss about this in mailing list.
> > > > > > > > >> > >
> > > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > > [hidden email]>
> > > > > > > > >> wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > What's the relationship between the streaming SQL
> > > proposed
> > > > > > here
> > > > > > > > and
> > > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > > [hidden email]
> > > > > > > > >> >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > > >> > > >>
> > > > > > > > >> > > >> - Henry
> > > > > > > > >> > > >>
> > > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > > [hidden email]
> > > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > > >> > > >>
> > > > > > > > >> > > >>> Hi Henry,
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> There is
> > > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > > and a
> > > > > > > > >> few
> > > > > > > > >> > > >>> subissues.
> > > > > > > > >> > > >>> I'll reorganize these and add more issues for
> > > > > > > > >> > > >>> the
> > > tasks
> > > > > > > > described
> > > > > > > > >> in
> > > > > > > > >> > > >>> the design document in the next days.
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > > [hidden email]
> > > > > > > > >> > > >> <javascript:;>
> > > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>>> HI Fabian,
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of
> > > > > > > > >> > > >>>> this
> > > new
> > > > > > > feature?
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> - Henry
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske
> > > > > > > > >> > > >>>> <
> > > > > > > > [hidden email]
> > > > > > > > >> > > >> <javascript:;>
> > > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> in the last days, Timo and I refined the
> > > > > > > > >> > > >>>>> design
> > > > document
> > > > > > for
> > > > > > > > >> > > >>>>> adding a
> > > > > > > > >> > > >>>> SQL /
> > > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> > started
> > > > by
> > > > > > > > Stephan.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> The document proposes an architecture that is
> > > centered
> > > > > > > around
> > > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache
> > > > > > > > >> > > >>>>> top-level
> > > project
> > > > > and
> > > > > > > > >> > > >>>>> includes a SQL
> > > > > > > > >> > > >>>> parser,
> > > > > > > > >> > > >>>>> a semantic validator for relational queries,
> > > > > > > > >> > > >>>>> and a
> > > > rule-
> > > > > > and
> > > > > > > > >> > > >> cost-based
> > > > > > > > >> > > >>>>> relational optimizer. Calcite is used by
> > > > > > > > >> > > >>>>> Apache
> > Hive
> > > > and
> > > > > > > > Apache
> > > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell,
> > > > > > > > >> > > >>>>> the
> > > plan
> > > > is
> > > > > > to
> > > > > > > > >> > > >>>>> translate Table
> > > > > > > > >> > > >>> API
> > > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> > expression
> > > > > > trees,
> > > > > > > > >> > > >>>>> optimize
> > > > > > > > >> > > >>>> these
> > > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > > DataStream
> > > > > > > > >> programs.The
> > > > > > > > >> > > >>>> document
> > > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > > subtasks.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> -- >
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>
> > > > > > > > >>
> > > > > >
> > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRj
> > > P
> > > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Unless there are major concerns with the
> > > > > > > > >> > > >>>>> design,
> > > Timo
> > > > > and
> > > > > > I
> > > > > > > > want
> > > > > > > > >> > > >>>>> to
> > > > > > > > >> > > >>> start
> > > > > > > > >> > > >>>>> next week to move the current Table API on
> > > > > > > > >> > > >>>>> top of
> > > > Apache
> > > > > > > > Calcite
> > > > > > > > >> > > >> (Task
> > > > > > > > >> > > >>> 1
> > > > > > > > >> > > >>>> in
> > > > > > > > >> > > >>>>> the document). The goal of this task is to
> > > > > > > > >> > > >>>>> have
> > the
> > > > same
> > > > > > > > >> > > >> functionality
> > > > > > > > >> > > >>> as
> > > > > > > > >> > > >>>>> currently, but with Calcite in the
> > > > > > > > >> > > >>>>> translation
> > > > process.
> > > > > > This
> > > > > > > > is
> > > > > > > > >> a
> > > > > > > > >> > > >>>> blocking
> > > > > > > > >> > > >>>>> task that we hope to complete soon.
> > > > > > > > >> > > >>>>> Afterwards, we
> > > can
> > > > > > > > >> > > >>>>> independently
> > > > > > > > >> > > >>> work
> > > > > > > > >> > > >>>>> on different aspects such as extending the
> > > > > > > > >> > > >>>>> Table
> > > API,
> > > > > > > adding a
> > > > > > > > >> SQL
> > > > > > > > >> > > >>>>> interface (basically just a parser),
> > > > > > > > >> > > >>>>> integration
> > > with
> > > > > > > external
> > > > > > > > >> > > >>>>> data sources, better code generation,
> > > > > > > > >> > > >>>>> optimization
> > > > > rules,
> > > > > > > > >> > > >>>>> streaming
> > > > > > > > >> > > >> support
> > > > > > > > >> > > >>>> for
> > > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> > implement
> > > > > Task
> > > > > > 1
> > > > > > > > and
> > > > > > > > >> > > >>>>> merge
> > > > > > > > >> > > >>> it
> > > > > > > > >> > > >>>> to
> > > > > > > > >> > > >>>>> the master branch once the task is completed.
> > > > > > > > >> > > >>>>> Of
> > > > course,
> > > > > > > > >> everybody
> > > > > > > > >> > > >>>>> is welcome to contribute to this effort.
> > > > > > > > >> > > >>>>> Please
> > let
> > > us
> > > > > > know
> > > > > > > > such
> > > > > > > > >> > > >>>>> that we
> > > > > > > > >> > > >>> can
> > > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Thanks,
> > > > > > > > >> > > >>>>> Fabian
> > > > > > > > >> > >
> > > > > > > > >> > > Regards,
> > > > > > > > >> > > Chiwan Park
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
12