(DEPRECATED) Apache Flink Mailing List archive.

Effort to add SQL / StreamSQL to Flink

Classic

List

Threaded

26 messages Options

Fabian Hueske-2

Effort to add SQL / StreamSQL to Flink

Hi everybody,

in the last days, Timo and I refined the design document for adding a SQL /
StreamSQL interface on top of Flink that was started by Stephan.

The document proposes an architecture that is centered around Apache
Calcite. Calcite is an Apache top-level project and includes a SQL parser,
a semantic validator for relational queries, and a rule- and cost-based
relational optimizer. Calcite is used by Apache Hive and Apache Drill
(among other projects). In a nutshell, the plan is to translate Table API
and SQL queries into Calcite's relational expression trees, optimize these
trees, and translate them into DataSet and DataStream programs.The document
breaks down the work into several tasks and subtasks.

Please review the design document and comment.

-- >
https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing

Unless there are major concerns with the design, Timo and I want to start
next week to move the current Table API on top of Apache Calcite (Task 1 in
the document). The goal of this task is to have the same functionality as
currently, but with Calcite in the translation process. This is a blocking
task that we hope to complete soon. Afterwards, we can independently work
on different aspects such as extending the Table API, adding a SQL
interface (basically just a parser), integration with external data
sources, better code generation, optimization rules, streaming support for
the Table API, StreamSQL, etc..

Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
the master branch once the task is completed. Of course, everybody is
welcome to contribute to this effort. Please let us know such that we can
coordinate our efforts.

Thanks,
Fabian

Matthias J. Sax-2

Re: Effort to add SQL / StreamSQL to Flink

Pretty cool!

On 01/07/2016 03:05 PM, Fabian Hueske wrote:

> Hi everybody,
>
> in the last days, Timo and I refined the design document for adding a SQL /
> StreamSQL interface on top of Flink that was started by Stephan.
>
> The document proposes an architecture that is centered around Apache
> Calcite. Calcite is an Apache top-level project and includes a SQL parser,
> a semantic validator for relational queries, and a rule- and cost-based
> relational optimizer. Calcite is used by Apache Hive and Apache Drill
> (among other projects). In a nutshell, the plan is to translate Table API
> and SQL queries into Calcite's relational expression trees, optimize these
> trees, and translate them into DataSet and DataStream programs.The document
> breaks down the work into several tasks and subtasks.
>
> Please review the design document and comment.
>
> -- >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>
> Unless there are major concerns with the design, Timo and I want to start
> next week to move the current Table API on top of Apache Calcite (Task 1 in
> the document). The goal of this task is to have the same functionality as
> currently, but with Calcite in the translation process. This is a blocking
> task that we hope to complete soon. Afterwards, we can independently work
> on different aspects such as extending the Table API, adding a SQL
> interface (basically just a parser), integration with external data
> sources, better code generation, optimization rules, streaming support for
> the Table API, StreamSQL, etc..
>
> Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
> the master branch once the task is completed. Of course, everybody is
> welcome to contribute to this effort. Please let us know such that we can
> coordinate our efforts.
>
> Thanks,
> Fabian
>

signature.asc (836 bytes) Download Attachment

Stephan Ewen

Re: Effort to add SQL / StreamSQL to Flink

Super, thanks for that detailed effort, Fabian!

On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote:

> Pretty cool!
>
> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
> > Hi everybody,
> >
> > in the last days, Timo and I refined the design document for adding a
> SQL /
> > StreamSQL interface on top of Flink that was started by Stephan.
> >
> > The document proposes an architecture that is centered around Apache
> > Calcite. Calcite is an Apache top-level project and includes a SQL
> parser,
> > a semantic validator for relational queries, and a rule- and cost-based
> > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > (among other projects). In a nutshell, the plan is to translate Table API
> > and SQL queries into Calcite's relational expression trees, optimize
> these
> > trees, and translate them into DataSet and DataStream programs.The
> document
> > breaks down the work into several tasks and subtasks.
> >
> > Please review the design document and comment.
> >
> > -- >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> >
> > Unless there are major concerns with the design, Timo and I want to start
> > next week to move the current Table API on top of Apache Calcite (Task 1
> in
> > the document). The goal of this task is to have the same functionality as
> > currently, but with Calcite in the translation process. This is a
> blocking
> > task that we hope to complete soon. Afterwards, we can independently work
> > on different aspects such as extending the Table API, adding a SQL
> > interface (basically just a parser), integration with external data
> > sources, better code generation, optimization rules, streaming support
> for
> > the Table API, StreamSQL, etc..
> >
> > Timo and I plan to work on a WIP branch to implement Task 1 and merge it
> to
> > the master branch once the task is completed. Of course, everybody is
> > welcome to contribute to this effort. Please let us know such that we can
> > coordinate our efforts.
> >
> > Thanks,
> > Fabian
> >
>
>

Kostas Tzoumas-2

Re: Effort to add SQL / StreamSQL to Flink

Wow! Thanks Fabian, this looks fantastic!

On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote:

> Super, thanks for that detailed effort, Fabian!
>
> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote:
>
> > Pretty cool!
> >
> > On 01/07/2016 03:05 PM, Fabian Hueske wrote:
> > > Hi everybody,
> > >
> > > in the last days, Timo and I refined the design document for adding a
> > SQL /
> > > StreamSQL interface on top of Flink that was started by Stephan.
> > >
> > > The document proposes an architecture that is centered around Apache
> > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > parser,
> > > a semantic validator for relational queries, and a rule- and cost-based
> > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > (among other projects). In a nutshell, the plan is to translate Table
> API
> > > and SQL queries into Calcite's relational expression trees, optimize
> > these
> > > trees, and translate them into DataSet and DataStream programs.The
> > document
> > > breaks down the work into several tasks and subtasks.
> > >
> > > Please review the design document and comment.
> > >
> > > -- >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > >
> > > Unless there are major concerns with the design, Timo and I want to
> start
> > > next week to move the current Table API on top of Apache Calcite (Task
> 1
> > in
> > > the document). The goal of this task is to have the same functionality
> as
> > > currently, but with Calcite in the translation process. This is a
> > blocking
> > > task that we hope to complete soon. Afterwards, we can independently
> work
> > > on different aspects such as extending the Table API, adding a SQL
> > > interface (basically just a parser), integration with external data
> > > sources, better code generation, optimization rules, streaming support
> > for
> > > the Table API, StreamSQL, etc..
> > >
> > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> it
> > to
> > > the master branch once the task is completed. Of course, everybody is
> > > welcome to contribute to this effort. Please let us know such that we
> can
> > > coordinate our efforts.
> > >
> > > Thanks,
> > > Fabian
> > >
> >
> >
>

Chiwan Park-2

Re: Effort to add SQL / StreamSQL to Flink

Really good! Many people want to use SQL. :)

> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <[hidden email]> wrote:
>
> Wow! Thanks Fabian, this looks fantastic!
>
> On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Super, thanks for that detailed effort, Fabian!
>>
>> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote:
>>
>>> Pretty cool!
>>>
>>> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
>>>> Hi everybody,
>>>>
>>>> in the last days, Timo and I refined the design document for adding a
>>> SQL /
>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>
>>>> The document proposes an architecture that is centered around Apache
>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>> parser,
>>>> a semantic validator for relational queries, and a rule- and cost-based
>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>> (among other projects). In a nutshell, the plan is to translate Table
>> API
>>>> and SQL queries into Calcite's relational expression trees, optimize
>>> these
>>>> trees, and translate them into DataSet and DataStream programs.The
>>> document
>>>> breaks down the work into several tasks and subtasks.
>>>>
>>>> Please review the design document and comment.
>>>>
>>>> -- >
>>>>
>>>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>>
>>>> Unless there are major concerns with the design, Timo and I want to
>> start
>>>> next week to move the current Table API on top of Apache Calcite (Task
>> 1
>>> in
>>>> the document). The goal of this task is to have the same functionality
>> as
>>>> currently, but with Calcite in the translation process. This is a
>>> blocking
>>>> task that we hope to complete soon. Afterwards, we can independently
>> work
>>>> on different aspects such as extending the Table API, adding a SQL
>>>> interface (basically just a parser), integration with external data
>>>> sources, better code generation, optimization rules, streaming support
>>> for
>>>> the Table API, StreamSQL, etc..
>>>>
>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>> it
>>> to
>>>> the master branch once the task is completed. Of course, everybody is
>>>> welcome to contribute to this effort. Please let us know such that we
>> can
>>>> coordinate our efforts.
>>>>
>>>> Thanks,
>>>> Fabian
>>>>
>>>
>>>
>>

Regards,
Chiwan Park

Li, Chengxiang

RE: Effort to add SQL / StreamSQL to Flink

Very cool work, look forward to contribute.

-----Original Message-----
From: Chiwan Park [mailto:[hidden email]]
Sent: Friday, January 8, 2016 9:36 AM
To: [hidden email]
Subject: Re: Effort to add SQL / StreamSQL to Flink

Really good! Many people want to use SQL. :)

> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <[hidden email]> wrote:
>
> Wow! Thanks Fabian, this looks fantastic!
>
> On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Super, thanks for that detailed effort, Fabian!
>>
>> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote:
>>
>>> Pretty cool!
>>>
>>> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
>>>> Hi everybody,
>>>>
>>>> in the last days, Timo and I refined the design document for adding
>>>> a
>>> SQL /
>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>
>>>> The document proposes an architecture that is centered around
>>>> Apache Calcite. Calcite is an Apache top-level project and includes
>>>> a SQL
>>> parser,
>>>> a semantic validator for relational queries, and a rule- and
>>>> cost-based relational optimizer. Calcite is used by Apache Hive and
>>>> Apache Drill (among other projects). In a nutshell, the plan is to
>>>> translate Table
>> API
>>>> and SQL queries into Calcite's relational expression trees,
>>>> optimize
>>> these
>>>> trees, and translate them into DataSet and DataStream programs.The
>>> document
>>>> breaks down the work into several tasks and subtasks.
>>>>
>>>> Please review the design document and comment.
>>>>
>>>> -- >
>>>>
>>>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> cp1h2TVqdI/edit?usp=sharing
>>>>
>>>> Unless there are major concerns with the design, Timo and I want to
>> start
>>>> next week to move the current Table API on top of Apache Calcite
>>>> (Task
>> 1
>>> in
>>>> the document). The goal of this task is to have the same
>>>> functionality
>> as
>>>> currently, but with Calcite in the translation process. This is a
>>> blocking
>>>> task that we hope to complete soon. Afterwards, we can
>>>> independently
>> work
>>>> on different aspects such as extending the Table API, adding a SQL
>>>> interface (basically just a parser), integration with external data
>>>> sources, better code generation, optimization rules, streaming
>>>> support
>>> for
>>>> the Table API, StreamSQL, etc..
>>>>
>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
>>>> merge
>> it
>>> to
>>>> the master branch once the task is completed. Of course, everybody
>>>> is welcome to contribute to this effort. Please let us know such
>>>> that we
>> can
>>>> coordinate our efforts.
>>>>
>>>> Thanks,
>>>> Fabian
>>>>
>>>
>>>
>>

Regards,
Chiwan Park

Henry Saputra

Re: Effort to add SQL / StreamSQL to Flink

In reply to this post by Fabian Hueske-2

I am excited and nervous at the same time =)

- Henry

On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote:

Henry Saputra

Re: Effort to add SQL / StreamSQL to Flink

In reply to this post by Fabian Hueske-2

HI Fabian,

Have you created JIRA ticket to keep track of this new feature?

- Henry

On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote:

Fabian Hueske-2

Re: Effort to add SQL / StreamSQL to Flink

Hi Henry,

There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
subissues.
I'll reorganize these and add more issues for the tasks described in the
design document in the next days.

Thanks, Fabian

2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]>:

> HI Fabian,
>
> Have you created JIRA ticket to keep track of this new feature?
>
> - Henry
>
> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote:
> > Hi everybody,
> >
> > in the last days, Timo and I refined the design document for adding a
> SQL /
> > StreamSQL interface on top of Flink that was started by Stephan.
> >
> > The document proposes an architecture that is centered around Apache
> > Calcite. Calcite is an Apache top-level project and includes a SQL
> parser,
> > a semantic validator for relational queries, and a rule- and cost-based
> > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > (among other projects). In a nutshell, the plan is to translate Table API
> > and SQL queries into Calcite's relational expression trees, optimize
> these
> > trees, and translate them into DataSet and DataStream programs.The
> document
> > breaks down the work into several tasks and subtasks.
> >
> > Please review the design document and comment.
> >
> > -- >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> >
> > Unless there are major concerns with the design, Timo and I want to start
> > next week to move the current Table API on top of Apache Calcite (Task 1
> in
> > the document). The goal of this task is to have the same functionality as
> > currently, but with Calcite in the translation process. This is a
> blocking
> > task that we hope to complete soon. Afterwards, we can independently work
> > on different aspects such as extending the Table API, adding a SQL
> > interface (basically just a parser), integration with external data
> > sources, better code generation, optimization rules, streaming support
> for
> > the Table API, StreamSQL, etc..
> >
> > Timo and I plan to work on a WIP branch to implement Task 1 and merge it
> to
> > the master branch once the task is completed. Of course, everybody is
> > welcome to contribute to this effort. Please let us know such that we can
> > coordinate our efforts.
> >
> > Thanks,
> > Fabian
>

Henry Saputra

Re: Effort to add SQL / StreamSQL to Flink

Awesome! Thanks for the reply, Fabian.

- Henry

On Sunday, January 10, 2016, Fabian Hueske <[hidden email]> wrote:

> Hi Henry,
>
> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> subissues.
> I'll reorganize these and add more issues for the tasks described in the
> design document in the next days.
>
> Thanks, Fabian
>
> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
> <javascript:;>>:
>
> > HI Fabian,
> >
> > Have you created JIRA ticket to keep track of this new feature?
> >
> > - Henry
> >
> > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
> <javascript:;>> wrote:
> > > Hi everybody,
> > >
> > > in the last days, Timo and I refined the design document for adding a
> > SQL /
> > > StreamSQL interface on top of Flink that was started by Stephan.
> > >
> > > The document proposes an architecture that is centered around Apache
> > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > parser,
> > > a semantic validator for relational queries, and a rule- and cost-based
> > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > (among other projects). In a nutshell, the plan is to translate Table
> API
> > > and SQL queries into Calcite's relational expression trees, optimize
> > these
> > > trees, and translate them into DataSet and DataStream programs.The
> > document
> > > breaks down the work into several tasks and subtasks.
> > >
> > > Please review the design document and comment.
> > >
> > > -- >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > >
> > > Unless there are major concerns with the design, Timo and I want to
> start
> > > next week to move the current Table API on top of Apache Calcite (Task
> 1
> > in
> > > the document). The goal of this task is to have the same functionality
> as
> > > currently, but with Calcite in the translation process. This is a
> > blocking
> > > task that we hope to complete soon. Afterwards, we can independently
> work
> > > on different aspects such as extending the Table API, adding a SQL
> > > interface (basically just a parser), integration with external data
> > > sources, better code generation, optimization rules, streaming support
> > for
> > > the Table API, StreamSQL, etc..
> > >
> > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> it
> > to
> > > the master branch once the task is completed. Of course, everybody is
> > > welcome to contribute to this effort. Please let us know such that we
> can
> > > coordinate our efforts.
> > >
> > > Thanks,
> > > Fabian
> >
>

Nick Dimiduk

Re: Effort to add SQL / StreamSQL to Flink

What's the relationship between the streaming SQL proposed here and the CEP
syntax proposed earlier in the week?

On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote:

> Awesome! Thanks for the reply, Fabian.
>
> - Henry
>
> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> <javascript:;>> wrote:
>
> > Hi Henry,
> >
> > There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> > subissues.
> > I'll reorganize these and add more issues for the tasks described in the
> > design document in the next days.
> >
> > Thanks, Fabian
> >
> > 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
> <javascript:;>
> > <javascript:;>>:
> >
> > > HI Fabian,
> > >
> > > Have you created JIRA ticket to keep track of this new feature?
> > >
> > > - Henry
> > >
> > > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
> <javascript:;>
> > <javascript:;>> wrote:
> > > > Hi everybody,
> > > >
> > > > in the last days, Timo and I refined the design document for adding a
> > > SQL /
> > > > StreamSQL interface on top of Flink that was started by Stephan.
> > > >
> > > > The document proposes an architecture that is centered around Apache
> > > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > > parser,
> > > > a semantic validator for relational queries, and a rule- and
> cost-based
> > > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > > (among other projects). In a nutshell, the plan is to translate Table
> > API
> > > > and SQL queries into Calcite's relational expression trees, optimize
> > > these
> > > > trees, and translate them into DataSet and DataStream programs.The
> > > document
> > > > breaks down the work into several tasks and subtasks.
> > > >
> > > > Please review the design document and comment.
> > > >
> > > > -- >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > > >
> > > > Unless there are major concerns with the design, Timo and I want to
> > start
> > > > next week to move the current Table API on top of Apache Calcite
> (Task
> > 1
> > > in
> > > > the document). The goal of this task is to have the same
> functionality
> > as
> > > > currently, but with Calcite in the translation process. This is a
> > > blocking
> > > > task that we hope to complete soon. Afterwards, we can independently
> > work
> > > > on different aspects such as extending the Table API, adding a SQL
> > > > interface (basically just a parser), integration with external data
> > > > sources, better code generation, optimization rules, streaming
> support
> > > for
> > > > the Table API, StreamSQL, etc..
> > > >
> > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> > it
> > > to
> > > > the master branch once the task is completed. Of course, everybody is
> > > > welcome to contribute to this effort. Please let us know such that we
> > can
> > > > coordinate our efforts.
> > > >
> > > > Thanks,
> > > > Fabian
> > >
> >
>

Chiwan Park-2

Re: Effort to add SQL / StreamSQL to Flink

We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list.

> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote:
>
> What's the relationship between the streaming SQL proposed here and the CEP
> syntax proposed earlier in the week?
>
> On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote:
>
>> Awesome! Thanks for the reply, Fabian.
>>
>> - Henry
>>
>> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
>> <javascript:;>> wrote:
>>
>>> Hi Henry,
>>>
>>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
>>> subissues.
>>> I'll reorganize these and add more issues for the tasks described in the
>>> design document in the next days.
>>>
>>> Thanks, Fabian
>>>
>>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
>> <javascript:;>
>>> <javascript:;>>:
>>>
>>>> HI Fabian,
>>>>
>>>> Have you created JIRA ticket to keep track of this new feature?
>>>>
>>>> - Henry
>>>>
>>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
>> <javascript:;>
>>> <javascript:;>> wrote:
>>>>> Hi everybody,
>>>>>
>>>>> in the last days, Timo and I refined the design document for adding a
>>>> SQL /
>>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>>
>>>>> The document proposes an architecture that is centered around Apache
>>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>>> parser,
>>>>> a semantic validator for relational queries, and a rule- and
>> cost-based
>>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>>> (among other projects). In a nutshell, the plan is to translate Table
>>> API
>>>>> and SQL queries into Calcite's relational expression trees, optimize
>>>> these
>>>>> trees, and translate them into DataSet and DataStream programs.The
>>>> document
>>>>> breaks down the work into several tasks and subtasks.
>>>>>
>>>>> Please review the design document and comment.
>>>>>
>>>>> -- >
>>>>>
>>>>
>>>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>>>
>>>>> Unless there are major concerns with the design, Timo and I want to
>>> start
>>>>> next week to move the current Table API on top of Apache Calcite
>> (Task
>>> 1
>>>> in
>>>>> the document). The goal of this task is to have the same
>> functionality
>>> as
>>>>> currently, but with Calcite in the translation process. This is a
>>>> blocking
>>>>> task that we hope to complete soon. Afterwards, we can independently
>>> work
>>>>> on different aspects such as extending the Table API, adding a SQL
>>>>> interface (basically just a parser), integration with external data
>>>>> sources, better code generation, optimization rules, streaming
>> support
>>>> for
>>>>> the Table API, StreamSQL, etc..
>>>>>
>>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>>> it
>>>> to
>>>>> the master branch once the task is completed. Of course, everybody is
>>>>> welcome to contribute to this effort. Please let us know such that we
>>> can
>>>>> coordinate our efforts.
>>>>>
>>>>> Thanks,
>>>>> Fabian

Regards,
Chiwan Park

Jiangsong (Hi)

答复: Effort to add SQL / StreamSQL to Flink

I suggest refering to Esper EPL[1], which is a SQL-standard language extend to offering a cluster of window, pattern matching. EPL can both support Streaming SQL and CEP with one unified syntax.

[1] http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf (Chapter 5. EPL Reference: Clauses)

Regards
Song

-----邮件原件-----
发件人: Chiwan Park [mailto:[hidden email]]
发送时间: 2016年1月11日 10:31
收件人: [hidden email]
主题: Re: Effort to add SQL / StreamSQL to Flink

We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list.

> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote:
>
> What's the relationship between the streaming SQL proposed here and
> the CEP syntax proposed earlier in the week?
>
> On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote:
>
>> Awesome! Thanks for the reply, Fabian.
>>
>> - Henry
>>
>> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
>> <javascript:;>> wrote:
>>
>>> Hi Henry,
>>>
>>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
>>> subissues.
>>> I'll reorganize these and add more issues for the tasks described in
>>> the design document in the next days.
>>>
>>> Thanks, Fabian
>>>
>>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
>> <javascript:;>
>>> <javascript:;>>:
>>>
>>>> HI Fabian,
>>>>
>>>> Have you created JIRA ticket to keep track of this new feature?
>>>>
>>>> - Henry
>>>>
>>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
>> <javascript:;>
>>> <javascript:;>> wrote:
>>>>> Hi everybody,
>>>>>
>>>>> in the last days, Timo and I refined the design document for
>>>>> adding a
>>>> SQL /
>>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>>
>>>>> The document proposes an architecture that is centered around
>>>>> Apache Calcite. Calcite is an Apache top-level project and
>>>>> includes a SQL
>>>> parser,
>>>>> a semantic validator for relational queries, and a rule- and
>> cost-based
>>>>> relational optimizer. Calcite is used by Apache Hive and Apache
>>>>> Drill (among other projects). In a nutshell, the plan is to
>>>>> translate Table
>>> API
>>>>> and SQL queries into Calcite's relational expression trees,
>>>>> optimize
>>>> these
>>>>> trees, and translate them into DataSet and DataStream programs.The
>>>> document
>>>>> breaks down the work into several tasks and subtasks.
>>>>>
>>>>> Please review the design document and comment.
>>>>>
>>>>> -- >
>>>>>
>>>>
>>>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> cp1h2TVqdI/edit?usp=sharing
>>>>>
>>>>> Unless there are major concerns with the design, Timo and I want
>>>>> to
>>> start
>>>>> next week to move the current Table API on top of Apache Calcite
>> (Task
>>> 1
>>>> in
>>>>> the document). The goal of this task is to have the same
>> functionality
>>> as
>>>>> currently, but with Calcite in the translation process. This is a
>>>> blocking
>>>>> task that we hope to complete soon. Afterwards, we can
>>>>> independently
>>> work
>>>>> on different aspects such as extending the Table API, adding a SQL
>>>>> interface (basically just a parser), integration with external
>>>>> data sources, better code generation, optimization rules,
>>>>> streaming
>> support
>>>> for
>>>>> the Table API, StreamSQL, etc..
>>>>>
>>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
>>>>> merge
>>> it
>>>> to
>>>>> the master branch once the task is completed. Of course, everybody
>>>>> is welcome to contribute to this effort. Please let us know such
>>>>> that we
>>> can
>>>>> coordinate our efforts.
>>>>>
>>>>> Thanks,
>>>>> Fabian

Regards,
Chiwan Park

Fabian Hueske-2

Re: 答复: Effort to add SQL / StreamSQL to Flink

Thanks for the feedback!

We will start the SQL effort with putting the existing (batch) Table API on
top of Apache Calcite.
From there we continue to add streaming support for the Table API before we
put a StreamSQL interface on top.

Consolidating the efforts with the CEP library sounds like a good idea to
me.
Maybe it can be nicely integrated with the streaming table API and later as
well with the StreamSQL interface (the StreamSQL dialect is not defined
yet).

@Till: What do you think about adding CEP features to the Table API. From
the CEP design doc, it looks like we need to add a pattern matching
operator in addition to the window features that we need to add for
streaming Table API in any case.

Best, Fabian

2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:

> I suggest refering to Esper EPL[1], which is a SQL-standard language
> extend to offering a cluster of window, pattern matching. EPL can both
> support Streaming SQL and CEP with one unified syntax.
>
> [1]
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> (Chapter 5. EPL Reference: Clauses)
>
>
> Regards
> Song
>
>
> -----邮件原件-----
> 发件人: Chiwan Park [mailto:[hidden email]]
> 发送时间: 2016年1月11日 10:31
> 收件人: [hidden email]
> 主题: Re: Effort to add SQL / StreamSQL to Flink
>
> We still don’t have a concensus about the streaming SQL and CEP library on
> Flink. Some people want to merge these two libraries. Maybe we have to
> discuss about this in mailing list.
>
> > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote:
> >
> > What's the relationship between the streaming SQL proposed here and
> > the CEP syntax proposed earlier in the week?
> >
> > On Sunday, January 10, 2016, Henry Saputra <[hidden email]>
> wrote:
> >
> >> Awesome! Thanks for the reply, Fabian.
> >>
> >> - Henry
> >>
> >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> >> <javascript:;>> wrote:
> >>
> >>> Hi Henry,
> >>>
> >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> >>> subissues.
> >>> I'll reorganize these and add more issues for the tasks described in
> >>> the design document in the next days.
> >>>
> >>> Thanks, Fabian
> >>>
> >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
> >> <javascript:;>
> >>> <javascript:;>>:
> >>>
> >>>> HI Fabian,
> >>>>
> >>>> Have you created JIRA ticket to keep track of this new feature?
> >>>>
> >>>> - Henry
> >>>>
> >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
> >> <javascript:;>
> >>> <javascript:;>> wrote:
> >>>>> Hi everybody,
> >>>>>
> >>>>> in the last days, Timo and I refined the design document for
> >>>>> adding a
> >>>> SQL /
> >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> >>>>>
> >>>>> The document proposes an architecture that is centered around
> >>>>> Apache Calcite. Calcite is an Apache top-level project and
> >>>>> includes a SQL
> >>>> parser,
> >>>>> a semantic validator for relational queries, and a rule- and
> >> cost-based
> >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> >>>>> Drill (among other projects). In a nutshell, the plan is to
> >>>>> translate Table
> >>> API
> >>>>> and SQL queries into Calcite's relational expression trees,
> >>>>> optimize
> >>>> these
> >>>>> trees, and translate them into DataSet and DataStream programs.The
> >>>> document
> >>>>> breaks down the work into several tasks and subtasks.
> >>>>>
> >>>>> Please review the design document and comment.
> >>>>>
> >>>>> -- >
> >>>>>
> >>>>
> >>>
> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> >> cp1h2TVqdI/edit?usp=sharing
> >>>>>
> >>>>> Unless there are major concerns with the design, Timo and I want
> >>>>> to
> >>> start
> >>>>> next week to move the current Table API on top of Apache Calcite
> >> (Task
> >>> 1
> >>>> in
> >>>>> the document). The goal of this task is to have the same
> >> functionality
> >>> as
> >>>>> currently, but with Calcite in the translation process. This is a
> >>>> blocking
> >>>>> task that we hope to complete soon. Afterwards, we can
> >>>>> independently
> >>> work
> >>>>> on different aspects such as extending the Table API, adding a SQL
> >>>>> interface (basically just a parser), integration with external
> >>>>> data sources, better code generation, optimization rules,
> >>>>> streaming
> >> support
> >>>> for
> >>>>> the Table API, StreamSQL, etc..
> >>>>>
> >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> >>>>> merge
> >>> it
> >>>> to
> >>>>> the master branch once the task is completed. Of course, everybody
> >>>>> is welcome to contribute to this effort. Please let us know such
> >>>>> that we
> >>> can
> >>>>> coordinate our efforts.
> >>>>>
> >>>>> Thanks,
> >>>>> Fabian
>
> Regards,
> Chiwan Park
>
>
>

Till Rohrmann

Re: 答复: Effort to add SQL / StreamSQL to Flink

First of all, it's a great design document. Looking forward having stream
SQL in the foreseeable future :-)

I think it is a good idea to consolidate stream SQL and CEP in the long
run. CEP's additional features compared to SQL boil down to pattern
detection. Once we have this, it should be only a question of defining the
SQL syntax for event patterns in order to integrate CEP with stream SQL.
Oracle has already defined an extension [1] to detect patterns in a set of
table rows. This or Esper's event processing language (EPL) [2] could be a
good starting point.

[1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
[2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/

Cheers,
Till

On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> wrote:

> Thanks for the feedback!
>
> We will start the SQL effort with putting the existing (batch) Table API on
> top of Apache Calcite.
> From there we continue to add streaming support for the Table API before we
> put a StreamSQL interface on top.
>
> Consolidating the efforts with the CEP library sounds like a good idea to
> me.
> Maybe it can be nicely integrated with the streaming table API and later as
> well with the StreamSQL interface (the StreamSQL dialect is not defined
> yet).
>
> @Till: What do you think about adding CEP features to the Table API. From
> the CEP design doc, it looks like we need to add a pattern matching
> operator in addition to the window features that we need to add for
> streaming Table API in any case.
>
> Best, Fabian
>
> 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:
>
> > I suggest refering to Esper EPL[1], which is a SQL-standard language
> > extend to offering a cluster of window, pattern matching. EPL can both
> > support Streaming SQL and CEP with one unified syntax.
> >
> > [1]
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > (Chapter 5. EPL Reference: Clauses)
> >
> >
> > Regards
> > Song
> >
> >
> > -----邮件原件-----
> > 发件人: Chiwan Park [mailto:[hidden email]]
> > 发送时间: 2016年1月11日 10:31
> > 收件人: [hidden email]
> > 主题: Re: Effort to add SQL / StreamSQL to Flink
> >
> > We still don’t have a concensus about the streaming SQL and CEP library
> on
> > Flink. Some people want to merge these two libraries. Maybe we have to
> > discuss about this in mailing list.
> >
> > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote:
> > >
> > > What's the relationship between the streaming SQL proposed here and
> > > the CEP syntax proposed earlier in the week?
> > >
> > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]>
> > wrote:
> > >
> > >> Awesome! Thanks for the reply, Fabian.
> > >>
> > >> - Henry
> > >>
> > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> > >> <javascript:;>> wrote:
> > >>
> > >>> Hi Henry,
> > >>>
> > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> > >>> subissues.
> > >>> I'll reorganize these and add more issues for the tasks described in
> > >>> the design document in the next days.
> > >>>
> > >>> Thanks, Fabian
> > >>>
> > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
> > >> <javascript:;>
> > >>> <javascript:;>>:
> > >>>
> > >>>> HI Fabian,
> > >>>>
> > >>>> Have you created JIRA ticket to keep track of this new feature?
> > >>>>
> > >>>> - Henry
> > >>>>
> > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
> > >> <javascript:;>
> > >>> <javascript:;>> wrote:
> > >>>>> Hi everybody,
> > >>>>>
> > >>>>> in the last days, Timo and I refined the design document for
> > >>>>> adding a
> > >>>> SQL /
> > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> > >>>>>
> > >>>>> The document proposes an architecture that is centered around
> > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > >>>>> includes a SQL
> > >>>> parser,
> > >>>>> a semantic validator for relational queries, and a rule- and
> > >> cost-based
> > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > >>>>> translate Table
> > >>> API
> > >>>>> and SQL queries into Calcite's relational expression trees,
> > >>>>> optimize
> > >>>> these
> > >>>>> trees, and translate them into DataSet and DataStream programs.The
> > >>>> document
> > >>>>> breaks down the work into several tasks and subtasks.
> > >>>>>
> > >>>>> Please review the design document and comment.
> > >>>>>
> > >>>>> -- >
> > >>>>>
> > >>>>
> > >>>
> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > >> cp1h2TVqdI/edit?usp=sharing
> > >>>>>
> > >>>>> Unless there are major concerns with the design, Timo and I want
> > >>>>> to
> > >>> start
> > >>>>> next week to move the current Table API on top of Apache Calcite
> > >> (Task
> > >>> 1
> > >>>> in
> > >>>>> the document). The goal of this task is to have the same
> > >> functionality
> > >>> as
> > >>>>> currently, but with Calcite in the translation process. This is a
> > >>>> blocking
> > >>>>> task that we hope to complete soon. Afterwards, we can
> > >>>>> independently
> > >>> work
> > >>>>> on different aspects such as extending the Table API, adding a SQL
> > >>>>> interface (basically just a parser), integration with external
> > >>>>> data sources, better code generation, optimization rules,
> > >>>>> streaming
> > >> support
> > >>>> for
> > >>>>> the Table API, StreamSQL, etc..
> > >>>>>
> > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> > >>>>> merge
> > >>> it
> > >>>> to
> > >>>>> the master branch once the task is completed. Of course, everybody
> > >>>>> is welcome to contribute to this effort. Please let us know such
> > >>>>> that we
> > >>> can
> > >>>>> coordinate our efforts.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Fabian
> >
> > Regards,
> > Chiwan Park
> >
> >
> >
>

Fabian Hueske-2

Re: 答复: Effort to add SQL / StreamSQL to Flink

We haven't defined the StreamSQL syntax yet (and I think it will take some
time until we are at that point).
So we are quite flexible with both featurs.

Let's keep this opportunity in mind and coordinate when before making
decisions about CEP or StreamSQL.

Fabian

2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:

> First of all, it's a great design document. Looking forward having stream
> SQL in the foreseeable future :-)
>
> I think it is a good idea to consolidate stream SQL and CEP in the long
> run. CEP's additional features compared to SQL boil down to pattern
> detection. Once we have this, it should be only a question of defining the
> SQL syntax for event patterns in order to integrate CEP with stream SQL.
> Oracle has already defined an extension [1] to detect patterns in a set of
> table rows. This or Esper's event processing language (EPL) [2] could be a
> good starting point.
>
> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
>
> Cheers,
> Till
>
> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> wrote:
>
> > Thanks for the feedback!
> >
> > We will start the SQL effort with putting the existing (batch) Table API
> on
> > top of Apache Calcite.
> > From there we continue to add streaming support for the Table API before
> we
> > put a StreamSQL interface on top.
> >
> > Consolidating the efforts with the CEP library sounds like a good idea to
> > me.
> > Maybe it can be nicely integrated with the streaming table API and later
> as
> > well with the StreamSQL interface (the StreamSQL dialect is not defined
> > yet).
> >
> > @Till: What do you think about adding CEP features to the Table API. From
> > the CEP design doc, it looks like we need to add a pattern matching
> > operator in addition to the window features that we need to add for
> > streaming Table API in any case.
> >
> > Best, Fabian
> >
> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:
> >
> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
> > > extend to offering a cluster of window, pattern matching. EPL can both
> > > support Streaming SQL and CEP with one unified syntax.
> > >
> > > [1]
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > (Chapter 5. EPL Reference: Clauses)
> > >
> > >
> > > Regards
> > > Song
> > >
> > >
> > > -----邮件原件-----
> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > 发送时间: 2016年1月11日 10:31
> > > 收件人: [hidden email]
> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > >
> > > We still don’t have a concensus about the streaming SQL and CEP library
> > on
> > > Flink. Some people want to merge these two libraries. Maybe we have to
> > > discuss about this in mailing list.
> > >
> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]>
> wrote:
> > > >
> > > > What's the relationship between the streaming SQL proposed here and
> > > > the CEP syntax proposed earlier in the week?
> > > >
> > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]>
> > > wrote:
> > > >
> > > >> Awesome! Thanks for the reply, Fabian.
> > > >>
> > > >> - Henry
> > > >>
> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> > > >> <javascript:;>> wrote:
> > > >>
> > > >>> Hi Henry,
> > > >>>
> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
> few
> > > >>> subissues.
> > > >>> I'll reorganize these and add more issues for the tasks described
> in
> > > >>> the design document in the next days.
> > > >>>
> > > >>> Thanks, Fabian
> > > >>>
> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
> > > >> <javascript:;>
> > > >>> <javascript:;>>:
> > > >>>
> > > >>>> HI Fabian,
> > > >>>>
> > > >>>> Have you created JIRA ticket to keep track of this new feature?
> > > >>>>
> > > >>>> - Henry
> > > >>>>
> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
> > > >> <javascript:;>
> > > >>> <javascript:;>> wrote:
> > > >>>>> Hi everybody,
> > > >>>>>
> > > >>>>> in the last days, Timo and I refined the design document for
> > > >>>>> adding a
> > > >>>> SQL /
> > > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> > > >>>>>
> > > >>>>> The document proposes an architecture that is centered around
> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > > >>>>> includes a SQL
> > > >>>> parser,
> > > >>>>> a semantic validator for relational queries, and a rule- and
> > > >> cost-based
> > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > > >>>>> translate Table
> > > >>> API
> > > >>>>> and SQL queries into Calcite's relational expression trees,
> > > >>>>> optimize
> > > >>>> these
> > > >>>>> trees, and translate them into DataSet and DataStream
> programs.The
> > > >>>> document
> > > >>>>> breaks down the work into several tasks and subtasks.
> > > >>>>>
> > > >>>>> Please review the design document and comment.
> > > >>>>>
> > > >>>>> -- >
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > >> cp1h2TVqdI/edit?usp=sharing
> > > >>>>>
> > > >>>>> Unless there are major concerns with the design, Timo and I want
> > > >>>>> to
> > > >>> start
> > > >>>>> next week to move the current Table API on top of Apache Calcite
> > > >> (Task
> > > >>> 1
> > > >>>> in
> > > >>>>> the document). The goal of this task is to have the same
> > > >> functionality
> > > >>> as
> > > >>>>> currently, but with Calcite in the translation process. This is a
> > > >>>> blocking
> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > >>>>> independently
> > > >>> work
> > > >>>>> on different aspects such as extending the Table API, adding a
> SQL
> > > >>>>> interface (basically just a parser), integration with external
> > > >>>>> data sources, better code generation, optimization rules,
> > > >>>>> streaming
> > > >> support
> > > >>>> for
> > > >>>>> the Table API, StreamSQL, etc..
> > > >>>>>
> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> > > >>>>> merge
> > > >>> it
> > > >>>> to
> > > >>>>> the master branch once the task is completed. Of course,
> everybody
> > > >>>>> is welcome to contribute to this effort. Please let us know such
> > > >>>>> that we
> > > >>> can
> > > >>>>> coordinate our efforts.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Fabian
> > >
> > > Regards,
> > > Chiwan Park
> > >
> > >
> > >
> >
>

Fabian Hueske-2

Re: 答复: Effort to add SQL / StreamSQL to Flink

Hi everybody,

as previously announced, I pushed a feature branch called "tableOnCalcite"
to the Flink repository.
We will use this branch to work on FLINK-3221 and its sub-issues.

Cheers, Fabian

2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:

> We haven't defined the StreamSQL syntax yet (and I think it will take some
> time until we are at that point).
> So we are quite flexible with both featurs.
>
> Let's keep this opportunity in mind and coordinate when before making
> decisions about CEP or StreamSQL.
>
> Fabian
>
> 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
>
>> First of all, it's a great design document. Looking forward having stream
>> SQL in the foreseeable future :-)
>>
>> I think it is a good idea to consolidate stream SQL and CEP in the long
>> run. CEP's additional features compared to SQL boil down to pattern
>> detection. Once we have this, it should be only a question of defining the
>> SQL syntax for event patterns in order to integrate CEP with stream SQL.
>> Oracle has already defined an extension [1] to detect patterns in a set of
>> table rows. This or Esper's event processing language (EPL) [2] could be a
>> good starting point.
>>
>> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
>> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
>>
>> Cheers,
>> Till
>>
>> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]>
>> wrote:
>>
>> > Thanks for the feedback!
>> >
>> > We will start the SQL effort with putting the existing (batch) Table
>> API on
>> > top of Apache Calcite.
>> > From there we continue to add streaming support for the Table API
>> before we
>> > put a StreamSQL interface on top.
>> >
>> > Consolidating the efforts with the CEP library sounds like a good idea
>> to
>> > me.
>> > Maybe it can be nicely integrated with the streaming table API and
>> later as
>> > well with the StreamSQL interface (the StreamSQL dialect is not defined
>> > yet).
>> >
>> > @Till: What do you think about adding CEP features to the Table API.
>> From
>> > the CEP design doc, it looks like we need to add a pattern matching
>> > operator in addition to the window features that we need to add for
>> > streaming Table API in any case.
>> >
>> > Best, Fabian
>> >
>> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:
>> >
>> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
>> > > extend to offering a cluster of window, pattern matching. EPL can
>> both
>> > > support Streaming SQL and CEP with one unified syntax.
>> > >
>> > > [1]
>> > >
>> >
>> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
>> > > (Chapter 5. EPL Reference: Clauses)
>> > >
>> > >
>> > > Regards
>> > > Song
>> > >
>> > >
>> > > -----邮件原件-----
>> > > 发件人: Chiwan Park [mailto:[hidden email]]
>> > > 发送时间: 2016年1月11日 10:31
>> > > 收件人: [hidden email]
>> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
>> > >
>> > > We still don’t have a concensus about the streaming SQL and CEP
>> library
>> > on
>> > > Flink. Some people want to merge these two libraries. Maybe we have to
>> > > discuss about this in mailing list.
>> > >
>> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]>
>> wrote:
>> > > >
>> > > > What's the relationship between the streaming SQL proposed here and
>> > > > the CEP syntax proposed earlier in the week?
>> > > >
>> > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]
>> >
>> > > wrote:
>> > > >
>> > > >> Awesome! Thanks for the reply, Fabian.
>> > > >>
>> > > >> - Henry
>> > > >>
>> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
>> > > >> <javascript:;>> wrote:
>> > > >>
>> > > >>> Hi Henry,
>> > > >>>
>> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
>> few
>> > > >>> subissues.
>> > > >>> I'll reorganize these and add more issues for the tasks described
>> in
>> > > >>> the design document in the next days.
>> > > >>>
>> > > >>> Thanks, Fabian
>> > > >>>
>> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]
>> > > >> <javascript:;>
>> > > >>> <javascript:;>>:
>> > > >>>
>> > > >>>> HI Fabian,
>> > > >>>>
>> > > >>>> Have you created JIRA ticket to keep track of this new feature?
>> > > >>>>
>> > > >>>> - Henry
>> > > >>>>
>> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]
>> > > >> <javascript:;>
>> > > >>> <javascript:;>> wrote:
>> > > >>>>> Hi everybody,
>> > > >>>>>
>> > > >>>>> in the last days, Timo and I refined the design document for
>> > > >>>>> adding a
>> > > >>>> SQL /
>> > > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
>> > > >>>>>
>> > > >>>>> The document proposes an architecture that is centered around
>> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
>> > > >>>>> includes a SQL
>> > > >>>> parser,
>> > > >>>>> a semantic validator for relational queries, and a rule- and
>> > > >> cost-based
>> > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
>> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
>> > > >>>>> translate Table
>> > > >>> API
>> > > >>>>> and SQL queries into Calcite's relational expression trees,
>> > > >>>>> optimize
>> > > >>>> these
>> > > >>>>> trees, and translate them into DataSet and DataStream
>> programs.The
>> > > >>>> document
>> > > >>>>> breaks down the work into several tasks and subtasks.
>> > > >>>>>
>> > > >>>>> Please review the design document and comment.
>> > > >>>>>
>> > > >>>>> -- >
>> > > >>>>>
>> > > >>>>
>> > > >>>
>> > > >>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> > > >> cp1h2TVqdI/edit?usp=sharing
>> > > >>>>>
>> > > >>>>> Unless there are major concerns with the design, Timo and I want
>> > > >>>>> to
>> > > >>> start
>> > > >>>>> next week to move the current Table API on top of Apache Calcite
>> > > >> (Task
>> > > >>> 1
>> > > >>>> in
>> > > >>>>> the document). The goal of this task is to have the same
>> > > >> functionality
>> > > >>> as
>> > > >>>>> currently, but with Calcite in the translation process. This is
>> a
>> > > >>>> blocking
>> > > >>>>> task that we hope to complete soon. Afterwards, we can
>> > > >>>>> independently
>> > > >>> work
>> > > >>>>> on different aspects such as extending the Table API, adding a
>> SQL
>> > > >>>>> interface (basically just a parser), integration with external
>> > > >>>>> data sources, better code generation, optimization rules,
>> > > >>>>> streaming
>> > > >> support
>> > > >>>> for
>> > > >>>>> the Table API, StreamSQL, etc..
>> > > >>>>>
>> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
>> > > >>>>> merge
>> > > >>> it
>> > > >>>> to
>> > > >>>>> the master branch once the task is completed. Of course,
>> everybody
>> > > >>>>> is welcome to contribute to this effort. Please let us know such
>> > > >>>>> that we
>> > > >>> can
>> > > >>>>> coordinate our efforts.
>> > > >>>>>
>> > > >>>>> Thanks,
>> > > >>>>> Fabian
>> > >
>> > > Regards,
>> > > Chiwan Park
>> > >
>> > >
>> > >
>> >
>>
>
>

Vasiliki Kalavri

Re: 答复: Effort to add SQL / StreamSQL to Flink

Hello everyone,

We are happy to announce that the "tableOnCalcite" branch is finally ready
to be merged.
It essentially provides the existing functionality of the Table API, but
now the translation happens through Apache Calcite.
You can find the changes rebased on top of the current master in [1].
We have removed the prototype streaming Table API functionality, which will
be added back once PR [2] is merged.

We'll go through the changes once more and, if no objections, we would like
to go ahead and merge this.

Cheers,
-Vasia.

[1]: https://github.com/vasia/flink/tree/merge-table
[2]: https://github.com/apache/flink/pull/1770

On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote:

> Hi everybody,
>
> as previously announced, I pushed a feature branch called "tableOnCalcite"
> to the Flink repository.
> We will use this branch to work on FLINK-3221 and its sub-issues.
>
> Cheers, Fabian
>
> 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
>
> > We haven't defined the StreamSQL syntax yet (and I think it will take
> some
> > time until we are at that point).
> > So we are quite flexible with both featurs.
> >
> > Let's keep this opportunity in mind and coordinate when before making
> > decisions about CEP or StreamSQL.
> >
> > Fabian
> >
> > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
> >
> >> First of all, it's a great design document. Looking forward having
> stream
> >> SQL in the foreseeable future :-)
> >>
> >> I think it is a good idea to consolidate stream SQL and CEP in the long
> >> run. CEP's additional features compared to SQL boil down to pattern
> >> detection. Once we have this, it should be only a question of defining
> the
> >> SQL syntax for event patterns in order to integrate CEP with stream SQL.
> >> Oracle has already defined an extension [1] to detect patterns in a set
> of
> >> table rows. This or Esper's event processing language (EPL) [2] could
> be a
> >> good starting point.
> >>
> >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> >> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]>
> >> wrote:
> >>
> >> > Thanks for the feedback!
> >> >
> >> > We will start the SQL effort with putting the existing (batch) Table
> >> API on
> >> > top of Apache Calcite.
> >> > From there we continue to add streaming support for the Table API
> >> before we
> >> > put a StreamSQL interface on top.
> >> >
> >> > Consolidating the efforts with the CEP library sounds like a good idea
> >> to
> >> > me.
> >> > Maybe it can be nicely integrated with the streaming table API and
> >> later as
> >> > well with the StreamSQL interface (the StreamSQL dialect is not
> defined
> >> > yet).
> >> >
> >> > @Till: What do you think about adding CEP features to the Table API.
> >> From
> >> > the CEP design doc, it looks like we need to add a pattern matching
> >> > operator in addition to the window features that we need to add for
> >> > streaming Table API in any case.
> >> >
> >> > Best, Fabian
> >> >
> >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:
> >> >
> >> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
> >> > > extend to offering a cluster of window, pattern matching. EPL can
> >> both
> >> > > support Streaming SQL and CEP with one unified syntax.
> >> > >
> >> > > [1]
> >> > >
> >> >
> >>
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> >> > > (Chapter 5. EPL Reference: Clauses)
> >> > >
> >> > >
> >> > > Regards
> >> > > Song
> >> > >
> >> > >
> >> > > -----邮件原件-----
> >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> >> > > 发送时间: 2016年1月11日 10:31
> >> > > 收件人: [hidden email]
> >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> >> > >
> >> > > We still don’t have a concensus about the streaming SQL and CEP
> >> library
> >> > on
> >> > > Flink. Some people want to merge these two libraries. Maybe we have
> to
> >> > > discuss about this in mailing list.
> >> > >
> >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]>
> >> wrote:
> >> > > >
> >> > > > What's the relationship between the streaming SQL proposed here
> and
> >> > > > the CEP syntax proposed earlier in the week?
> >> > > >
> >> > > > On Sunday, January 10, 2016, Henry Saputra <
> [hidden email]
> >> >
> >> > > wrote:
> >> > > >
> >> > > >> Awesome! Thanks for the reply, Fabian.
> >> > > >>
> >> > > >> - Henry
> >> > > >>
> >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> >> > > >> <javascript:;>> wrote:
> >> > > >>
> >> > > >>> Hi Henry,
> >> > > >>>
> >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
> >> few
> >> > > >>> subissues.
> >> > > >>> I'll reorganize these and add more issues for the tasks
> described
> >> in
> >> > > >>> the design document in the next days.
> >> > > >>>
> >> > > >>> Thanks, Fabian
> >> > > >>>
> >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> [hidden email]
> >> > > >> <javascript:;>
> >> > > >>> <javascript:;>>:
> >> > > >>>
> >> > > >>>> HI Fabian,
> >> > > >>>>
> >> > > >>>> Have you created JIRA ticket to keep track of this new feature?
> >> > > >>>>
> >> > > >>>> - Henry
> >> > > >>>>
> >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> [hidden email]
> >> > > >> <javascript:;>
> >> > > >>> <javascript:;>> wrote:
> >> > > >>>>> Hi everybody,
> >> > > >>>>>
> >> > > >>>>> in the last days, Timo and I refined the design document for
> >> > > >>>>> adding a
> >> > > >>>> SQL /
> >> > > >>>>> StreamSQL interface on top of Flink that was started by
> Stephan.
> >> > > >>>>>
> >> > > >>>>> The document proposes an architecture that is centered around
> >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> >> > > >>>>> includes a SQL
> >> > > >>>> parser,
> >> > > >>>>> a semantic validator for relational queries, and a rule- and
> >> > > >> cost-based
> >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> Apache
> >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> >> > > >>>>> translate Table
> >> > > >>> API
> >> > > >>>>> and SQL queries into Calcite's relational expression trees,
> >> > > >>>>> optimize
> >> > > >>>> these
> >> > > >>>>> trees, and translate them into DataSet and DataStream
> >> programs.The
> >> > > >>>> document
> >> > > >>>>> breaks down the work into several tasks and subtasks.
> >> > > >>>>>
> >> > > >>>>> Please review the design document and comment.
> >> > > >>>>>
> >> > > >>>>> -- >
> >> > > >>>>>
> >> > > >>>>
> >> > > >>>
> >> > > >>
> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> >> > > >> cp1h2TVqdI/edit?usp=sharing
> >> > > >>>>>
> >> > > >>>>> Unless there are major concerns with the design, Timo and I
> want
> >> > > >>>>> to
> >> > > >>> start
> >> > > >>>>> next week to move the current Table API on top of Apache
> Calcite
> >> > > >> (Task
> >> > > >>> 1
> >> > > >>>> in
> >> > > >>>>> the document). The goal of this task is to have the same
> >> > > >> functionality
> >> > > >>> as
> >> > > >>>>> currently, but with Calcite in the translation process. This
> is
> >> a
> >> > > >>>> blocking
> >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> >> > > >>>>> independently
> >> > > >>> work
> >> > > >>>>> on different aspects such as extending the Table API, adding a
> >> SQL
> >> > > >>>>> interface (basically just a parser), integration with external
> >> > > >>>>> data sources, better code generation, optimization rules,
> >> > > >>>>> streaming
> >> > > >> support
> >> > > >>>> for
> >> > > >>>>> the Table API, StreamSQL, etc..
> >> > > >>>>>
> >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1
> and
> >> > > >>>>> merge
> >> > > >>> it
> >> > > >>>> to
> >> > > >>>>> the master branch once the task is completed. Of course,
> >> everybody
> >> > > >>>>> is welcome to contribute to this effort. Please let us know
> such
> >> > > >>>>> that we
> >> > > >>> can
> >> > > >>>>> coordinate our efforts.
> >> > > >>>>>
> >> > > >>>>> Thanks,
> >> > > >>>>> Fabian
> >> > >
> >> > > Regards,
> >> > > Chiwan Park
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Robert Metzger

Re: 答复: Effort to add SQL / StreamSQL to Flink

Cool, this is great news!
So "Task 1" from the document [1] is done with the merge? And PR #1770 is
going towards "Task 6".
I think good support for Stream SQL is a very interesting new feature for
Flink.

[1]
https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0

On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <[hidden email]
> wrote:

> Hello everyone,
>
> We are happy to announce that the "tableOnCalcite" branch is finally ready
> to be merged.
> It essentially provides the existing functionality of the Table API, but
> now the translation happens through Apache Calcite.
> You can find the changes rebased on top of the current master in [1].
> We have removed the prototype streaming Table API functionality, which will
> be added back once PR [2] is merged.
>
> We'll go through the changes once more and, if no objections, we would like
> to go ahead and merge this.
>
> Cheers,
> -Vasia.
>
> [1]: https://github.com/vasia/flink/tree/merge-table
> [2]: https://github.com/apache/flink/pull/1770
>
>
> On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote:
>
> > Hi everybody,
> >
> > as previously announced, I pushed a feature branch called
> "tableOnCalcite"
> > to the Flink repository.
> > We will use this branch to work on FLINK-3221 and its sub-issues.
> >
> > Cheers, Fabian
> >
> > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> >
> > > We haven't defined the StreamSQL syntax yet (and I think it will take
> > some
> > > time until we are at that point).
> > > So we are quite flexible with both featurs.
> > >
> > > Let's keep this opportunity in mind and coordinate when before making
> > > decisions about CEP or StreamSQL.
> > >
> > > Fabian
> > >
> > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
> > >
> > >> First of all, it's a great design document. Looking forward having
> > stream
> > >> SQL in the foreseeable future :-)
> > >>
> > >> I think it is a good idea to consolidate stream SQL and CEP in the
> long
> > >> run. CEP's additional features compared to SQL boil down to pattern
> > >> detection. Once we have this, it should be only a question of defining
> > the
> > >> SQL syntax for event patterns in order to integrate CEP with stream
> SQL.
> > >> Oracle has already defined an extension [1] to detect patterns in a
> set
> > of
> > >> table rows. This or Esper's event processing language (EPL) [2] could
> > be a
> > >> good starting point.
> > >>
> > >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > >> [2]
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]>
> > >> wrote:
> > >>
> > >> > Thanks for the feedback!
> > >> >
> > >> > We will start the SQL effort with putting the existing (batch) Table
> > >> API on
> > >> > top of Apache Calcite.
> > >> > From there we continue to add streaming support for the Table API
> > >> before we
> > >> > put a StreamSQL interface on top.
> > >> >
> > >> > Consolidating the efforts with the CEP library sounds like a good
> idea
> > >> to
> > >> > me.
> > >> > Maybe it can be nicely integrated with the streaming table API and
> > >> later as
> > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > defined
> > >> > yet).
> > >> >
> > >> > @Till: What do you think about adding CEP features to the Table API.
> > >> From
> > >> > the CEP design doc, it looks like we need to add a pattern matching
> > >> > operator in addition to the window features that we need to add for
> > >> > streaming Table API in any case.
> > >> >
> > >> > Best, Fabian
> > >> >
> > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>:
> > >> >
> > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> language
> > >> > > extend to offering a cluster of window, pattern matching. EPL can
> > >> both
> > >> > > support Streaming SQL and CEP with one unified syntax.
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > >> > > (Chapter 5. EPL Reference: Clauses)
> > >> > >
> > >> > >
> > >> > > Regards
> > >> > > Song
> > >> > >
> > >> > >
> > >> > > -----邮件原件-----
> > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > >> > > 发送时间: 2016年1月11日 10:31
> > >> > > 收件人: [hidden email]
> > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > >> > >
> > >> > > We still don’t have a concensus about the streaming SQL and CEP
> > >> library
> > >> > on
> > >> > > Flink. Some people want to merge these two libraries. Maybe we
> have
> > to
> > >> > > discuss about this in mailing list.
> > >> > >
> > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]>
> > >> wrote:
> > >> > > >
> > >> > > > What's the relationship between the streaming SQL proposed here
> > and
> > >> > > > the CEP syntax proposed earlier in the week?
> > >> > > >
> > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > [hidden email]
> > >> >
> > >> > > wrote:
> > >> > > >
> > >> > > >> Awesome! Thanks for the reply, Fabian.
> > >> > > >>
> > >> > > >> - Henry
> > >> > > >>
> > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email]
> > >> > > >> <javascript:;>> wrote:
> > >> > > >>
> > >> > > >>> Hi Henry,
> > >> > > >>>
> > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> and a
> > >> few
> > >> > > >>> subissues.
> > >> > > >>> I'll reorganize these and add more issues for the tasks
> > described
> > >> in
> > >> > > >>> the design document in the next days.
> > >> > > >>>
> > >> > > >>> Thanks, Fabian
> > >> > > >>>
> > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > [hidden email]
> > >> > > >> <javascript:;>
> > >> > > >>> <javascript:;>>:
> > >> > > >>>
> > >> > > >>>> HI Fabian,
> > >> > > >>>>
> > >> > > >>>> Have you created JIRA ticket to keep track of this new
> feature?
> > >> > > >>>>
> > >> > > >>>> - Henry
> > >> > > >>>>
> > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > [hidden email]
> > >> > > >> <javascript:;>
> > >> > > >>> <javascript:;>> wrote:
> > >> > > >>>>> Hi everybody,
> > >> > > >>>>>
> > >> > > >>>>> in the last days, Timo and I refined the design document for
> > >> > > >>>>> adding a
> > >> > > >>>> SQL /
> > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > Stephan.
> > >> > > >>>>>
> > >> > > >>>>> The document proposes an architecture that is centered
> around
> > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > >> > > >>>>> includes a SQL
> > >> > > >>>> parser,
> > >> > > >>>>> a semantic validator for relational queries, and a rule- and
> > >> > > >> cost-based
> > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > Apache
> > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > >> > > >>>>> translate Table
> > >> > > >>> API
> > >> > > >>>>> and SQL queries into Calcite's relational expression trees,
> > >> > > >>>>> optimize
> > >> > > >>>> these
> > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > >> programs.The
> > >> > > >>>> document
> > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > >> > > >>>>>
> > >> > > >>>>> Please review the design document and comment.
> > >> > > >>>>>
> > >> > > >>>>> -- >
> > >> > > >>>>>
> > >> > > >>>>
> > >> > > >>>
> > >> > > >>
> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > >> > > >>>>>
> > >> > > >>>>> Unless there are major concerns with the design, Timo and I
> > want
> > >> > > >>>>> to
> > >> > > >>> start
> > >> > > >>>>> next week to move the current Table API on top of Apache
> > Calcite
> > >> > > >> (Task
> > >> > > >>> 1
> > >> > > >>>> in
> > >> > > >>>>> the document). The goal of this task is to have the same
> > >> > > >> functionality
> > >> > > >>> as
> > >> > > >>>>> currently, but with Calcite in the translation process. This
> > is
> > >> a
> > >> > > >>>> blocking
> > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > >> > > >>>>> independently
> > >> > > >>> work
> > >> > > >>>>> on different aspects such as extending the Table API,
> adding a
> > >> SQL
> > >> > > >>>>> interface (basically just a parser), integration with
> external
> > >> > > >>>>> data sources, better code generation, optimization rules,
> > >> > > >>>>> streaming
> > >> > > >> support
> > >> > > >>>> for
> > >> > > >>>>> the Table API, StreamSQL, etc..
> > >> > > >>>>>
> > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1
> > and
> > >> > > >>>>> merge
> > >> > > >>> it
> > >> > > >>>> to
> > >> > > >>>>> the master branch once the task is completed. Of course,
> > >> everybody
> > >> > > >>>>> is welcome to contribute to this effort. Please let us know
> > such
> > >> > > >>>>> that we
> > >> > > >>> can
> > >> > > >>>>> coordinate our efforts.
> > >> > > >>>>>
> > >> > > >>>>> Thanks,
> > >> > > >>>>> Fabian
> > >> > >
> > >> > > Regards,
> > >> > > Chiwan Park
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Vasiliki Kalavri

Re: 答复: Effort to add SQL / StreamSQL to Flink

Yes, the current state corresponds to Task 1. PR #1770 corresponds to Task
5. Task 6 should come right after :)

-V.

On 16 March 2016 at 20:35, Robert Metzger <[hidden email]> wrote:

> Cool, this is great news!
> So "Task 1" from the document [1] is done with the merge? And PR #1770 is
> going towards "Task 6".
> I think good support for Stream SQL is a very interesting new feature for
> Flink.
>
> [1]
>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
>
> On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> [hidden email]
> > wrote:
>
> > Hello everyone,
> >
> > We are happy to announce that the "tableOnCalcite" branch is finally
> ready
> > to be merged.
> > It essentially provides the existing functionality of the Table API, but
> > now the translation happens through Apache Calcite.
> > You can find the changes rebased on top of the current master in [1].
> > We have removed the prototype streaming Table API functionality, which
> will
> > be added back once PR [2] is merged.
> >
> > We'll go through the changes once more and, if no objections, we would
> like
> > to go ahead and merge this.
> >
> > Cheers,
> > -Vasia.
> >
> > [1]: https://github.com/vasia/flink/tree/merge-table
> > [2]: https://github.com/apache/flink/pull/1770
> >
> >
> > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote:
> >
> > > Hi everybody,
> > >
> > > as previously announced, I pushed a feature branch called
> > "tableOnCalcite"
> > > to the Flink repository.
> > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > >
> > > Cheers, Fabian
> > >
> > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>:
> > >
> > > > We haven't defined the StreamSQL syntax yet (and I think it will take
> > > some
> > > > time until we are at that point).
> > > > So we are quite flexible with both featurs.
> > > >
> > > > Let's keep this opportunity in mind and coordinate when before making
> > > > decisions about CEP or StreamSQL.
> > > >
> > > > Fabian
> > > >
> > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>:
> > > >
> > > >> First of all, it's a great design document. Looking forward having
> > > stream
> > > >> SQL in the foreseeable future :-)
> > > >>
> > > >> I think it is a good idea to consolidate stream SQL and CEP in the
> > long
> > > >> run. CEP's additional features compared to SQL boil down to pattern
> > > >> detection. Once we have this, it should be only a question of
> defining
> > > the
> > > >> SQL syntax for event patterns in order to integrate CEP with stream
> > SQL.
> > > >> Oracle has already defined an extension [1] to detect patterns in a
> > set
> > > of
> > > >> table rows. This or Esper's event processing language (EPL) [2]
> could
> > > be a
> > > >> good starting point.
> > > >>
> > > >> [1]
> https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > >> [2]
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]>
> > > >> wrote:
> > > >>
> > > >> > Thanks for the feedback!
> > > >> >
> > > >> > We will start the SQL effort with putting the existing (batch)
> Table
> > > >> API on
> > > >> > top of Apache Calcite.
> > > >> > From there we continue to add streaming support for the Table API
> > > >> before we
> > > >> > put a StreamSQL interface on top.
> > > >> >
> > > >> > Consolidating the efforts with the CEP library sounds like a good
> > idea
> > > >> to
> > > >> > me.
> > > >> > Maybe it can be nicely integrated with the streaming table API and
> > > >> later as
> > > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > > defined
> > > >> > yet).
> > > >> >
> > > >> > @Till: What do you think about adding CEP features to the Table
> API.
> > > >> From
> > > >> > the CEP design doc, it looks like we need to add a pattern
> matching
> > > >> > operator in addition to the window features that we need to add
> for
> > > >> > streaming Table API in any case.
> > > >> >
> > > >> > Best, Fabian
> > > >> >
> > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]
> >:
> > > >> >
> > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > language
> > > >> > > extend to offering a cluster of window, pattern matching. EPL
> can
> > > >> both
> > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > >> > >
> > > >> > > [1]
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > >> > > (Chapter 5. EPL Reference: Clauses)
> > > >> > >
> > > >> > >
> > > >> > > Regards
> > > >> > > Song
> > > >> > >
> > > >> > >
> > > >> > > -----邮件原件-----
> > > >> > > 发件人: Chiwan Park [mailto:[hidden email]]
> > > >> > > 发送时间: 2016年1月11日 10:31
> > > >> > > 收件人: [hidden email]
> > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > >> > >
> > > >> > > We still don’t have a concensus about the streaming SQL and CEP
> > > >> library
> > > >> > on
> > > >> > > Flink. Some people want to merge these two libraries. Maybe we
> > have
> > > to
> > > >> > > discuss about this in mailing list.
> > > >> > >
> > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> [hidden email]>
> > > >> wrote:
> > > >> > > >
> > > >> > > > What's the relationship between the streaming SQL proposed
> here
> > > and
> > > >> > > > the CEP syntax proposed earlier in the week?
> > > >> > > >
> > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > [hidden email]
> > > >> >
> > > >> > > wrote:
> > > >> > > >
> > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > >> > > >>
> > > >> > > >> - Henry
> > > >> > > >>
> > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> [hidden email]
> > > >> > > >> <javascript:;>> wrote:
> > > >> > > >>
> > > >> > > >>> Hi Henry,
> > > >> > > >>>
> > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> > and a
> > > >> few
> > > >> > > >>> subissues.
> > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > described
> > > >> in
> > > >> > > >>> the design document in the next days.
> > > >> > > >>>
> > > >> > > >>> Thanks, Fabian
> > > >> > > >>>
> > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > [hidden email]
> > > >> > > >> <javascript:;>
> > > >> > > >>> <javascript:;>>:
> > > >> > > >>>
> > > >> > > >>>> HI Fabian,
> > > >> > > >>>>
> > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > feature?
> > > >> > > >>>>
> > > >> > > >>>> - Henry
> > > >> > > >>>>
> > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > [hidden email]
> > > >> > > >> <javascript:;>
> > > >> > > >>> <javascript:;>> wrote:
> > > >> > > >>>>> Hi everybody,
> > > >> > > >>>>>
> > > >> > > >>>>> in the last days, Timo and I refined the design document
> for
> > > >> > > >>>>> adding a
> > > >> > > >>>> SQL /
> > > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > > Stephan.
> > > >> > > >>>>>
> > > >> > > >>>>> The document proposes an architecture that is centered
> > around
> > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > > >> > > >>>>> includes a SQL
> > > >> > > >>>> parser,
> > > >> > > >>>>> a semantic validator for relational queries, and a rule-
> and
> > > >> > > >> cost-based
> > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > > Apache
> > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is
> to
> > > >> > > >>>>> translate Table
> > > >> > > >>> API
> > > >> > > >>>>> and SQL queries into Calcite's relational expression
> trees,
> > > >> > > >>>>> optimize
> > > >> > > >>>> these
> > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > >> programs.The
> > > >> > > >>>> document
> > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > >> > > >>>>>
> > > >> > > >>>>> Please review the design document and comment.
> > > >> > > >>>>>
> > > >> > > >>>>> -- >
> > > >> > > >>>>>
> > > >> > > >>>>
> > > >> > > >>>
> > > >> > > >>
> > > >>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > >> > > >>>>>
> > > >> > > >>>>> Unless there are major concerns with the design, Timo and
> I
> > > want
> > > >> > > >>>>> to
> > > >> > > >>> start
> > > >> > > >>>>> next week to move the current Table API on top of Apache
> > > Calcite
> > > >> > > >> (Task
> > > >> > > >>> 1
> > > >> > > >>>> in
> > > >> > > >>>>> the document). The goal of this task is to have the same
> > > >> > > >> functionality
> > > >> > > >>> as
> > > >> > > >>>>> currently, but with Calcite in the translation process.
> This
> > > is
> > > >> a
> > > >> > > >>>> blocking
> > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > >> > > >>>>> independently
> > > >> > > >>> work
> > > >> > > >>>>> on different aspects such as extending the Table API,
> > adding a
> > > >> SQL
> > > >> > > >>>>> interface (basically just a parser), integration with
> > external
> > > >> > > >>>>> data sources, better code generation, optimization rules,
> > > >> > > >>>>> streaming
> > > >> > > >> support
> > > >> > > >>>> for
> > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > >> > > >>>>>
> > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task
> 1
> > > and
> > > >> > > >>>>> merge
> > > >> > > >>> it
> > > >> > > >>>> to
> > > >> > > >>>>> the master branch once the task is completed. Of course,
> > > >> everybody
> > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> know
> > > such
> > > >> > > >>>>> that we
> > > >> > > >>> can
> > > >> > > >>>>> coordinate our efforts.
> > > >> > > >>>>>
> > > >> > > >>>>> Thanks,
> > > >> > > >>>>> Fabian
> > > >> > >
> > > >> > > Regards,
> > > >> > > Chiwan Park
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>