Hi everybody,
in the last days, Timo and I refined the design document for adding a SQL / StreamSQL interface on top of Flink that was started by Stephan. The document proposes an architecture that is centered around Apache Calcite. Calcite is an Apache top-level project and includes a SQL parser, a semantic validator for relational queries, and a rule- and cost-based relational optimizer. Calcite is used by Apache Hive and Apache Drill (among other projects). In a nutshell, the plan is to translate Table API and SQL queries into Calcite's relational expression trees, optimize these trees, and translate them into DataSet and DataStream programs.The document breaks down the work into several tasks and subtasks. Please review the design document and comment. -- > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing Unless there are major concerns with the design, Timo and I want to start next week to move the current Table API on top of Apache Calcite (Task 1 in the document). The goal of this task is to have the same functionality as currently, but with Calcite in the translation process. This is a blocking task that we hope to complete soon. Afterwards, we can independently work on different aspects such as extending the Table API, adding a SQL interface (basically just a parser), integration with external data sources, better code generation, optimization rules, streaming support for the Table API, StreamSQL, etc.. Timo and I plan to work on a WIP branch to implement Task 1 and merge it to the master branch once the task is completed. Of course, everybody is welcome to contribute to this effort. Please let us know such that we can coordinate our efforts. Thanks, Fabian |
Pretty cool!
On 01/07/2016 03:05 PM, Fabian Hueske wrote: > Hi everybody, > > in the last days, Timo and I refined the design document for adding a SQL / > StreamSQL interface on top of Flink that was started by Stephan. > > The document proposes an architecture that is centered around Apache > Calcite. Calcite is an Apache top-level project and includes a SQL parser, > a semantic validator for relational queries, and a rule- and cost-based > relational optimizer. Calcite is used by Apache Hive and Apache Drill > (among other projects). In a nutshell, the plan is to translate Table API > and SQL queries into Calcite's relational expression trees, optimize these > trees, and translate them into DataSet and DataStream programs.The document > breaks down the work into several tasks and subtasks. > > Please review the design document and comment. > > -- > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > Unless there are major concerns with the design, Timo and I want to start > next week to move the current Table API on top of Apache Calcite (Task 1 in > the document). The goal of this task is to have the same functionality as > currently, but with Calcite in the translation process. This is a blocking > task that we hope to complete soon. Afterwards, we can independently work > on different aspects such as extending the Table API, adding a SQL > interface (basically just a parser), integration with external data > sources, better code generation, optimization rules, streaming support for > the Table API, StreamSQL, etc.. > > Timo and I plan to work on a WIP branch to implement Task 1 and merge it to > the master branch once the task is completed. Of course, everybody is > welcome to contribute to this effort. Please let us know such that we can > coordinate our efforts. > > Thanks, > Fabian > |
Super, thanks for that detailed effort, Fabian!
On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote: > Pretty cool! > > On 01/07/2016 03:05 PM, Fabian Hueske wrote: > > Hi everybody, > > > > in the last days, Timo and I refined the design document for adding a > SQL / > > StreamSQL interface on top of Flink that was started by Stephan. > > > > The document proposes an architecture that is centered around Apache > > Calcite. Calcite is an Apache top-level project and includes a SQL > parser, > > a semantic validator for relational queries, and a rule- and cost-based > > relational optimizer. Calcite is used by Apache Hive and Apache Drill > > (among other projects). In a nutshell, the plan is to translate Table API > > and SQL queries into Calcite's relational expression trees, optimize > these > > trees, and translate them into DataSet and DataStream programs.The > document > > breaks down the work into several tasks and subtasks. > > > > Please review the design document and comment. > > > > -- > > > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > > > Unless there are major concerns with the design, Timo and I want to start > > next week to move the current Table API on top of Apache Calcite (Task 1 > in > > the document). The goal of this task is to have the same functionality as > > currently, but with Calcite in the translation process. This is a > blocking > > task that we hope to complete soon. Afterwards, we can independently work > > on different aspects such as extending the Table API, adding a SQL > > interface (basically just a parser), integration with external data > > sources, better code generation, optimization rules, streaming support > for > > the Table API, StreamSQL, etc.. > > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge it > to > > the master branch once the task is completed. Of course, everybody is > > welcome to contribute to this effort. Please let us know such that we can > > coordinate our efforts. > > > > Thanks, > > Fabian > > > > |
Wow! Thanks Fabian, this looks fantastic!
On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote: > Super, thanks for that detailed effort, Fabian! > > On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote: > > > Pretty cool! > > > > On 01/07/2016 03:05 PM, Fabian Hueske wrote: > > > Hi everybody, > > > > > > in the last days, Timo and I refined the design document for adding a > > SQL / > > > StreamSQL interface on top of Flink that was started by Stephan. > > > > > > The document proposes an architecture that is centered around Apache > > > Calcite. Calcite is an Apache top-level project and includes a SQL > > parser, > > > a semantic validator for relational queries, and a rule- and cost-based > > > relational optimizer. Calcite is used by Apache Hive and Apache Drill > > > (among other projects). In a nutshell, the plan is to translate Table > API > > > and SQL queries into Calcite's relational expression trees, optimize > > these > > > trees, and translate them into DataSet and DataStream programs.The > > document > > > breaks down the work into several tasks and subtasks. > > > > > > Please review the design document and comment. > > > > > > -- > > > > > > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > > > > > Unless there are major concerns with the design, Timo and I want to > start > > > next week to move the current Table API on top of Apache Calcite (Task > 1 > > in > > > the document). The goal of this task is to have the same functionality > as > > > currently, but with Calcite in the translation process. This is a > > blocking > > > task that we hope to complete soon. Afterwards, we can independently > work > > > on different aspects such as extending the Table API, adding a SQL > > > interface (basically just a parser), integration with external data > > > sources, better code generation, optimization rules, streaming support > > for > > > the Table API, StreamSQL, etc.. > > > > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge > it > > to > > > the master branch once the task is completed. Of course, everybody is > > > welcome to contribute to this effort. Please let us know such that we > can > > > coordinate our efforts. > > > > > > Thanks, > > > Fabian > > > > > > > > |
Really good! Many people want to use SQL. :)
> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <[hidden email]> wrote: > > Wow! Thanks Fabian, this looks fantastic! > > On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote: > >> Super, thanks for that detailed effort, Fabian! >> >> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote: >> >>> Pretty cool! >>> >>> On 01/07/2016 03:05 PM, Fabian Hueske wrote: >>>> Hi everybody, >>>> >>>> in the last days, Timo and I refined the design document for adding a >>> SQL / >>>> StreamSQL interface on top of Flink that was started by Stephan. >>>> >>>> The document proposes an architecture that is centered around Apache >>>> Calcite. Calcite is an Apache top-level project and includes a SQL >>> parser, >>>> a semantic validator for relational queries, and a rule- and cost-based >>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill >>>> (among other projects). In a nutshell, the plan is to translate Table >> API >>>> and SQL queries into Calcite's relational expression trees, optimize >>> these >>>> trees, and translate them into DataSet and DataStream programs.The >>> document >>>> breaks down the work into several tasks and subtasks. >>>> >>>> Please review the design document and comment. >>>> >>>> -- > >>>> >>> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing >>>> >>>> Unless there are major concerns with the design, Timo and I want to >> start >>>> next week to move the current Table API on top of Apache Calcite (Task >> 1 >>> in >>>> the document). The goal of this task is to have the same functionality >> as >>>> currently, but with Calcite in the translation process. This is a >>> blocking >>>> task that we hope to complete soon. Afterwards, we can independently >> work >>>> on different aspects such as extending the Table API, adding a SQL >>>> interface (basically just a parser), integration with external data >>>> sources, better code generation, optimization rules, streaming support >>> for >>>> the Table API, StreamSQL, etc.. >>>> >>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge >> it >>> to >>>> the master branch once the task is completed. Of course, everybody is >>>> welcome to contribute to this effort. Please let us know such that we >> can >>>> coordinate our efforts. >>>> >>>> Thanks, >>>> Fabian >>>> >>> >>> >> Regards, Chiwan Park |
Very cool work, look forward to contribute.
-----Original Message----- From: Chiwan Park [mailto:[hidden email]] Sent: Friday, January 8, 2016 9:36 AM To: [hidden email] Subject: Re: Effort to add SQL / StreamSQL to Flink Really good! Many people want to use SQL. :) > On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <[hidden email]> wrote: > > Wow! Thanks Fabian, this looks fantastic! > > On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <[hidden email]> wrote: > >> Super, thanks for that detailed effort, Fabian! >> >> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <[hidden email]> wrote: >> >>> Pretty cool! >>> >>> On 01/07/2016 03:05 PM, Fabian Hueske wrote: >>>> Hi everybody, >>>> >>>> in the last days, Timo and I refined the design document for adding >>>> a >>> SQL / >>>> StreamSQL interface on top of Flink that was started by Stephan. >>>> >>>> The document proposes an architecture that is centered around >>>> Apache Calcite. Calcite is an Apache top-level project and includes >>>> a SQL >>> parser, >>>> a semantic validator for relational queries, and a rule- and >>>> cost-based relational optimizer. Calcite is used by Apache Hive and >>>> Apache Drill (among other projects). In a nutshell, the plan is to >>>> translate Table >> API >>>> and SQL queries into Calcite's relational expression trees, >>>> optimize >>> these >>>> trees, and translate them into DataSet and DataStream programs.The >>> document >>>> breaks down the work into several tasks and subtasks. >>>> >>>> Please review the design document and comment. >>>> >>>> -- > >>>> >>> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP >> cp1h2TVqdI/edit?usp=sharing >>>> >>>> Unless there are major concerns with the design, Timo and I want to >> start >>>> next week to move the current Table API on top of Apache Calcite >>>> (Task >> 1 >>> in >>>> the document). The goal of this task is to have the same >>>> functionality >> as >>>> currently, but with Calcite in the translation process. This is a >>> blocking >>>> task that we hope to complete soon. Afterwards, we can >>>> independently >> work >>>> on different aspects such as extending the Table API, adding a SQL >>>> interface (basically just a parser), integration with external data >>>> sources, better code generation, optimization rules, streaming >>>> support >>> for >>>> the Table API, StreamSQL, etc.. >>>> >>>> Timo and I plan to work on a WIP branch to implement Task 1 and >>>> merge >> it >>> to >>>> the master branch once the task is completed. Of course, everybody >>>> is welcome to contribute to this effort. Please let us know such >>>> that we >> can >>>> coordinate our efforts. >>>> >>>> Thanks, >>>> Fabian >>>> >>> >>> >> Regards, Chiwan Park |
In reply to this post by Fabian Hueske-2
I am excited and nervous at the same time =)
- Henry On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote: > Hi everybody, > > in the last days, Timo and I refined the design document for adding a SQL / > StreamSQL interface on top of Flink that was started by Stephan. > > The document proposes an architecture that is centered around Apache > Calcite. Calcite is an Apache top-level project and includes a SQL parser, > a semantic validator for relational queries, and a rule- and cost-based > relational optimizer. Calcite is used by Apache Hive and Apache Drill > (among other projects). In a nutshell, the plan is to translate Table API > and SQL queries into Calcite's relational expression trees, optimize these > trees, and translate them into DataSet and DataStream programs.The document > breaks down the work into several tasks and subtasks. > > Please review the design document and comment. > > -- > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > Unless there are major concerns with the design, Timo and I want to start > next week to move the current Table API on top of Apache Calcite (Task 1 in > the document). The goal of this task is to have the same functionality as > currently, but with Calcite in the translation process. This is a blocking > task that we hope to complete soon. Afterwards, we can independently work > on different aspects such as extending the Table API, adding a SQL > interface (basically just a parser), integration with external data > sources, better code generation, optimization rules, streaming support for > the Table API, StreamSQL, etc.. > > Timo and I plan to work on a WIP branch to implement Task 1 and merge it to > the master branch once the task is completed. Of course, everybody is > welcome to contribute to this effort. Please let us know such that we can > coordinate our efforts. > > Thanks, > Fabian |
In reply to this post by Fabian Hueske-2
HI Fabian,
Have you created JIRA ticket to keep track of this new feature? - Henry On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote: > Hi everybody, > > in the last days, Timo and I refined the design document for adding a SQL / > StreamSQL interface on top of Flink that was started by Stephan. > > The document proposes an architecture that is centered around Apache > Calcite. Calcite is an Apache top-level project and includes a SQL parser, > a semantic validator for relational queries, and a rule- and cost-based > relational optimizer. Calcite is used by Apache Hive and Apache Drill > (among other projects). In a nutshell, the plan is to translate Table API > and SQL queries into Calcite's relational expression trees, optimize these > trees, and translate them into DataSet and DataStream programs.The document > breaks down the work into several tasks and subtasks. > > Please review the design document and comment. > > -- > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > Unless there are major concerns with the design, Timo and I want to start > next week to move the current Table API on top of Apache Calcite (Task 1 in > the document). The goal of this task is to have the same functionality as > currently, but with Calcite in the translation process. This is a blocking > task that we hope to complete soon. Afterwards, we can independently work > on different aspects such as extending the Table API, adding a SQL > interface (basically just a parser), integration with external data > sources, better code generation, optimization rules, streaming support for > the Table API, StreamSQL, etc.. > > Timo and I plan to work on a WIP branch to implement Task 1 and merge it to > the master branch once the task is completed. Of course, everybody is > welcome to contribute to this effort. Please let us know such that we can > coordinate our efforts. > > Thanks, > Fabian |
Hi Henry,
There is https://issues.apache.org/jira/browse/FLINK-2099 and a few subissues. I'll reorganize these and add more issues for the tasks described in the design document in the next days. Thanks, Fabian 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email]>: > HI Fabian, > > Have you created JIRA ticket to keep track of this new feature? > > - Henry > > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email]> wrote: > > Hi everybody, > > > > in the last days, Timo and I refined the design document for adding a > SQL / > > StreamSQL interface on top of Flink that was started by Stephan. > > > > The document proposes an architecture that is centered around Apache > > Calcite. Calcite is an Apache top-level project and includes a SQL > parser, > > a semantic validator for relational queries, and a rule- and cost-based > > relational optimizer. Calcite is used by Apache Hive and Apache Drill > > (among other projects). In a nutshell, the plan is to translate Table API > > and SQL queries into Calcite's relational expression trees, optimize > these > > trees, and translate them into DataSet and DataStream programs.The > document > > breaks down the work into several tasks and subtasks. > > > > Please review the design document and comment. > > > > -- > > > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > > > Unless there are major concerns with the design, Timo and I want to start > > next week to move the current Table API on top of Apache Calcite (Task 1 > in > > the document). The goal of this task is to have the same functionality as > > currently, but with Calcite in the translation process. This is a > blocking > > task that we hope to complete soon. Afterwards, we can independently work > > on different aspects such as extending the Table API, adding a SQL > > interface (basically just a parser), integration with external data > > sources, better code generation, optimization rules, streaming support > for > > the Table API, StreamSQL, etc.. > > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge it > to > > the master branch once the task is completed. Of course, everybody is > > welcome to contribute to this effort. Please let us know such that we can > > coordinate our efforts. > > > > Thanks, > > Fabian > |
Awesome! Thanks for the reply, Fabian.
- Henry On Sunday, January 10, 2016, Fabian Hueske <[hidden email]> wrote: > Hi Henry, > > There is https://issues.apache.org/jira/browse/FLINK-2099 and a few > subissues. > I'll reorganize these and add more issues for the tasks described in the > design document in the next days. > > Thanks, Fabian > > 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] > <javascript:;>>: > > > HI Fabian, > > > > Have you created JIRA ticket to keep track of this new feature? > > > > - Henry > > > > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] > <javascript:;>> wrote: > > > Hi everybody, > > > > > > in the last days, Timo and I refined the design document for adding a > > SQL / > > > StreamSQL interface on top of Flink that was started by Stephan. > > > > > > The document proposes an architecture that is centered around Apache > > > Calcite. Calcite is an Apache top-level project and includes a SQL > > parser, > > > a semantic validator for relational queries, and a rule- and cost-based > > > relational optimizer. Calcite is used by Apache Hive and Apache Drill > > > (among other projects). In a nutshell, the plan is to translate Table > API > > > and SQL queries into Calcite's relational expression trees, optimize > > these > > > trees, and translate them into DataSet and DataStream programs.The > > document > > > breaks down the work into several tasks and subtasks. > > > > > > Please review the design document and comment. > > > > > > -- > > > > > > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > > > > > Unless there are major concerns with the design, Timo and I want to > start > > > next week to move the current Table API on top of Apache Calcite (Task > 1 > > in > > > the document). The goal of this task is to have the same functionality > as > > > currently, but with Calcite in the translation process. This is a > > blocking > > > task that we hope to complete soon. Afterwards, we can independently > work > > > on different aspects such as extending the Table API, adding a SQL > > > interface (basically just a parser), integration with external data > > > sources, better code generation, optimization rules, streaming support > > for > > > the Table API, StreamSQL, etc.. > > > > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge > it > > to > > > the master branch once the task is completed. Of course, everybody is > > > welcome to contribute to this effort. Please let us know such that we > can > > > coordinate our efforts. > > > > > > Thanks, > > > Fabian > > > |
What's the relationship between the streaming SQL proposed here and the CEP
syntax proposed earlier in the week? On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote: > Awesome! Thanks for the reply, Fabian. > > - Henry > > On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > <javascript:;>> wrote: > > > Hi Henry, > > > > There is https://issues.apache.org/jira/browse/FLINK-2099 and a few > > subissues. > > I'll reorganize these and add more issues for the tasks described in the > > design document in the next days. > > > > Thanks, Fabian > > > > 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] > <javascript:;> > > <javascript:;>>: > > > > > HI Fabian, > > > > > > Have you created JIRA ticket to keep track of this new feature? > > > > > > - Henry > > > > > > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] > <javascript:;> > > <javascript:;>> wrote: > > > > Hi everybody, > > > > > > > > in the last days, Timo and I refined the design document for adding a > > > SQL / > > > > StreamSQL interface on top of Flink that was started by Stephan. > > > > > > > > The document proposes an architecture that is centered around Apache > > > > Calcite. Calcite is an Apache top-level project and includes a SQL > > > parser, > > > > a semantic validator for relational queries, and a rule- and > cost-based > > > > relational optimizer. Calcite is used by Apache Hive and Apache Drill > > > > (among other projects). In a nutshell, the plan is to translate Table > > API > > > > and SQL queries into Calcite's relational expression trees, optimize > > > these > > > > trees, and translate them into DataSet and DataStream programs.The > > > document > > > > breaks down the work into several tasks and subtasks. > > > > > > > > Please review the design document and comment. > > > > > > > > -- > > > > > > > > > > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing > > > > > > > > Unless there are major concerns with the design, Timo and I want to > > start > > > > next week to move the current Table API on top of Apache Calcite > (Task > > 1 > > > in > > > > the document). The goal of this task is to have the same > functionality > > as > > > > currently, but with Calcite in the translation process. This is a > > > blocking > > > > task that we hope to complete soon. Afterwards, we can independently > > work > > > > on different aspects such as extending the Table API, adding a SQL > > > > interface (basically just a parser), integration with external data > > > > sources, better code generation, optimization rules, streaming > support > > > for > > > > the Table API, StreamSQL, etc.. > > > > > > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge > > it > > > to > > > > the master branch once the task is completed. Of course, everybody is > > > > welcome to contribute to this effort. Please let us know such that we > > can > > > > coordinate our efforts. > > > > > > > > Thanks, > > > > Fabian > > > > > > |
We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list.
> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote: > > What's the relationship between the streaming SQL proposed here and the CEP > syntax proposed earlier in the week? > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote: > >> Awesome! Thanks for the reply, Fabian. >> >> - Henry >> >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] >> <javascript:;>> wrote: >> >>> Hi Henry, >>> >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few >>> subissues. >>> I'll reorganize these and add more issues for the tasks described in the >>> design document in the next days. >>> >>> Thanks, Fabian >>> >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] >> <javascript:;> >>> <javascript:;>>: >>> >>>> HI Fabian, >>>> >>>> Have you created JIRA ticket to keep track of this new feature? >>>> >>>> - Henry >>>> >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] >> <javascript:;> >>> <javascript:;>> wrote: >>>>> Hi everybody, >>>>> >>>>> in the last days, Timo and I refined the design document for adding a >>>> SQL / >>>>> StreamSQL interface on top of Flink that was started by Stephan. >>>>> >>>>> The document proposes an architecture that is centered around Apache >>>>> Calcite. Calcite is an Apache top-level project and includes a SQL >>>> parser, >>>>> a semantic validator for relational queries, and a rule- and >> cost-based >>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill >>>>> (among other projects). In a nutshell, the plan is to translate Table >>> API >>>>> and SQL queries into Calcite's relational expression trees, optimize >>>> these >>>>> trees, and translate them into DataSet and DataStream programs.The >>>> document >>>>> breaks down the work into several tasks and subtasks. >>>>> >>>>> Please review the design document and comment. >>>>> >>>>> -- > >>>>> >>>> >>> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing >>>>> >>>>> Unless there are major concerns with the design, Timo and I want to >>> start >>>>> next week to move the current Table API on top of Apache Calcite >> (Task >>> 1 >>>> in >>>>> the document). The goal of this task is to have the same >> functionality >>> as >>>>> currently, but with Calcite in the translation process. This is a >>>> blocking >>>>> task that we hope to complete soon. Afterwards, we can independently >>> work >>>>> on different aspects such as extending the Table API, adding a SQL >>>>> interface (basically just a parser), integration with external data >>>>> sources, better code generation, optimization rules, streaming >> support >>>> for >>>>> the Table API, StreamSQL, etc.. >>>>> >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge >>> it >>>> to >>>>> the master branch once the task is completed. Of course, everybody is >>>>> welcome to contribute to this effort. Please let us know such that we >>> can >>>>> coordinate our efforts. >>>>> >>>>> Thanks, >>>>> Fabian Regards, Chiwan Park |
I suggest refering to Esper EPL[1], which is a SQL-standard language extend to offering a cluster of window, pattern matching. EPL can both support Streaming SQL and CEP with one unified syntax.
[1] http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf (Chapter 5. EPL Reference: Clauses) Regards Song -----邮件原件----- 发件人: Chiwan Park [mailto:[hidden email]] 发送时间: 2016年1月11日 10:31 收件人: [hidden email] 主题: Re: Effort to add SQL / StreamSQL to Flink We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list. > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote: > > What's the relationship between the streaming SQL proposed here and > the CEP syntax proposed earlier in the week? > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]> wrote: > >> Awesome! Thanks for the reply, Fabian. >> >> - Henry >> >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] >> <javascript:;>> wrote: >> >>> Hi Henry, >>> >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few >>> subissues. >>> I'll reorganize these and add more issues for the tasks described in >>> the design document in the next days. >>> >>> Thanks, Fabian >>> >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] >> <javascript:;> >>> <javascript:;>>: >>> >>>> HI Fabian, >>>> >>>> Have you created JIRA ticket to keep track of this new feature? >>>> >>>> - Henry >>>> >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] >> <javascript:;> >>> <javascript:;>> wrote: >>>>> Hi everybody, >>>>> >>>>> in the last days, Timo and I refined the design document for >>>>> adding a >>>> SQL / >>>>> StreamSQL interface on top of Flink that was started by Stephan. >>>>> >>>>> The document proposes an architecture that is centered around >>>>> Apache Calcite. Calcite is an Apache top-level project and >>>>> includes a SQL >>>> parser, >>>>> a semantic validator for relational queries, and a rule- and >> cost-based >>>>> relational optimizer. Calcite is used by Apache Hive and Apache >>>>> Drill (among other projects). In a nutshell, the plan is to >>>>> translate Table >>> API >>>>> and SQL queries into Calcite's relational expression trees, >>>>> optimize >>>> these >>>>> trees, and translate them into DataSet and DataStream programs.The >>>> document >>>>> breaks down the work into several tasks and subtasks. >>>>> >>>>> Please review the design document and comment. >>>>> >>>>> -- > >>>>> >>>> >>> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP >> cp1h2TVqdI/edit?usp=sharing >>>>> >>>>> Unless there are major concerns with the design, Timo and I want >>>>> to >>> start >>>>> next week to move the current Table API on top of Apache Calcite >> (Task >>> 1 >>>> in >>>>> the document). The goal of this task is to have the same >> functionality >>> as >>>>> currently, but with Calcite in the translation process. This is a >>>> blocking >>>>> task that we hope to complete soon. Afterwards, we can >>>>> independently >>> work >>>>> on different aspects such as extending the Table API, adding a SQL >>>>> interface (basically just a parser), integration with external >>>>> data sources, better code generation, optimization rules, >>>>> streaming >> support >>>> for >>>>> the Table API, StreamSQL, etc.. >>>>> >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and >>>>> merge >>> it >>>> to >>>>> the master branch once the task is completed. Of course, everybody >>>>> is welcome to contribute to this effort. Please let us know such >>>>> that we >>> can >>>>> coordinate our efforts. >>>>> >>>>> Thanks, >>>>> Fabian Regards, Chiwan Park |
Thanks for the feedback!
We will start the SQL effort with putting the existing (batch) Table API on top of Apache Calcite. From there we continue to add streaming support for the Table API before we put a StreamSQL interface on top. Consolidating the efforts with the CEP library sounds like a good idea to me. Maybe it can be nicely integrated with the streaming table API and later as well with the StreamSQL interface (the StreamSQL dialect is not defined yet). @Till: What do you think about adding CEP features to the Table API. From the CEP design doc, it looks like we need to add a pattern matching operator in addition to the window features that we need to add for streaming Table API in any case. Best, Fabian 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: > I suggest refering to Esper EPL[1], which is a SQL-standard language > extend to offering a cluster of window, pattern matching. EPL can both > support Streaming SQL and CEP with one unified syntax. > > [1] > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > (Chapter 5. EPL Reference: Clauses) > > > Regards > Song > > > -----邮件原件----- > 发件人: Chiwan Park [mailto:[hidden email]] > 发送时间: 2016年1月11日 10:31 > 收件人: [hidden email] > 主题: Re: Effort to add SQL / StreamSQL to Flink > > We still don’t have a concensus about the streaming SQL and CEP library on > Flink. Some people want to merge these two libraries. Maybe we have to > discuss about this in mailing list. > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote: > > > > What's the relationship between the streaming SQL proposed here and > > the CEP syntax proposed earlier in the week? > > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]> > wrote: > > > >> Awesome! Thanks for the reply, Fabian. > >> > >> - Henry > >> > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > >> <javascript:;>> wrote: > >> > >>> Hi Henry, > >>> > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few > >>> subissues. > >>> I'll reorganize these and add more issues for the tasks described in > >>> the design document in the next days. > >>> > >>> Thanks, Fabian > >>> > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] > >> <javascript:;> > >>> <javascript:;>>: > >>> > >>>> HI Fabian, > >>>> > >>>> Have you created JIRA ticket to keep track of this new feature? > >>>> > >>>> - Henry > >>>> > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] > >> <javascript:;> > >>> <javascript:;>> wrote: > >>>>> Hi everybody, > >>>>> > >>>>> in the last days, Timo and I refined the design document for > >>>>> adding a > >>>> SQL / > >>>>> StreamSQL interface on top of Flink that was started by Stephan. > >>>>> > >>>>> The document proposes an architecture that is centered around > >>>>> Apache Calcite. Calcite is an Apache top-level project and > >>>>> includes a SQL > >>>> parser, > >>>>> a semantic validator for relational queries, and a rule- and > >> cost-based > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache > >>>>> Drill (among other projects). In a nutshell, the plan is to > >>>>> translate Table > >>> API > >>>>> and SQL queries into Calcite's relational expression trees, > >>>>> optimize > >>>> these > >>>>> trees, and translate them into DataSet and DataStream programs.The > >>>> document > >>>>> breaks down the work into several tasks and subtasks. > >>>>> > >>>>> Please review the design document and comment. > >>>>> > >>>>> -- > > >>>>> > >>>> > >>> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > >> cp1h2TVqdI/edit?usp=sharing > >>>>> > >>>>> Unless there are major concerns with the design, Timo and I want > >>>>> to > >>> start > >>>>> next week to move the current Table API on top of Apache Calcite > >> (Task > >>> 1 > >>>> in > >>>>> the document). The goal of this task is to have the same > >> functionality > >>> as > >>>>> currently, but with Calcite in the translation process. This is a > >>>> blocking > >>>>> task that we hope to complete soon. Afterwards, we can > >>>>> independently > >>> work > >>>>> on different aspects such as extending the Table API, adding a SQL > >>>>> interface (basically just a parser), integration with external > >>>>> data sources, better code generation, optimization rules, > >>>>> streaming > >> support > >>>> for > >>>>> the Table API, StreamSQL, etc.. > >>>>> > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and > >>>>> merge > >>> it > >>>> to > >>>>> the master branch once the task is completed. Of course, everybody > >>>>> is welcome to contribute to this effort. Please let us know such > >>>>> that we > >>> can > >>>>> coordinate our efforts. > >>>>> > >>>>> Thanks, > >>>>> Fabian > > Regards, > Chiwan Park > > > |
First of all, it's a great design document. Looking forward having stream
SQL in the foreseeable future :-) I think it is a good idea to consolidate stream SQL and CEP in the long run. CEP's additional features compared to SQL boil down to pattern detection. Once we have this, it should be only a question of defining the SQL syntax for event patterns in order to integrate CEP with stream SQL. Oracle has already defined an extension [1] to detect patterns in a set of table rows. This or Esper's event processing language (EPL) [2] could be a good starting point. [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ Cheers, Till On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> wrote: > Thanks for the feedback! > > We will start the SQL effort with putting the existing (batch) Table API on > top of Apache Calcite. > From there we continue to add streaming support for the Table API before we > put a StreamSQL interface on top. > > Consolidating the efforts with the CEP library sounds like a good idea to > me. > Maybe it can be nicely integrated with the streaming table API and later as > well with the StreamSQL interface (the StreamSQL dialect is not defined > yet). > > @Till: What do you think about adding CEP features to the Table API. From > the CEP design doc, it looks like we need to add a pattern matching > operator in addition to the window features that we need to add for > streaming Table API in any case. > > Best, Fabian > > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: > > > I suggest refering to Esper EPL[1], which is a SQL-standard language > > extend to offering a cluster of window, pattern matching. EPL can both > > support Streaming SQL and CEP with one unified syntax. > > > > [1] > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > > (Chapter 5. EPL Reference: Clauses) > > > > > > Regards > > Song > > > > > > -----邮件原件----- > > 发件人: Chiwan Park [mailto:[hidden email]] > > 发送时间: 2016年1月11日 10:31 > > 收件人: [hidden email] > > 主题: Re: Effort to add SQL / StreamSQL to Flink > > > > We still don’t have a concensus about the streaming SQL and CEP library > on > > Flink. Some people want to merge these two libraries. Maybe we have to > > discuss about this in mailing list. > > > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> wrote: > > > > > > What's the relationship between the streaming SQL proposed here and > > > the CEP syntax proposed earlier in the week? > > > > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]> > > wrote: > > > > > >> Awesome! Thanks for the reply, Fabian. > > >> > > >> - Henry > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > > >> <javascript:;>> wrote: > > >> > > >>> Hi Henry, > > >>> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few > > >>> subissues. > > >>> I'll reorganize these and add more issues for the tasks described in > > >>> the design document in the next days. > > >>> > > >>> Thanks, Fabian > > >>> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] > > >> <javascript:;> > > >>> <javascript:;>>: > > >>> > > >>>> HI Fabian, > > >>>> > > >>>> Have you created JIRA ticket to keep track of this new feature? > > >>>> > > >>>> - Henry > > >>>> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] > > >> <javascript:;> > > >>> <javascript:;>> wrote: > > >>>>> Hi everybody, > > >>>>> > > >>>>> in the last days, Timo and I refined the design document for > > >>>>> adding a > > >>>> SQL / > > >>>>> StreamSQL interface on top of Flink that was started by Stephan. > > >>>>> > > >>>>> The document proposes an architecture that is centered around > > >>>>> Apache Calcite. Calcite is an Apache top-level project and > > >>>>> includes a SQL > > >>>> parser, > > >>>>> a semantic validator for relational queries, and a rule- and > > >> cost-based > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache > > >>>>> Drill (among other projects). In a nutshell, the plan is to > > >>>>> translate Table > > >>> API > > >>>>> and SQL queries into Calcite's relational expression trees, > > >>>>> optimize > > >>>> these > > >>>>> trees, and translate them into DataSet and DataStream programs.The > > >>>> document > > >>>>> breaks down the work into several tasks and subtasks. > > >>>>> > > >>>>> Please review the design document and comment. > > >>>>> > > >>>>> -- > > > >>>>> > > >>>> > > >>> > > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > > >> cp1h2TVqdI/edit?usp=sharing > > >>>>> > > >>>>> Unless there are major concerns with the design, Timo and I want > > >>>>> to > > >>> start > > >>>>> next week to move the current Table API on top of Apache Calcite > > >> (Task > > >>> 1 > > >>>> in > > >>>>> the document). The goal of this task is to have the same > > >> functionality > > >>> as > > >>>>> currently, but with Calcite in the translation process. This is a > > >>>> blocking > > >>>>> task that we hope to complete soon. Afterwards, we can > > >>>>> independently > > >>> work > > >>>>> on different aspects such as extending the Table API, adding a SQL > > >>>>> interface (basically just a parser), integration with external > > >>>>> data sources, better code generation, optimization rules, > > >>>>> streaming > > >> support > > >>>> for > > >>>>> the Table API, StreamSQL, etc.. > > >>>>> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and > > >>>>> merge > > >>> it > > >>>> to > > >>>>> the master branch once the task is completed. Of course, everybody > > >>>>> is welcome to contribute to this effort. Please let us know such > > >>>>> that we > > >>> can > > >>>>> coordinate our efforts. > > >>>>> > > >>>>> Thanks, > > >>>>> Fabian > > > > Regards, > > Chiwan Park > > > > > > > |
We haven't defined the StreamSQL syntax yet (and I think it will take some
time until we are at that point). So we are quite flexible with both featurs. Let's keep this opportunity in mind and coordinate when before making decisions about CEP or StreamSQL. Fabian 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>: > First of all, it's a great design document. Looking forward having stream > SQL in the foreseeable future :-) > > I think it is a good idea to consolidate stream SQL and CEP in the long > run. CEP's additional features compared to SQL boil down to pattern > detection. Once we have this, it should be only a question of defining the > SQL syntax for event patterns in order to integrate CEP with stream SQL. > Oracle has already defined an extension [1] to detect patterns in a set of > table rows. This or Esper's event processing language (EPL) [2] could be a > good starting point. > > [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 > [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ > > Cheers, > Till > > On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> wrote: > > > Thanks for the feedback! > > > > We will start the SQL effort with putting the existing (batch) Table API > on > > top of Apache Calcite. > > From there we continue to add streaming support for the Table API before > we > > put a StreamSQL interface on top. > > > > Consolidating the efforts with the CEP library sounds like a good idea to > > me. > > Maybe it can be nicely integrated with the streaming table API and later > as > > well with the StreamSQL interface (the StreamSQL dialect is not defined > > yet). > > > > @Till: What do you think about adding CEP features to the Table API. From > > the CEP design doc, it looks like we need to add a pattern matching > > operator in addition to the window features that we need to add for > > streaming Table API in any case. > > > > Best, Fabian > > > > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: > > > > > I suggest refering to Esper EPL[1], which is a SQL-standard language > > > extend to offering a cluster of window, pattern matching. EPL can both > > > support Streaming SQL and CEP with one unified syntax. > > > > > > [1] > > > > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > > > (Chapter 5. EPL Reference: Clauses) > > > > > > > > > Regards > > > Song > > > > > > > > > -----邮件原件----- > > > 发件人: Chiwan Park [mailto:[hidden email]] > > > 发送时间: 2016年1月11日 10:31 > > > 收件人: [hidden email] > > > 主题: Re: Effort to add SQL / StreamSQL to Flink > > > > > > We still don’t have a concensus about the streaming SQL and CEP library > > on > > > Flink. Some people want to merge these two libraries. Maybe we have to > > > discuss about this in mailing list. > > > > > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> > wrote: > > > > > > > > What's the relationship between the streaming SQL proposed here and > > > > the CEP syntax proposed earlier in the week? > > > > > > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email]> > > > wrote: > > > > > > > >> Awesome! Thanks for the reply, Fabian. > > > >> > > > >> - Henry > > > >> > > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > > > >> <javascript:;>> wrote: > > > >> > > > >>> Hi Henry, > > > >>> > > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a > few > > > >>> subissues. > > > >>> I'll reorganize these and add more issues for the tasks described > in > > > >>> the design document in the next days. > > > >>> > > > >>> Thanks, Fabian > > > >>> > > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] > > > >> <javascript:;> > > > >>> <javascript:;>>: > > > >>> > > > >>>> HI Fabian, > > > >>>> > > > >>>> Have you created JIRA ticket to keep track of this new feature? > > > >>>> > > > >>>> - Henry > > > >>>> > > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] > > > >> <javascript:;> > > > >>> <javascript:;>> wrote: > > > >>>>> Hi everybody, > > > >>>>> > > > >>>>> in the last days, Timo and I refined the design document for > > > >>>>> adding a > > > >>>> SQL / > > > >>>>> StreamSQL interface on top of Flink that was started by Stephan. > > > >>>>> > > > >>>>> The document proposes an architecture that is centered around > > > >>>>> Apache Calcite. Calcite is an Apache top-level project and > > > >>>>> includes a SQL > > > >>>> parser, > > > >>>>> a semantic validator for relational queries, and a rule- and > > > >> cost-based > > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache > > > >>>>> Drill (among other projects). In a nutshell, the plan is to > > > >>>>> translate Table > > > >>> API > > > >>>>> and SQL queries into Calcite's relational expression trees, > > > >>>>> optimize > > > >>>> these > > > >>>>> trees, and translate them into DataSet and DataStream > programs.The > > > >>>> document > > > >>>>> breaks down the work into several tasks and subtasks. > > > >>>>> > > > >>>>> Please review the design document and comment. > > > >>>>> > > > >>>>> -- > > > > >>>>> > > > >>>> > > > >>> > > > >> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > > > >> cp1h2TVqdI/edit?usp=sharing > > > >>>>> > > > >>>>> Unless there are major concerns with the design, Timo and I want > > > >>>>> to > > > >>> start > > > >>>>> next week to move the current Table API on top of Apache Calcite > > > >> (Task > > > >>> 1 > > > >>>> in > > > >>>>> the document). The goal of this task is to have the same > > > >> functionality > > > >>> as > > > >>>>> currently, but with Calcite in the translation process. This is a > > > >>>> blocking > > > >>>>> task that we hope to complete soon. Afterwards, we can > > > >>>>> independently > > > >>> work > > > >>>>> on different aspects such as extending the Table API, adding a > SQL > > > >>>>> interface (basically just a parser), integration with external > > > >>>>> data sources, better code generation, optimization rules, > > > >>>>> streaming > > > >> support > > > >>>> for > > > >>>>> the Table API, StreamSQL, etc.. > > > >>>>> > > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and > > > >>>>> merge > > > >>> it > > > >>>> to > > > >>>>> the master branch once the task is completed. Of course, > everybody > > > >>>>> is welcome to contribute to this effort. Please let us know such > > > >>>>> that we > > > >>> can > > > >>>>> coordinate our efforts. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Fabian > > > > > > Regards, > > > Chiwan Park > > > > > > > > > > > > |
Hi everybody,
as previously announced, I pushed a feature branch called "tableOnCalcite" to the Flink repository. We will use this branch to work on FLINK-3221 and its sub-issues. Cheers, Fabian 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>: > We haven't defined the StreamSQL syntax yet (and I think it will take some > time until we are at that point). > So we are quite flexible with both featurs. > > Let's keep this opportunity in mind and coordinate when before making > decisions about CEP or StreamSQL. > > Fabian > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>: > >> First of all, it's a great design document. Looking forward having stream >> SQL in the foreseeable future :-) >> >> I think it is a good idea to consolidate stream SQL and CEP in the long >> run. CEP's additional features compared to SQL boil down to pattern >> detection. Once we have this, it should be only a question of defining the >> SQL syntax for event patterns in order to integrate CEP with stream SQL. >> Oracle has already defined an extension [1] to detect patterns in a set of >> table rows. This or Esper's event processing language (EPL) [2] could be a >> good starting point. >> >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 >> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ >> >> Cheers, >> Till >> >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> >> wrote: >> >> > Thanks for the feedback! >> > >> > We will start the SQL effort with putting the existing (batch) Table >> API on >> > top of Apache Calcite. >> > From there we continue to add streaming support for the Table API >> before we >> > put a StreamSQL interface on top. >> > >> > Consolidating the efforts with the CEP library sounds like a good idea >> to >> > me. >> > Maybe it can be nicely integrated with the streaming table API and >> later as >> > well with the StreamSQL interface (the StreamSQL dialect is not defined >> > yet). >> > >> > @Till: What do you think about adding CEP features to the Table API. >> From >> > the CEP design doc, it looks like we need to add a pattern matching >> > operator in addition to the window features that we need to add for >> > streaming Table API in any case. >> > >> > Best, Fabian >> > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: >> > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard language >> > > extend to offering a cluster of window, pattern matching. EPL can >> both >> > > support Streaming SQL and CEP with one unified syntax. >> > > >> > > [1] >> > > >> > >> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf >> > > (Chapter 5. EPL Reference: Clauses) >> > > >> > > >> > > Regards >> > > Song >> > > >> > > >> > > -----邮件原件----- >> > > 发件人: Chiwan Park [mailto:[hidden email]] >> > > 发送时间: 2016年1月11日 10:31 >> > > 收件人: [hidden email] >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink >> > > >> > > We still don’t have a concensus about the streaming SQL and CEP >> library >> > on >> > > Flink. Some people want to merge these two libraries. Maybe we have to >> > > discuss about this in mailing list. >> > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> >> wrote: >> > > > >> > > > What's the relationship between the streaming SQL proposed here and >> > > > the CEP syntax proposed earlier in the week? >> > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <[hidden email] >> > >> > > wrote: >> > > > >> > > >> Awesome! Thanks for the reply, Fabian. >> > > >> >> > > >> - Henry >> > > >> >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] >> > > >> <javascript:;>> wrote: >> > > >> >> > > >>> Hi Henry, >> > > >>> >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a >> few >> > > >>> subissues. >> > > >>> I'll reorganize these and add more issues for the tasks described >> in >> > > >>> the design document in the next days. >> > > >>> >> > > >>> Thanks, Fabian >> > > >>> >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <[hidden email] >> > > >> <javascript:;> >> > > >>> <javascript:;>>: >> > > >>> >> > > >>>> HI Fabian, >> > > >>>> >> > > >>>> Have you created JIRA ticket to keep track of this new feature? >> > > >>>> >> > > >>>> - Henry >> > > >>>> >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <[hidden email] >> > > >> <javascript:;> >> > > >>> <javascript:;>> wrote: >> > > >>>>> Hi everybody, >> > > >>>>> >> > > >>>>> in the last days, Timo and I refined the design document for >> > > >>>>> adding a >> > > >>>> SQL / >> > > >>>>> StreamSQL interface on top of Flink that was started by Stephan. >> > > >>>>> >> > > >>>>> The document proposes an architecture that is centered around >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and >> > > >>>>> includes a SQL >> > > >>>> parser, >> > > >>>>> a semantic validator for relational queries, and a rule- and >> > > >> cost-based >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to >> > > >>>>> translate Table >> > > >>> API >> > > >>>>> and SQL queries into Calcite's relational expression trees, >> > > >>>>> optimize >> > > >>>> these >> > > >>>>> trees, and translate them into DataSet and DataStream >> programs.The >> > > >>>> document >> > > >>>>> breaks down the work into several tasks and subtasks. >> > > >>>>> >> > > >>>>> Please review the design document and comment. >> > > >>>>> >> > > >>>>> -- > >> > > >>>>> >> > > >>>> >> > > >>> >> > > >> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP >> > > >> cp1h2TVqdI/edit?usp=sharing >> > > >>>>> >> > > >>>>> Unless there are major concerns with the design, Timo and I want >> > > >>>>> to >> > > >>> start >> > > >>>>> next week to move the current Table API on top of Apache Calcite >> > > >> (Task >> > > >>> 1 >> > > >>>> in >> > > >>>>> the document). The goal of this task is to have the same >> > > >> functionality >> > > >>> as >> > > >>>>> currently, but with Calcite in the translation process. This is >> a >> > > >>>> blocking >> > > >>>>> task that we hope to complete soon. Afterwards, we can >> > > >>>>> independently >> > > >>> work >> > > >>>>> on different aspects such as extending the Table API, adding a >> SQL >> > > >>>>> interface (basically just a parser), integration with external >> > > >>>>> data sources, better code generation, optimization rules, >> > > >>>>> streaming >> > > >> support >> > > >>>> for >> > > >>>>> the Table API, StreamSQL, etc.. >> > > >>>>> >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and >> > > >>>>> merge >> > > >>> it >> > > >>>> to >> > > >>>>> the master branch once the task is completed. Of course, >> everybody >> > > >>>>> is welcome to contribute to this effort. Please let us know such >> > > >>>>> that we >> > > >>> can >> > > >>>>> coordinate our efforts. >> > > >>>>> >> > > >>>>> Thanks, >> > > >>>>> Fabian >> > > >> > > Regards, >> > > Chiwan Park >> > > >> > > >> > > >> > >> > > |
Hello everyone,
We are happy to announce that the "tableOnCalcite" branch is finally ready to be merged. It essentially provides the existing functionality of the Table API, but now the translation happens through Apache Calcite. You can find the changes rebased on top of the current master in [1]. We have removed the prototype streaming Table API functionality, which will be added back once PR [2] is merged. We'll go through the changes once more and, if no objections, we would like to go ahead and merge this. Cheers, -Vasia. [1]: https://github.com/vasia/flink/tree/merge-table [2]: https://github.com/apache/flink/pull/1770 On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote: > Hi everybody, > > as previously announced, I pushed a feature branch called "tableOnCalcite" > to the Flink repository. > We will use this branch to work on FLINK-3221 and its sub-issues. > > Cheers, Fabian > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>: > > > We haven't defined the StreamSQL syntax yet (and I think it will take > some > > time until we are at that point). > > So we are quite flexible with both featurs. > > > > Let's keep this opportunity in mind and coordinate when before making > > decisions about CEP or StreamSQL. > > > > Fabian > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>: > > > >> First of all, it's a great design document. Looking forward having > stream > >> SQL in the foreseeable future :-) > >> > >> I think it is a good idea to consolidate stream SQL and CEP in the long > >> run. CEP's additional features compared to SQL boil down to pattern > >> detection. Once we have this, it should be only a question of defining > the > >> SQL syntax for event patterns in order to integrate CEP with stream SQL. > >> Oracle has already defined an extension [1] to detect patterns in a set > of > >> table rows. This or Esper's event processing language (EPL) [2] could > be a > >> good starting point. > >> > >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 > >> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ > >> > >> Cheers, > >> Till > >> > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> > >> wrote: > >> > >> > Thanks for the feedback! > >> > > >> > We will start the SQL effort with putting the existing (batch) Table > >> API on > >> > top of Apache Calcite. > >> > From there we continue to add streaming support for the Table API > >> before we > >> > put a StreamSQL interface on top. > >> > > >> > Consolidating the efforts with the CEP library sounds like a good idea > >> to > >> > me. > >> > Maybe it can be nicely integrated with the streaming table API and > >> later as > >> > well with the StreamSQL interface (the StreamSQL dialect is not > defined > >> > yet). > >> > > >> > @Till: What do you think about adding CEP features to the Table API. > >> From > >> > the CEP design doc, it looks like we need to add a pattern matching > >> > operator in addition to the window features that we need to add for > >> > streaming Table API in any case. > >> > > >> > Best, Fabian > >> > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: > >> > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard language > >> > > extend to offering a cluster of window, pattern matching. EPL can > >> both > >> > > support Streaming SQL and CEP with one unified syntax. > >> > > > >> > > [1] > >> > > > >> > > >> > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > >> > > (Chapter 5. EPL Reference: Clauses) > >> > > > >> > > > >> > > Regards > >> > > Song > >> > > > >> > > > >> > > -----邮件原件----- > >> > > 发件人: Chiwan Park [mailto:[hidden email]] > >> > > 发送时间: 2016年1月11日 10:31 > >> > > 收件人: [hidden email] > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink > >> > > > >> > > We still don’t have a concensus about the streaming SQL and CEP > >> library > >> > on > >> > > Flink. Some people want to merge these two libraries. Maybe we have > to > >> > > discuss about this in mailing list. > >> > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> > >> wrote: > >> > > > > >> > > > What's the relationship between the streaming SQL proposed here > and > >> > > > the CEP syntax proposed earlier in the week? > >> > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra < > [hidden email] > >> > > >> > > wrote: > >> > > > > >> > > >> Awesome! Thanks for the reply, Fabian. > >> > > >> > >> > > >> - Henry > >> > > >> > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > >> > > >> <javascript:;>> wrote: > >> > > >> > >> > > >>> Hi Henry, > >> > > >>> > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a > >> few > >> > > >>> subissues. > >> > > >>> I'll reorganize these and add more issues for the tasks > described > >> in > >> > > >>> the design document in the next days. > >> > > >>> > >> > > >>> Thanks, Fabian > >> > > >>> > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra < > [hidden email] > >> > > >> <javascript:;> > >> > > >>> <javascript:;>>: > >> > > >>> > >> > > >>>> HI Fabian, > >> > > >>>> > >> > > >>>> Have you created JIRA ticket to keep track of this new feature? > >> > > >>>> > >> > > >>>> - Henry > >> > > >>>> > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske < > [hidden email] > >> > > >> <javascript:;> > >> > > >>> <javascript:;>> wrote: > >> > > >>>>> Hi everybody, > >> > > >>>>> > >> > > >>>>> in the last days, Timo and I refined the design document for > >> > > >>>>> adding a > >> > > >>>> SQL / > >> > > >>>>> StreamSQL interface on top of Flink that was started by > Stephan. > >> > > >>>>> > >> > > >>>>> The document proposes an architecture that is centered around > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and > >> > > >>>>> includes a SQL > >> > > >>>> parser, > >> > > >>>>> a semantic validator for relational queries, and a rule- and > >> > > >> cost-based > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and > Apache > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to > >> > > >>>>> translate Table > >> > > >>> API > >> > > >>>>> and SQL queries into Calcite's relational expression trees, > >> > > >>>>> optimize > >> > > >>>> these > >> > > >>>>> trees, and translate them into DataSet and DataStream > >> programs.The > >> > > >>>> document > >> > > >>>>> breaks down the work into several tasks and subtasks. > >> > > >>>>> > >> > > >>>>> Please review the design document and comment. > >> > > >>>>> > >> > > >>>>> -- > > >> > > >>>>> > >> > > >>>> > >> > > >>> > >> > > >> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > >> > > >> cp1h2TVqdI/edit?usp=sharing > >> > > >>>>> > >> > > >>>>> Unless there are major concerns with the design, Timo and I > want > >> > > >>>>> to > >> > > >>> start > >> > > >>>>> next week to move the current Table API on top of Apache > Calcite > >> > > >> (Task > >> > > >>> 1 > >> > > >>>> in > >> > > >>>>> the document). The goal of this task is to have the same > >> > > >> functionality > >> > > >>> as > >> > > >>>>> currently, but with Calcite in the translation process. This > is > >> a > >> > > >>>> blocking > >> > > >>>>> task that we hope to complete soon. Afterwards, we can > >> > > >>>>> independently > >> > > >>> work > >> > > >>>>> on different aspects such as extending the Table API, adding a > >> SQL > >> > > >>>>> interface (basically just a parser), integration with external > >> > > >>>>> data sources, better code generation, optimization rules, > >> > > >>>>> streaming > >> > > >> support > >> > > >>>> for > >> > > >>>>> the Table API, StreamSQL, etc.. > >> > > >>>>> > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 > and > >> > > >>>>> merge > >> > > >>> it > >> > > >>>> to > >> > > >>>>> the master branch once the task is completed. Of course, > >> everybody > >> > > >>>>> is welcome to contribute to this effort. Please let us know > such > >> > > >>>>> that we > >> > > >>> can > >> > > >>>>> coordinate our efforts. > >> > > >>>>> > >> > > >>>>> Thanks, > >> > > >>>>> Fabian > >> > > > >> > > Regards, > >> > > Chiwan Park > >> > > > >> > > > >> > > > >> > > >> > > > > > |
Cool, this is great news!
So "Task 1" from the document [1] is done with the merge? And PR #1770 is going towards "Task 6". I think good support for Stream SQL is a very interesting new feature for Flink. [1] https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0 On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <[hidden email] > wrote: > Hello everyone, > > We are happy to announce that the "tableOnCalcite" branch is finally ready > to be merged. > It essentially provides the existing functionality of the Table API, but > now the translation happens through Apache Calcite. > You can find the changes rebased on top of the current master in [1]. > We have removed the prototype streaming Table API functionality, which will > be added back once PR [2] is merged. > > We'll go through the changes once more and, if no objections, we would like > to go ahead and merge this. > > Cheers, > -Vasia. > > [1]: https://github.com/vasia/flink/tree/merge-table > [2]: https://github.com/apache/flink/pull/1770 > > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote: > > > Hi everybody, > > > > as previously announced, I pushed a feature branch called > "tableOnCalcite" > > to the Flink repository. > > We will use this branch to work on FLINK-3221 and its sub-issues. > > > > Cheers, Fabian > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>: > > > > > We haven't defined the StreamSQL syntax yet (and I think it will take > > some > > > time until we are at that point). > > > So we are quite flexible with both featurs. > > > > > > Let's keep this opportunity in mind and coordinate when before making > > > decisions about CEP or StreamSQL. > > > > > > Fabian > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>: > > > > > >> First of all, it's a great design document. Looking forward having > > stream > > >> SQL in the foreseeable future :-) > > >> > > >> I think it is a good idea to consolidate stream SQL and CEP in the > long > > >> run. CEP's additional features compared to SQL boil down to pattern > > >> detection. Once we have this, it should be only a question of defining > > the > > >> SQL syntax for event patterns in order to integrate CEP with stream > SQL. > > >> Oracle has already defined an extension [1] to detect patterns in a > set > > of > > >> table rows. This or Esper's event processing language (EPL) [2] could > > be a > > >> good starting point. > > >> > > >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 > > >> [2] > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ > > >> > > >> Cheers, > > >> Till > > >> > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> > > >> wrote: > > >> > > >> > Thanks for the feedback! > > >> > > > >> > We will start the SQL effort with putting the existing (batch) Table > > >> API on > > >> > top of Apache Calcite. > > >> > From there we continue to add streaming support for the Table API > > >> before we > > >> > put a StreamSQL interface on top. > > >> > > > >> > Consolidating the efforts with the CEP library sounds like a good > idea > > >> to > > >> > me. > > >> > Maybe it can be nicely integrated with the streaming table API and > > >> later as > > >> > well with the StreamSQL interface (the StreamSQL dialect is not > > defined > > >> > yet). > > >> > > > >> > @Till: What do you think about adding CEP features to the Table API. > > >> From > > >> > the CEP design doc, it looks like we need to add a pattern matching > > >> > operator in addition to the window features that we need to add for > > >> > streaming Table API in any case. > > >> > > > >> > Best, Fabian > > >> > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email]>: > > >> > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard > language > > >> > > extend to offering a cluster of window, pattern matching. EPL can > > >> both > > >> > > support Streaming SQL and CEP with one unified syntax. > > >> > > > > >> > > [1] > > >> > > > > >> > > > >> > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > > >> > > (Chapter 5. EPL Reference: Clauses) > > >> > > > > >> > > > > >> > > Regards > > >> > > Song > > >> > > > > >> > > > > >> > > -----邮件原件----- > > >> > > 发件人: Chiwan Park [mailto:[hidden email]] > > >> > > 发送时间: 2016年1月11日 10:31 > > >> > > 收件人: [hidden email] > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink > > >> > > > > >> > > We still don’t have a concensus about the streaming SQL and CEP > > >> library > > >> > on > > >> > > Flink. Some people want to merge these two libraries. Maybe we > have > > to > > >> > > discuss about this in mailing list. > > >> > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <[hidden email]> > > >> wrote: > > >> > > > > > >> > > > What's the relationship between the streaming SQL proposed here > > and > > >> > > > the CEP syntax proposed earlier in the week? > > >> > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra < > > [hidden email] > > >> > > > >> > > wrote: > > >> > > > > > >> > > >> Awesome! Thanks for the reply, Fabian. > > >> > > >> > > >> > > >> - Henry > > >> > > >> > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <[hidden email] > > >> > > >> <javascript:;>> wrote: > > >> > > >> > > >> > > >>> Hi Henry, > > >> > > >>> > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 > and a > > >> few > > >> > > >>> subissues. > > >> > > >>> I'll reorganize these and add more issues for the tasks > > described > > >> in > > >> > > >>> the design document in the next days. > > >> > > >>> > > >> > > >>> Thanks, Fabian > > >> > > >>> > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra < > > [hidden email] > > >> > > >> <javascript:;> > > >> > > >>> <javascript:;>>: > > >> > > >>> > > >> > > >>>> HI Fabian, > > >> > > >>>> > > >> > > >>>> Have you created JIRA ticket to keep track of this new > feature? > > >> > > >>>> > > >> > > >>>> - Henry > > >> > > >>>> > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske < > > [hidden email] > > >> > > >> <javascript:;> > > >> > > >>> <javascript:;>> wrote: > > >> > > >>>>> Hi everybody, > > >> > > >>>>> > > >> > > >>>>> in the last days, Timo and I refined the design document for > > >> > > >>>>> adding a > > >> > > >>>> SQL / > > >> > > >>>>> StreamSQL interface on top of Flink that was started by > > Stephan. > > >> > > >>>>> > > >> > > >>>>> The document proposes an architecture that is centered > around > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and > > >> > > >>>>> includes a SQL > > >> > > >>>> parser, > > >> > > >>>>> a semantic validator for relational queries, and a rule- and > > >> > > >> cost-based > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and > > Apache > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to > > >> > > >>>>> translate Table > > >> > > >>> API > > >> > > >>>>> and SQL queries into Calcite's relational expression trees, > > >> > > >>>>> optimize > > >> > > >>>> these > > >> > > >>>>> trees, and translate them into DataSet and DataStream > > >> programs.The > > >> > > >>>> document > > >> > > >>>>> breaks down the work into several tasks and subtasks. > > >> > > >>>>> > > >> > > >>>>> Please review the design document and comment. > > >> > > >>>>> > > >> > > >>>>> -- > > > >> > > >>>>> > > >> > > >>>> > > >> > > >>> > > >> > > >> > > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > > >> > > >> cp1h2TVqdI/edit?usp=sharing > > >> > > >>>>> > > >> > > >>>>> Unless there are major concerns with the design, Timo and I > > want > > >> > > >>>>> to > > >> > > >>> start > > >> > > >>>>> next week to move the current Table API on top of Apache > > Calcite > > >> > > >> (Task > > >> > > >>> 1 > > >> > > >>>> in > > >> > > >>>>> the document). The goal of this task is to have the same > > >> > > >> functionality > > >> > > >>> as > > >> > > >>>>> currently, but with Calcite in the translation process. This > > is > > >> a > > >> > > >>>> blocking > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can > > >> > > >>>>> independently > > >> > > >>> work > > >> > > >>>>> on different aspects such as extending the Table API, > adding a > > >> SQL > > >> > > >>>>> interface (basically just a parser), integration with > external > > >> > > >>>>> data sources, better code generation, optimization rules, > > >> > > >>>>> streaming > > >> > > >> support > > >> > > >>>> for > > >> > > >>>>> the Table API, StreamSQL, etc.. > > >> > > >>>>> > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 > > and > > >> > > >>>>> merge > > >> > > >>> it > > >> > > >>>> to > > >> > > >>>>> the master branch once the task is completed. Of course, > > >> everybody > > >> > > >>>>> is welcome to contribute to this effort. Please let us know > > such > > >> > > >>>>> that we > > >> > > >>> can > > >> > > >>>>> coordinate our efforts. > > >> > > >>>>> > > >> > > >>>>> Thanks, > > >> > > >>>>> Fabian > > >> > > > > >> > > Regards, > > >> > > Chiwan Park > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > > > > > > > |
Yes, the current state corresponds to Task 1. PR #1770 corresponds to Task
5. Task 6 should come right after :) -V. On 16 March 2016 at 20:35, Robert Metzger <[hidden email]> wrote: > Cool, this is great news! > So "Task 1" from the document [1] is done with the merge? And PR #1770 is > going towards "Task 6". > I think good support for Stream SQL is a very interesting new feature for > Flink. > > [1] > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0 > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri < > [hidden email] > > wrote: > > > Hello everyone, > > > > We are happy to announce that the "tableOnCalcite" branch is finally > ready > > to be merged. > > It essentially provides the existing functionality of the Table API, but > > now the translation happens through Apache Calcite. > > You can find the changes rebased on top of the current master in [1]. > > We have removed the prototype streaming Table API functionality, which > will > > be added back once PR [2] is merged. > > > > We'll go through the changes once more and, if no objections, we would > like > > to go ahead and merge this. > > > > Cheers, > > -Vasia. > > > > [1]: https://github.com/vasia/flink/tree/merge-table > > [2]: https://github.com/apache/flink/pull/1770 > > > > > > On 15 January 2016 at 10:59, Fabian Hueske <[hidden email]> wrote: > > > > > Hi everybody, > > > > > > as previously announced, I pushed a feature branch called > > "tableOnCalcite" > > > to the Flink repository. > > > We will use this branch to work on FLINK-3221 and its sub-issues. > > > > > > Cheers, Fabian > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <[hidden email]>: > > > > > > > We haven't defined the StreamSQL syntax yet (and I think it will take > > > some > > > > time until we are at that point). > > > > So we are quite flexible with both featurs. > > > > > > > > Let's keep this opportunity in mind and coordinate when before making > > > > decisions about CEP or StreamSQL. > > > > > > > > Fabian > > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <[hidden email]>: > > > > > > > >> First of all, it's a great design document. Looking forward having > > > stream > > > >> SQL in the foreseeable future :-) > > > >> > > > >> I think it is a good idea to consolidate stream SQL and CEP in the > > long > > > >> run. CEP's additional features compared to SQL boil down to pattern > > > >> detection. Once we have this, it should be only a question of > defining > > > the > > > >> SQL syntax for event patterns in order to integrate CEP with stream > > SQL. > > > >> Oracle has already defined an extension [1] to detect patterns in a > > set > > > of > > > >> table rows. This or Esper's event processing language (EPL) [2] > could > > > be a > > > >> good starting point. > > > >> > > > >> [1] > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959 > > > >> [2] > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/ > > > >> > > > >> Cheers, > > > >> Till > > > >> > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <[hidden email]> > > > >> wrote: > > > >> > > > >> > Thanks for the feedback! > > > >> > > > > >> > We will start the SQL effort with putting the existing (batch) > Table > > > >> API on > > > >> > top of Apache Calcite. > > > >> > From there we continue to add streaming support for the Table API > > > >> before we > > > >> > put a StreamSQL interface on top. > > > >> > > > > >> > Consolidating the efforts with the CEP library sounds like a good > > idea > > > >> to > > > >> > me. > > > >> > Maybe it can be nicely integrated with the streaming table API and > > > >> later as > > > >> > well with the StreamSQL interface (the StreamSQL dialect is not > > > defined > > > >> > yet). > > > >> > > > > >> > @Till: What do you think about adding CEP features to the Table > API. > > > >> From > > > >> > the CEP design doc, it looks like we need to add a pattern > matching > > > >> > operator in addition to the window features that we need to add > for > > > >> > streaming Table API in any case. > > > >> > > > > >> > Best, Fabian > > > >> > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <[hidden email] > >: > > > >> > > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard > > language > > > >> > > extend to offering a cluster of window, pattern matching. EPL > can > > > >> both > > > >> > > support Streaming SQL and CEP with one unified syntax. > > > >> > > > > > >> > > [1] > > > >> > > > > > >> > > > > >> > > > > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf > > > >> > > (Chapter 5. EPL Reference: Clauses) > > > >> > > > > > >> > > > > > >> > > Regards > > > >> > > Song > > > >> > > > > > >> > > > > > >> > > -----邮件原件----- > > > >> > > 发件人: Chiwan Park [mailto:[hidden email]] > > > >> > > 发送时间: 2016年1月11日 10:31 > > > >> > > 收件人: [hidden email] > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink > > > >> > > > > > >> > > We still don’t have a concensus about the streaming SQL and CEP > > > >> library > > > >> > on > > > >> > > Flink. Some people want to merge these two libraries. Maybe we > > have > > > to > > > >> > > discuss about this in mailing list. > > > >> > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk < > [hidden email]> > > > >> wrote: > > > >> > > > > > > >> > > > What's the relationship between the streaming SQL proposed > here > > > and > > > >> > > > the CEP syntax proposed earlier in the week? > > > >> > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra < > > > [hidden email] > > > >> > > > > >> > > wrote: > > > >> > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian. > > > >> > > >> > > > >> > > >> - Henry > > > >> > > >> > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske < > [hidden email] > > > >> > > >> <javascript:;>> wrote: > > > >> > > >> > > > >> > > >>> Hi Henry, > > > >> > > >>> > > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 > > and a > > > >> few > > > >> > > >>> subissues. > > > >> > > >>> I'll reorganize these and add more issues for the tasks > > > described > > > >> in > > > >> > > >>> the design document in the next days. > > > >> > > >>> > > > >> > > >>> Thanks, Fabian > > > >> > > >>> > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra < > > > [hidden email] > > > >> > > >> <javascript:;> > > > >> > > >>> <javascript:;>>: > > > >> > > >>> > > > >> > > >>>> HI Fabian, > > > >> > > >>>> > > > >> > > >>>> Have you created JIRA ticket to keep track of this new > > feature? > > > >> > > >>>> > > > >> > > >>>> - Henry > > > >> > > >>>> > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske < > > > [hidden email] > > > >> > > >> <javascript:;> > > > >> > > >>> <javascript:;>> wrote: > > > >> > > >>>>> Hi everybody, > > > >> > > >>>>> > > > >> > > >>>>> in the last days, Timo and I refined the design document > for > > > >> > > >>>>> adding a > > > >> > > >>>> SQL / > > > >> > > >>>>> StreamSQL interface on top of Flink that was started by > > > Stephan. > > > >> > > >>>>> > > > >> > > >>>>> The document proposes an architecture that is centered > > around > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and > > > >> > > >>>>> includes a SQL > > > >> > > >>>> parser, > > > >> > > >>>>> a semantic validator for relational queries, and a rule- > and > > > >> > > >> cost-based > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and > > > Apache > > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is > to > > > >> > > >>>>> translate Table > > > >> > > >>> API > > > >> > > >>>>> and SQL queries into Calcite's relational expression > trees, > > > >> > > >>>>> optimize > > > >> > > >>>> these > > > >> > > >>>>> trees, and translate them into DataSet and DataStream > > > >> programs.The > > > >> > > >>>> document > > > >> > > >>>>> breaks down the work into several tasks and subtasks. > > > >> > > >>>>> > > > >> > > >>>>> Please review the design document and comment. > > > >> > > >>>>> > > > >> > > >>>>> -- > > > > >> > > >>>>> > > > >> > > >>>> > > > >> > > >>> > > > >> > > >> > > > >> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP > > > >> > > >> cp1h2TVqdI/edit?usp=sharing > > > >> > > >>>>> > > > >> > > >>>>> Unless there are major concerns with the design, Timo and > I > > > want > > > >> > > >>>>> to > > > >> > > >>> start > > > >> > > >>>>> next week to move the current Table API on top of Apache > > > Calcite > > > >> > > >> (Task > > > >> > > >>> 1 > > > >> > > >>>> in > > > >> > > >>>>> the document). The goal of this task is to have the same > > > >> > > >> functionality > > > >> > > >>> as > > > >> > > >>>>> currently, but with Calcite in the translation process. > This > > > is > > > >> a > > > >> > > >>>> blocking > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can > > > >> > > >>>>> independently > > > >> > > >>> work > > > >> > > >>>>> on different aspects such as extending the Table API, > > adding a > > > >> SQL > > > >> > > >>>>> interface (basically just a parser), integration with > > external > > > >> > > >>>>> data sources, better code generation, optimization rules, > > > >> > > >>>>> streaming > > > >> > > >> support > > > >> > > >>>> for > > > >> > > >>>>> the Table API, StreamSQL, etc.. > > > >> > > >>>>> > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task > 1 > > > and > > > >> > > >>>>> merge > > > >> > > >>> it > > > >> > > >>>> to > > > >> > > >>>>> the master branch once the task is completed. Of course, > > > >> everybody > > > >> > > >>>>> is welcome to contribute to this effort. Please let us > know > > > such > > > >> > > >>>>> that we > > > >> > > >>> can > > > >> > > >>>>> coordinate our efforts. > > > >> > > >>>>> > > > >> > > >>>>> Thanks, > > > >> > > >>>>> Fabian > > > >> > > > > > >> > > Regards, > > > >> > > Chiwan Park > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |