Hi everyone,
some of you might have already read FLIP-32 [1] where we've described an approximate roadmap of how to handle the big Blink SQL contribution and how we can make the Table & SQL API equally important to the existing DataStream API. As mentioned there (Advance the API and Unblock New Features, Item 1), the rework of the Table/SQL type system is a crucial step for unblocking future contributions. In particular, Flink's current type system has many shortcomings which make an integration with other systems (such as Hive), DDL statements, and a unified API for Java/Scala difficult. We propose a new type system that is closer to the SQL standard, integrates better with other SQL vendors, and solves most of the type-related issues we had in the past. The design document for FLIP-37 can be found here: https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing I'm looking forward to your feedback. Thanks, Timo [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions |
Big +1 to this! I left some comments in google doc.
Best, Kurt On Wed, Mar 27, 2019 at 11:32 PM Timo Walther <[hidden email]> wrote: > Hi everyone, > > some of you might have already read FLIP-32 [1] where we've described an > approximate roadmap of how to handle the big Blink SQL contribution and > how we can make the Table & SQL API equally important to the existing > DataStream API. > > As mentioned there (Advance the API and Unblock New Features, Item 1), > the rework of the Table/SQL type system is a crucial step for unblocking > future contributions. In particular, Flink's current type system has > many shortcomings which make an integration with other systems (such as > Hive), DDL statements, and a unified API for Java/Scala difficult. We > propose a new type system that is closer to the SQL standard, integrates > better with other SQL vendors, and solves most of the type-related > issues we had in the past. > > The design document for FLIP-37 can be found here: > > > https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing > > I'm looking forward to your feedback. > > Thanks, > Timo > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions > > |
Another big +1 from my side. Thank you Timo for preparing the document!
I really look forward for this to have a standardized way of type handling. This should solve loads of problems. I really like the separation of logical type from its physical representation, I think we should aim to introduce that and keep it separated. Best, Dawid On 28/03/2019 08:51, Kurt Young wrote: > Big +1 to this! I left some comments in google doc. > > Best, > Kurt > > > On Wed, Mar 27, 2019 at 11:32 PM Timo Walther <[hidden email]> wrote: > >> Hi everyone, >> >> some of you might have already read FLIP-32 [1] where we've described an >> approximate roadmap of how to handle the big Blink SQL contribution and >> how we can make the Table & SQL API equally important to the existing >> DataStream API. >> >> As mentioned there (Advance the API and Unblock New Features, Item 1), >> the rework of the Table/SQL type system is a crucial step for unblocking >> future contributions. In particular, Flink's current type system has >> many shortcomings which make an integration with other systems (such as >> Hive), DDL statements, and a unified API for Java/Scala difficult. We >> propose a new type system that is closer to the SQL standard, integrates >> better with other SQL vendors, and solves most of the type-related >> issues we had in the past. >> >> The design document for FLIP-37 can be found here: >> >> >> https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing >> >> I'm looking forward to your feedback. >> >> Thanks, >> Timo >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions >> >> signature.asc (849 bytes) Download Attachment |
Maybe to give some background about Dawid's latest email:
Kurt raised some good points regarding the conversion of data types at the boundaries of the API and SPI. After that, Dawid and I had a long discussion of how users can define those boundaries in a nicer way. The outcome of this discussion was similar to Blink's current distinction between InternalTypes and ExternalTypes. I updated the document with a improved structure of DataTypes (for users, API, SPI with conversion information) and LogicalTypes (used internally and close to standard SQL types). Thanks for the feedback so far, Timo Am 28.03.19 um 11:18 schrieb Dawid Wysakowicz: > Another big +1 from my side. Thank you Timo for preparing the document! > > I really look forward for this to have a standardized way of type > handling. This should solve loads of problems. I really like the > separation of logical type from its physical representation, I think we > should aim to introduce that and keep it separated. > > Best, > > Dawid > > On 28/03/2019 08:51, Kurt Young wrote: >> Big +1 to this! I left some comments in google doc. >> >> Best, >> Kurt >> >> >> On Wed, Mar 27, 2019 at 11:32 PM Timo Walther <[hidden email]> wrote: >> >>> Hi everyone, >>> >>> some of you might have already read FLIP-32 [1] where we've described an >>> approximate roadmap of how to handle the big Blink SQL contribution and >>> how we can make the Table & SQL API equally important to the existing >>> DataStream API. >>> >>> As mentioned there (Advance the API and Unblock New Features, Item 1), >>> the rework of the Table/SQL type system is a crucial step for unblocking >>> future contributions. In particular, Flink's current type system has >>> many shortcomings which make an integration with other systems (such as >>> Hive), DDL statements, and a unified API for Java/Scala difficult. We >>> propose a new type system that is closer to the SQL standard, integrates >>> better with other SQL vendors, and solves most of the type-related >>> issues we had in the past. >>> >>> The design document for FLIP-37 can be found here: >>> >>> >>> https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing >>> >>> I'm looking forward to your feedback. >>> >>> Thanks, >>> Timo >>> >>> [1] >>> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions >>> >>> |
Thanks @Timo for starting this effort and preparing the document :-)
I took a pass and left some comments. I also very much like the idea of the DataType and LogicalType separation. As explained in the doc, we've also been looking into ways to improve the type system so a huge +1 on our side. One question I have is, since this touches many of the external systems like Hive / Blink comparison, does it make sense to share this to a border audience (such as user@) later to gather more feedbacks? Looking forward to this change and would love to contribute in anyway I can! Best, Rong On Thu, Mar 28, 2019 at 3:25 AM Timo Walther <[hidden email]> wrote: > Maybe to give some background about Dawid's latest email: > > Kurt raised some good points regarding the conversion of data types at > the boundaries of the API and SPI. After that, Dawid and I had a long > discussion of how users can define those boundaries in a nicer way. The > outcome of this discussion was similar to Blink's current distinction > between InternalTypes and ExternalTypes. I updated the document with a > improved structure of DataTypes (for users, API, SPI with conversion > information) and LogicalTypes (used internally and close to standard SQL > types). > > Thanks for the feedback so far, > Timo > > Am 28.03.19 um 11:18 schrieb Dawid Wysakowicz: > > Another big +1 from my side. Thank you Timo for preparing the document! > > > > I really look forward for this to have a standardized way of type > > handling. This should solve loads of problems. I really like the > > separation of logical type from its physical representation, I think we > > should aim to introduce that and keep it separated. > > > > Best, > > > > Dawid > > > > On 28/03/2019 08:51, Kurt Young wrote: > >> Big +1 to this! I left some comments in google doc. > >> > >> Best, > >> Kurt > >> > >> > >> On Wed, Mar 27, 2019 at 11:32 PM Timo Walther <[hidden email]> > wrote: > >> > >>> Hi everyone, > >>> > >>> some of you might have already read FLIP-32 [1] where we've described > an > >>> approximate roadmap of how to handle the big Blink SQL contribution and > >>> how we can make the Table & SQL API equally important to the existing > >>> DataStream API. > >>> > >>> As mentioned there (Advance the API and Unblock New Features, Item 1), > >>> the rework of the Table/SQL type system is a crucial step for > unblocking > >>> future contributions. In particular, Flink's current type system has > >>> many shortcomings which make an integration with other systems (such as > >>> Hive), DDL statements, and a unified API for Java/Scala difficult. We > >>> propose a new type system that is closer to the SQL standard, > integrates > >>> better with other SQL vendors, and solves most of the type-related > >>> issues we had in the past. > >>> > >>> The design document for FLIP-37 can be found here: > >>> > >>> > >>> > https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing > >>> > >>> I'm looking forward to your feedback. > >>> > >>> Thanks, > >>> Timo > >>> > >>> [1] > >>> > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions > >>> > >>> > > |
Hi everyone,
thanks for the valuable feedback I got so far. I updated the design document at different positions due to the comments I got online and offline. In general, the feedback was very positive. It seems there is consensus to perform a big rework of the type system with a better long-term vision and closer semantics to other SQL vendors and the standard itself. Since my last mail, we improved topics around date-time types (esp. due to the cross-platform discussions [0]). And improved the general inter-operability with UDF implementation and Java classes. I would like to convert the design document [1] into a FLIP soon and start with an implementation of the basic structure. I'm sure we will have subsequent discussion about certain types or semantics but this can also happen in the corresponding issues/PRs. @Rong: Sorry for not responding earlier. I think we should avoid crossposting design dicussions on both MLs, because there are a lot of them right now. People that are interested should follow this ML. Thanks, Timo [0] https://docs.google.com/document/d/1gNRww9mZJcHvUDCXklzjFEQGpefsuR_akCDfWsdE35Q/edit# [1] https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit# Am 28.03.19 um 17:24 schrieb Rong Rong: > Thanks @Timo for starting this effort and preparing the document :-) > > I took a pass and left some comments. I also very much like the idea of the > DataType and LogicalType separation. > As explained in the doc, we've also been looking into ways to improve the > type system so a huge +1 on our side. > > One question I have is, since this touches many of the external systems > like Hive / Blink comparison, does it make sense to share this to a border > audience (such as user@) later to gather more feedbacks? > > Looking forward to this change and would love to contribute in anyway I can! > > Best, > Rong > > > > > > > > > On Thu, Mar 28, 2019 at 3:25 AM Timo Walther <[hidden email]> wrote: > >> Maybe to give some background about Dawid's latest email: >> >> Kurt raised some good points regarding the conversion of data types at >> the boundaries of the API and SPI. After that, Dawid and I had a long >> discussion of how users can define those boundaries in a nicer way. The >> outcome of this discussion was similar to Blink's current distinction >> between InternalTypes and ExternalTypes. I updated the document with a >> improved structure of DataTypes (for users, API, SPI with conversion >> information) and LogicalTypes (used internally and close to standard SQL >> types). >> >> Thanks for the feedback so far, >> Timo >> >> Am 28.03.19 um 11:18 schrieb Dawid Wysakowicz: >>> Another big +1 from my side. Thank you Timo for preparing the document! >>> >>> I really look forward for this to have a standardized way of type >>> handling. This should solve loads of problems. I really like the >>> separation of logical type from its physical representation, I think we >>> should aim to introduce that and keep it separated. >>> >>> Best, >>> >>> Dawid >>> >>> On 28/03/2019 08:51, Kurt Young wrote: >>>> Big +1 to this! I left some comments in google doc. >>>> >>>> Best, >>>> Kurt >>>> >>>> >>>> On Wed, Mar 27, 2019 at 11:32 PM Timo Walther <[hidden email]> >> wrote: >>>>> Hi everyone, >>>>> >>>>> some of you might have already read FLIP-32 [1] where we've described >> an >>>>> approximate roadmap of how to handle the big Blink SQL contribution and >>>>> how we can make the Table & SQL API equally important to the existing >>>>> DataStream API. >>>>> >>>>> As mentioned there (Advance the API and Unblock New Features, Item 1), >>>>> the rework of the Table/SQL type system is a crucial step for >> unblocking >>>>> future contributions. In particular, Flink's current type system has >>>>> many shortcomings which make an integration with other systems (such as >>>>> Hive), DDL statements, and a unified API for Java/Scala difficult. We >>>>> propose a new type system that is closer to the SQL standard, >> integrates >>>>> better with other SQL vendors, and solves most of the type-related >>>>> issues we had in the past. >>>>> >>>>> The design document for FLIP-37 can be found here: >>>>> >>>>> >>>>> >> https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit?usp=sharing >>>>> I'm looking forward to your feedback. >>>>> >>>>> Thanks, >>>>> Timo >>>>> >>>>> [1] >>>>> >>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions >>>>> >> |
Free forum by Nabble | Edit this page |