Hi devs, I wanted to bring another improvement proposal up for a discussion. Often users need to adjust existing tables slightly. This is especially useful when users need to enhance a table created from an external tool (e.g. HIVE) with Flink's specific information such as e.g watermarks. It can also be a useful tool for ETL processes, e.g. merging two tables into a single one with a different connector. My suggestion would be to support an optional Feature T171, “LIKE clause in table definition” of SQL standard 2008. You can see the description of the proposal here: https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE Looking forward for your comments. Best, Dawid signature.asc (849 bytes) Download Attachment |
Hi Dawid,
Thanks for starting this discussion. I like the idea. Once we support more intergrated catalogs, e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. Because it's very common to adjust existing tables in catalog slightly. My initial thought was introducing INHERITS keyword, which is also supported in PostgreSQL [1]. This is also similar to the functionality of Hive CREATE TABLE LIKE [2]. CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS cat.db.KafkoTopic CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS cat.db.KafkoTopic WITH ('k' = 'v') The INHERITS can inherit an existing table with all columns, watermark, and properties, but the properties and watermark and be overwrited explicitly. The reason I prefer INHERITS rather than LIKE is the keyword position. We are copying an existing table definition including the properties. However, LIKE appears in the schema part, it sounds like copying properties into schema part of DDL. Besides of that, I'm not sure whether the use case stands "merging two tables into a single one with a different connector". From my understanding, most use cases are just slightly adjusting on an existing catalog table with new properties or watermarks. Do we really need to merge two table definitions into a single one? For example, is it possible to merge a Kafka table definition and a Filesystem table definition into a new Kafka table, and the new Kafka table exactly matches the underlying physical data format? Best, Jark [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html [2]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> wrote: > Hi devs, > > I wanted to bring another improvement proposal up for a discussion. Often > users need to adjust existing tables slightly. This is especially useful > when users need to enhance a table created from an external tool (e.g. > HIVE) with Flink's specific information such as e.g watermarks. It can also > be a useful tool for ETL processes, e.g. merging two tables into a single > one with a different connector. My suggestion would be to support an > optional *Feature T171, “LIKE clause in table definition” *of SQL > standard 2008. > > You can see the description of the proposal here: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > Looking forward for your comments. > > Best, > > Dawid > |
Hi Jark,
I did investigate the INHERITS clause, but it has a semantic that in my opinion we definitely don't want to support. INHERITS creates a new table with a "link" to the original table. Therefore if you e.g change the schema of the original table it's also reflected in the child table. It's also possible for tables like A inherits B query them like Select * from only A, by default it returns results from both tables. I am pretty sure it's not what we're looking for. PostgreSQL implements both the LIKE clause and INHERITS. I am open for discussion if we should support multiple LIKE statements or not. Standard also allows declaring the clause after the schema part. We can also do it. Nevertheless I think including multiple tables might be useful, e.g. when you want to union two tables and output to the same Kafka cluster and just change the target topic. I know it's not a very common use case but it's not a big effort to support it. Let me know what you think. Best, Dawid On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: > Hi Dawid, > > Thanks for starting this discussion. I like the idea. > Once we support more intergrated catalogs, > e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. > Because it's very common to adjust existing tables in catalog slightly. > > My initial thought was introducing INHERITS keyword, which is also > supported in PostgreSQL [1]. > This is also similar to the functionality of Hive CREATE TABLE LIKE [2]. > > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > cat.db.KafkoTopic > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > cat.db.KafkoTopic WITH ('k' = 'v') > > The INHERITS can inherit an existing table with all columns, watermark, and > properties, but the properties and watermark and be overwrited explicitly. > > The reason I prefer INHERITS rather than LIKE is the keyword position. We > are copying an existing table definition including the properties. > However, LIKE appears in the schema part, it sounds like copying properties > into schema part of DDL. > > Besides of that, I'm not sure whether the use case stands "merging two > tables into a single one with a different connector". > From my understanding, most use cases are just slightly adjusting on an > existing catalog table with new properties or watermarks. > Do we really need to merge two table definitions into a single one? For > example, is it possible to merge a Kafka table definition and > a Filesystem table definition into a new Kafka table, and the new Kafka > table exactly matches the underlying physical data format? > > Best, > Jark > > [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html > [2]: > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike > > > On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> > wrote: > > > Hi devs, > > > > I wanted to bring another improvement proposal up for a discussion. Often > > users need to adjust existing tables slightly. This is especially useful > > when users need to enhance a table created from an external tool (e.g. > > HIVE) with Flink's specific information such as e.g watermarks. It can > also > > be a useful tool for ETL processes, e.g. merging two tables into a single > > one with a different connector. My suggestion would be to support an > > optional *Feature T171, “LIKE clause in table definition” *of SQL > > standard 2008. > > > > You can see the description of the proposal here: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > > > Looking forward for your comments. > > > > Best, > > > > Dawid > > > |
Thanks Dawid for starting this discussion.
I like the "LIKE". 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will propagate any changes in column data definitions and check constraints down the inheritance hierarchy. A inherits B, A and B share every things, they have the same kafka topic. If modify schema of B, this means underlying kafka topic schema changed, so I think it is good to modify A too. If this for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes this is just we want. But "LIKE" also very useful for many cases. 2.For LIKE statement in schema, I know two kinds of like syntax, one is MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite properties in "with"? This looks weird, because "LIKE" is in schema, but it can affect outside properties. Best, Jingsong Lee On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email]> wrote: > Hi Jark, > I did investigate the INHERITS clause, but it has a semantic that in my > opinion we definitely don't want to support. INHERITS creates a new table > with a "link" to the original table. Therefore if you e.g change the schema > of the original table it's also reflected in the child table. It's also > possible for tables like A inherits B query them like Select * from only A, > by default it returns results from both tables. I am pretty sure it's not > what we're looking for. > > PostgreSQL implements both the LIKE clause and INHERITS. I am open for > discussion if we should support multiple LIKE statements or not. Standard > also allows declaring the clause after the schema part. We can also do it. > Nevertheless I think including multiple tables might be useful, e.g. when > you want to union two tables and output to the same Kafka cluster and just > change the target topic. I know it's not a very common use case but it's > not a big effort to support it. > > Let me know what you think. > > Best, > Dawid > > On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: > > > Hi Dawid, > > > > Thanks for starting this discussion. I like the idea. > > Once we support more intergrated catalogs, > > e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. > > Because it's very common to adjust existing tables in catalog slightly. > > > > My initial thought was introducing INHERITS keyword, which is also > > supported in PostgreSQL [1]. > > This is also similar to the functionality of Hive CREATE TABLE LIKE [2]. > > > > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > > cat.db.KafkoTopic > > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > > cat.db.KafkoTopic WITH ('k' = 'v') > > > > The INHERITS can inherit an existing table with all columns, watermark, > and > > properties, but the properties and watermark and be overwrited > explicitly. > > > > The reason I prefer INHERITS rather than LIKE is the keyword position. We > > are copying an existing table definition including the properties. > > However, LIKE appears in the schema part, it sounds like copying > properties > > into schema part of DDL. > > > > Besides of that, I'm not sure whether the use case stands "merging two > > tables into a single one with a different connector". > > From my understanding, most use cases are just slightly adjusting on an > > existing catalog table with new properties or watermarks. > > Do we really need to merge two table definitions into a single one? For > > example, is it possible to merge a Kafka table definition and > > a Filesystem table definition into a new Kafka table, and the new Kafka > > table exactly matches the underlying physical data format? > > > > Best, > > Jark > > > > [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html > > [2]: > > > > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike > > > > > > On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> > > wrote: > > > > > Hi devs, > > > > > > I wanted to bring another improvement proposal up for a discussion. > Often > > > users need to adjust existing tables slightly. This is especially > useful > > > when users need to enhance a table created from an external tool (e.g. > > > HIVE) with Flink's specific information such as e.g watermarks. It can > > also > > > be a useful tool for ETL processes, e.g. merging two tables into a > single > > > one with a different connector. My suggestion would be to support an > > > optional *Feature T171, “LIKE clause in table definition” *of SQL > > > standard 2008. > > > > > > You can see the description of the proposal here: > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > > > > > Looking forward for your comments. > > > > > > Best, > > > > > > Dawid > > > > > > -- Best, Jingsong Lee |
Hi Dawid,
> INHERITS creates a new table with a "link" to the original table. Yes, INHERITS is a "link" to the original table in PostgreSQL. But INHERITS is not SQL standard, I think it's fine for vendors to define theire semantics. > Standard also allows declaring the clause after the schema part. We can also do it. Is that true? I didn't find it in SQL standard. If this is true, I prefer to put LIKE after the schema part. ==================================== Hi Jingsong, The concern you mentioned in (2) is exactly my concern too. That's why I suggested INHERITS, or put LIKE after schema part. Best, Jark On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: > Thanks Dawid for starting this discussion. > > I like the "LIKE". > > 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will > propagate any changes in column data definitions and check constraints down > the inheritance hierarchy. A inherits B, A and B share every things, they > have the same kafka topic. If modify schema of B, this means underlying > kafka topic schema changed, so I think it is good to modify A too. If this > for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes > this is just we want. > But "LIKE" also very useful for many cases. > > 2.For LIKE statement in schema, I know two kinds of like syntax, one is > MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: > - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite > properties in "with"? This looks weird, because "LIKE" is in schema, but it > can affect outside properties. > > Best, > Jingsong Lee > > On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email]> > wrote: > > > Hi Jark, > > I did investigate the INHERITS clause, but it has a semantic that in my > > opinion we definitely don't want to support. INHERITS creates a new table > > with a "link" to the original table. Therefore if you e.g change the > schema > > of the original table it's also reflected in the child table. It's also > > possible for tables like A inherits B query them like Select * from only > A, > > by default it returns results from both tables. I am pretty sure it's not > > what we're looking for. > > > > PostgreSQL implements both the LIKE clause and INHERITS. I am open for > > discussion if we should support multiple LIKE statements or not. Standard > > also allows declaring the clause after the schema part. We can also do > it. > > Nevertheless I think including multiple tables might be useful, e.g. when > > you want to union two tables and output to the same Kafka cluster and > just > > change the target topic. I know it's not a very common use case but it's > > not a big effort to support it. > > > > Let me know what you think. > > > > Best, > > Dawid > > > > On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: > > > > > Hi Dawid, > > > > > > Thanks for starting this discussion. I like the idea. > > > Once we support more intergrated catalogs, > > > e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. > > > Because it's very common to adjust existing tables in catalog slightly. > > > > > > My initial thought was introducing INHERITS keyword, which is also > > > supported in PostgreSQL [1]. > > > This is also similar to the functionality of Hive CREATE TABLE LIKE > [2]. > > > > > > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > > > cat.db.KafkoTopic > > > CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > > > cat.db.KafkoTopic WITH ('k' = 'v') > > > > > > The INHERITS can inherit an existing table with all columns, watermark, > > and > > > properties, but the properties and watermark and be overwrited > > explicitly. > > > > > > The reason I prefer INHERITS rather than LIKE is the keyword position. > We > > > are copying an existing table definition including the properties. > > > However, LIKE appears in the schema part, it sounds like copying > > properties > > > into schema part of DDL. > > > > > > Besides of that, I'm not sure whether the use case stands "merging two > > > tables into a single one with a different connector". > > > From my understanding, most use cases are just slightly adjusting on an > > > existing catalog table with new properties or watermarks. > > > Do we really need to merge two table definitions into a single one? For > > > example, is it possible to merge a Kafka table definition and > > > a Filesystem table definition into a new Kafka table, and the new Kafka > > > table exactly matches the underlying physical data format? > > > > > > Best, > > > Jark > > > > > > [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html > > > [2]: > > > > > > > > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike > > > > > > > > > On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> > > > wrote: > > > > > > > Hi devs, > > > > > > > > I wanted to bring another improvement proposal up for a discussion. > > Often > > > > users need to adjust existing tables slightly. This is especially > > useful > > > > when users need to enhance a table created from an external tool > (e.g. > > > > HIVE) with Flink's specific information such as e.g watermarks. It > can > > > also > > > > be a useful tool for ETL processes, e.g. merging two tables into a > > single > > > > one with a different connector. My suggestion would be to support an > > > > optional *Feature T171, “LIKE clause in table definition” *of SQL > > > > standard 2008. > > > > > > > > You can see the description of the proposal here: > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > > > > > > > Looking forward for your comments. > > > > > > > > Best, > > > > > > > > Dawid > > > > > > > > > > > > -- > Best, Jingsong Lee > |
Sorry for a late reply, but I was on vacation.
As for putting the LIKE after the schema part. You're right, sql standard lets it be only in the schema part. I was mislead by examples for DB2 and MYSQL, which differ from the standard in that respect. My bad, sorry. Nevertheless I'd still be in favour of using the LIKE clause for that purpose rather than INHERITS. I'm fine with putting it after the schema part. The argument that it applies to the options part make sense to me. I must admit I am not a fan of the INHERITS clause. @Jar I'd not redefine the semantics of the INHERITS clause entirely. I am sure it will pose unnecessary confusion if it differs significantly from what was implemented for, let's be true, more popular vendors such as PostgreSQL. My biggest concern is that the INHERITS clause in PostgreSQL allows constructs such as SELECT * FROM ONLY B (where e.g. A INHERITS B). My understanding of the purpose of the INHERITS clause is that it really emulates inheritance that let's you create "nested" data sets. I think what we are more interested in is a way to adjust only the metadata of an already existing table. Moreover I prefer the LIKE clause as it is more widespread. In some way it is supported by PostgreSQL, DB2, SnowflakeDB, MySQL. Lastly @Jingsong, I am not sure about the "link" part. I know at first glance having a link and reflecting changes might seem appealing, but I am afraid it would pose more threads than it would give benefits. First of all it would make the LIKE/INHERITS clause unusable for creating e.g. hive tables or jdbc tables that could be used from other systems, as the link would not be understandable by those systems. Best, Dawid On 05/03/2020 07:46, Jark Wu wrote: > Hi Dawid, > >> INHERITS creates a new table with a "link" to the original table. > Yes, INHERITS is a "link" to the original table in PostgreSQL. > But INHERITS is not SQL standard, I think it's fine for vendors to define > theire semantics. > >> Standard also allows declaring the clause after the schema part. We can > also do it. > Is that true? I didn't find it in SQL standard. If this is true, I prefer > to put LIKE after the schema part. > > ==================================== > > Hi Jingsong, > > The concern you mentioned in (2) is exactly my concern too. That's why I > suggested INHERITS, or put LIKE after schema part. > > Best, > Jark > > On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: > >> Thanks Dawid for starting this discussion. >> >> I like the "LIKE". >> >> 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will >> propagate any changes in column data definitions and check constraints down >> the inheritance hierarchy. A inherits B, A and B share every things, they >> have the same kafka topic. If modify schema of B, this means underlying >> kafka topic schema changed, so I think it is good to modify A too. If this >> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes >> this is just we want. >> But "LIKE" also very useful for many cases. >> >> 2.For LIKE statement in schema, I know two kinds of like syntax, one is >> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: >> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite >> properties in "with"? This looks weird, because "LIKE" is in schema, but it >> can affect outside properties. >> >> Best, >> Jingsong Lee >> >> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email]> >> wrote: >> >>> Hi Jark, >>> I did investigate the INHERITS clause, but it has a semantic that in my >>> opinion we definitely don't want to support. INHERITS creates a new table >>> with a "link" to the original table. Therefore if you e.g change the >> schema >>> of the original table it's also reflected in the child table. It's also >>> possible for tables like A inherits B query them like Select * from only >> A, >>> by default it returns results from both tables. I am pretty sure it's not >>> what we're looking for. >>> >>> PostgreSQL implements both the LIKE clause and INHERITS. I am open for >>> discussion if we should support multiple LIKE statements or not. Standard >>> also allows declaring the clause after the schema part. We can also do >> it. >>> Nevertheless I think including multiple tables might be useful, e.g. when >>> you want to union two tables and output to the same Kafka cluster and >> just >>> change the target topic. I know it's not a very common use case but it's >>> not a big effort to support it. >>> >>> Let me know what you think. >>> >>> Best, >>> Dawid >>> >>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: >>> >>>> Hi Dawid, >>>> >>>> Thanks for starting this discussion. I like the idea. >>>> Once we support more intergrated catalogs, >>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. >>>> Because it's very common to adjust existing tables in catalog slightly. >>>> >>>> My initial thought was introducing INHERITS keyword, which is also >>>> supported in PostgreSQL [1]. >>>> This is also similar to the functionality of Hive CREATE TABLE LIKE >> [2]. >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>> cat.db.KafkoTopic >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>> cat.db.KafkoTopic WITH ('k' = 'v') >>>> >>>> The INHERITS can inherit an existing table with all columns, watermark, >>> and >>>> properties, but the properties and watermark and be overwrited >>> explicitly. >>>> The reason I prefer INHERITS rather than LIKE is the keyword position. >> We >>>> are copying an existing table definition including the properties. >>>> However, LIKE appears in the schema part, it sounds like copying >>> properties >>>> into schema part of DDL. >>>> >>>> Besides of that, I'm not sure whether the use case stands "merging two >>>> tables into a single one with a different connector". >>>> From my understanding, most use cases are just slightly adjusting on an >>>> existing catalog table with new properties or watermarks. >>>> Do we really need to merge two table definitions into a single one? For >>>> example, is it possible to merge a Kafka table definition and >>>> a Filesystem table definition into a new Kafka table, and the new Kafka >>>> table exactly matches the underlying physical data format? >>>> >>>> Best, >>>> Jark >>>> >>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html >>>> [2]: >>>> >>>> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike >>>> >>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> >>>> wrote: >>>> >>>>> Hi devs, >>>>> >>>>> I wanted to bring another improvement proposal up for a discussion. >>> Often >>>>> users need to adjust existing tables slightly. This is especially >>> useful >>>>> when users need to enhance a table created from an external tool >> (e.g. >>>>> HIVE) with Flink's specific information such as e.g watermarks. It >> can >>>> also >>>>> be a useful tool for ETL processes, e.g. merging two tables into a >>> single >>>>> one with a different connector. My suggestion would be to support an >>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL >>>>> standard 2008. >>>>> >>>>> You can see the description of the proposal here: >>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE >>>>> Looking forward for your comments. >>>>> >>>>> Best, >>>>> >>>>> Dawid >>>>> >> >> -- >> Best, Jingsong Lee >> signature.asc (849 bytes) Download Attachment |
+1 to use LIKE and put after schema part.
I also prefer the keyword LIKE than INHERITS, because it's easier to type and understand, for a non-native English user :) But I would like to limit a single LIKE clause in the DDL in the first version. We can allow multiple LIKE clause in the future if needed. Best, Jark On Tue, 24 Mar 2020 at 19:03, Dawid Wysakowicz <[hidden email]> wrote: > Sorry for a late reply, but I was on vacation. > > As for putting the LIKE after the schema part. You're right, sql > standard lets it be only in the schema part. I was mislead by examples > for DB2 and MYSQL, which differ from the standard in that respect. My > bad, sorry. > > Nevertheless I'd still be in favour of using the LIKE clause for that > purpose rather than INHERITS. I'm fine with putting it after the schema > part. The argument that it applies to the options part make sense to me. > > I must admit I am not a fan of the INHERITS clause. @Jar I'd not > redefine the semantics of the INHERITS clause entirely. I am sure it > will pose unnecessary confusion if it differs significantly from what > was implemented for, let's be true, more popular vendors such as > PostgreSQL. My biggest concern is that the INHERITS clause in PostgreSQL > allows constructs such as SELECT * FROM ONLY B (where e.g. A INHERITS > B). My understanding of the purpose of the INHERITS clause is that it > really emulates inheritance that let's you create "nested" data sets. I > think what we are more interested in is a way to adjust only the > metadata of an already existing table. > > Moreover I prefer the LIKE clause as it is more widespread. In some way > it is supported by PostgreSQL, DB2, SnowflakeDB, MySQL. > > Lastly @Jingsong, I am not sure about the "link" part. I know at first > glance having a link and reflecting changes might seem appealing, but I > am afraid it would pose more threads than it would give benefits. First > of all it would make the LIKE/INHERITS clause unusable for creating e.g. > hive tables or jdbc tables that could be used from other systems, as the > link would not be understandable by those systems. > > Best, > > Dawid > > > > On 05/03/2020 07:46, Jark Wu wrote: > > Hi Dawid, > > > >> INHERITS creates a new table with a "link" to the original table. > > Yes, INHERITS is a "link" to the original table in PostgreSQL. > > But INHERITS is not SQL standard, I think it's fine for vendors to define > > theire semantics. > > > >> Standard also allows declaring the clause after the schema part. We can > > also do it. > > Is that true? I didn't find it in SQL standard. If this is true, I prefer > > to put LIKE after the schema part. > > > > ==================================== > > > > Hi Jingsong, > > > > The concern you mentioned in (2) is exactly my concern too. That's why I > > suggested INHERITS, or put LIKE after schema part. > > > > Best, > > Jark > > > > On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: > > > >> Thanks Dawid for starting this discussion. > >> > >> I like the "LIKE". > >> > >> 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE > will > >> propagate any changes in column data definitions and check constraints > down > >> the inheritance hierarchy. A inherits B, A and B share every things, > they > >> have the same kafka topic. If modify schema of B, this means underlying > >> kafka topic schema changed, so I think it is good to modify A too. If > this > >> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes > >> this is just we want. > >> But "LIKE" also very useful for many cases. > >> > >> 2.For LIKE statement in schema, I know two kinds of like syntax, one is > >> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: > >> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite > >> properties in "with"? This looks weird, because "LIKE" is in schema, > but it > >> can affect outside properties. > >> > >> Best, > >> Jingsong Lee > >> > >> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email] > > > >> wrote: > >> > >>> Hi Jark, > >>> I did investigate the INHERITS clause, but it has a semantic that in my > >>> opinion we definitely don't want to support. INHERITS creates a new > table > >>> with a "link" to the original table. Therefore if you e.g change the > >> schema > >>> of the original table it's also reflected in the child table. It's also > >>> possible for tables like A inherits B query them like Select * from > only > >> A, > >>> by default it returns results from both tables. I am pretty sure it's > not > >>> what we're looking for. > >>> > >>> PostgreSQL implements both the LIKE clause and INHERITS. I am open for > >>> discussion if we should support multiple LIKE statements or not. > Standard > >>> also allows declaring the clause after the schema part. We can also do > >> it. > >>> Nevertheless I think including multiple tables might be useful, e.g. > when > >>> you want to union two tables and output to the same Kafka cluster and > >> just > >>> change the target topic. I know it's not a very common use case but > it's > >>> not a big effort to support it. > >>> > >>> Let me know what you think. > >>> > >>> Best, > >>> Dawid > >>> > >>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: > >>> > >>>> Hi Dawid, > >>>> > >>>> Thanks for starting this discussion. I like the idea. > >>>> Once we support more intergrated catalogs, > >>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. > >>>> Because it's very common to adjust existing tables in catalog > slightly. > >>>> > >>>> My initial thought was introducing INHERITS keyword, which is also > >>>> supported in PostgreSQL [1]. > >>>> This is also similar to the functionality of Hive CREATE TABLE LIKE > >> [2]. > >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > >>>> cat.db.KafkoTopic > >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > >>>> cat.db.KafkoTopic WITH ('k' = 'v') > >>>> > >>>> The INHERITS can inherit an existing table with all columns, > watermark, > >>> and > >>>> properties, but the properties and watermark and be overwrited > >>> explicitly. > >>>> The reason I prefer INHERITS rather than LIKE is the keyword position. > >> We > >>>> are copying an existing table definition including the properties. > >>>> However, LIKE appears in the schema part, it sounds like copying > >>> properties > >>>> into schema part of DDL. > >>>> > >>>> Besides of that, I'm not sure whether the use case stands "merging two > >>>> tables into a single one with a different connector". > >>>> From my understanding, most use cases are just slightly adjusting on > an > >>>> existing catalog table with new properties or watermarks. > >>>> Do we really need to merge two table definitions into a single one? > For > >>>> example, is it possible to merge a Kafka table definition and > >>>> a Filesystem table definition into a new Kafka table, and the new > Kafka > >>>> table exactly matches the underlying physical data format? > >>>> > >>>> Best, > >>>> Jark > >>>> > >>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html > >>>> [2]: > >>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike > >>>> > >>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email] > > > >>>> wrote: > >>>> > >>>>> Hi devs, > >>>>> > >>>>> I wanted to bring another improvement proposal up for a discussion. > >>> Often > >>>>> users need to adjust existing tables slightly. This is especially > >>> useful > >>>>> when users need to enhance a table created from an external tool > >> (e.g. > >>>>> HIVE) with Flink's specific information such as e.g watermarks. It > >> can > >>>> also > >>>>> be a useful tool for ETL processes, e.g. merging two tables into a > >>> single > >>>>> one with a different connector. My suggestion would be to support an > >>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL > >>>>> standard 2008. > >>>>> > >>>>> You can see the description of the proposal here: > >>>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > >>>>> Looking forward for your comments. > >>>>> > >>>>> Best, > >>>>> > >>>>> Dawid > >>>>> > >> > >> -- > >> Best, Jingsong Lee > >> > > |
Hi Dawid,
thanks for your design document. LIKE vs. INHERITS: I would also not start creating transitive dependencies for table metadata. This is very complicated to maintain in a long-term, esp. when we ALTER or DELETE a table. Instead the new table metadata should be materialized immediately before storing it in a catalog. We can consider INHERITS in the future, once we see a need for it. LIKE in schema vs. out of schema: I'm fine with following DB2 syntax here. A LIKE after the schema but with more options what should be inherited. That would be richer than Hive and closer to the standard (even though it comes after the schema part). I also think that inheriting columns from multiple tables is not a very common case, except for dimension tables maybe. Regards, Timo On 24.03.20 14:02, Jark Wu wrote: > +1 to use LIKE and put after schema part. > I also prefer the keyword LIKE than INHERITS, because it's easier to type > and understand, for a non-native English user :) > But I would like to limit a single LIKE clause in the DDL in the first > version. We can allow multiple LIKE clause in the future if needed. > > Best, > Jark > > On Tue, 24 Mar 2020 at 19:03, Dawid Wysakowicz <[hidden email]> > wrote: > >> Sorry for a late reply, but I was on vacation. >> >> As for putting the LIKE after the schema part. You're right, sql >> standard lets it be only in the schema part. I was mislead by examples >> for DB2 and MYSQL, which differ from the standard in that respect. My >> bad, sorry. >> >> Nevertheless I'd still be in favour of using the LIKE clause for that >> purpose rather than INHERITS. I'm fine with putting it after the schema >> part. The argument that it applies to the options part make sense to me. >> >> I must admit I am not a fan of the INHERITS clause. @Jar I'd not >> redefine the semantics of the INHERITS clause entirely. I am sure it >> will pose unnecessary confusion if it differs significantly from what >> was implemented for, let's be true, more popular vendors such as >> PostgreSQL. My biggest concern is that the INHERITS clause in PostgreSQL >> allows constructs such as SELECT * FROM ONLY B (where e.g. A INHERITS >> B). My understanding of the purpose of the INHERITS clause is that it >> really emulates inheritance that let's you create "nested" data sets. I >> think what we are more interested in is a way to adjust only the >> metadata of an already existing table. >> >> Moreover I prefer the LIKE clause as it is more widespread. In some way >> it is supported by PostgreSQL, DB2, SnowflakeDB, MySQL. >> >> Lastly @Jingsong, I am not sure about the "link" part. I know at first >> glance having a link and reflecting changes might seem appealing, but I >> am afraid it would pose more threads than it would give benefits. First >> of all it would make the LIKE/INHERITS clause unusable for creating e.g. >> hive tables or jdbc tables that could be used from other systems, as the >> link would not be understandable by those systems. >> >> Best, >> >> Dawid >> >> >> >> On 05/03/2020 07:46, Jark Wu wrote: >>> Hi Dawid, >>> >>>> INHERITS creates a new table with a "link" to the original table. >>> Yes, INHERITS is a "link" to the original table in PostgreSQL. >>> But INHERITS is not SQL standard, I think it's fine for vendors to define >>> theire semantics. >>> >>>> Standard also allows declaring the clause after the schema part. We can >>> also do it. >>> Is that true? I didn't find it in SQL standard. If this is true, I prefer >>> to put LIKE after the schema part. >>> >>> ==================================== >>> >>> Hi Jingsong, >>> >>> The concern you mentioned in (2) is exactly my concern too. That's why I >>> suggested INHERITS, or put LIKE after schema part. >>> >>> Best, >>> Jark >>> >>> On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: >>> >>>> Thanks Dawid for starting this discussion. >>>> >>>> I like the "LIKE". >>>> >>>> 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE >> will >>>> propagate any changes in column data definitions and check constraints >> down >>>> the inheritance hierarchy. A inherits B, A and B share every things, >> they >>>> have the same kafka topic. If modify schema of B, this means underlying >>>> kafka topic schema changed, so I think it is good to modify A too. If >> this >>>> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes >>>> this is just we want. >>>> But "LIKE" also very useful for many cases. >>>> >>>> 2.For LIKE statement in schema, I know two kinds of like syntax, one is >>>> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: >>>> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite >>>> properties in "with"? This looks weird, because "LIKE" is in schema, >> but it >>>> can affect outside properties. >>>> >>>> Best, >>>> Jingsong Lee >>>> >>>> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email] >>> >>>> wrote: >>>> >>>>> Hi Jark, >>>>> I did investigate the INHERITS clause, but it has a semantic that in my >>>>> opinion we definitely don't want to support. INHERITS creates a new >> table >>>>> with a "link" to the original table. Therefore if you e.g change the >>>> schema >>>>> of the original table it's also reflected in the child table. It's also >>>>> possible for tables like A inherits B query them like Select * from >> only >>>> A, >>>>> by default it returns results from both tables. I am pretty sure it's >> not >>>>> what we're looking for. >>>>> >>>>> PostgreSQL implements both the LIKE clause and INHERITS. I am open for >>>>> discussion if we should support multiple LIKE statements or not. >> Standard >>>>> also allows declaring the clause after the schema part. We can also do >>>> it. >>>>> Nevertheless I think including multiple tables might be useful, e.g. >> when >>>>> you want to union two tables and output to the same Kafka cluster and >>>> just >>>>> change the target topic. I know it's not a very common use case but >> it's >>>>> not a big effort to support it. >>>>> >>>>> Let me know what you think. >>>>> >>>>> Best, >>>>> Dawid >>>>> >>>>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: >>>>> >>>>>> Hi Dawid, >>>>>> >>>>>> Thanks for starting this discussion. I like the idea. >>>>>> Once we support more intergrated catalogs, >>>>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. >>>>>> Because it's very common to adjust existing tables in catalog >> slightly. >>>>>> >>>>>> My initial thought was introducing INHERITS keyword, which is also >>>>>> supported in PostgreSQL [1]. >>>>>> This is also similar to the functionality of Hive CREATE TABLE LIKE >>>> [2]. >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>>> cat.db.KafkoTopic >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>>> cat.db.KafkoTopic WITH ('k' = 'v') >>>>>> >>>>>> The INHERITS can inherit an existing table with all columns, >> watermark, >>>>> and >>>>>> properties, but the properties and watermark and be overwrited >>>>> explicitly. >>>>>> The reason I prefer INHERITS rather than LIKE is the keyword position. >>>> We >>>>>> are copying an existing table definition including the properties. >>>>>> However, LIKE appears in the schema part, it sounds like copying >>>>> properties >>>>>> into schema part of DDL. >>>>>> >>>>>> Besides of that, I'm not sure whether the use case stands "merging two >>>>>> tables into a single one with a different connector". >>>>>> From my understanding, most use cases are just slightly adjusting on >> an >>>>>> existing catalog table with new properties or watermarks. >>>>>> Do we really need to merge two table definitions into a single one? >> For >>>>>> example, is it possible to merge a Kafka table definition and >>>>>> a Filesystem table definition into a new Kafka table, and the new >> Kafka >>>>>> table exactly matches the underlying physical data format? >>>>>> >>>>>> Best, >>>>>> Jark >>>>>> >>>>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html >>>>>> [2]: >>>>>> >>>>>> >>>> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike >>>>>> >>>>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email] >>> >>>>>> wrote: >>>>>> >>>>>>> Hi devs, >>>>>>> >>>>>>> I wanted to bring another improvement proposal up for a discussion. >>>>> Often >>>>>>> users need to adjust existing tables slightly. This is especially >>>>> useful >>>>>>> when users need to enhance a table created from an external tool >>>> (e.g. >>>>>>> HIVE) with Flink's specific information such as e.g watermarks. It >>>> can >>>>>> also >>>>>>> be a useful tool for ETL processes, e.g. merging two tables into a >>>>> single >>>>>>> one with a different connector. My suggestion would be to support an >>>>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL >>>>>>> standard 2008. >>>>>>> >>>>>>> You can see the description of the proposal here: >>>>>>> >>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE >>>>>>> Looking forward for your comments. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Dawid >>>>>>> >>>> >>>> -- >>>> Best, Jingsong Lee >>>> >> >> > |
In reply to this post by Jark Wu-2
Thank you for your opinions. I updated the FLIP with results of the
discussion. Let me know if you have further concerns. Best, Dawid On 05/03/2020 07:46, Jark Wu wrote: > Hi Dawid, > >> INHERITS creates a new table with a "link" to the original table. > Yes, INHERITS is a "link" to the original table in PostgreSQL. > But INHERITS is not SQL standard, I think it's fine for vendors to define > theire semantics. > >> Standard also allows declaring the clause after the schema part. We can > also do it. > Is that true? I didn't find it in SQL standard. If this is true, I prefer > to put LIKE after the schema part. > > ==================================== > > Hi Jingsong, > > The concern you mentioned in (2) is exactly my concern too. That's why I > suggested INHERITS, or put LIKE after schema part. > > Best, > Jark > > On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: > >> Thanks Dawid for starting this discussion. >> >> I like the "LIKE". >> >> 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will >> propagate any changes in column data definitions and check constraints down >> the inheritance hierarchy. A inherits B, A and B share every things, they >> have the same kafka topic. If modify schema of B, this means underlying >> kafka topic schema changed, so I think it is good to modify A too. If this >> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes >> this is just we want. >> But "LIKE" also very useful for many cases. >> >> 2.For LIKE statement in schema, I know two kinds of like syntax, one is >> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: >> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite >> properties in "with"? This looks weird, because "LIKE" is in schema, but it >> can affect outside properties. >> >> Best, >> Jingsong Lee >> >> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email]> >> wrote: >> >>> Hi Jark, >>> I did investigate the INHERITS clause, but it has a semantic that in my >>> opinion we definitely don't want to support. INHERITS creates a new table >>> with a "link" to the original table. Therefore if you e.g change the >> schema >>> of the original table it's also reflected in the child table. It's also >>> possible for tables like A inherits B query them like Select * from only >> A, >>> by default it returns results from both tables. I am pretty sure it's not >>> what we're looking for. >>> >>> PostgreSQL implements both the LIKE clause and INHERITS. I am open for >>> discussion if we should support multiple LIKE statements or not. Standard >>> also allows declaring the clause after the schema part. We can also do >> it. >>> Nevertheless I think including multiple tables might be useful, e.g. when >>> you want to union two tables and output to the same Kafka cluster and >> just >>> change the target topic. I know it's not a very common use case but it's >>> not a big effort to support it. >>> >>> Let me know what you think. >>> >>> Best, >>> Dawid >>> >>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: >>> >>>> Hi Dawid, >>>> >>>> Thanks for starting this discussion. I like the idea. >>>> Once we support more intergrated catalogs, >>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. >>>> Because it's very common to adjust existing tables in catalog slightly. >>>> >>>> My initial thought was introducing INHERITS keyword, which is also >>>> supported in PostgreSQL [1]. >>>> This is also similar to the functionality of Hive CREATE TABLE LIKE >> [2]. >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>> cat.db.KafkoTopic >>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>> cat.db.KafkoTopic WITH ('k' = 'v') >>>> >>>> The INHERITS can inherit an existing table with all columns, watermark, >>> and >>>> properties, but the properties and watermark and be overwrited >>> explicitly. >>>> The reason I prefer INHERITS rather than LIKE is the keyword position. >> We >>>> are copying an existing table definition including the properties. >>>> However, LIKE appears in the schema part, it sounds like copying >>> properties >>>> into schema part of DDL. >>>> >>>> Besides of that, I'm not sure whether the use case stands "merging two >>>> tables into a single one with a different connector". >>>> From my understanding, most use cases are just slightly adjusting on an >>>> existing catalog table with new properties or watermarks. >>>> Do we really need to merge two table definitions into a single one? For >>>> example, is it possible to merge a Kafka table definition and >>>> a Filesystem table definition into a new Kafka table, and the new Kafka >>>> table exactly matches the underlying physical data format? >>>> >>>> Best, >>>> Jark >>>> >>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html >>>> [2]: >>>> >>>> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike >>>> >>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> >>>> wrote: >>>> >>>>> Hi devs, >>>>> >>>>> I wanted to bring another improvement proposal up for a discussion. >>> Often >>>>> users need to adjust existing tables slightly. This is especially >>> useful >>>>> when users need to enhance a table created from an external tool >> (e.g. >>>>> HIVE) with Flink's specific information such as e.g watermarks. It >> can >>>> also >>>>> be a useful tool for ETL processes, e.g. merging two tables into a >>> single >>>>> one with a different connector. My suggestion would be to support an >>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL >>>>> standard 2008. >>>>> >>>>> You can see the description of the proposal here: >>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE >>>>> Looking forward for your comments. >>>>> >>>>> Best, >>>>> >>>>> Dawid >>>>> >> >> -- >> Best, Jingsong Lee >> signature.asc (849 bytes) Download Attachment |
Hi Dawid,
thanks for updating the FLIP. One minor comment from my side, should we move the LIKE clause to the very end? CREATE TABLE X () WITH () LIKE ... Otherwise, the LIKE clause looks a bit lost if there are options afterwards. Otherwise, +1 for start a vote from my side. Regards, Timo On 25.03.20 15:30, Dawid Wysakowicz wrote: > Thank you for your opinions. I updated the FLIP with results of the > discussion. Let me know if you have further concerns. > > Best, > > Dawid > > On 05/03/2020 07:46, Jark Wu wrote: >> Hi Dawid, >> >>> INHERITS creates a new table with a "link" to the original table. >> Yes, INHERITS is a "link" to the original table in PostgreSQL. >> But INHERITS is not SQL standard, I think it's fine for vendors to define >> theire semantics. >> >>> Standard also allows declaring the clause after the schema part. We can >> also do it. >> Is that true? I didn't find it in SQL standard. If this is true, I prefer >> to put LIKE after the schema part. >> >> ==================================== >> >> Hi Jingsong, >> >> The concern you mentioned in (2) is exactly my concern too. That's why I >> suggested INHERITS, or put LIKE after schema part. >> >> Best, >> Jark >> >> On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> wrote: >> >>> Thanks Dawid for starting this discussion. >>> >>> I like the "LIKE". >>> >>> 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will >>> propagate any changes in column data definitions and check constraints down >>> the inheritance hierarchy. A inherits B, A and B share every things, they >>> have the same kafka topic. If modify schema of B, this means underlying >>> kafka topic schema changed, so I think it is good to modify A too. If this >>> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes >>> this is just we want. >>> But "LIKE" also very useful for many cases. >>> >>> 2.For LIKE statement in schema, I know two kinds of like syntax, one is >>> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: >>> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite >>> properties in "with"? This looks weird, because "LIKE" is in schema, but it >>> can affect outside properties. >>> >>> Best, >>> Jingsong Lee >>> >>> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz <[hidden email]> >>> wrote: >>> >>>> Hi Jark, >>>> I did investigate the INHERITS clause, but it has a semantic that in my >>>> opinion we definitely don't want to support. INHERITS creates a new table >>>> with a "link" to the original table. Therefore if you e.g change the >>> schema >>>> of the original table it's also reflected in the child table. It's also >>>> possible for tables like A inherits B query them like Select * from only >>> A, >>>> by default it returns results from both tables. I am pretty sure it's not >>>> what we're looking for. >>>> >>>> PostgreSQL implements both the LIKE clause and INHERITS. I am open for >>>> discussion if we should support multiple LIKE statements or not. Standard >>>> also allows declaring the clause after the schema part. We can also do >>> it. >>>> Nevertheless I think including multiple tables might be useful, e.g. when >>>> you want to union two tables and output to the same Kafka cluster and >>> just >>>> change the target topic. I know it's not a very common use case but it's >>>> not a big effort to support it. >>>> >>>> Let me know what you think. >>>> >>>> Best, >>>> Dawid >>>> >>>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: >>>> >>>>> Hi Dawid, >>>>> >>>>> Thanks for starting this discussion. I like the idea. >>>>> Once we support more intergrated catalogs, >>>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. >>>>> Because it's very common to adjust existing tables in catalog slightly. >>>>> >>>>> My initial thought was introducing INHERITS keyword, which is also >>>>> supported in PostgreSQL [1]. >>>>> This is also similar to the functionality of Hive CREATE TABLE LIKE >>> [2]. >>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>> cat.db.KafkoTopic >>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>> cat.db.KafkoTopic WITH ('k' = 'v') >>>>> >>>>> The INHERITS can inherit an existing table with all columns, watermark, >>>> and >>>>> properties, but the properties and watermark and be overwrited >>>> explicitly. >>>>> The reason I prefer INHERITS rather than LIKE is the keyword position. >>> We >>>>> are copying an existing table definition including the properties. >>>>> However, LIKE appears in the schema part, it sounds like copying >>>> properties >>>>> into schema part of DDL. >>>>> >>>>> Besides of that, I'm not sure whether the use case stands "merging two >>>>> tables into a single one with a different connector". >>>>> From my understanding, most use cases are just slightly adjusting on an >>>>> existing catalog table with new properties or watermarks. >>>>> Do we really need to merge two table definitions into a single one? For >>>>> example, is it possible to merge a Kafka table definition and >>>>> a Filesystem table definition into a new Kafka table, and the new Kafka >>>>> table exactly matches the underlying physical data format? >>>>> >>>>> Best, >>>>> Jark >>>>> >>>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html >>>>> [2]: >>>>> >>>>> >>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike >>>>> >>>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz <[hidden email]> >>>>> wrote: >>>>> >>>>>> Hi devs, >>>>>> >>>>>> I wanted to bring another improvement proposal up for a discussion. >>>> Often >>>>>> users need to adjust existing tables slightly. This is especially >>>> useful >>>>>> when users need to enhance a table created from an external tool >>> (e.g. >>>>>> HIVE) with Flink's specific information such as e.g watermarks. It >>> can >>>>> also >>>>>> be a useful tool for ETL processes, e.g. merging two tables into a >>>> single >>>>>> one with a different connector. My suggestion would be to support an >>>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL >>>>>> standard 2008. >>>>>> >>>>>> You can see the description of the proposal here: >>>>>> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE >>>>>> Looking forward for your comments. >>>>>> >>>>>> Best, >>>>>> >>>>>> Dawid >>>>>> >>> >>> -- >>> Best, Jingsong Lee >>> > |
Hi Timo,
I think your suggestion makes sense. I updated the document. As there are no more comments I will start a vote for it. Best, Dawid On 30/03/2020 16:40, Timo Walther wrote: > Hi Dawid, > > thanks for updating the FLIP. One minor comment from my side, should > we move the LIKE clause to the very end? > > CREATE TABLE X () WITH () LIKE ... > > Otherwise, the LIKE clause looks a bit lost if there are options > afterwards. Otherwise, +1 for start a vote from my side. > > Regards, > Timo > > > On 25.03.20 15:30, Dawid Wysakowicz wrote: >> Thank you for your opinions. I updated the FLIP with results of the >> discussion. Let me know if you have further concerns. >> >> Best, >> >> Dawid >> >> On 05/03/2020 07:46, Jark Wu wrote: >>> Hi Dawid, >>> >>>> INHERITS creates a new table with a "link" to the original table. >>> Yes, INHERITS is a "link" to the original table in PostgreSQL. >>> But INHERITS is not SQL standard, I think it's fine for vendors to >>> define >>> theire semantics. >>> >>>> Standard also allows declaring the clause after the schema part. We >>>> can >>> also do it. >>> Is that true? I didn't find it in SQL standard. If this is true, I >>> prefer >>> to put LIKE after the schema part. >>> >>> ==================================== >>> >>> Hi Jingsong, >>> >>> The concern you mentioned in (2) is exactly my concern too. That's >>> why I >>> suggested INHERITS, or put LIKE after schema part. >>> >>> Best, >>> Jark >>> >>> On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> >>> wrote: >>> >>>> Thanks Dawid for starting this discussion. >>>> >>>> I like the "LIKE". >>>> >>>> 1.For "INHERITS", I think this is a good feature too, yes, ALTER >>>> TABLE will >>>> propagate any changes in column data definitions and check >>>> constraints down >>>> the inheritance hierarchy. A inherits B, A and B share every >>>> things, they >>>> have the same kafka topic. If modify schema of B, this means >>>> underlying >>>> kafka topic schema changed, so I think it is good to modify A too. >>>> If this >>>> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think >>>> sometimes >>>> this is just we want. >>>> But "LIKE" also very useful for many cases. >>>> >>>> 2.For LIKE statement in schema, I know two kinds of like syntax, >>>> one is >>>> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: >>>> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite >>>> properties in "with"? This looks weird, because "LIKE" is in >>>> schema, but it >>>> can affect outside properties. >>>> >>>> Best, >>>> Jingsong Lee >>>> >>>> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz >>>> <[hidden email]> >>>> wrote: >>>> >>>>> Hi Jark, >>>>> I did investigate the INHERITS clause, but it has a semantic that >>>>> in my >>>>> opinion we definitely don't want to support. INHERITS creates a >>>>> new table >>>>> with a "link" to the original table. Therefore if you e.g change the >>>> schema >>>>> of the original table it's also reflected in the child table. It's >>>>> also >>>>> possible for tables like A inherits B query them like Select * >>>>> from only >>>> A, >>>>> by default it returns results from both tables. I am pretty sure >>>>> it's not >>>>> what we're looking for. >>>>> >>>>> PostgreSQL implements both the LIKE clause and INHERITS. I am open >>>>> for >>>>> discussion if we should support multiple LIKE statements or not. >>>>> Standard >>>>> also allows declaring the clause after the schema part. We can >>>>> also do >>>> it. >>>>> Nevertheless I think including multiple tables might be useful, >>>>> e.g. when >>>>> you want to union two tables and output to the same Kafka cluster and >>>> just >>>>> change the target topic. I know it's not a very common use case >>>>> but it's >>>>> not a big effort to support it. >>>>> >>>>> Let me know what you think. >>>>> >>>>> Best, >>>>> Dawid >>>>> >>>>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: >>>>> >>>>>> Hi Dawid, >>>>>> >>>>>> Thanks for starting this discussion. I like the idea. >>>>>> Once we support more intergrated catalogs, >>>>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more >>>>>> urgent. >>>>>> Because it's very common to adjust existing tables in catalog >>>>>> slightly. >>>>>> >>>>>> My initial thought was introducing INHERITS keyword, which is also >>>>>> supported in PostgreSQL [1]. >>>>>> This is also similar to the functionality of Hive CREATE TABLE LIKE >>>> [2]. >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>>> cat.db.KafkoTopic >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS >>>>>> cat.db.KafkoTopic WITH ('k' = 'v') >>>>>> >>>>>> The INHERITS can inherit an existing table with all columns, >>>>>> watermark, >>>>> and >>>>>> properties, but the properties and watermark and be overwrited >>>>> explicitly. >>>>>> The reason I prefer INHERITS rather than LIKE is the keyword >>>>>> position. >>>> We >>>>>> are copying an existing table definition including the properties. >>>>>> However, LIKE appears in the schema part, it sounds like copying >>>>> properties >>>>>> into schema part of DDL. >>>>>> >>>>>> Besides of that, I'm not sure whether the use case stands >>>>>> "merging two >>>>>> tables into a single one with a different connector". >>>>>> From my understanding, most use cases are just slightly >>>>>> adjusting on an >>>>>> existing catalog table with new properties or watermarks. >>>>>> Do we really need to merge two table definitions into a single >>>>>> one? For >>>>>> example, is it possible to merge a Kafka table definition and >>>>>> a Filesystem table definition into a new Kafka table, and the new >>>>>> Kafka >>>>>> table exactly matches the underlying physical data format? >>>>>> >>>>>> Best, >>>>>> Jark >>>>>> >>>>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html >>>>>> [2]: >>>>>> >>>>>> >>>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike >>>> >>>>>> >>>>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz >>>>>> <[hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Hi devs, >>>>>>> >>>>>>> I wanted to bring another improvement proposal up for a discussion. >>>>> Often >>>>>>> users need to adjust existing tables slightly. This is especially >>>>> useful >>>>>>> when users need to enhance a table created from an external tool >>>> (e.g. >>>>>>> HIVE) with Flink's specific information such as e.g watermarks. It >>>> can >>>>>> also >>>>>>> be a useful tool for ETL processes, e.g. merging two tables into a >>>>> single >>>>>>> one with a different connector. My suggestion would be to >>>>>>> support an >>>>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL >>>>>>> standard 2008. >>>>>>> >>>>>>> You can see the description of the proposal here: >>>>>>> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE >>>> >>>>>>> Looking forward for your comments. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Dawid >>>>>>> >>>> >>>> -- >>>> Best, Jingsong Lee >>>> >> > signature.asc (849 bytes) Download Attachment |
Hi Dawid,
Just two small questions: - Can you explain more about "CONSTRAINTS, GENERATED, OPTIONS" in the FLIP? I can image the meaning of "CONSTRAINTS, OPTIONS" in the example, but it is hard to guess "GENERATED". - Which category does partition keys belong to? (I am sorry if I've disturbed the vote thread, because in my Gmail view, they're the same thread.) Best, Jingsong Lee On Tue, Mar 31, 2020 at 3:30 PM Dawid Wysakowicz <[hidden email]> wrote: > Hi Timo, > > I think your suggestion makes sense. I updated the document. > > As there are no more comments I will start a vote for it. > > Best, > > Dawid > > On 30/03/2020 16:40, Timo Walther wrote: > > Hi Dawid, > > > > thanks for updating the FLIP. One minor comment from my side, should > > we move the LIKE clause to the very end? > > > > CREATE TABLE X () WITH () LIKE ... > > > > Otherwise, the LIKE clause looks a bit lost if there are options > > afterwards. Otherwise, +1 for start a vote from my side. > > > > Regards, > > Timo > > > > > > On 25.03.20 15:30, Dawid Wysakowicz wrote: > >> Thank you for your opinions. I updated the FLIP with results of the > >> discussion. Let me know if you have further concerns. > >> > >> Best, > >> > >> Dawid > >> > >> On 05/03/2020 07:46, Jark Wu wrote: > >>> Hi Dawid, > >>> > >>>> INHERITS creates a new table with a "link" to the original table. > >>> Yes, INHERITS is a "link" to the original table in PostgreSQL. > >>> But INHERITS is not SQL standard, I think it's fine for vendors to > >>> define > >>> theire semantics. > >>> > >>>> Standard also allows declaring the clause after the schema part. We > >>>> can > >>> also do it. > >>> Is that true? I didn't find it in SQL standard. If this is true, I > >>> prefer > >>> to put LIKE after the schema part. > >>> > >>> ==================================== > >>> > >>> Hi Jingsong, > >>> > >>> The concern you mentioned in (2) is exactly my concern too. That's > >>> why I > >>> suggested INHERITS, or put LIKE after schema part. > >>> > >>> Best, > >>> Jark > >>> > >>> On Thu, 5 Mar 2020 at 12:05, Jingsong Li <[hidden email]> > >>> wrote: > >>> > >>>> Thanks Dawid for starting this discussion. > >>>> > >>>> I like the "LIKE". > >>>> > >>>> 1.For "INHERITS", I think this is a good feature too, yes, ALTER > >>>> TABLE will > >>>> propagate any changes in column data definitions and check > >>>> constraints down > >>>> the inheritance hierarchy. A inherits B, A and B share every > >>>> things, they > >>>> have the same kafka topic. If modify schema of B, this means > >>>> underlying > >>>> kafka topic schema changed, so I think it is good to modify A too. > >>>> If this > >>>> for "ConfluentSchemaRegistryCatalog" mention by Jark, I think > >>>> sometimes > >>>> this is just we want. > >>>> But "LIKE" also very useful for many cases. > >>>> > >>>> 2.For LIKE statement in schema, I know two kinds of like syntax, > >>>> one is > >>>> MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: > >>>> - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite > >>>> properties in "with"? This looks weird, because "LIKE" is in > >>>> schema, but it > >>>> can affect outside properties. > >>>> > >>>> Best, > >>>> Jingsong Lee > >>>> > >>>> On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz > >>>> <[hidden email]> > >>>> wrote: > >>>> > >>>>> Hi Jark, > >>>>> I did investigate the INHERITS clause, but it has a semantic that > >>>>> in my > >>>>> opinion we definitely don't want to support. INHERITS creates a > >>>>> new table > >>>>> with a "link" to the original table. Therefore if you e.g change the > >>>> schema > >>>>> of the original table it's also reflected in the child table. It's > >>>>> also > >>>>> possible for tables like A inherits B query them like Select * > >>>>> from only > >>>> A, > >>>>> by default it returns results from both tables. I am pretty sure > >>>>> it's not > >>>>> what we're looking for. > >>>>> > >>>>> PostgreSQL implements both the LIKE clause and INHERITS. I am open > >>>>> for > >>>>> discussion if we should support multiple LIKE statements or not. > >>>>> Standard > >>>>> also allows declaring the clause after the schema part. We can > >>>>> also do > >>>> it. > >>>>> Nevertheless I think including multiple tables might be useful, > >>>>> e.g. when > >>>>> you want to union two tables and output to the same Kafka cluster and > >>>> just > >>>>> change the target topic. I know it's not a very common use case > >>>>> but it's > >>>>> not a big effort to support it. > >>>>> > >>>>> Let me know what you think. > >>>>> > >>>>> Best, > >>>>> Dawid > >>>>> > >>>>> On Wed, 4 Mar 2020, 04:55 Jark Wu, <[hidden email]> wrote: > >>>>> > >>>>>> Hi Dawid, > >>>>>> > >>>>>> Thanks for starting this discussion. I like the idea. > >>>>>> Once we support more intergrated catalogs, > >>>>>> e.g. ConfluentSchemaRegistryCatalog, this problem will be more > >>>>>> urgent. > >>>>>> Because it's very common to adjust existing tables in catalog > >>>>>> slightly. > >>>>>> > >>>>>> My initial thought was introducing INHERITS keyword, which is also > >>>>>> supported in PostgreSQL [1]. > >>>>>> This is also similar to the functionality of Hive CREATE TABLE LIKE > >>>> [2]. > >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > >>>>>> cat.db.KafkoTopic > >>>>>> CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS > >>>>>> cat.db.KafkoTopic WITH ('k' = 'v') > >>>>>> > >>>>>> The INHERITS can inherit an existing table with all columns, > >>>>>> watermark, > >>>>> and > >>>>>> properties, but the properties and watermark and be overwrited > >>>>> explicitly. > >>>>>> The reason I prefer INHERITS rather than LIKE is the keyword > >>>>>> position. > >>>> We > >>>>>> are copying an existing table definition including the properties. > >>>>>> However, LIKE appears in the schema part, it sounds like copying > >>>>> properties > >>>>>> into schema part of DDL. > >>>>>> > >>>>>> Besides of that, I'm not sure whether the use case stands > >>>>>> "merging two > >>>>>> tables into a single one with a different connector". > >>>>>> From my understanding, most use cases are just slightly > >>>>>> adjusting on an > >>>>>> existing catalog table with new properties or watermarks. > >>>>>> Do we really need to merge two table definitions into a single > >>>>>> one? For > >>>>>> example, is it possible to merge a Kafka table definition and > >>>>>> a Filesystem table definition into a new Kafka table, and the new > >>>>>> Kafka > >>>>>> table exactly matches the underlying physical data format? > >>>>>> > >>>>>> Best, > >>>>>> Jark > >>>>>> > >>>>>> [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html > >>>>>> [2]: > >>>>>> > >>>>>> > >>>> > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike > >>>> > >>>>>> > >>>>>> On Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz > >>>>>> <[hidden email]> > >>>>>> wrote: > >>>>>> > >>>>>>> Hi devs, > >>>>>>> > >>>>>>> I wanted to bring another improvement proposal up for a discussion. > >>>>> Often > >>>>>>> users need to adjust existing tables slightly. This is especially > >>>>> useful > >>>>>>> when users need to enhance a table created from an external tool > >>>> (e.g. > >>>>>>> HIVE) with Flink's specific information such as e.g watermarks. It > >>>> can > >>>>>> also > >>>>>>> be a useful tool for ETL processes, e.g. merging two tables into a > >>>>> single > >>>>>>> one with a different connector. My suggestion would be to > >>>>>>> support an > >>>>>>> optional *Feature T171, “LIKE clause in table definition” *of SQL > >>>>>>> standard 2008. > >>>>>>> > >>>>>>> You can see the description of the proposal here: > >>>>>>> > >>>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > >>>> > >>>>>>> Looking forward for your comments. > >>>>>>> > >>>>>>> Best, > >>>>>>> > >>>>>>> Dawid > >>>>>>> > >>>> > >>>> -- > >>>> Best, Jingsong Lee > >>>> > >> > > > > -- Best, Jingsong Lee |
Hi Jingsong, I added a short description for the options:
I think partitions are a valid point. I think they are not included in any of the options. It makes sense to include them as well. I would suggest adding INCLUDING | EXCLUDING PARTITIONS as another alternative. I will not cancel the vote for now, as the comment came soon after starting the vote. If anyone thinks I should give more time to discuss the partitions topic, feel free to comment in this thread. Best, Dawid On 31/03/2020 10:05, Jingsong Li wrote:
Hi Dawid, Just two small questions: - Can you explain more about "CONSTRAINTS, GENERATED, OPTIONS" in the FLIP? I can image the meaning of "CONSTRAINTS, OPTIONS" in the example, but it is hard to guess "GENERATED". - Which category does partition keys belong to? (I am sorry if I've disturbed the vote thread, because in my Gmail view, they're the same thread.) Best, Jingsong Lee On Tue, Mar 31, 2020 at 3:30 PM Dawid Wysakowicz [hidden email] wrote:Hi Timo, I think your suggestion makes sense. I updated the document. As there are no more comments I will start a vote for it. Best, Dawid On 30/03/2020 16:40, Timo Walther wrote:Hi Dawid, thanks for updating the FLIP. One minor comment from my side, should we move the LIKE clause to the very end? CREATE TABLE X () WITH () LIKE ... Otherwise, the LIKE clause looks a bit lost if there are options afterwards. Otherwise, +1 for start a vote from my side. Regards, Timo On 25.03.20 15:30, Dawid Wysakowicz wrote:Thank you for your opinions. I updated the FLIP with results of the discussion. Let me know if you have further concerns. Best, Dawid On 05/03/2020 07:46, Jark Wu wrote:Hi Dawid,INHERITS creates a new table with a "link" to the original table.Yes, INHERITS is a "link" to the original table in PostgreSQL. But INHERITS is not SQL standard, I think it's fine for vendors to define theire semantics.Standard also allows declaring the clause after the schema part. We canalso do it. Is that true? I didn't find it in SQL standard. If this is true, I prefer to put LIKE after the schema part. ==================================== Hi Jingsong, The concern you mentioned in (2) is exactly my concern too. That's why I suggested INHERITS, or put LIKE after schema part. Best, Jark On Thu, 5 Mar 2020 at 12:05, Jingsong Li [hidden email] wrote:Thanks Dawid for starting this discussion. I like the "LIKE". 1.For "INHERITS", I think this is a good feature too, yes, ALTER TABLE will propagate any changes in column data definitions and check constraints down the inheritance hierarchy. A inherits B, A and B share every things, they have the same kafka topic. If modify schema of B, this means underlying kafka topic schema changed, so I think it is good to modify A too. If this for "ConfluentSchemaRegistryCatalog" mention by Jark, I think sometimes this is just we want. But "LIKE" also very useful for many cases. 2.For LIKE statement in schema, I know two kinds of like syntax, one is MySQL/hive/sqlserver, the other is PostgreSQL. I prefer former: - In the FLIP, there is "OVERWRITING OPTIONS", this will overwrite properties in "with"? This looks weird, because "LIKE" is in schema, but it can affect outside properties. Best, Jingsong Lee On Wed, Mar 4, 2020 at 2:05 PM Dawid Wysakowicz [hidden email] wrote:Hi Jark, I did investigate the INHERITS clause, but it has a semantic that in my opinion we definitely don't want to support. INHERITS creates a new table with a "link" to the original table. Therefore if you e.g change theschemaof the original table it's also reflected in the child table. It's also possible for tables like A inherits B query them like Select * from onlyA,by default it returns results from both tables. I am pretty sure it's not what we're looking for. PostgreSQL implements both the LIKE clause and INHERITS. I am open for discussion if we should support multiple LIKE statements or not. Standard also allows declaring the clause after the schema part. We can also doit.Nevertheless I think including multiple tables might be useful, e.g. when you want to union two tables and output to the same Kafka cluster andjustchange the target topic. I know it's not a very common use case but it's not a big effort to support it. Let me know what you think. Best, Dawid On Wed, 4 Mar 2020, 04:55 Jark Wu, [hidden email] wrote:Hi Dawid, Thanks for starting this discussion. I like the idea. Once we support more intergrated catalogs, e.g. ConfluentSchemaRegistryCatalog, this problem will be more urgent. Because it's very common to adjust existing tables in catalog slightly. My initial thought was introducing INHERITS keyword, which is also supported in PostgreSQL [1]. This is also similar to the functionality of Hive CREATE TABLE LIKE[2].CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS cat.db.KafkoTopic CREATE TEMPORARY TABLE MyTable (WATERMARK FOR ts) INHERITS cat.db.KafkoTopic WITH ('k' = 'v') The INHERITS can inherit an existing table with all columns, watermark,andproperties, but the properties and watermark and be overwritedexplicitly.The reason I prefer INHERITS rather than LIKE is the keyword position.Weare copying an existing table definition including the properties. However, LIKE appears in the schema part, it sounds like copyingpropertiesinto schema part of DDL. Besides of that, I'm not sure whether the use case stands "merging two tables into a single one with a different connector". From my understanding, most use cases are just slightly adjusting on an existing catalog table with new properties or watermarks. Do we really need to merge two table definitions into a single one? For example, is it possible to merge a Kafka table definition and a Filesystem table definition into a new Kafka table, and the new Kafka table exactly matches the underlying physical data format? Best, Jark [1]: https://www.postgresql.org/docs/9.5/sql-createtable.html [2]:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLikeOn Tue, 3 Mar 2020 at 21:12, Dawid Wysakowicz [hidden email] wrote:Hi devs, I wanted to bring another improvement proposal up for a discussion.Oftenusers need to adjust existing tables slightly. This is especiallyusefulwhen users need to enhance a table created from an external tool(e.g.HIVE) with Flink's specific information such as e.g watermarks. Itcanalsobe a useful tool for ETL processes, e.g. merging two tables into asingleone with a different connector. My suggestion would be to support an optional *Feature T171, “LIKE clause in table definition” *of SQL standard 2008. You can see the description of the proposal here:https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLELooking forward for your comments. Best, Dawid-- Best, Jingsong Lee signature.asc (849 bytes) Download Attachment |
Free forum by Nabble | Edit this page |