Hi everyone,
as some of you might have noticed, in the last two releases we aimed to unify SQL connectors and make them more modular. The first connectors and formats have been implemented and are usable via the SQL Client and Java/Scala/SQL APIs. However, after writing more connectors/example programs and talking to users, there are still a couple of improvements that should be applied to unified SQL connector API. I wrote a design document [1] that discusses limitations that I have observed and consideres feedback that I have collected over the last months. I don't know whether we will implement all of these improvements, but it would be great to get feedback for a satisfactory API and for future priorization. The general goal should be to connect to external systems as convenient and type-safe as possible. Any feedback is highly appreciated. Thanks, Timo [1] https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing |
Thanks for the proposal Timo!
I've done a pass and added some comments (mostly asking for clarification, details). Overall, this is going into a very good direction. I think the tables which are stored in different systems and using a format definition to define other formats require some more discussions. However, these are also not the features that we would start with. From a compatibility point of view, an important question to answer would be whether we can drop the support for field mapping, i.e., do we have users who take advantage of mapping format fields to fields with a different name in the schema. Besides that, all existing functionality is preserved although the syntax changes a bit. Best, Fabian Am Mo., 1. Okt. 2018 um 10:53 Uhr schrieb Timo Walther <[hidden email]>: > Hi everyone, > > as some of you might have noticed, in the last two releases we aimed to > unify SQL connectors and make them more modular. The first connectors > and formats have been implemented and are usable via the SQL Client and > Java/Scala/SQL APIs. > > However, after writing more connectors/example programs and talking to > users, there are still a couple of improvements that should be applied > to unified SQL connector API. > > I wrote a design document [1] that discusses limitations that I have > observed and consideres feedback that I have collected over the last > months. I don't know whether we will implement all of these > improvements, but it would be great to get feedback for a satisfactory > API and for future priorization. > > The general goal should be to connect to external systems as convenient > and type-safe as possible. Any feedback is highly appreciated. > > Thanks, > > Timo > > [1] > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > |
Thanks for the feedback Fabian. I updated the document and addressed
your comments. I agree that tables which are stored in different systems need more discussion. I would suggest to deprecate the field mapping interfaces in this release and remove it in the next release. Regards, Timo Am 02.10.18 um 11:06 schrieb Fabian Hueske: > Thanks for the proposal Timo! > > I've done a pass and added some comments (mostly asking for clarification, > details). > Overall, this is going into a very good direction. > I think the tables which are stored in different systems and using a format > definition to define other formats require some more discussions. > However, these are also not the features that we would start with. > > >From a compatibility point of view, an important question to answer would > be whether we can drop the support for field mapping, i.e., do we have > users who take advantage of mapping format fields to fields with a > different name in the schema. > Besides that, all existing functionality is preserved although the syntax > changes a bit. > > Best, > Fabian > > Am Mo., 1. Okt. 2018 um 10:53 Uhr schrieb Timo Walther <[hidden email]>: > >> Hi everyone, >> >> as some of you might have noticed, in the last two releases we aimed to >> unify SQL connectors and make them more modular. The first connectors >> and formats have been implemented and are usable via the SQL Client and >> Java/Scala/SQL APIs. >> >> However, after writing more connectors/example programs and talking to >> users, there are still a couple of improvements that should be applied >> to unified SQL connector API. >> >> I wrote a design document [1] that discusses limitations that I have >> observed and consideres feedback that I have collected over the last >> months. I don't know whether we will implement all of these >> improvements, but it would be great to get feedback for a satisfactory >> API and for future priorization. >> >> The general goal should be to connect to external systems as convenient >> and type-safe as possible. Any feedback is highly appreciated. >> >> Thanks, >> >> Timo >> >> [1] >> >> https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing >> >> |
Thanks for the proposal!
I like the proposed changes a lot, especially support for reading/writing key data of systems that have a key/value split will be very nice to have. > On 2. Oct 2018, at 11:58, Timo Walther <[hidden email]> wrote: > > Thanks for the feedback Fabian. I updated the document and addressed your comments. > > I agree that tables which are stored in different systems need more discussion. I would suggest to deprecate the field mapping interfaces in this release and remove it in the next release. > > Regards, > Timo > > > Am 02.10.18 um 11:06 schrieb Fabian Hueske: >> Thanks for the proposal Timo! >> >> I've done a pass and added some comments (mostly asking for clarification, >> details). >> Overall, this is going into a very good direction. >> I think the tables which are stored in different systems and using a format >> definition to define other formats require some more discussions. >> However, these are also not the features that we would start with. >> >> >From a compatibility point of view, an important question to answer would >> be whether we can drop the support for field mapping, i.e., do we have >> users who take advantage of mapping format fields to fields with a >> different name in the schema. >> Besides that, all existing functionality is preserved although the syntax >> changes a bit. >> >> Best, >> Fabian >> >> Am Mo., 1. Okt. 2018 um 10:53 Uhr schrieb Timo Walther <[hidden email]>: >> >>> Hi everyone, >>> >>> as some of you might have noticed, in the last two releases we aimed to >>> unify SQL connectors and make them more modular. The first connectors >>> and formats have been implemented and are usable via the SQL Client and >>> Java/Scala/SQL APIs. >>> >>> However, after writing more connectors/example programs and talking to >>> users, there are still a couple of improvements that should be applied >>> to unified SQL connector API. >>> >>> I wrote a design document [1] that discusses limitations that I have >>> observed and consideres feedback that I have collected over the last >>> months. I don't know whether we will implement all of these >>> improvements, but it would be great to get feedback for a satisfactory >>> API and for future priorization. >>> >>> The general goal should be to connect to external systems as convenient >>> and type-safe as possible. Any feedback is highly appreciated. >>> >>> Thanks, >>> >>> Timo >>> >>> [1] >>> >>> https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing >>> >>> > |
In reply to this post by Timo Walther-2
Thanks a lot for the proposal, Timo. I left a few comments. Also, it seems
the example in the doc does not have the table type (source, sink and both) property anymore. Are you suggesting drop it? I think the table type properties is still useful as it can restrict a certain connector to be only source/sink, for example, we usually want a Kafka topic to be either read-only or write-only, but not both. Shuyi On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> wrote: > Hi everyone, > > as some of you might have noticed, in the last two releases we aimed to > unify SQL connectors and make them more modular. The first connectors > and formats have been implemented and are usable via the SQL Client and > Java/Scala/SQL APIs. > > However, after writing more connectors/example programs and talking to > users, there are still a couple of improvements that should be applied > to unified SQL connector API. > > I wrote a design document [1] that discusses limitations that I have > observed and consideres feedback that I have collected over the last > months. I don't know whether we will implement all of these > improvements, but it would be great to get feedback for a satisfactory > API and for future priorization. > > The general goal should be to connect to external systems as convenient > and type-safe as possible. Any feedback is highly appreciated. > > Thanks, > > Timo > > [1] > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > -- "So you have to trust that the dots will somehow connect in your future." |
Hi,
Thanks a lot for the proposal. I like the idea to unify table definitions. I think we can drop the table type since the type can be derived from the sql, i.e, a table be inserted can only be a sink table. I left some minor suggestions in the document, mainly include: - Maybe we also need to allow define properties for tables. - Support specify Computed Columns in a table - Support define keys for sources. Best, Hequn On Thu, Oct 4, 2018 at 4:09 PM Shuyi Chen <[hidden email]> wrote: > Thanks a lot for the proposal, Timo. I left a few comments. Also, it seems > the example in the doc does not have the table type (source, sink and both) > property anymore. Are you suggesting drop it? I think the table type > properties is still useful as it can restrict a certain connector to be > only source/sink, for example, we usually want a Kafka topic to be either > read-only or write-only, but not both. > > Shuyi > > On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> wrote: > > > Hi everyone, > > > > as some of you might have noticed, in the last two releases we aimed to > > unify SQL connectors and make them more modular. The first connectors > > and formats have been implemented and are usable via the SQL Client and > > Java/Scala/SQL APIs. > > > > However, after writing more connectors/example programs and talking to > > users, there are still a couple of improvements that should be applied > > to unified SQL connector API. > > > > I wrote a design document [1] that discusses limitations that I have > > observed and consideres feedback that I have collected over the last > > months. I don't know whether we will implement all of these > > improvements, but it would be great to get feedback for a satisfactory > > API and for future priorization. > > > > The general goal should be to connect to external systems as convenient > > and type-safe as possible. Any feedback is highly appreciated. > > > > Thanks, > > > > Timo > > > > [1] > > > > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > > > > > -- > "So you have to trust that the dots will somehow connect in your future." > |
Hi Timo,
Thanks for putting together the proposal! I really love the idea to combining solution for historic and recent data and left some suggestions on that part. Regarding the table type, e.g. for kafka streams, I agree with @hequn's idea that it should be pretty much inferable from the SQL context. I think there might be some questions need to be addressed when unifying the definition, for example: - Should a Kafka table used in "INSERT INTO" statement be used again in "FROM" statement, and vise versa ? - How to enforce checks in combo-table use cases ? - Can user change the way a table is used (e.g. source/sink) in interactive env such as sql-client ? Thanks, Rong On Thu, Oct 4, 2018 at 7:31 AM Hequn Cheng <[hidden email]> wrote: > Hi, > > Thanks a lot for the proposal. I like the idea to unify table definitions. > I think we can drop the table type since the type can be derived from the > sql, i.e, a table be inserted can only be a sink table. > > I left some minor suggestions in the document, mainly include: > - Maybe we also need to allow define properties for tables. > - Support specify Computed Columns in a table > - Support define keys for sources. > > Best, Hequn > > > On Thu, Oct 4, 2018 at 4:09 PM Shuyi Chen <[hidden email]> wrote: > > > Thanks a lot for the proposal, Timo. I left a few comments. Also, it > seems > > the example in the doc does not have the table type (source, sink and > both) > > property anymore. Are you suggesting drop it? I think the table type > > properties is still useful as it can restrict a certain connector to be > > only source/sink, for example, we usually want a Kafka topic to be either > > read-only or write-only, but not both. > > > > Shuyi > > > > On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> wrote: > > > > > Hi everyone, > > > > > > as some of you might have noticed, in the last two releases we aimed to > > > unify SQL connectors and make them more modular. The first connectors > > > and formats have been implemented and are usable via the SQL Client and > > > Java/Scala/SQL APIs. > > > > > > However, after writing more connectors/example programs and talking to > > > users, there are still a couple of improvements that should be applied > > > to unified SQL connector API. > > > > > > I wrote a design document [1] that discusses limitations that I have > > > observed and consideres feedback that I have collected over the last > > > months. I don't know whether we will implement all of these > > > improvements, but it would be great to get feedback for a satisfactory > > > API and for future priorization. > > > > > > The general goal should be to connect to external systems as convenient > > > and type-safe as possible. Any feedback is highly appreciated. > > > > > > Thanks, > > > > > > Timo > > > > > > [1] > > > > > > > > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > > > > > > > > > -- > > "So you have to trust that the dots will somehow connect in your future." > > > |
In reply to this post by Hequn Cheng
In the case of normal Flink job, I agree we can infer the table type from
the queries. However, for SQL client, the query is adhoc and not known beforehand. In such case, we might want to enforce the table open mode at startup time, so users won't accidentally write to a Kafka topic that is supposed to be written only by some producer. What do you guys think? Shuyi On Thu, Oct 4, 2018 at 7:31 AM Hequn Cheng <[hidden email]> wrote: > Hi, > > Thanks a lot for the proposal. I like the idea to unify table definitions. > I think we can drop the table type since the type can be derived from the > sql, i.e, a table be inserted can only be a sink table. > > I left some minor suggestions in the document, mainly include: > - Maybe we also need to allow define properties for tables. > - Support specify Computed Columns in a table > - Support define keys for sources. > > Best, Hequn > > > On Thu, Oct 4, 2018 at 4:09 PM Shuyi Chen <[hidden email]> wrote: > > > Thanks a lot for the proposal, Timo. I left a few comments. Also, it > seems > > the example in the doc does not have the table type (source, sink and > both) > > property anymore. Are you suggesting drop it? I think the table type > > properties is still useful as it can restrict a certain connector to be > > only source/sink, for example, we usually want a Kafka topic to be either > > read-only or write-only, but not both. > > > > Shuyi > > > > On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> wrote: > > > > > Hi everyone, > > > > > > as some of you might have noticed, in the last two releases we aimed to > > > unify SQL connectors and make them more modular. The first connectors > > > and formats have been implemented and are usable via the SQL Client and > > > Java/Scala/SQL APIs. > > > > > > However, after writing more connectors/example programs and talking to > > > users, there are still a couple of improvements that should be applied > > > to unified SQL connector API. > > > > > > I wrote a design document [1] that discusses limitations that I have > > > observed and consideres feedback that I have collected over the last > > > months. I don't know whether we will implement all of these > > > improvements, but it would be great to get feedback for a satisfactory > > > API and for future priorization. > > > > > > The general goal should be to connect to external systems as convenient > > > and type-safe as possible. Any feedback is highly appreciated. > > > > > > Thanks, > > > > > > Timo > > > > > > [1] > > > > > > > > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > > > > > > > > > -- > > "So you have to trust that the dots will somehow connect in your future." > > > -- "So you have to trust that the dots will somehow connect in your future." |
Hi,
It is a good question that how to avoid write to a table accidentally. I think there are other ways to solve the problem, such as we can provide a view instead of a table to the users or add a table constraint. Best, Hequn On Fri, Oct 5, 2018 at 1:30 PM Shuyi Chen <[hidden email]> wrote: > In the case of normal Flink job, I agree we can infer the table type from > the queries. However, for SQL client, the query is adhoc and not known > beforehand. In such case, we might want to enforce the table open mode at > startup time, so users won't accidentally write to a Kafka topic that is > supposed to be written only by some producer. What do you guys think? > > Shuyi > > On Thu, Oct 4, 2018 at 7:31 AM Hequn Cheng <[hidden email]> wrote: > > > Hi, > > > > Thanks a lot for the proposal. I like the idea to unify table > definitions. > > I think we can drop the table type since the type can be derived from the > > sql, i.e, a table be inserted can only be a sink table. > > > > I left some minor suggestions in the document, mainly include: > > - Maybe we also need to allow define properties for tables. > > - Support specify Computed Columns in a table > > - Support define keys for sources. > > > > Best, Hequn > > > > > > On Thu, Oct 4, 2018 at 4:09 PM Shuyi Chen <[hidden email]> wrote: > > > > > Thanks a lot for the proposal, Timo. I left a few comments. Also, it > > seems > > > the example in the doc does not have the table type (source, sink and > > both) > > > property anymore. Are you suggesting drop it? I think the table type > > > properties is still useful as it can restrict a certain connector to be > > > only source/sink, for example, we usually want a Kafka topic to be > either > > > read-only or write-only, but not both. > > > > > > Shuyi > > > > > > On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> > wrote: > > > > > > > Hi everyone, > > > > > > > > as some of you might have noticed, in the last two releases we aimed > to > > > > unify SQL connectors and make them more modular. The first connectors > > > > and formats have been implemented and are usable via the SQL Client > and > > > > Java/Scala/SQL APIs. > > > > > > > > However, after writing more connectors/example programs and talking > to > > > > users, there are still a couple of improvements that should be > applied > > > > to unified SQL connector API. > > > > > > > > I wrote a design document [1] that discusses limitations that I have > > > > observed and consideres feedback that I have collected over the last > > > > months. I don't know whether we will implement all of these > > > > improvements, but it would be great to get feedback for a > satisfactory > > > > API and for future priorization. > > > > > > > > The general goal should be to connect to external systems as > convenient > > > > and type-safe as possible. Any feedback is highly appreciated. > > > > > > > > Thanks, > > > > > > > > Timo > > > > > > > > [1] > > > > > > > > > > > > > > https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing > > > > > > > > > > > > > > -- > > > "So you have to trust that the dots will somehow connect in your > future." > > > > > > > > -- > "So you have to trust that the dots will somehow connect in your future." > |
In reply to this post by Shuyi Chen
Hi everyone,
thanks for the feedback that we got so far. I will update the document in the next couple of hours such that we can continue with the discussion. Regarding the table type: Actually I just didn't mention it in the document, because the table type is a SQL Client/External catalog interface specific property that is evaluated before the unified connector API (depending on the table type a source and/or sink is discovered). I agree with Shuyi's comments that it should be possible to restrict read/write access. The general goal should be that properties defined in the design document apply to both sources and sinks, i.e., no special source-only or sink-only properties. @Rong: Currently, a user can not change the way how a table is used in the interactive shell. Tables defined in an environment file are immutable. This will be possible using a SQL DDL in the future. Regards, Timo Am 05.10.18 um 07:30 schrieb Shuyi Chen: > In the case of normal Flink job, I agree we can infer the table type from > the queries. However, for SQL client, the query is adhoc and not known > beforehand. In such case, we might want to enforce the table open mode at > startup time, so users won't accidentally write to a Kafka topic that is > supposed to be written only by some producer. What do you guys think? > > Shuyi > > On Thu, Oct 4, 2018 at 7:31 AM Hequn Cheng <[hidden email]> wrote: > >> Hi, >> >> Thanks a lot for the proposal. I like the idea to unify table definitions. >> I think we can drop the table type since the type can be derived from the >> sql, i.e, a table be inserted can only be a sink table. >> >> I left some minor suggestions in the document, mainly include: >> - Maybe we also need to allow define properties for tables. >> - Support specify Computed Columns in a table >> - Support define keys for sources. >> >> Best, Hequn >> >> >> On Thu, Oct 4, 2018 at 4:09 PM Shuyi Chen <[hidden email]> wrote: >> >>> Thanks a lot for the proposal, Timo. I left a few comments. Also, it >> seems >>> the example in the doc does not have the table type (source, sink and >> both) >>> property anymore. Are you suggesting drop it? I think the table type >>> properties is still useful as it can restrict a certain connector to be >>> only source/sink, for example, we usually want a Kafka topic to be either >>> read-only or write-only, but not both. >>> >>> Shuyi >>> >>> On Mon, Oct 1, 2018 at 1:53 AM Timo Walther <[hidden email]> wrote: >>> >>>> Hi everyone, >>>> >>>> as some of you might have noticed, in the last two releases we aimed to >>>> unify SQL connectors and make them more modular. The first connectors >>>> and formats have been implemented and are usable via the SQL Client and >>>> Java/Scala/SQL APIs. >>>> >>>> However, after writing more connectors/example programs and talking to >>>> users, there are still a couple of improvements that should be applied >>>> to unified SQL connector API. >>>> >>>> I wrote a design document [1] that discusses limitations that I have >>>> observed and consideres feedback that I have collected over the last >>>> months. I don't know whether we will implement all of these >>>> improvements, but it would be great to get feedback for a satisfactory >>>> API and for future priorization. >>>> >>>> The general goal should be to connect to external systems as convenient >>>> and type-safe as possible. Any feedback is highly appreciated. >>>> >>>> Thanks, >>>> >>>> Timo >>>> >>>> [1] >>>> >>>> >> https://docs.google.com/document/d/1Yaxp1UJUFW-peGLt8EIidwKIZEWrrA-pznWLuvaH39Y/edit?usp=sharing >>>> >>> -- >>> "So you have to trust that the dots will somehow connect in your future." >>> > |
Free forum by Nabble | Edit this page |