Hi everyone,
I want to start a discussion about further improve and simplify our current connector porperty keys, aka WITH options. Currently, we have a 'connector.' prefix for many properties, but they are verbose, and we see a big inconsistency between the properties when designing FLIP-107. So we propose to remove all the 'connector.' prefix and rename 'connector.type' to 'connector', 'format.type' to 'format'. So a new Kafka DDL may look like this: CREATE TABLE kafka_table ( ... ) WITH ( 'connector' = 'kafka', 'version' = '0.10', 'topic' = 'test-topic', 'startup-mode' = 'earliest-offset', 'properties.bootstrap.servers' = 'localhost:9092', 'properties.group.id' = 'testGroup', 'format' = 'json', 'format.fail-on-missing-field' = 'false' ); The new connector property key set will come together with new Factory inferface which is proposed in FLIP-95. Old properties are still compatible with their existing implementation. New properties are only available in new DynamicTableFactory implementations. You can access the detailed FLIP here: https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory Best, Jark |
Hi Jark,
I am really looking forward to this feature. I think this feature could simplify flink sql code,and at the same time , it could make the developer more easlier to config the flink sql WITH options. Now when I am using flink sql to write flink task , sometimes I think the WITH options is too long for user. For example,I config the kafka source connector parameter,for consumer group and brokers parameter: 'connector.properties.0.key' = 'group.id' > , 'connector.properties.0.value' = 'xxx' > , 'connector.properties.1.key' = 'bootstrap.servers' > , 'connector.properties.1.value' = 'xxxxx' > I can understand this config , but for the flink fresh man,maybe it is confused for him. In my thought, I am really looking forward to this feature,thank you to propose this feature. Best wishes, LakeShen Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > Hi everyone, > > I want to start a discussion about further improve and simplify our current > connector porperty keys, aka WITH options. Currently, we have a > 'connector.' prefix for many properties, but they are verbose, and we see a > big inconsistency between the properties when designing FLIP-107. > > So we propose to remove all the 'connector.' prefix and rename > 'connector.type' to 'connector', 'format.type' to 'format'. So a new Kafka > DDL may look like this: > > CREATE TABLE kafka_table ( > ... > ) WITH ( > 'connector' = 'kafka', > 'version' = '0.10', > 'topic' = 'test-topic', > 'startup-mode' = 'earliest-offset', > 'properties.bootstrap.servers' = 'localhost:9092', > 'properties.group.id' = 'testGroup', > 'format' = 'json', > 'format.fail-on-missing-field' = 'false' > ); > > The new connector property key set will come together with new Factory > inferface which is proposed in FLIP-95. Old properties are still compatible > with their existing implementation. New properties are only available in > new DynamicTableFactory implementations. > > You can access the detailed FLIP here: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > Best, > Jark > |
Hi Jark,
Thanks for the proposal, I'm +1 to the general idea. However I have a question about "version", in the old design, the version seems to be aimed for tracking property version, with different version, we could evolve these step by step without breaking backward compatibility. But in this design, version is representing external system's version, like "0.11" for kafka, "6" or "7" for ES. I'm not sure if this is necessary, what's the benefit of using two keys instead of one, like "kafka-0.11" or "ES6" directly? And how about the old capability which could let us evolving connector properties? Best, Kurt On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> wrote: > Hi Jark, > > I am really looking forward to this feature. I think this feature > could simplify flink sql code,and at the same time , > it could make the developer more easlier to config the flink sql WITH > options. > > Now when I am using flink sql to write flink task , sometimes I think the > WITH options is too long for user. > For example,I config the kafka source connector parameter,for consumer > group and brokers parameter: > > 'connector.properties.0.key' = 'group.id' > > , 'connector.properties.0.value' = 'xxx' > > , 'connector.properties.1.key' = 'bootstrap.servers' > > , 'connector.properties.1.value' = 'xxxxx' > > > > I can understand this config , but for the flink fresh man,maybe it > is confused for him. > In my thought, I am really looking forward to this feature,thank you to > propose this feature. > > Best wishes, > LakeShen > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > Hi everyone, > > > > I want to start a discussion about further improve and simplify our > current > > connector porperty keys, aka WITH options. Currently, we have a > > 'connector.' prefix for many properties, but they are verbose, and we > see a > > big inconsistency between the properties when designing FLIP-107. > > > > So we propose to remove all the 'connector.' prefix and rename > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new > Kafka > > DDL may look like this: > > > > CREATE TABLE kafka_table ( > > ... > > ) WITH ( > > 'connector' = 'kafka', > > 'version' = '0.10', > > 'topic' = 'test-topic', > > 'startup-mode' = 'earliest-offset', > > 'properties.bootstrap.servers' = 'localhost:9092', > > 'properties.group.id' = 'testGroup', > > 'format' = 'json', > > 'format.fail-on-missing-field' = 'false' > > ); > > > > The new connector property key set will come together with new Factory > > inferface which is proposed in FLIP-95. Old properties are still > compatible > > with their existing implementation. New properties are only available in > > new DynamicTableFactory implementations. > > > > You can access the detailed FLIP here: > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > Best, > > Jark > > > |
Hi Kurt,
That's good questions. > the meaning of "version" There are two versions in the old design. One is property version "connector.property-version" which can be used for backward compatibility. The other one is "connector.version" which defines the version of external system, e.g. 0.11" for kafka, "6" or "7" for ES. In this proposal, the "version" is the previous "connector.version". The ""connector.property-version" is not introduced in new design. > how to keep the old capability which can evolve connector properties The "connector.property-version" is an optional key in the old design and is never bumped up. I'm not sure how "connector.property-version" should work in the initial design. Maybe @Timo Walther <[hidden email]> has more knowledge on this. But for the new properties, every options should be expressed as `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily support evolving keys. > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? There are several benefit to use separate "version" key I can see: 1) it's more readable to separete them into different keys, because they are orthogonal concepts. 2) the planner can give all the availble versions in the exception message, if user uses a wrong version (this is often reported in user ML). 3) If we use "kafka-0.11" as connector identifier, we may have to write a full documentation for each version, because they are different "connector"? IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector but with different "client jar" version, they share all the same supported property keys and should reside together. 4) IMO, the future vision is version-free. At some point in the future, we may don't need users to specify kafka version anymore, and make "version=universal" as optional or removed in the future. This is can be done easily if they are separate keys. Best, Jark On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > Hi Jark, > > Thanks for the proposal, I'm +1 to the general idea. However I have a > question about "version", > in the old design, the version seems to be aimed for tracking property > version, with different > version, we could evolve these step by step without breaking backward > compatibility. But in this > design, version is representing external system's version, like "0.11" for > kafka, "6" or "7" for ES. > I'm not sure if this is necessary, what's the benefit of using two keys > instead of one, like "kafka-0.11" > or "ES6" directly? And how about the old capability which could let us > evolving connector properties? > > Best, > Kurt > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> > wrote: > > > Hi Jark, > > > > I am really looking forward to this feature. I think this feature > > could simplify flink sql code,and at the same time , > > it could make the developer more easlier to config the flink sql WITH > > options. > > > > Now when I am using flink sql to write flink task , sometimes I think the > > WITH options is too long for user. > > For example,I config the kafka source connector parameter,for consumer > > group and brokers parameter: > > > > 'connector.properties.0.key' = 'group.id' > > > , 'connector.properties.0.value' = 'xxx' > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > I can understand this config , but for the flink fresh man,maybe it > > is confused for him. > > In my thought, I am really looking forward to this feature,thank you to > > propose this feature. > > > > Best wishes, > > LakeShen > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > Hi everyone, > > > > > > I want to start a discussion about further improve and simplify our > > current > > > connector porperty keys, aka WITH options. Currently, we have a > > > 'connector.' prefix for many properties, but they are verbose, and we > > see a > > > big inconsistency between the properties when designing FLIP-107. > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new > > Kafka > > > DDL may look like this: > > > > > > CREATE TABLE kafka_table ( > > > ... > > > ) WITH ( > > > 'connector' = 'kafka', > > > 'version' = '0.10', > > > 'topic' = 'test-topic', > > > 'startup-mode' = 'earliest-offset', > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > 'properties.group.id' = 'testGroup', > > > 'format' = 'json', > > > 'format.fail-on-missing-field' = 'false' > > > ); > > > > > > The new connector property key set will come together with new Factory > > > inferface which is proposed in FLIP-95. Old properties are still > > compatible > > > with their existing implementation. New properties are only available > in > > > new DynamicTableFactory implementations. > > > > > > You can access the detailed FLIP here: > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > Best, > > > Jark > > > > > > |
> 1) it's more readable to separete them into different keys, because they
are orthogonal concepts. I'm not sure whether this is more clear, but using one key will be definitely one line less. > 2) the planner can give all the availble versions in the exception message, if user uses a wrong version (this is often reported in user ML). Lists all available connectors seems also quite straightforward, e.g user provided a wrong "kafka-0.8", we tell user we have candidates of "kakfa-0.11", "kafka-universal" > 3) we may have to write a full documentation for each version, because they are different "connector"? I don't think so. We can still treat it as the same connector but with different versions. > 4) At some point in the future, we may don't need users to specify kafka version anymore. I think this could make user confuse. For example, let's say current "universal" kafka connector is the unified connector, which can support all kinds of versions. Then users will have different experience regarding to version: If you provide "version=0.11", we will use "kafka-0.11" for connector. But if users don't set any version, we will use "kafka-universal" instead. The behavior is inconsistent IMO. Best, Kurt On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote: > Hi Kurt, > > That's good questions. > > > the meaning of "version" > There are two versions in the old design. One is property version > "connector.property-version" which can be used for backward compatibility. > The other one is "connector.version" which defines the version of external > system, e.g. 0.11" for kafka, "6" or "7" for ES. > In this proposal, the "version" is the previous "connector.version". The > ""connector.property-version" is not introduced in new design. > > > how to keep the old capability which can evolve connector properties > The "connector.property-version" is an optional key in the old design and > is never bumped up. > I'm not sure how "connector.property-version" should work in the initial > design. Maybe @Timo Walther <[hidden email]> has more knowledge on > this. > But for the new properties, every options should be expressed as > `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily > support evolving keys. > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? > There are several benefit to use separate "version" key I can see: > 1) it's more readable to separete them into different keys, because they > are orthogonal concepts. > 2) the planner can give all the availble versions in the exception message, > if user uses a wrong version (this is often reported in user ML). > 3) If we use "kafka-0.11" as connector identifier, we may have to write a > full documentation for each version, because they are different > "connector"? > IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector > but with different "client jar" version, > they share all the same supported property keys and should reside > together. > 4) IMO, the future vision is version-free. At some point in the future, we > may don't need users to specify kafka version anymore, and make > "version=universal" as optional or removed in the future. This is can be > done easily if they are separate keys. > > Best, > Jark > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > > > Hi Jark, > > > > Thanks for the proposal, I'm +1 to the general idea. However I have a > > question about "version", > > in the old design, the version seems to be aimed for tracking property > > version, with different > > version, we could evolve these step by step without breaking backward > > compatibility. But in this > > design, version is representing external system's version, like "0.11" > for > > kafka, "6" or "7" for ES. > > I'm not sure if this is necessary, what's the benefit of using two keys > > instead of one, like "kafka-0.11" > > or "ES6" directly? And how about the old capability which could let us > > evolving connector properties? > > > > Best, > > Kurt > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> > > wrote: > > > > > Hi Jark, > > > > > > I am really looking forward to this feature. I think this feature > > > could simplify flink sql code,and at the same time , > > > it could make the developer more easlier to config the flink sql WITH > > > options. > > > > > > Now when I am using flink sql to write flink task , sometimes I think > the > > > WITH options is too long for user. > > > For example,I config the kafka source connector parameter,for consumer > > > group and brokers parameter: > > > > > > 'connector.properties.0.key' = 'group.id' > > > > , 'connector.properties.0.value' = 'xxx' > > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > I can understand this config , but for the flink fresh man,maybe it > > > is confused for him. > > > In my thought, I am really looking forward to this feature,thank you to > > > propose this feature. > > > > > > Best wishes, > > > LakeShen > > > > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > > > Hi everyone, > > > > > > > > I want to start a discussion about further improve and simplify our > > > current > > > > connector porperty keys, aka WITH options. Currently, we have a > > > > 'connector.' prefix for many properties, but they are verbose, and we > > > see a > > > > big inconsistency between the properties when designing FLIP-107. > > > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new > > > Kafka > > > > DDL may look like this: > > > > > > > > CREATE TABLE kafka_table ( > > > > ... > > > > ) WITH ( > > > > 'connector' = 'kafka', > > > > 'version' = '0.10', > > > > 'topic' = 'test-topic', > > > > 'startup-mode' = 'earliest-offset', > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > 'properties.group.id' = 'testGroup', > > > > 'format' = 'json', > > > > 'format.fail-on-missing-field' = 'false' > > > > ); > > > > > > > > The new connector property key set will come together with new > Factory > > > > inferface which is proposed in FLIP-95. Old properties are still > > > compatible > > > > with their existing implementation. New properties are only available > > in > > > > new DynamicTableFactory implementations. > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > Best, > > > > Jark > > > > > > > > > > |
In reply to this post by Jark Wu-2
Thanks Jark for the proposal.
+1 to the general idea. For "version", what about "kafka.version"? It is obvious to know its meaning. And I'd like to start a new topic: Should we need to explicitly separate source from sink? With the development of batch and streaming, more and more connectors have both source and sink. So should we set a rule for table properties: - properties for both source and sink: without prefix, like "topic" - properties for source only: with "source." prefix, like "source.startup-mode" - properties for sink only: with "sink." prefix, like "sink.partitioner" What do you think? Best, Jingsong Lee On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote: > Hi Kurt, > > That's good questions. > > > the meaning of "version" > There are two versions in the old design. One is property version > "connector.property-version" which can be used for backward compatibility. > The other one is "connector.version" which defines the version of external > system, e.g. 0.11" for kafka, "6" or "7" for ES. > In this proposal, the "version" is the previous "connector.version". The > ""connector.property-version" is not introduced in new design. > > > how to keep the old capability which can evolve connector properties > The "connector.property-version" is an optional key in the old design and > is never bumped up. > I'm not sure how "connector.property-version" should work in the initial > design. Maybe @Timo Walther <[hidden email]> has more knowledge on > this. > But for the new properties, every options should be expressed as > `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily > support evolving keys. > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? > There are several benefit to use separate "version" key I can see: > 1) it's more readable to separete them into different keys, because they > are orthogonal concepts. > 2) the planner can give all the availble versions in the exception message, > if user uses a wrong version (this is often reported in user ML). > 3) If we use "kafka-0.11" as connector identifier, we may have to write a > full documentation for each version, because they are different > "connector"? > IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector > but with different "client jar" version, > they share all the same supported property keys and should reside > together. > 4) IMO, the future vision is version-free. At some point in the future, we > may don't need users to specify kafka version anymore, and make > "version=universal" as optional or removed in the future. This is can be > done easily if they are separate keys. > > Best, > Jark > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > > > Hi Jark, > > > > Thanks for the proposal, I'm +1 to the general idea. However I have a > > question about "version", > > in the old design, the version seems to be aimed for tracking property > > version, with different > > version, we could evolve these step by step without breaking backward > > compatibility. But in this > > design, version is representing external system's version, like "0.11" > for > > kafka, "6" or "7" for ES. > > I'm not sure if this is necessary, what's the benefit of using two keys > > instead of one, like "kafka-0.11" > > or "ES6" directly? And how about the old capability which could let us > > evolving connector properties? > > > > Best, > > Kurt > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> > > wrote: > > > > > Hi Jark, > > > > > > I am really looking forward to this feature. I think this feature > > > could simplify flink sql code,and at the same time , > > > it could make the developer more easlier to config the flink sql WITH > > > options. > > > > > > Now when I am using flink sql to write flink task , sometimes I think > the > > > WITH options is too long for user. > > > For example,I config the kafka source connector parameter,for consumer > > > group and brokers parameter: > > > > > > 'connector.properties.0.key' = 'group.id' > > > > , 'connector.properties.0.value' = 'xxx' > > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > I can understand this config , but for the flink fresh man,maybe it > > > is confused for him. > > > In my thought, I am really looking forward to this feature,thank you to > > > propose this feature. > > > > > > Best wishes, > > > LakeShen > > > > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > > > Hi everyone, > > > > > > > > I want to start a discussion about further improve and simplify our > > > current > > > > connector porperty keys, aka WITH options. Currently, we have a > > > > 'connector.' prefix for many properties, but they are verbose, and we > > > see a > > > > big inconsistency between the properties when designing FLIP-107. > > > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new > > > Kafka > > > > DDL may look like this: > > > > > > > > CREATE TABLE kafka_table ( > > > > ... > > > > ) WITH ( > > > > 'connector' = 'kafka', > > > > 'version' = '0.10', > > > > 'topic' = 'test-topic', > > > > 'startup-mode' = 'earliest-offset', > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > 'properties.group.id' = 'testGroup', > > > > 'format' = 'json', > > > > 'format.fail-on-missing-field' = 'false' > > > > ); > > > > > > > > The new connector property key set will come together with new > Factory > > > > inferface which is proposed in FLIP-95. Old properties are still > > > compatible > > > > with their existing implementation. New properties are only available > > in > > > > new DynamicTableFactory implementations. > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > Best, > > > > Jark > > > > > > > > > > -- Best, Jingsong Lee |
Hi Kurt,
> 2) Lists all available connectors seems also quite straightforward, e.g user provided a wrong "kafka-0.8", we tell user we have candidates of "kakfa-0.11", "kafka-universal" It's not possible for the framework to throw such exception. Because the framework doesn't know what versions do the connector support. All the version information is a blackbox in the identifier. But with `Factory#factoryVersion()` interface, we can know all the supported versions. > 3) I don't think so. We can still treat it as the same connector but with different versions. That's true but that's weird. Because from the plain DDL definition, they look like different connectors with different "connector" value, e.g. 'connector=kafka-0.8', 'connector=kafka-0.10'. > If users don't set any version, we will use "kafka-universal" instead. The behavior is inconsistent IMO. That is a long term vision when there is no kafka clusters with <0.11 version. At that point, "universal" is the only supported version in Flink and the "version" key can be optional. --------------------------------------------- Hi Jingsong, > "version" vs "kafka.version" I though about it. But if we prefix "kafka" to version, we should prefix "kafka" for all other property keys, because they are all kafka specific options. However, that will make the property set verbose, see rejected option#2 in the FLIP. > explicitly separate options for source and sink That's a good topic. It's good to have a guideline for the new property keys. I'm fine to prefix with a "source"/"sink" for some connector keys. Actually, we already do this in some connectors, e.g. jdbc and hbase. Best, Jark On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]> wrote: > Thanks Jark for the proposal. > > +1 to the general idea. > > For "version", what about "kafka.version"? It is obvious to know its > meaning. > > And I'd like to start a new topic: > Should we need to explicitly separate source from sink? > With the development of batch and streaming, more and more connectors have > both source and sink. > > So should we set a rule for table properties: > - properties for both source and sink: without prefix, like "topic" > - properties for source only: with "source." prefix, like > "source.startup-mode" > - properties for sink only: with "sink." prefix, like "sink.partitioner" > > What do you think? > > Best, > Jingsong Lee > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote: > > > Hi Kurt, > > > > That's good questions. > > > > > the meaning of "version" > > There are two versions in the old design. One is property version > > "connector.property-version" which can be used for backward > compatibility. > > The other one is "connector.version" which defines the version of > external > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > In this proposal, the "version" is the previous "connector.version". The > > ""connector.property-version" is not introduced in new design. > > > > > how to keep the old capability which can evolve connector properties > > The "connector.property-version" is an optional key in the old design and > > is never bumped up. > > I'm not sure how "connector.property-version" should work in the initial > > design. Maybe @Timo Walther <[hidden email]> has more knowledge on > > this. > > But for the new properties, every options should be expressed as > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily > > support evolving keys. > > > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? > > There are several benefit to use separate "version" key I can see: > > 1) it's more readable to separete them into different keys, because they > > are orthogonal concepts. > > 2) the planner can give all the availble versions in the exception > message, > > if user uses a wrong version (this is often reported in user ML). > > 3) If we use "kafka-0.11" as connector identifier, we may have to write a > > full documentation for each version, because they are different > > "connector"? > > IMO, for 0.11, 0.11, etc... kafka, they are actually the same > connector > > but with different "client jar" version, > > they share all the same supported property keys and should reside > > together. > > 4) IMO, the future vision is version-free. At some point in the future, > we > > may don't need users to specify kafka version anymore, and make > > "version=universal" as optional or removed in the future. This is can be > > done easily if they are separate keys. > > > > Best, > > Jark > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > > > > > Hi Jark, > > > > > > Thanks for the proposal, I'm +1 to the general idea. However I have a > > > question about "version", > > > in the old design, the version seems to be aimed for tracking property > > > version, with different > > > version, we could evolve these step by step without breaking backward > > > compatibility. But in this > > > design, version is representing external system's version, like "0.11" > > for > > > kafka, "6" or "7" for ES. > > > I'm not sure if this is necessary, what's the benefit of using two keys > > > instead of one, like "kafka-0.11" > > > or "ES6" directly? And how about the old capability which could let us > > > evolving connector properties? > > > > > > Best, > > > Kurt > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> > > > wrote: > > > > > > > Hi Jark, > > > > > > > > I am really looking forward to this feature. I think this feature > > > > could simplify flink sql code,and at the same time , > > > > it could make the developer more easlier to config the flink sql WITH > > > > options. > > > > > > > > Now when I am using flink sql to write flink task , sometimes I think > > the > > > > WITH options is too long for user. > > > > For example,I config the kafka source connector parameter,for > consumer > > > > group and brokers parameter: > > > > > > > > 'connector.properties.0.key' = 'group.id' > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > I can understand this config , but for the flink fresh man,maybe it > > > > is confused for him. > > > > In my thought, I am really looking forward to this feature,thank you > to > > > > propose this feature. > > > > > > > > Best wishes, > > > > LakeShen > > > > > > > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > > > > > Hi everyone, > > > > > > > > > > I want to start a discussion about further improve and simplify our > > > > current > > > > > connector porperty keys, aka WITH options. Currently, we have a > > > > > 'connector.' prefix for many properties, but they are verbose, and > we > > > > see a > > > > > big inconsistency between the properties when designing FLIP-107. > > > > > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a > new > > > > Kafka > > > > > DDL may look like this: > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > ... > > > > > ) WITH ( > > > > > 'connector' = 'kafka', > > > > > 'version' = '0.10', > > > > > 'topic' = 'test-topic', > > > > > 'startup-mode' = 'earliest-offset', > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > 'properties.group.id' = 'testGroup', > > > > > 'format' = 'json', > > > > > 'format.fail-on-missing-field' = 'false' > > > > > ); > > > > > > > > > > The new connector property key set will come together with new > > Factory > > > > > inferface which is proposed in FLIP-95. Old properties are still > > > > compatible > > > > > with their existing implementation. New properties are only > available > > > in > > > > > new DynamicTableFactory implementations. > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > Best, > > > > > Jark > > > > > > > > > > > > > > > > > -- > Best, Jingsong Lee > |
> It's not possible for the framework to throw such exception. Because the
framework doesn't know what versions do the connector support. Not really, we can list all valid connectors framework could found. E.g. user mistyped 'kafka-0.x', the error message will looks like: we don't have any connector named "kafka-0.x", but we have: FileSystem Kafka-0.10 Kafka-0.11 ElasticSearch6 ElasticSearch7 Best, Kurt On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]> wrote: > Hi Kurt, > > > 2) Lists all available connectors seems also quite straightforward, e.g > user provided a wrong "kafka-0.8", we tell user we have candidates of > "kakfa-0.11", "kafka-universal" > It's not possible for the framework to throw such exception. Because the > framework doesn't know what versions do the connector support. All the > version information is a blackbox in the identifier. But with > `Factory#factoryVersion()` interface, we can know all the supported > versions. > > > 3) I don't think so. We can still treat it as the same connector but with > different versions. > That's true but that's weird. Because from the plain DDL definition, they > look like different connectors with different "connector" value, e.g. > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > If users don't set any version, we will use "kafka-universal" instead. > The behavior is inconsistent IMO. > That is a long term vision when there is no kafka clusters with <0.11 > version. > At that point, "universal" is the only supported version in Flink and the > "version" key can be optional. > > --------------------------------------------- > > Hi Jingsong, > > > "version" vs "kafka.version" > I though about it. But if we prefix "kafka" to version, we should prefix > "kafka" for all other property keys, because they are all kafka specific > options. > However, that will make the property set verbose, see rejected option#2 in > the FLIP. > > > explicitly separate options for source and sink > That's a good topic. It's good to have a guideline for the new property > keys. > I'm fine to prefix with a "source"/"sink" for some connector keys. > Actually, we already do this in some connectors, e.g. jdbc and hbase. > > Best, > Jark > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]> wrote: > > > Thanks Jark for the proposal. > > > > +1 to the general idea. > > > > For "version", what about "kafka.version"? It is obvious to know its > > meaning. > > > > And I'd like to start a new topic: > > Should we need to explicitly separate source from sink? > > With the development of batch and streaming, more and more connectors > have > > both source and sink. > > > > So should we set a rule for table properties: > > - properties for both source and sink: without prefix, like "topic" > > - properties for source only: with "source." prefix, like > > "source.startup-mode" > > - properties for sink only: with "sink." prefix, like "sink.partitioner" > > > > What do you think? > > > > Best, > > Jingsong Lee > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote: > > > > > Hi Kurt, > > > > > > That's good questions. > > > > > > > the meaning of "version" > > > There are two versions in the old design. One is property version > > > "connector.property-version" which can be used for backward > > compatibility. > > > The other one is "connector.version" which defines the version of > > external > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > In this proposal, the "version" is the previous "connector.version". > The > > > ""connector.property-version" is not introduced in new design. > > > > > > > how to keep the old capability which can evolve connector properties > > > The "connector.property-version" is an optional key in the old design > and > > > is never bumped up. > > > I'm not sure how "connector.property-version" should work in the > initial > > > design. Maybe @Timo Walther <[hidden email]> has more knowledge on > > > this. > > > But for the new properties, every options should be expressed as > > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to > easily > > > support evolving keys. > > > > > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? > > > There are several benefit to use separate "version" key I can see: > > > 1) it's more readable to separete them into different keys, because > they > > > are orthogonal concepts. > > > 2) the planner can give all the availble versions in the exception > > message, > > > if user uses a wrong version (this is often reported in user ML). > > > 3) If we use "kafka-0.11" as connector identifier, we may have to > write a > > > full documentation for each version, because they are different > > > "connector"? > > > IMO, for 0.11, 0.11, etc... kafka, they are actually the same > > connector > > > but with different "client jar" version, > > > they share all the same supported property keys and should reside > > > together. > > > 4) IMO, the future vision is version-free. At some point in the future, > > we > > > may don't need users to specify kafka version anymore, and make > > > "version=universal" as optional or removed in the future. This is can > be > > > done easily if they are separate keys. > > > > > > Best, > > > Jark > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > > > > > > > Hi Jark, > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. However I have a > > > > question about "version", > > > > in the old design, the version seems to be aimed for tracking > property > > > > version, with different > > > > version, we could evolve these step by step without breaking backward > > > > compatibility. But in this > > > > design, version is representing external system's version, like > "0.11" > > > for > > > > kafka, "6" or "7" for ES. > > > > I'm not sure if this is necessary, what's the benefit of using two > keys > > > > instead of one, like "kafka-0.11" > > > > or "ES6" directly? And how about the old capability which could let > us > > > > evolving connector properties? > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> > > > > wrote: > > > > > > > > > Hi Jark, > > > > > > > > > > I am really looking forward to this feature. I think this feature > > > > > could simplify flink sql code,and at the same time , > > > > > it could make the developer more easlier to config the flink sql > WITH > > > > > options. > > > > > > > > > > Now when I am using flink sql to write flink task , sometimes I > think > > > the > > > > > WITH options is too long for user. > > > > > For example,I config the kafka source connector parameter,for > > consumer > > > > > group and brokers parameter: > > > > > > > > > > 'connector.properties.0.key' = 'group.id' > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh man,maybe it > > > > > is confused for him. > > > > > In my thought, I am really looking forward to this feature,thank > you > > to > > > > > propose this feature. > > > > > > > > > > Best wishes, > > > > > LakeShen > > > > > > > > > > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > I want to start a discussion about further improve and simplify > our > > > > > current > > > > > > connector porperty keys, aka WITH options. Currently, we have a > > > > > > 'connector.' prefix for many properties, but they are verbose, > and > > we > > > > > see a > > > > > > big inconsistency between the properties when designing FLIP-107. > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a > > new > > > > > Kafka > > > > > > DDL may look like this: > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > ... > > > > > > ) WITH ( > > > > > > 'connector' = 'kafka', > > > > > > 'version' = '0.10', > > > > > > 'topic' = 'test-topic', > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > 'properties.group.id' = 'testGroup', > > > > > > 'format' = 'json', > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > ); > > > > > > > > > > > > The new connector property key set will come together with new > > > Factory > > > > > > inferface which is proposed in FLIP-95. Old properties are still > > > > > compatible > > > > > > with their existing implementation. New properties are only > > available > > > > in > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > Best, > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best, Jingsong Lee > > > |
Hi Jark,
Thanks for starting the discussion. The FLIP LGTM generally. Regarding connector version VS put version into connector's name, I favor the latter personally, using one property to locate a connector can make the error log more precise. From the users' side, using one property to match a connector will be easier. Especially we have many connectors, and some of the need version property required, and some of them not. Regarding Jingsong's suggestion, IMO, it's a very good complement to the FLIP. Distinguishing properties for source and sink can be very useful, and also this will make the connector property more precise. We are also sick of this for now, we cannot know whether a DDL is a source or sink unless we look through all queries where the table is used. Even more, some of the required properties are only required for source, bug we cannot leave it blank for sink, and vice versa. I think we can also add a type for dimension tables except source and sink. Kurt Young <[hidden email]> 于2020年3月30日周一 下午8:16写道: > > It's not possible for the framework to throw such exception. Because the > framework doesn't know what versions do the connector support. > > Not really, we can list all valid connectors framework could found. E.g. > user mistyped 'kafka-0.x', the error message will looks like: > > we don't have any connector named "kafka-0.x", but we have: > FileSystem > Kafka-0.10 > Kafka-0.11 > ElasticSearch6 > ElasticSearch7 > > Best, > Kurt > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]> wrote: > > > Hi Kurt, > > > > > 2) Lists all available connectors seems also quite straightforward, e.g > > user provided a wrong "kafka-0.8", we tell user we have candidates of > > "kakfa-0.11", "kafka-universal" > > It's not possible for the framework to throw such exception. Because the > > framework doesn't know what versions do the connector support. All the > > version information is a blackbox in the identifier. But with > > `Factory#factoryVersion()` interface, we can know all the supported > > versions. > > > > > 3) I don't think so. We can still treat it as the same connector but > with > > different versions. > > That's true but that's weird. Because from the plain DDL definition, they > > look like different connectors with different "connector" value, e.g. > > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > > > If users don't set any version, we will use "kafka-universal" instead. > > The behavior is inconsistent IMO. > > That is a long term vision when there is no kafka clusters with <0.11 > > version. > > At that point, "universal" is the only supported version in Flink and the > > "version" key can be optional. > > > > --------------------------------------------- > > > > Hi Jingsong, > > > > > "version" vs "kafka.version" > > I though about it. But if we prefix "kafka" to version, we should prefix > > "kafka" for all other property keys, because they are all kafka specific > > options. > > However, that will make the property set verbose, see rejected option#2 > in > > the FLIP. > > > > > explicitly separate options for source and sink > > That's a good topic. It's good to have a guideline for the new property > > keys. > > I'm fine to prefix with a "source"/"sink" for some connector keys. > > Actually, we already do this in some connectors, e.g. jdbc and hbase. > > > > Best, > > Jark > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]> > wrote: > > > > > Thanks Jark for the proposal. > > > > > > +1 to the general idea. > > > > > > For "version", what about "kafka.version"? It is obvious to know its > > > meaning. > > > > > > And I'd like to start a new topic: > > > Should we need to explicitly separate source from sink? > > > With the development of batch and streaming, more and more connectors > > have > > > both source and sink. > > > > > > So should we set a rule for table properties: > > > - properties for both source and sink: without prefix, like "topic" > > > - properties for source only: with "source." prefix, like > > > "source.startup-mode" > > > - properties for sink only: with "sink." prefix, like > "sink.partitioner" > > > > > > What do you think? > > > > > > Best, > > > Jingsong Lee > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote: > > > > > > > Hi Kurt, > > > > > > > > That's good questions. > > > > > > > > > the meaning of "version" > > > > There are two versions in the old design. One is property version > > > > "connector.property-version" which can be used for backward > > > compatibility. > > > > The other one is "connector.version" which defines the version of > > > external > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > > In this proposal, the "version" is the previous "connector.version". > > The > > > > ""connector.property-version" is not introduced in new design. > > > > > > > > > how to keep the old capability which can evolve connector > properties > > > > The "connector.property-version" is an optional key in the old design > > and > > > > is never bumped up. > > > > I'm not sure how "connector.property-version" should work in the > > initial > > > > design. Maybe @Timo Walther <[hidden email]> has more knowledge > on > > > > this. > > > > But for the new properties, every options should be expressed as > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to > > easily > > > > support evolving keys. > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"? > > > > There are several benefit to use separate "version" key I can see: > > > > 1) it's more readable to separete them into different keys, because > > they > > > > are orthogonal concepts. > > > > 2) the planner can give all the availble versions in the exception > > > message, > > > > if user uses a wrong version (this is often reported in user ML). > > > > 3) If we use "kafka-0.11" as connector identifier, we may have to > > write a > > > > full documentation for each version, because they are different > > > > "connector"? > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually the same > > > connector > > > > but with different "client jar" version, > > > > they share all the same supported property keys and should reside > > > > together. > > > > 4) IMO, the future vision is version-free. At some point in the > future, > > > we > > > > may don't need users to specify kafka version anymore, and make > > > > "version=universal" as optional or removed in the future. This is can > > be > > > > done easily if they are separate keys. > > > > > > > > Best, > > > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote: > > > > > > > > > Hi Jark, > > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. However I > have a > > > > > question about "version", > > > > > in the old design, the version seems to be aimed for tracking > > property > > > > > version, with different > > > > > version, we could evolve these step by step without breaking > backward > > > > > compatibility. But in this > > > > > design, version is representing external system's version, like > > "0.11" > > > > for > > > > > kafka, "6" or "7" for ES. > > > > > I'm not sure if this is necessary, what's the benefit of using two > > keys > > > > > instead of one, like "kafka-0.11" > > > > > or "ES6" directly? And how about the old capability which could let > > us > > > > > evolving connector properties? > > > > > > > > > > Best, > > > > > Kurt > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > I am really looking forward to this feature. I think this feature > > > > > > could simplify flink sql code,and at the same time , > > > > > > it could make the developer more easlier to config the flink sql > > WITH > > > > > > options. > > > > > > > > > > > > Now when I am using flink sql to write flink task , sometimes I > > think > > > > the > > > > > > WITH options is too long for user. > > > > > > For example,I config the kafka source connector parameter,for > > > consumer > > > > > > group and brokers parameter: > > > > > > > > > > > > 'connector.properties.0.key' = 'group.id' > > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > > , 'connector.properties.1.key' = 'bootstrap.servers' > > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh man,maybe > it > > > > > > is confused for him. > > > > > > In my thought, I am really looking forward to this feature,thank > > you > > > to > > > > > > propose this feature. > > > > > > > > > > > > Best wishes, > > > > > > LakeShen > > > > > > > > > > > > > > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > I want to start a discussion about further improve and simplify > > our > > > > > > current > > > > > > > connector porperty keys, aka WITH options. Currently, we have a > > > > > > > 'connector.' prefix for many properties, but they are verbose, > > and > > > we > > > > > > see a > > > > > > > big inconsistency between the properties when designing > FLIP-107. > > > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix and rename > > > > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So > a > > > new > > > > > > Kafka > > > > > > > DDL may look like this: > > > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > > ... > > > > > > > ) WITH ( > > > > > > > 'connector' = 'kafka', > > > > > > > 'version' = '0.10', > > > > > > > 'topic' = 'test-topic', > > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > > 'properties.group.id' = 'testGroup', > > > > > > > 'format' = 'json', > > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > > ); > > > > > > > > > > > > > > The new connector property key set will come together with new > > > > Factory > > > > > > > inferface which is proposed in FLIP-95. Old properties are > still > > > > > > compatible > > > > > > > with their existing implementation. New properties are only > > > available > > > > > in > > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > > > Best, > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: [hidden email]; [hidden email] |
Hi Jark,
thanks for the FLIP. I don't have a strong opinion on DynamicTableFactory#factoryVersion() for distiguishing connector versions. We can also include it in the factory identifier itself. For Kafka, I would then just use `kafka` because the naming "universal" was just a helper solution at that time. What we need is a good guide for how to design the options. We should include that in the JavaDoc of factory interface itself, once it is in-place. I like the difference between source and sink in properties. Regarding "connector.property-version": It was meant for backwards compatibility along the dimension of "property design" compared to "connector version". However, since the connector is now responsible for parsing all properties, it is not as necessary as it was before. However, a change of the property design as it is done in FLIP-107 becomes more difficult without a property version and versioning is always a good idea in API world. Regards, Timo On 30.03.20 16:38, Benchao Li wrote: > Hi Jark, > > Thanks for starting the discussion. The FLIP LGTM generally. > > Regarding connector version VS put version into connector's name, > I favor the latter personally, using one property to locate a connector > can make the error log more precise. > From the users' side, using one property to match a connector will be > easier. Especially we have many connectors, > and some of the need version property required, and some of them not. > > Regarding Jingsong's suggestion, > IMO, it's a very good complement to the FLIP. Distinguishing properties > for source and sink can be very useful, and > also this will make the connector property more precise. > We are also sick of this for now, we cannot know whether a DDL is a > source or sink unless we look through all queries where > the table is used. > Even more, some of the required properties are only required for source, > bug we cannot leave it blank for sink, and vice versa. > I think we can also add a type for dimension tables except source and sink. > > Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 > 周一 下午8:16写道: > > > It's not possible for the framework to throw such exception. > Because the > framework doesn't know what versions do the connector support. > > Not really, we can list all valid connectors framework could found. E.g. > user mistyped 'kafka-0.x', the error message will looks like: > > we don't have any connector named "kafka-0.x", but we have: > FileSystem > Kafka-0.10 > Kafka-0.11 > ElasticSearch6 > ElasticSearch7 > > Best, > Kurt > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] > <mailto:[hidden email]>> wrote: > > > Hi Kurt, > > > > > 2) Lists all available connectors seems also quite > straightforward, e.g > > user provided a wrong "kafka-0.8", we tell user we have candidates of > > "kakfa-0.11", "kafka-universal" > > It's not possible for the framework to throw such exception. > Because the > > framework doesn't know what versions do the connector support. > All the > > version information is a blackbox in the identifier. But with > > `Factory#factoryVersion()` interface, we can know all the supported > > versions. > > > > > 3) I don't think so. We can still treat it as the same > connector but with > > different versions. > > That's true but that's weird. Because from the plain DDL > definition, they > > look like different connectors with different "connector" value, e.g. > > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > > > If users don't set any version, we will use "kafka-universal" > instead. > > The behavior is inconsistent IMO. > > That is a long term vision when there is no kafka clusters with <0.11 > > version. > > At that point, "universal" is the only supported version in Flink > and the > > "version" key can be optional. > > > > --------------------------------------------- > > > > Hi Jingsong, > > > > > "version" vs "kafka.version" > > I though about it. But if we prefix "kafka" to version, we should > prefix > > "kafka" for all other property keys, because they are all kafka > specific > > options. > > However, that will make the property set verbose, see rejected > option#2 in > > the FLIP. > > > > > explicitly separate options for source and sink > > That's a good topic. It's good to have a guideline for the new > property > > keys. > > I'm fine to prefix with a "source"/"sink" for some connector keys. > > Actually, we already do this in some connectors, e.g. jdbc and hbase. > > > > Best, > > Jark > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email] > <mailto:[hidden email]>> wrote: > > > > > Thanks Jark for the proposal. > > > > > > +1 to the general idea. > > > > > > For "version", what about "kafka.version"? It is obvious to > know its > > > meaning. > > > > > > And I'd like to start a new topic: > > > Should we need to explicitly separate source from sink? > > > With the development of batch and streaming, more and more > connectors > > have > > > both source and sink. > > > > > > So should we set a rule for table properties: > > > - properties for both source and sink: without prefix, like "topic" > > > - properties for source only: with "source." prefix, like > > > "source.startup-mode" > > > - properties for sink only: with "sink." prefix, like > "sink.partitioner" > > > > > > What do you think? > > > > > > Best, > > > Jingsong Lee > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] > <mailto:[hidden email]>> wrote: > > > > > > > Hi Kurt, > > > > > > > > That's good questions. > > > > > > > > > the meaning of "version" > > > > There are two versions in the old design. One is property version > > > > "connector.property-version" which can be used for backward > > > compatibility. > > > > The other one is "connector.version" which defines the version of > > > external > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > > In this proposal, the "version" is the previous > "connector.version". > > The > > > > ""connector.property-version" is not introduced in new design. > > > > > > > > > how to keep the old capability which can evolve connector > properties > > > > The "connector.property-version" is an optional key in the > old design > > and > > > > is never bumped up. > > > > I'm not sure how "connector.property-version" should work in the > > initial > > > > design. Maybe @Timo Walther <[hidden email] > <mailto:[hidden email]>> has more knowledge on > > > > this. > > > > But for the new properties, every options should be expressed as > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to > > easily > > > > support evolving keys. > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", > "kafka-universal"? > > > > There are several benefit to use separate "version" key I can > see: > > > > 1) it's more readable to separete them into different keys, > because > > they > > > > are orthogonal concepts. > > > > 2) the planner can give all the availble versions in the > exception > > > message, > > > > if user uses a wrong version (this is often reported in user ML). > > > > 3) If we use "kafka-0.11" as connector identifier, we may have to > > write a > > > > full documentation for each version, because they are different > > > > "connector"? > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually the same > > > connector > > > > but with different "client jar" version, > > > > they share all the same supported property keys and > should reside > > > > together. > > > > 4) IMO, the future vision is version-free. At some point in > the future, > > > we > > > > may don't need users to specify kafka version anymore, and make > > > > "version=universal" as optional or removed in the future. > This is can > > be > > > > done easily if they are separate keys. > > > > > > > > Best, > > > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] > <mailto:[hidden email]>> wrote: > > > > > > > > > Hi Jark, > > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. > However I have a > > > > > question about "version", > > > > > in the old design, the version seems to be aimed for tracking > > property > > > > > version, with different > > > > > version, we could evolve these step by step without > breaking backward > > > > > compatibility. But in this > > > > > design, version is representing external system's version, like > > "0.11" > > > > for > > > > > kafka, "6" or "7" for ES. > > > > > I'm not sure if this is necessary, what's the benefit of > using two > > keys > > > > > instead of one, like "kafka-0.11" > > > > > or "ES6" directly? And how about the old capability which > could let > > us > > > > > evolving connector properties? > > > > > > > > > > Best, > > > > > Kurt > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > <[hidden email] <mailto:[hidden email]>> > > > > > wrote: > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > I am really looking forward to this feature. I think this > feature > > > > > > could simplify flink sql code,and at the same time , > > > > > > it could make the developer more easlier to config the > flink sql > > WITH > > > > > > options. > > > > > > > > > > > > Now when I am using flink sql to write flink task , > sometimes I > > think > > > > the > > > > > > WITH options is too long for user. > > > > > > For example,I config the kafka source connector > parameter,for > > > consumer > > > > > > group and brokers parameter: > > > > > > > > > > > > 'connector.properties.0.key' = 'group.id > <http://group.id>' > > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > > , 'connector.properties.1.key' = > 'bootstrap.servers' > > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh > man,maybe it > > > > > > is confused for him. > > > > > > In my thought, I am really looking forward to this > feature,thank > > you > > > to > > > > > > propose this feature. > > > > > > > > > > > > Best wishes, > > > > > > LakeShen > > > > > > > > > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 > 2020年3月30日周一 下午2:02写道: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > I want to start a discussion about further improve and > simplify > > our > > > > > > current > > > > > > > connector porperty keys, aka WITH options. Currently, > we have a > > > > > > > 'connector.' prefix for many properties, but they are > verbose, > > and > > > we > > > > > > see a > > > > > > > big inconsistency between the properties when designing > FLIP-107. > > > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix and > rename > > > > > > > 'connector.type' to 'connector', 'format.type' to > 'format'. So a > > > new > > > > > > Kafka > > > > > > > DDL may look like this: > > > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > > ... > > > > > > > ) WITH ( > > > > > > > 'connector' = 'kafka', > > > > > > > 'version' = '0.10', > > > > > > > 'topic' = 'test-topic', > > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > > 'properties.group.id <http://properties.group.id>' = > 'testGroup', > > > > > > > 'format' = 'json', > > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > > ); > > > > > > > > > > > > > > The new connector property key set will come together > with new > > > > Factory > > > > > > > inferface which is proposed in FLIP-95. Old properties > are still > > > > > > compatible > > > > > > > with their existing implementation. New properties are only > > > available > > > > > in > > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > > > Best, > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > > > > -- > > Benchao Li > School of Electronics Engineering and Computer Science, Peking University > Tel:+86-15650713730 > Email:[hidden email] <mailto:[hidden email]>;[hidden email] <mailto:[hidden email]> > |
Hi all,
I like the overall design of the FLIP. As for the withstanding concerns. I kind of like the approach to put the version into the factory identifier. I think it's the cleanest way to say that this version actually applies to the connector itself and not to the system it connects to. BTW, I think the outcome of this discussion will affect interfaces described in FLIP-95. If we put the version into the functionIdentifier, do we need Factory#factoryVersion? I also think it does make sense to have a versioning for the properties as well. Are we able to read all the current properties with the new factories? I think we could use the "connector.property-version" to alternate between different Factory interfaces to support the old set of properties. Otherwise the new factories need to understand both set of properties, don't they? Best, Dawid On 30/03/2020 17:07, Timo Walther wrote: > Hi Jark, > > thanks for the FLIP. I don't have a strong opinion on > DynamicTableFactory#factoryVersion() for distiguishing connector > versions. We can also include it in the factory identifier itself. For > Kafka, I would then just use `kafka` because the naming "universal" > was just a helper solution at that time. > > What we need is a good guide for how to design the options. We should > include that in the JavaDoc of factory interface itself, once it is > in-place. I like the difference between source and sink in properties. > > Regarding "connector.property-version": > > It was meant for backwards compatibility along the dimension of > "property design" compared to "connector version". However, since the > connector is now responsible for parsing all properties, it is not as > necessary as it was before. > > However, a change of the property design as it is done in FLIP-107 > becomes more difficult without a property version and versioning is > always a good idea in API world. > > Regards, > Timo > > > On 30.03.20 16:38, Benchao Li wrote: >> Hi Jark, >> >> Thanks for starting the discussion. The FLIP LGTM generally. >> >> Regarding connector version VS put version into connector's name, >> I favor the latter personally, using one property to locate a >> connector can make the error log more precise. >> From the users' side, using one property to match a connector will >> be easier. Especially we have many connectors, >> and some of the need version property required, and some of them not. >> >> Regarding Jingsong's suggestion, >> IMO, it's a very good complement to the FLIP. Distinguishing >> properties for source and sink can be very useful, and >> also this will make the connector property more precise. >> We are also sick of this for now, we cannot know whether a DDL is a >> source or sink unless we look through all queries where >> the table is used. >> Even more, some of the required properties are only required for >> source, bug we cannot leave it blank for sink, and vice versa. >> I think we can also add a type for dimension tables except source and >> sink. >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 >> 周一 下午8:16写道: >> >> > It's not possible for the framework to throw such exception. >> Because the >> framework doesn't know what versions do the connector support. >> >> Not really, we can list all valid connectors framework could >> found. E.g. >> user mistyped 'kafka-0.x', the error message will looks like: >> >> we don't have any connector named "kafka-0.x", but we have: >> FileSystem >> Kafka-0.10 >> Kafka-0.11 >> ElasticSearch6 >> ElasticSearch7 >> >> Best, >> Kurt >> >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >> <mailto:[hidden email]>> wrote: >> >> > Hi Kurt, >> > >> > > 2) Lists all available connectors seems also quite >> straightforward, e.g >> > user provided a wrong "kafka-0.8", we tell user we have >> candidates of >> > "kakfa-0.11", "kafka-universal" >> > It's not possible for the framework to throw such exception. >> Because the >> > framework doesn't know what versions do the connector support. >> All the >> > version information is a blackbox in the identifier. But with >> > `Factory#factoryVersion()` interface, we can know all the >> supported >> > versions. >> > >> > > 3) I don't think so. We can still treat it as the same >> connector but with >> > different versions. >> > That's true but that's weird. Because from the plain DDL >> definition, they >> > look like different connectors with different "connector" >> value, e.g. >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. >> > >> > > If users don't set any version, we will use "kafka-universal" >> instead. >> > The behavior is inconsistent IMO. >> > That is a long term vision when there is no kafka clusters >> with <0.11 >> > version. >> > At that point, "universal" is the only supported version in Flink >> and the >> > "version" key can be optional. >> > >> > --------------------------------------------- >> > >> > Hi Jingsong, >> > >> > > "version" vs "kafka.version" >> > I though about it. But if we prefix "kafka" to version, we should >> prefix >> > "kafka" for all other property keys, because they are all kafka >> specific >> > options. >> > However, that will make the property set verbose, see rejected >> option#2 in >> > the FLIP. >> > >> > > explicitly separate options for source and sink >> > That's a good topic. It's good to have a guideline for the new >> property >> > keys. >> > I'm fine to prefix with a "source"/"sink" for some connector >> keys. >> > Actually, we already do this in some connectors, e.g. jdbc and >> hbase. >> > >> > Best, >> > Jark >> > >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email] >> <mailto:[hidden email]>> wrote: >> > >> > > Thanks Jark for the proposal. >> > > >> > > +1 to the general idea. >> > > >> > > For "version", what about "kafka.version"? It is obvious to >> know its >> > > meaning. >> > > >> > > And I'd like to start a new topic: >> > > Should we need to explicitly separate source from sink? >> > > With the development of batch and streaming, more and more >> connectors >> > have >> > > both source and sink. >> > > >> > > So should we set a rule for table properties: >> > > - properties for both source and sink: without prefix, like >> "topic" >> > > - properties for source only: with "source." prefix, like >> > > "source.startup-mode" >> > > - properties for sink only: with "sink." prefix, like >> "sink.partitioner" >> > > >> > > What do you think? >> > > >> > > Best, >> > > Jingsong Lee >> > > >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >> <mailto:[hidden email]>> wrote: >> > > >> > > > Hi Kurt, >> > > > >> > > > That's good questions. >> > > > >> > > > > the meaning of "version" >> > > > There are two versions in the old design. One is property >> version >> > > > "connector.property-version" which can be used for backward >> > > compatibility. >> > > > The other one is "connector.version" which defines the >> version of >> > > external >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. >> > > > In this proposal, the "version" is the previous >> "connector.version". >> > The >> > > > ""connector.property-version" is not introduced in new >> design. >> > > > >> > > > > how to keep the old capability which can evolve connector >> properties >> > > > The "connector.property-version" is an optional key in the >> old design >> > and >> > > > is never bumped up. >> > > > I'm not sure how "connector.property-version" should work >> in the >> > initial >> > > > design. Maybe @Timo Walther <[hidden email] >> <mailto:[hidden email]>> has more knowledge on >> > > > this. >> > > > But for the new properties, every options should be >> expressed as >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` >> method to >> > easily >> > > > support evolving keys. >> > > > >> > > > > a single keys instead of two, e.g. "kafka-0.11", >> "kafka-universal"? >> > > > There are several benefit to use separate "version" key I can >> see: >> > > > 1) it's more readable to separete them into different keys, >> because >> > they >> > > > are orthogonal concepts. >> > > > 2) the planner can give all the availble versions in the >> exception >> > > message, >> > > > if user uses a wrong version (this is often reported in >> user ML). >> > > > 3) If we use "kafka-0.11" as connector identifier, we may >> have to >> > write a >> > > > full documentation for each version, because they are >> different >> > > > "connector"? >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually >> the same >> > > connector >> > > > but with different "client jar" version, >> > > > they share all the same supported property keys and >> should reside >> > > > together. >> > > > 4) IMO, the future vision is version-free. At some point in >> the future, >> > > we >> > > > may don't need users to specify kafka version anymore, and >> make >> > > > "version=universal" as optional or removed in the future. >> This is can >> > be >> > > > done easily if they are separate keys. >> > > > >> > > > Best, >> > > > Jark >> > > > >> > > > >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] >> <mailto:[hidden email]>> wrote: >> > > > >> > > > > Hi Jark, >> > > > > >> > > > > Thanks for the proposal, I'm +1 to the general idea. >> However I have a >> > > > > question about "version", >> > > > > in the old design, the version seems to be aimed for >> tracking >> > property >> > > > > version, with different >> > > > > version, we could evolve these step by step without >> breaking backward >> > > > > compatibility. But in this >> > > > > design, version is representing external system's >> version, like >> > "0.11" >> > > > for >> > > > > kafka, "6" or "7" for ES. >> > > > > I'm not sure if this is necessary, what's the benefit of >> using two >> > keys >> > > > > instead of one, like "kafka-0.11" >> > > > > or "ES6" directly? And how about the old capability which >> could let >> > us >> > > > > evolving connector properties? >> > > > > >> > > > > Best, >> > > > > Kurt >> > > > > >> > > > > >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen >> <[hidden email] <mailto:[hidden email]>> >> > > > > wrote: >> > > > > >> > > > > > Hi Jark, >> > > > > > >> > > > > > I am really looking forward to this feature. I think this >> feature >> > > > > > could simplify flink sql code,and at the same time , >> > > > > > it could make the developer more easlier to config the >> flink sql >> > WITH >> > > > > > options. >> > > > > > >> > > > > > Now when I am using flink sql to write flink task , >> sometimes I >> > think >> > > > the >> > > > > > WITH options is too long for user. >> > > > > > For example,I config the kafka source connector >> parameter,for >> > > consumer >> > > > > > group and brokers parameter: >> > > > > > >> > > > > > 'connector.properties.0.key' = 'group.id >> <http://group.id>' >> > > > > > > , 'connector.properties.0.value' = 'xxx' >> > > > > > > , 'connector.properties.1.key' = >> 'bootstrap.servers' >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' >> > > > > > > >> > > > > > >> > > > > > I can understand this config , but for the flink fresh >> man,maybe it >> > > > > > is confused for him. >> > > > > > In my thought, I am really looking forward to this >> feature,thank >> > you >> > > to >> > > > > > propose this feature. >> > > > > > >> > > > > > Best wishes, >> > > > > > LakeShen >> > > > > > >> > > > > > >> > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 >> 2020年3月30日周一 下午2:02写道: >> > > > > > >> > > > > > > Hi everyone, >> > > > > > > >> > > > > > > I want to start a discussion about further improve and >> simplify >> > our >> > > > > > current >> > > > > > > connector porperty keys, aka WITH options. Currently, >> we have a >> > > > > > > 'connector.' prefix for many properties, but they are >> verbose, >> > and >> > > we >> > > > > > see a >> > > > > > > big inconsistency between the properties when designing >> FLIP-107. >> > > > > > > >> > > > > > > So we propose to remove all the 'connector.' prefix and >> rename >> > > > > > > 'connector.type' to 'connector', 'format.type' to >> 'format'. So a >> > > new >> > > > > > Kafka >> > > > > > > DDL may look like this: >> > > > > > > >> > > > > > > CREATE TABLE kafka_table ( >> > > > > > > ... >> > > > > > > ) WITH ( >> > > > > > > 'connector' = 'kafka', >> > > > > > > 'version' = '0.10', >> > > > > > > 'topic' = 'test-topic', >> > > > > > > 'startup-mode' = 'earliest-offset', >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', >> > > > > > > 'properties.group.id <http://properties.group.id>' = >> 'testGroup', >> > > > > > > 'format' = 'json', >> > > > > > > 'format.fail-on-missing-field' = 'false' >> > > > > > > ); >> > > > > > > >> > > > > > > The new connector property key set will come together >> with new >> > > > Factory >> > > > > > > inferface which is proposed in FLIP-95. Old properties >> are still >> > > > > > compatible >> > > > > > > with their existing implementation. New properties >> are only >> > > available >> > > > > in >> > > > > > > new DynamicTableFactory implementations. >> > > > > > > >> > > > > > > You can access the detailed FLIP here: >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >> > > > > > > >> > > > > > > Best, >> > > > > > > Jark >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > >> > > -- >> > > Best, Jingsong Lee >> > > >> > >> >> >> >> -- >> >> Benchao Li >> School of Electronics Engineering and Computer Science, Peking >> University >> Tel:+86-15650713730 >> Email:[hidden email] >> <mailto:[hidden email]>;[hidden email] >> <mailto:[hidden email]> >> > signature.asc (849 bytes) Download Attachment |
Hi all,
Thanks for the feedbacks. It seems that we have a conclusion to put the version into the factory identifier. I'm also fine with this. If we have this outcome, the interface of Factory#factoryVersion is not needed anymore, this can simplify the learning cost of new factory. We may need to update FLIP-95 and re-vote for it? cc @Timo Walther <[hidden email]> Btw, I would like to use "_" instead of "-" as the version delimiter, because "-" looks like minus and may confuse users, e.g. "elasticsearch-6". This is not forced, but should be a guilde in the Javadoc of Factory. I propose to use the following identifiers for existing connectors, kafka => kafka for 0.11+ versions, we don't suffix "-universal", because the meaning of "universal" not easy to understand. kafka-0.11 => kafka for 0.11 version kafka-0.10 => kafka for 0.10 version elasticsearch-6 => elasticsearch for 6.x versions elasticsearch-7 => elasticsearch for 7.x versions hbase-1.4 => hbase for 1.4.x versions jdbc filesystem We use "-" as the version delimiter to make them to be more consistent. This is not forces, users can also use other delimiters or without delimiter. But this can be a guilde in the Javadoc of Factory, to make the connector ecosystem to be more consistent. What do you think? ------------------------ Regarding "connector.property-version": Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are designed not support to read current properties. All the current properties are routed to the old factories if they are using "connector.type". Otherwise, properties are routed to new factories. If I understand correctly, the "connector.property-version" is attched implicitly by system, not manually set by users. For example, the framework should add "connector.property-version=1" to properties when processing DDL statement. I'm fine to add a "connector.property-version=1" when processing DDL statement, but I think it's also fine if we don't, because this can be done in the future if need and the default version can be 1. Best, Jark On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]> wrote: > Hi all, > > I like the overall design of the FLIP. > > As for the withstanding concerns. I kind of like the approach to put the > version into the factory identifier. I think it's the cleanest way to > say that this version actually applies to the connector itself and not > to the system it connects to. BTW, I think the outcome of this > discussion will affect interfaces described in FLIP-95. If we put the > version into the functionIdentifier, do we need Factory#factoryVersion? > > I also think it does make sense to have a versioning for the properties > as well. Are we able to read all the current properties with the new > factories? I think we could use the "connector.property-version" to > alternate between different Factory interfaces to support the old set of > properties. Otherwise the new factories need to understand both set of > properties, don't they? > > Best, > > Dawid > > On 30/03/2020 17:07, Timo Walther wrote: > > Hi Jark, > > > > thanks for the FLIP. I don't have a strong opinion on > > DynamicTableFactory#factoryVersion() for distiguishing connector > > versions. We can also include it in the factory identifier itself. For > > Kafka, I would then just use `kafka` because the naming "universal" > > was just a helper solution at that time. > > > > What we need is a good guide for how to design the options. We should > > include that in the JavaDoc of factory interface itself, once it is > > in-place. I like the difference between source and sink in properties. > > > > Regarding "connector.property-version": > > > > It was meant for backwards compatibility along the dimension of > > "property design" compared to "connector version". However, since the > > connector is now responsible for parsing all properties, it is not as > > necessary as it was before. > > > > However, a change of the property design as it is done in FLIP-107 > > becomes more difficult without a property version and versioning is > > always a good idea in API world. > > > > Regards, > > Timo > > > > > > On 30.03.20 16:38, Benchao Li wrote: > >> Hi Jark, > >> > >> Thanks for starting the discussion. The FLIP LGTM generally. > >> > >> Regarding connector version VS put version into connector's name, > >> I favor the latter personally, using one property to locate a > >> connector can make the error log more precise. > >> From the users' side, using one property to match a connector will > >> be easier. Especially we have many connectors, > >> and some of the need version property required, and some of them not. > >> > >> Regarding Jingsong's suggestion, > >> IMO, it's a very good complement to the FLIP. Distinguishing > >> properties for source and sink can be very useful, and > >> also this will make the connector property more precise. > >> We are also sick of this for now, we cannot know whether a DDL is a > >> source or sink unless we look through all queries where > >> the table is used. > >> Even more, some of the required properties are only required for > >> source, bug we cannot leave it blank for sink, and vice versa. > >> I think we can also add a type for dimension tables except source and > >> sink. > >> > >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 > >> 周一 下午8:16写道: > >> > >> > It's not possible for the framework to throw such exception. > >> Because the > >> framework doesn't know what versions do the connector support. > >> > >> Not really, we can list all valid connectors framework could > >> found. E.g. > >> user mistyped 'kafka-0.x', the error message will looks like: > >> > >> we don't have any connector named "kafka-0.x", but we have: > >> FileSystem > >> Kafka-0.10 > >> Kafka-0.11 > >> ElasticSearch6 > >> ElasticSearch7 > >> > >> Best, > >> Kurt > >> > >> > >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] > >> <mailto:[hidden email]>> wrote: > >> > >> > Hi Kurt, > >> > > >> > > 2) Lists all available connectors seems also quite > >> straightforward, e.g > >> > user provided a wrong "kafka-0.8", we tell user we have > >> candidates of > >> > "kakfa-0.11", "kafka-universal" > >> > It's not possible for the framework to throw such exception. > >> Because the > >> > framework doesn't know what versions do the connector support. > >> All the > >> > version information is a blackbox in the identifier. But with > >> > `Factory#factoryVersion()` interface, we can know all the > >> supported > >> > versions. > >> > > >> > > 3) I don't think so. We can still treat it as the same > >> connector but with > >> > different versions. > >> > That's true but that's weird. Because from the plain DDL > >> definition, they > >> > look like different connectors with different "connector" > >> value, e.g. > >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. > >> > > >> > > If users don't set any version, we will use "kafka-universal" > >> instead. > >> > The behavior is inconsistent IMO. > >> > That is a long term vision when there is no kafka clusters > >> with <0.11 > >> > version. > >> > At that point, "universal" is the only supported version in Flink > >> and the > >> > "version" key can be optional. > >> > > >> > --------------------------------------------- > >> > > >> > Hi Jingsong, > >> > > >> > > "version" vs "kafka.version" > >> > I though about it. But if we prefix "kafka" to version, we should > >> prefix > >> > "kafka" for all other property keys, because they are all kafka > >> specific > >> > options. > >> > However, that will make the property set verbose, see rejected > >> option#2 in > >> > the FLIP. > >> > > >> > > explicitly separate options for source and sink > >> > That's a good topic. It's good to have a guideline for the new > >> property > >> > keys. > >> > I'm fine to prefix with a "source"/"sink" for some connector > >> keys. > >> > Actually, we already do this in some connectors, e.g. jdbc and > >> hbase. > >> > > >> > Best, > >> > Jark > >> > > >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > [hidden email] > >> <mailto:[hidden email]>> wrote: > >> > > >> > > Thanks Jark for the proposal. > >> > > > >> > > +1 to the general idea. > >> > > > >> > > For "version", what about "kafka.version"? It is obvious to > >> know its > >> > > meaning. > >> > > > >> > > And I'd like to start a new topic: > >> > > Should we need to explicitly separate source from sink? > >> > > With the development of batch and streaming, more and more > >> connectors > >> > have > >> > > both source and sink. > >> > > > >> > > So should we set a rule for table properties: > >> > > - properties for both source and sink: without prefix, like > >> "topic" > >> > > - properties for source only: with "source." prefix, like > >> > > "source.startup-mode" > >> > > - properties for sink only: with "sink." prefix, like > >> "sink.partitioner" > >> > > > >> > > What do you think? > >> > > > >> > > Best, > >> > > Jingsong Lee > >> > > > >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] > >> <mailto:[hidden email]>> wrote: > >> > > > >> > > > Hi Kurt, > >> > > > > >> > > > That's good questions. > >> > > > > >> > > > > the meaning of "version" > >> > > > There are two versions in the old design. One is property > >> version > >> > > > "connector.property-version" which can be used for backward > >> > > compatibility. > >> > > > The other one is "connector.version" which defines the > >> version of > >> > > external > >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > >> > > > In this proposal, the "version" is the previous > >> "connector.version". > >> > The > >> > > > ""connector.property-version" is not introduced in new > >> design. > >> > > > > >> > > > > how to keep the old capability which can evolve connector > >> properties > >> > > > The "connector.property-version" is an optional key in the > >> old design > >> > and > >> > > > is never bumped up. > >> > > > I'm not sure how "connector.property-version" should work > >> in the > >> > initial > >> > > > design. Maybe @Timo Walther <[hidden email] > >> <mailto:[hidden email]>> has more knowledge on > >> > > > this. > >> > > > But for the new properties, every options should be > >> expressed as > >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > >> method to > >> > easily > >> > > > support evolving keys. > >> > > > > >> > > > > a single keys instead of two, e.g. "kafka-0.11", > >> "kafka-universal"? > >> > > > There are several benefit to use separate "version" key I can > >> see: > >> > > > 1) it's more readable to separete them into different keys, > >> because > >> > they > >> > > > are orthogonal concepts. > >> > > > 2) the planner can give all the availble versions in the > >> exception > >> > > message, > >> > > > if user uses a wrong version (this is often reported in > >> user ML). > >> > > > 3) If we use "kafka-0.11" as connector identifier, we may > >> have to > >> > write a > >> > > > full documentation for each version, because they are > >> different > >> > > > "connector"? > >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > >> the same > >> > > connector > >> > > > but with different "client jar" version, > >> > > > they share all the same supported property keys and > >> should reside > >> > > > together. > >> > > > 4) IMO, the future vision is version-free. At some point in > >> the future, > >> > > we > >> > > > may don't need users to specify kafka version anymore, and > >> make > >> > > > "version=universal" as optional or removed in the future. > >> This is can > >> > be > >> > > > done easily if they are separate keys. > >> > > > > >> > > > Best, > >> > > > Jark > >> > > > > >> > > > > >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] > >> <mailto:[hidden email]>> wrote: > >> > > > > >> > > > > Hi Jark, > >> > > > > > >> > > > > Thanks for the proposal, I'm +1 to the general idea. > >> However I have a > >> > > > > question about "version", > >> > > > > in the old design, the version seems to be aimed for > >> tracking > >> > property > >> > > > > version, with different > >> > > > > version, we could evolve these step by step without > >> breaking backward > >> > > > > compatibility. But in this > >> > > > > design, version is representing external system's > >> version, like > >> > "0.11" > >> > > > for > >> > > > > kafka, "6" or "7" for ES. > >> > > > > I'm not sure if this is necessary, what's the benefit of > >> using two > >> > keys > >> > > > > instead of one, like "kafka-0.11" > >> > > > > or "ES6" directly? And how about the old capability which > >> could let > >> > us > >> > > > > evolving connector properties? > >> > > > > > >> > > > > Best, > >> > > > > Kurt > >> > > > > > >> > > > > > >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > >> <[hidden email] <mailto:[hidden email]>> > >> > > > > wrote: > >> > > > > > >> > > > > > Hi Jark, > >> > > > > > > >> > > > > > I am really looking forward to this feature. I think this > >> feature > >> > > > > > could simplify flink sql code,and at the same time , > >> > > > > > it could make the developer more easlier to config the > >> flink sql > >> > WITH > >> > > > > > options. > >> > > > > > > >> > > > > > Now when I am using flink sql to write flink task , > >> sometimes I > >> > think > >> > > > the > >> > > > > > WITH options is too long for user. > >> > > > > > For example,I config the kafka source connector > >> parameter,for > >> > > consumer > >> > > > > > group and brokers parameter: > >> > > > > > > >> > > > > > 'connector.properties.0.key' = 'group.id > >> <http://group.id>' > >> > > > > > > , 'connector.properties.0.value' = 'xxx' > >> > > > > > > , 'connector.properties.1.key' = > >> 'bootstrap.servers' > >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' > >> > > > > > > > >> > > > > > > >> > > > > > I can understand this config , but for the flink fresh > >> man,maybe it > >> > > > > > is confused for him. > >> > > > > > In my thought, I am really looking forward to this > >> feature,thank > >> > you > >> > > to > >> > > > > > propose this feature. > >> > > > > > > >> > > > > > Best wishes, > >> > > > > > LakeShen > >> > > > > > > >> > > > > > > >> > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 > >> 2020年3月30日周一 下午2:02写道: > >> > > > > > > >> > > > > > > Hi everyone, > >> > > > > > > > >> > > > > > > I want to start a discussion about further improve and > >> simplify > >> > our > >> > > > > > current > >> > > > > > > connector porperty keys, aka WITH options. Currently, > >> we have a > >> > > > > > > 'connector.' prefix for many properties, but they are > >> verbose, > >> > and > >> > > we > >> > > > > > see a > >> > > > > > > big inconsistency between the properties when designing > >> FLIP-107. > >> > > > > > > > >> > > > > > > So we propose to remove all the 'connector.' prefix and > >> rename > >> > > > > > > 'connector.type' to 'connector', 'format.type' to > >> 'format'. So a > >> > > new > >> > > > > > Kafka > >> > > > > > > DDL may look like this: > >> > > > > > > > >> > > > > > > CREATE TABLE kafka_table ( > >> > > > > > > ... > >> > > > > > > ) WITH ( > >> > > > > > > 'connector' = 'kafka', > >> > > > > > > 'version' = '0.10', > >> > > > > > > 'topic' = 'test-topic', > >> > > > > > > 'startup-mode' = 'earliest-offset', > >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > >> > > > > > > 'properties.group.id <http://properties.group.id>' = > >> 'testGroup', > >> > > > > > > 'format' = 'json', > >> > > > > > > 'format.fail-on-missing-field' = 'false' > >> > > > > > > ); > >> > > > > > > > >> > > > > > > The new connector property key set will come together > >> with new > >> > > > Factory > >> > > > > > > inferface which is proposed in FLIP-95. Old properties > >> are still > >> > > > > > compatible > >> > > > > > > with their existing implementation. New properties > >> are only > >> > > available > >> > > > > in > >> > > > > > > new DynamicTableFactory implementations. > >> > > > > > > > >> > > > > > > You can access the detailed FLIP here: > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > >> > > > > > > > >> > > > > > > Best, > >> > > > > > > Jark > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > >> > > -- > >> > > Best, Jingsong Lee > >> > > > >> > > >> > >> > >> > >> -- > >> > >> Benchao Li > >> School of Electronics Engineering and Computer Science, Peking > >> University > >> Tel:+86-15650713730 > >> Email:[hidden email] > >> <mailto:[hidden email]>;[hidden email] > >> <mailto:[hidden email]> > >> > > > |
Hi all,
Thanks for the feedbacks. It seems that we have a conclusion to put the version into the factory identifier. I'm also fine with this. If we have this outcome, the interface of Factory#factoryVersion is not needed anymore, this can simplify the learning cost of new factory. We may need to update FLIP-95 and re-vote for it? cc @Timo Walther <[hidden email]> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because the meaning of "universal" not easy to understand. kafka-0.11 => kafka for 0.11 version kafka-0.10 => kafka for 0.10 version elasticsearch-6 => elasticsearch for 6.x versions elasticsearch-7 => elasticsearch for 7.x versions hbase-1.4 => hbase for 1.4.x versions jdbc filesystem We use "-" as the version delimiter to make them to be more consistent. This is not forces, users can also use other delimiters or without delimiter. But this can be a guilde in the Javadoc of Factory, to make the connector ecosystem to be more consistent. What do you think? ------------------------ Regarding "connector.property-version": Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are designed not support to read current properties. All the current properties are routed to the old factories if they are using "connector.type". Otherwise, properties are routed to new factories. If I understand correctly, the "connector.property-version" is attched implicitly by system, not manually set by users. For example, the framework should add "connector.property-version=1" to properties when processing DDL statement. I'm fine to add a "connector.property-version=1" when processing DDL statement, but I think it's also fine if we don't, because this can be done in the future if need and the default version can be 1. Best, Jark On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: > Hi all, > > Thanks for the feedbacks. > > It seems that we have a conclusion to put the version into the factory > identifier. I'm also fine with this. > If we have this outcome, the interface of Factory#factoryVersion is not > needed anymore, this can simplify the learning cost of new factory. > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > <[hidden email]> > > Btw, I would like to use "_" instead of "-" as the version delimiter, > because "-" looks like minus and may confuse users, e.g. "elasticsearch-6". > This is not forced, but should be a guilde in the Javadoc of Factory. > I propose to use the following identifiers for existing connectors, > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > the meaning of "universal" not easy to understand. > kafka-0.11 => kafka for 0.11 version > kafka-0.10 => kafka for 0.10 version > elasticsearch-6 => elasticsearch for 6.x versions > elasticsearch-7 => elasticsearch for 7.x versions > hbase-1.4 => hbase for 1.4.x versions > jdbc > filesystem > > We use "-" as the version delimiter to make them to be more consistent. > This is not forces, users can also use other delimiters or without > delimiter. > But this can be a guilde in the Javadoc of Factory, to make the connector > ecosystem to be more consistent. > > What do you think? > > ------------------------ > > Regarding "connector.property-version": > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > designed not support to read current properties. > All the current properties are routed to the old factories if they are > using "connector.type". Otherwise, properties are routed to new factories. > > If I understand correctly, the "connector.property-version" is attched > implicitly by system, not manually set by users. > For example, the framework should add "connector.property-version=1" to > properties when processing DDL statement. > I'm fine to add a "connector.property-version=1" when processing DDL > statement, but I think it's also fine if we don't, > because this can be done in the future if need and the default version can > be 1. > > Best, > Jark > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]> > wrote: > >> Hi all, >> >> I like the overall design of the FLIP. >> >> As for the withstanding concerns. I kind of like the approach to put the >> version into the factory identifier. I think it's the cleanest way to >> say that this version actually applies to the connector itself and not >> to the system it connects to. BTW, I think the outcome of this >> discussion will affect interfaces described in FLIP-95. If we put the >> version into the functionIdentifier, do we need Factory#factoryVersion? >> >> I also think it does make sense to have a versioning for the properties >> as well. Are we able to read all the current properties with the new >> factories? I think we could use the "connector.property-version" to >> alternate between different Factory interfaces to support the old set of >> properties. Otherwise the new factories need to understand both set of >> properties, don't they? >> >> Best, >> >> Dawid >> >> On 30/03/2020 17:07, Timo Walther wrote: >> > Hi Jark, >> > >> > thanks for the FLIP. I don't have a strong opinion on >> > DynamicTableFactory#factoryVersion() for distiguishing connector >> > versions. We can also include it in the factory identifier itself. For >> > Kafka, I would then just use `kafka` because the naming "universal" >> > was just a helper solution at that time. >> > >> > What we need is a good guide for how to design the options. We should >> > include that in the JavaDoc of factory interface itself, once it is >> > in-place. I like the difference between source and sink in properties. >> > >> > Regarding "connector.property-version": >> > >> > It was meant for backwards compatibility along the dimension of >> > "property design" compared to "connector version". However, since the >> > connector is now responsible for parsing all properties, it is not as >> > necessary as it was before. >> > >> > However, a change of the property design as it is done in FLIP-107 >> > becomes more difficult without a property version and versioning is >> > always a good idea in API world. >> > >> > Regards, >> > Timo >> > >> > >> > On 30.03.20 16:38, Benchao Li wrote: >> >> Hi Jark, >> >> >> >> Thanks for starting the discussion. The FLIP LGTM generally. >> >> >> >> Regarding connector version VS put version into connector's name, >> >> I favor the latter personally, using one property to locate a >> >> connector can make the error log more precise. >> >> From the users' side, using one property to match a connector will >> >> be easier. Especially we have many connectors, >> >> and some of the need version property required, and some of them not. >> >> >> >> Regarding Jingsong's suggestion, >> >> IMO, it's a very good complement to the FLIP. Distinguishing >> >> properties for source and sink can be very useful, and >> >> also this will make the connector property more precise. >> >> We are also sick of this for now, we cannot know whether a DDL is a >> >> source or sink unless we look through all queries where >> >> the table is used. >> >> Even more, some of the required properties are only required for >> >> source, bug we cannot leave it blank for sink, and vice versa. >> >> I think we can also add a type for dimension tables except source and >> >> sink. >> >> >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 >> >> 周一 下午8:16写道: >> >> >> >> > It's not possible for the framework to throw such exception. >> >> Because the >> >> framework doesn't know what versions do the connector support. >> >> >> >> Not really, we can list all valid connectors framework could >> >> found. E.g. >> >> user mistyped 'kafka-0.x', the error message will looks like: >> >> >> >> we don't have any connector named "kafka-0.x", but we have: >> >> FileSystem >> >> Kafka-0.10 >> >> Kafka-0.11 >> >> ElasticSearch6 >> >> ElasticSearch7 >> >> >> >> Best, >> >> Kurt >> >> >> >> >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >> >> <mailto:[hidden email]>> wrote: >> >> >> >> > Hi Kurt, >> >> > >> >> > > 2) Lists all available connectors seems also quite >> >> straightforward, e.g >> >> > user provided a wrong "kafka-0.8", we tell user we have >> >> candidates of >> >> > "kakfa-0.11", "kafka-universal" >> >> > It's not possible for the framework to throw such exception. >> >> Because the >> >> > framework doesn't know what versions do the connector support. >> >> All the >> >> > version information is a blackbox in the identifier. But with >> >> > `Factory#factoryVersion()` interface, we can know all the >> >> supported >> >> > versions. >> >> > >> >> > > 3) I don't think so. We can still treat it as the same >> >> connector but with >> >> > different versions. >> >> > That's true but that's weird. Because from the plain DDL >> >> definition, they >> >> > look like different connectors with different "connector" >> >> value, e.g. >> >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. >> >> > >> >> > > If users don't set any version, we will use "kafka-universal" >> >> instead. >> >> > The behavior is inconsistent IMO. >> >> > That is a long term vision when there is no kafka clusters >> >> with <0.11 >> >> > version. >> >> > At that point, "universal" is the only supported version in >> Flink >> >> and the >> >> > "version" key can be optional. >> >> > >> >> > --------------------------------------------- >> >> > >> >> > Hi Jingsong, >> >> > >> >> > > "version" vs "kafka.version" >> >> > I though about it. But if we prefix "kafka" to version, we >> should >> >> prefix >> >> > "kafka" for all other property keys, because they are all kafka >> >> specific >> >> > options. >> >> > However, that will make the property set verbose, see rejected >> >> option#2 in >> >> > the FLIP. >> >> > >> >> > > explicitly separate options for source and sink >> >> > That's a good topic. It's good to have a guideline for the new >> >> property >> >> > keys. >> >> > I'm fine to prefix with a "source"/"sink" for some connector >> >> keys. >> >> > Actually, we already do this in some connectors, e.g. jdbc and >> >> hbase. >> >> > >> >> > Best, >> >> > Jark >> >> > >> >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >> [hidden email] >> >> <mailto:[hidden email]>> wrote: >> >> > >> >> > > Thanks Jark for the proposal. >> >> > > >> >> > > +1 to the general idea. >> >> > > >> >> > > For "version", what about "kafka.version"? It is obvious to >> >> know its >> >> > > meaning. >> >> > > >> >> > > And I'd like to start a new topic: >> >> > > Should we need to explicitly separate source from sink? >> >> > > With the development of batch and streaming, more and more >> >> connectors >> >> > have >> >> > > both source and sink. >> >> > > >> >> > > So should we set a rule for table properties: >> >> > > - properties for both source and sink: without prefix, like >> >> "topic" >> >> > > - properties for source only: with "source." prefix, like >> >> > > "source.startup-mode" >> >> > > - properties for sink only: with "sink." prefix, like >> >> "sink.partitioner" >> >> > > >> >> > > What do you think? >> >> > > >> >> > > Best, >> >> > > Jingsong Lee >> >> > > >> >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >> >> <mailto:[hidden email]>> wrote: >> >> > > >> >> > > > Hi Kurt, >> >> > > > >> >> > > > That's good questions. >> >> > > > >> >> > > > > the meaning of "version" >> >> > > > There are two versions in the old design. One is property >> >> version >> >> > > > "connector.property-version" which can be used for backward >> >> > > compatibility. >> >> > > > The other one is "connector.version" which defines the >> >> version of >> >> > > external >> >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. >> >> > > > In this proposal, the "version" is the previous >> >> "connector.version". >> >> > The >> >> > > > ""connector.property-version" is not introduced in new >> >> design. >> >> > > > >> >> > > > > how to keep the old capability which can evolve connector >> >> properties >> >> > > > The "connector.property-version" is an optional key in the >> >> old design >> >> > and >> >> > > > is never bumped up. >> >> > > > I'm not sure how "connector.property-version" should work >> >> in the >> >> > initial >> >> > > > design. Maybe @Timo Walther <[hidden email] >> >> <mailto:[hidden email]>> has more knowledge on >> >> > > > this. >> >> > > > But for the new properties, every options should be >> >> expressed as >> >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` >> >> method to >> >> > easily >> >> > > > support evolving keys. >> >> > > > >> >> > > > > a single keys instead of two, e.g. "kafka-0.11", >> >> "kafka-universal"? >> >> > > > There are several benefit to use separate "version" key I >> can >> >> see: >> >> > > > 1) it's more readable to separete them into different keys, >> >> because >> >> > they >> >> > > > are orthogonal concepts. >> >> > > > 2) the planner can give all the availble versions in the >> >> exception >> >> > > message, >> >> > > > if user uses a wrong version (this is often reported in >> >> user ML). >> >> > > > 3) If we use "kafka-0.11" as connector identifier, we may >> >> have to >> >> > write a >> >> > > > full documentation for each version, because they are >> >> different >> >> > > > "connector"? >> >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually >> >> the same >> >> > > connector >> >> > > > but with different "client jar" version, >> >> > > > they share all the same supported property keys and >> >> should reside >> >> > > > together. >> >> > > > 4) IMO, the future vision is version-free. At some point in >> >> the future, >> >> > > we >> >> > > > may don't need users to specify kafka version anymore, and >> >> make >> >> > > > "version=universal" as optional or removed in the future. >> >> This is can >> >> > be >> >> > > > done easily if they are separate keys. >> >> > > > >> >> > > > Best, >> >> > > > Jark >> >> > > > >> >> > > > >> >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] >> >> <mailto:[hidden email]>> wrote: >> >> > > > >> >> > > > > Hi Jark, >> >> > > > > >> >> > > > > Thanks for the proposal, I'm +1 to the general idea. >> >> However I have a >> >> > > > > question about "version", >> >> > > > > in the old design, the version seems to be aimed for >> >> tracking >> >> > property >> >> > > > > version, with different >> >> > > > > version, we could evolve these step by step without >> >> breaking backward >> >> > > > > compatibility. But in this >> >> > > > > design, version is representing external system's >> >> version, like >> >> > "0.11" >> >> > > > for >> >> > > > > kafka, "6" or "7" for ES. >> >> > > > > I'm not sure if this is necessary, what's the benefit of >> >> using two >> >> > keys >> >> > > > > instead of one, like "kafka-0.11" >> >> > > > > or "ES6" directly? And how about the old capability which >> >> could let >> >> > us >> >> > > > > evolving connector properties? >> >> > > > > >> >> > > > > Best, >> >> > > > > Kurt >> >> > > > > >> >> > > > > >> >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen >> >> <[hidden email] <mailto:[hidden email]>> >> >> > > > > wrote: >> >> > > > > >> >> > > > > > Hi Jark, >> >> > > > > > >> >> > > > > > I am really looking forward to this feature. I think >> this >> >> feature >> >> > > > > > could simplify flink sql code,and at the same time , >> >> > > > > > it could make the developer more easlier to config the >> >> flink sql >> >> > WITH >> >> > > > > > options. >> >> > > > > > >> >> > > > > > Now when I am using flink sql to write flink task , >> >> sometimes I >> >> > think >> >> > > > the >> >> > > > > > WITH options is too long for user. >> >> > > > > > For example,I config the kafka source connector >> >> parameter,for >> >> > > consumer >> >> > > > > > group and brokers parameter: >> >> > > > > > >> >> > > > > > 'connector.properties.0.key' = 'group.id >> >> <http://group.id>' >> >> > > > > > > , 'connector.properties.0.value' = 'xxx' >> >> > > > > > > , 'connector.properties.1.key' = >> >> 'bootstrap.servers' >> >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' >> >> > > > > > > >> >> > > > > > >> >> > > > > > I can understand this config , but for the flink fresh >> >> man,maybe it >> >> > > > > > is confused for him. >> >> > > > > > In my thought, I am really looking forward to this >> >> feature,thank >> >> > you >> >> > > to >> >> > > > > > propose this feature. >> >> > > > > > >> >> > > > > > Best wishes, >> >> > > > > > LakeShen >> >> > > > > > >> >> > > > > > >> >> > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 >> >> 2020年3月30日周一 下午2:02写道: >> >> > > > > > >> >> > > > > > > Hi everyone, >> >> > > > > > > >> >> > > > > > > I want to start a discussion about further improve and >> >> simplify >> >> > our >> >> > > > > > current >> >> > > > > > > connector porperty keys, aka WITH options. Currently, >> >> we have a >> >> > > > > > > 'connector.' prefix for many properties, but they are >> >> verbose, >> >> > and >> >> > > we >> >> > > > > > see a >> >> > > > > > > big inconsistency between the properties when >> designing >> >> FLIP-107. >> >> > > > > > > >> >> > > > > > > So we propose to remove all the 'connector.' prefix >> and >> >> rename >> >> > > > > > > 'connector.type' to 'connector', 'format.type' to >> >> 'format'. So a >> >> > > new >> >> > > > > > Kafka >> >> > > > > > > DDL may look like this: >> >> > > > > > > >> >> > > > > > > CREATE TABLE kafka_table ( >> >> > > > > > > ... >> >> > > > > > > ) WITH ( >> >> > > > > > > 'connector' = 'kafka', >> >> > > > > > > 'version' = '0.10', >> >> > > > > > > 'topic' = 'test-topic', >> >> > > > > > > 'startup-mode' = 'earliest-offset', >> >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', >> >> > > > > > > 'properties.group.id <http://properties.group.id>' = >> >> 'testGroup', >> >> > > > > > > 'format' = 'json', >> >> > > > > > > 'format.fail-on-missing-field' = 'false' >> >> > > > > > > ); >> >> > > > > > > >> >> > > > > > > The new connector property key set will come together >> >> with new >> >> > > > Factory >> >> > > > > > > inferface which is proposed in FLIP-95. Old properties >> >> are still >> >> > > > > > compatible >> >> > > > > > > with their existing implementation. New properties >> >> are only >> >> > > available >> >> > > > > in >> >> > > > > > > new DynamicTableFactory implementations. >> >> > > > > > > >> >> > > > > > > You can access the detailed FLIP here: >> >> > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >> >> > > > > > > >> >> > > > > > > Best, >> >> > > > > > > Jark >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > > >> >> > > -- >> >> > > Best, Jingsong Lee >> >> > > >> >> > >> >> >> >> >> >> >> >> -- >> >> >> >> Benchao Li >> >> School of Electronics Engineering and Computer Science, Peking >> >> University >> >> Tel:+86-15650713730 >> >> Email:[hidden email] >> >> <mailto:[hidden email]>;[hidden email] >> >> <mailto:[hidden email]> >> >> >> > >> > |
Regarding to version delimiter, I'm in favor of using "-", not "_". Not
only because we are using "-" for config keys in other module, like CoreOptions, DeploymentOptions, but also "-" is easier to type than "_" which doesn't need to press "shift" ;-) Best, Kurt On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <[hidden email]> wrote: > Hi all, > > Thanks for the feedbacks. > > It seems that we have a conclusion to put the version into the factory > identifier. I'm also fine with this. > If we have this outcome, the interface of Factory#factoryVersion is not > needed anymore, this can simplify the learning cost of new factory. > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > <[hidden email]> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > the meaning of "universal" not easy to understand. > kafka-0.11 => kafka for 0.11 version > kafka-0.10 => kafka for 0.10 version > elasticsearch-6 => elasticsearch for 6.x versions > elasticsearch-7 => elasticsearch for 7.x versions > hbase-1.4 => hbase for 1.4.x versions > jdbc > filesystem > > We use "-" as the version delimiter to make them to be more consistent. > This is not forces, users can also use other delimiters or without > delimiter. > But this can be a guilde in the Javadoc of Factory, to make the connector > ecosystem to be more consistent. > > What do you think? > > ------------------------ > > Regarding "connector.property-version": > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > designed not support to read current properties. > All the current properties are routed to the old factories if they are > using "connector.type". Otherwise, properties are routed to new factories. > > If I understand correctly, the "connector.property-version" is attched > implicitly by system, not manually set by users. > For example, the framework should add "connector.property-version=1" to > properties when processing DDL statement. > I'm fine to add a "connector.property-version=1" when processing DDL > statement, but I think it's also fine if we don't, > because this can be done in the future if need and the default version can > be 1. > > Best, > Jark > > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: > > > Hi all, > > > > Thanks for the feedbacks. > > > > It seems that we have a conclusion to put the version into the factory > > identifier. I'm also fine with this. > > If we have this outcome, the interface of Factory#factoryVersion is not > > needed anymore, this can simplify the learning cost of new factory. > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > <[hidden email]> > > > > Btw, I would like to use "_" instead of "-" as the version delimiter, > > because "-" looks like minus and may confuse users, e.g. > "elasticsearch-6". > > This is not forced, but should be a guilde in the Javadoc of Factory. > > I propose to use the following identifiers for existing connectors, > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > > the meaning of "universal" not easy to understand. > > kafka-0.11 => kafka for 0.11 version > > kafka-0.10 => kafka for 0.10 version > > elasticsearch-6 => elasticsearch for 6.x versions > > elasticsearch-7 => elasticsearch for 7.x versions > > hbase-1.4 => hbase for 1.4.x versions > > jdbc > > filesystem > > > > We use "-" as the version delimiter to make them to be more consistent. > > This is not forces, users can also use other delimiters or without > > delimiter. > > But this can be a guilde in the Javadoc of Factory, to make the connector > > ecosystem to be more consistent. > > > > What do you think? > > > > ------------------------ > > > > Regarding "connector.property-version": > > > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > > designed not support to read current properties. > > All the current properties are routed to the old factories if they are > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > If I understand correctly, the "connector.property-version" is attched > > implicitly by system, not manually set by users. > > For example, the framework should add "connector.property-version=1" to > > properties when processing DDL statement. > > I'm fine to add a "connector.property-version=1" when processing DDL > > statement, but I think it's also fine if we don't, > > because this can be done in the future if need and the default version > can > > be 1. > > > > Best, > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]> > > wrote: > > > >> Hi all, > >> > >> I like the overall design of the FLIP. > >> > >> As for the withstanding concerns. I kind of like the approach to put the > >> version into the factory identifier. I think it's the cleanest way to > >> say that this version actually applies to the connector itself and not > >> to the system it connects to. BTW, I think the outcome of this > >> discussion will affect interfaces described in FLIP-95. If we put the > >> version into the functionIdentifier, do we need Factory#factoryVersion? > >> > >> I also think it does make sense to have a versioning for the properties > >> as well. Are we able to read all the current properties with the new > >> factories? I think we could use the "connector.property-version" to > >> alternate between different Factory interfaces to support the old set of > >> properties. Otherwise the new factories need to understand both set of > >> properties, don't they? > >> > >> Best, > >> > >> Dawid > >> > >> On 30/03/2020 17:07, Timo Walther wrote: > >> > Hi Jark, > >> > > >> > thanks for the FLIP. I don't have a strong opinion on > >> > DynamicTableFactory#factoryVersion() for distiguishing connector > >> > versions. We can also include it in the factory identifier itself. For > >> > Kafka, I would then just use `kafka` because the naming "universal" > >> > was just a helper solution at that time. > >> > > >> > What we need is a good guide for how to design the options. We should > >> > include that in the JavaDoc of factory interface itself, once it is > >> > in-place. I like the difference between source and sink in properties. > >> > > >> > Regarding "connector.property-version": > >> > > >> > It was meant for backwards compatibility along the dimension of > >> > "property design" compared to "connector version". However, since the > >> > connector is now responsible for parsing all properties, it is not as > >> > necessary as it was before. > >> > > >> > However, a change of the property design as it is done in FLIP-107 > >> > becomes more difficult without a property version and versioning is > >> > always a good idea in API world. > >> > > >> > Regards, > >> > Timo > >> > > >> > > >> > On 30.03.20 16:38, Benchao Li wrote: > >> >> Hi Jark, > >> >> > >> >> Thanks for starting the discussion. The FLIP LGTM generally. > >> >> > >> >> Regarding connector version VS put version into connector's name, > >> >> I favor the latter personally, using one property to locate a > >> >> connector can make the error log more precise. > >> >> From the users' side, using one property to match a connector will > >> >> be easier. Especially we have many connectors, > >> >> and some of the need version property required, and some of them not. > >> >> > >> >> Regarding Jingsong's suggestion, > >> >> IMO, it's a very good complement to the FLIP. Distinguishing > >> >> properties for source and sink can be very useful, and > >> >> also this will make the connector property more precise. > >> >> We are also sick of this for now, we cannot know whether a DDL is a > >> >> source or sink unless we look through all queries where > >> >> the table is used. > >> >> Even more, some of the required properties are only required for > >> >> source, bug we cannot leave it blank for sink, and vice versa. > >> >> I think we can also add a type for dimension tables except source and > >> >> sink. > >> >> > >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 > >> >> 周一 下午8:16写道: > >> >> > >> >> > It's not possible for the framework to throw such exception. > >> >> Because the > >> >> framework doesn't know what versions do the connector support. > >> >> > >> >> Not really, we can list all valid connectors framework could > >> >> found. E.g. > >> >> user mistyped 'kafka-0.x', the error message will looks like: > >> >> > >> >> we don't have any connector named "kafka-0.x", but we have: > >> >> FileSystem > >> >> Kafka-0.10 > >> >> Kafka-0.11 > >> >> ElasticSearch6 > >> >> ElasticSearch7 > >> >> > >> >> Best, > >> >> Kurt > >> >> > >> >> > >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] > >> >> <mailto:[hidden email]>> wrote: > >> >> > >> >> > Hi Kurt, > >> >> > > >> >> > > 2) Lists all available connectors seems also quite > >> >> straightforward, e.g > >> >> > user provided a wrong "kafka-0.8", we tell user we have > >> >> candidates of > >> >> > "kakfa-0.11", "kafka-universal" > >> >> > It's not possible for the framework to throw such exception. > >> >> Because the > >> >> > framework doesn't know what versions do the connector support. > >> >> All the > >> >> > version information is a blackbox in the identifier. But with > >> >> > `Factory#factoryVersion()` interface, we can know all the > >> >> supported > >> >> > versions. > >> >> > > >> >> > > 3) I don't think so. We can still treat it as the same > >> >> connector but with > >> >> > different versions. > >> >> > That's true but that's weird. Because from the plain DDL > >> >> definition, they > >> >> > look like different connectors with different "connector" > >> >> value, e.g. > >> >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. > >> >> > > >> >> > > If users don't set any version, we will use > "kafka-universal" > >> >> instead. > >> >> > The behavior is inconsistent IMO. > >> >> > That is a long term vision when there is no kafka clusters > >> >> with <0.11 > >> >> > version. > >> >> > At that point, "universal" is the only supported version in > >> Flink > >> >> and the > >> >> > "version" key can be optional. > >> >> > > >> >> > --------------------------------------------- > >> >> > > >> >> > Hi Jingsong, > >> >> > > >> >> > > "version" vs "kafka.version" > >> >> > I though about it. But if we prefix "kafka" to version, we > >> should > >> >> prefix > >> >> > "kafka" for all other property keys, because they are all > kafka > >> >> specific > >> >> > options. > >> >> > However, that will make the property set verbose, see rejected > >> >> option#2 in > >> >> > the FLIP. > >> >> > > >> >> > > explicitly separate options for source and sink > >> >> > That's a good topic. It's good to have a guideline for the new > >> >> property > >> >> > keys. > >> >> > I'm fine to prefix with a "source"/"sink" for some connector > >> >> keys. > >> >> > Actually, we already do this in some connectors, e.g. jdbc and > >> >> hbase. > >> >> > > >> >> > Best, > >> >> > Jark > >> >> > > >> >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > >> [hidden email] > >> >> <mailto:[hidden email]>> wrote: > >> >> > > >> >> > > Thanks Jark for the proposal. > >> >> > > > >> >> > > +1 to the general idea. > >> >> > > > >> >> > > For "version", what about "kafka.version"? It is obvious to > >> >> know its > >> >> > > meaning. > >> >> > > > >> >> > > And I'd like to start a new topic: > >> >> > > Should we need to explicitly separate source from sink? > >> >> > > With the development of batch and streaming, more and more > >> >> connectors > >> >> > have > >> >> > > both source and sink. > >> >> > > > >> >> > > So should we set a rule for table properties: > >> >> > > - properties for both source and sink: without prefix, like > >> >> "topic" > >> >> > > - properties for source only: with "source." prefix, like > >> >> > > "source.startup-mode" > >> >> > > - properties for sink only: with "sink." prefix, like > >> >> "sink.partitioner" > >> >> > > > >> >> > > What do you think? > >> >> > > > >> >> > > Best, > >> >> > > Jingsong Lee > >> >> > > > >> >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] > >> >> <mailto:[hidden email]>> wrote: > >> >> > > > >> >> > > > Hi Kurt, > >> >> > > > > >> >> > > > That's good questions. > >> >> > > > > >> >> > > > > the meaning of "version" > >> >> > > > There are two versions in the old design. One is property > >> >> version > >> >> > > > "connector.property-version" which can be used for > backward > >> >> > > compatibility. > >> >> > > > The other one is "connector.version" which defines the > >> >> version of > >> >> > > external > >> >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > >> >> > > > In this proposal, the "version" is the previous > >> >> "connector.version". > >> >> > The > >> >> > > > ""connector.property-version" is not introduced in new > >> >> design. > >> >> > > > > >> >> > > > > how to keep the old capability which can evolve > connector > >> >> properties > >> >> > > > The "connector.property-version" is an optional key in the > >> >> old design > >> >> > and > >> >> > > > is never bumped up. > >> >> > > > I'm not sure how "connector.property-version" should work > >> >> in the > >> >> > initial > >> >> > > > design. Maybe @Timo Walther <[hidden email] > >> >> <mailto:[hidden email]>> has more knowledge on > >> >> > > > this. > >> >> > > > But for the new properties, every options should be > >> >> expressed as > >> >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > >> >> method to > >> >> > easily > >> >> > > > support evolving keys. > >> >> > > > > >> >> > > > > a single keys instead of two, e.g. "kafka-0.11", > >> >> "kafka-universal"? > >> >> > > > There are several benefit to use separate "version" key I > >> can > >> >> see: > >> >> > > > 1) it's more readable to separete them into different > keys, > >> >> because > >> >> > they > >> >> > > > are orthogonal concepts. > >> >> > > > 2) the planner can give all the availble versions in the > >> >> exception > >> >> > > message, > >> >> > > > if user uses a wrong version (this is often reported in > >> >> user ML). > >> >> > > > 3) If we use "kafka-0.11" as connector identifier, we may > >> >> have to > >> >> > write a > >> >> > > > full documentation for each version, because they are > >> >> different > >> >> > > > "connector"? > >> >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > >> >> the same > >> >> > > connector > >> >> > > > but with different "client jar" version, > >> >> > > > they share all the same supported property keys and > >> >> should reside > >> >> > > > together. > >> >> > > > 4) IMO, the future vision is version-free. At some point > in > >> >> the future, > >> >> > > we > >> >> > > > may don't need users to specify kafka version anymore, and > >> >> make > >> >> > > > "version=universal" as optional or removed in the future. > >> >> This is can > >> >> > be > >> >> > > > done easily if they are separate keys. > >> >> > > > > >> >> > > > Best, > >> >> > > > Jark > >> >> > > > > >> >> > > > > >> >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young < > [hidden email] > >> >> <mailto:[hidden email]>> wrote: > >> >> > > > > >> >> > > > > Hi Jark, > >> >> > > > > > >> >> > > > > Thanks for the proposal, I'm +1 to the general idea. > >> >> However I have a > >> >> > > > > question about "version", > >> >> > > > > in the old design, the version seems to be aimed for > >> >> tracking > >> >> > property > >> >> > > > > version, with different > >> >> > > > > version, we could evolve these step by step without > >> >> breaking backward > >> >> > > > > compatibility. But in this > >> >> > > > > design, version is representing external system's > >> >> version, like > >> >> > "0.11" > >> >> > > > for > >> >> > > > > kafka, "6" or "7" for ES. > >> >> > > > > I'm not sure if this is necessary, what's the benefit of > >> >> using two > >> >> > keys > >> >> > > > > instead of one, like "kafka-0.11" > >> >> > > > > or "ES6" directly? And how about the old capability > which > >> >> could let > >> >> > us > >> >> > > > > evolving connector properties? > >> >> > > > > > >> >> > > > > Best, > >> >> > > > > Kurt > >> >> > > > > > >> >> > > > > > >> >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > >> >> <[hidden email] <mailto:[hidden email]>> > >> >> > > > > wrote: > >> >> > > > > > >> >> > > > > > Hi Jark, > >> >> > > > > > > >> >> > > > > > I am really looking forward to this feature. I think > >> this > >> >> feature > >> >> > > > > > could simplify flink sql code,and at the same time , > >> >> > > > > > it could make the developer more easlier to config the > >> >> flink sql > >> >> > WITH > >> >> > > > > > options. > >> >> > > > > > > >> >> > > > > > Now when I am using flink sql to write flink task , > >> >> sometimes I > >> >> > think > >> >> > > > the > >> >> > > > > > WITH options is too long for user. > >> >> > > > > > For example,I config the kafka source connector > >> >> parameter,for > >> >> > > consumer > >> >> > > > > > group and brokers parameter: > >> >> > > > > > > >> >> > > > > > 'connector.properties.0.key' = 'group.id > >> >> <http://group.id>' > >> >> > > > > > > , 'connector.properties.0.value' = 'xxx' > >> >> > > > > > > , 'connector.properties.1.key' = > >> >> 'bootstrap.servers' > >> >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > I can understand this config , but for the flink fresh > >> >> man,maybe it > >> >> > > > > > is confused for him. > >> >> > > > > > In my thought, I am really looking forward to this > >> >> feature,thank > >> >> > you > >> >> > > to > >> >> > > > > > propose this feature. > >> >> > > > > > > >> >> > > > > > Best wishes, > >> >> > > > > > LakeShen > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> > 于 > >> >> 2020年3月30日周一 下午2:02写道: > >> >> > > > > > > >> >> > > > > > > Hi everyone, > >> >> > > > > > > > >> >> > > > > > > I want to start a discussion about further improve > and > >> >> simplify > >> >> > our > >> >> > > > > > current > >> >> > > > > > > connector porperty keys, aka WITH options. > Currently, > >> >> we have a > >> >> > > > > > > 'connector.' prefix for many properties, but they > are > >> >> verbose, > >> >> > and > >> >> > > we > >> >> > > > > > see a > >> >> > > > > > > big inconsistency between the properties when > >> designing > >> >> FLIP-107. > >> >> > > > > > > > >> >> > > > > > > So we propose to remove all the 'connector.' prefix > >> and > >> >> rename > >> >> > > > > > > 'connector.type' to 'connector', 'format.type' to > >> >> 'format'. So a > >> >> > > new > >> >> > > > > > Kafka > >> >> > > > > > > DDL may look like this: > >> >> > > > > > > > >> >> > > > > > > CREATE TABLE kafka_table ( > >> >> > > > > > > ... > >> >> > > > > > > ) WITH ( > >> >> > > > > > > 'connector' = 'kafka', > >> >> > > > > > > 'version' = '0.10', > >> >> > > > > > > 'topic' = 'test-topic', > >> >> > > > > > > 'startup-mode' = 'earliest-offset', > >> >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > >> >> > > > > > > 'properties.group.id <http://properties.group.id>' > = > >> >> 'testGroup', > >> >> > > > > > > 'format' = 'json', > >> >> > > > > > > 'format.fail-on-missing-field' = 'false' > >> >> > > > > > > ); > >> >> > > > > > > > >> >> > > > > > > The new connector property key set will come > together > >> >> with new > >> >> > > > Factory > >> >> > > > > > > inferface which is proposed in FLIP-95. Old > properties > >> >> are still > >> >> > > > > > compatible > >> >> > > > > > > with their existing implementation. New properties > >> >> are only > >> >> > > available > >> >> > > > > in > >> >> > > > > > > new DynamicTableFactory implementations. > >> >> > > > > > > > >> >> > > > > > > You can access the detailed FLIP here: > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > >> >> > > > > > > > >> >> > > > > > > Best, > >> >> > > > > > > Jark > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > Best, Jingsong Lee > >> >> > > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> >> Benchao Li > >> >> School of Electronics Engineering and Computer Science, Peking > >> >> University > >> >> Tel:+86-15650713730 > >> >> Email:[hidden email] > >> >> <mailto:[hidden email]>;[hidden email] > >> >> <mailto:[hidden email]> > >> >> > >> > > >> > > > |
Regarding to "property-version", even this is brought up by myself, but I'm
actually not a fan of this. As a user, I think this property is too much for me to learn, and I probably won't use this anyway. If framework can follow a good procedure to upgrade properties, show some error messages (but still can work) when user used some deprecated properties, and then finally drop them 2-3 versions later. I think this is good enough. Best, Kurt On Tue, Mar 31, 2020 at 9:07 AM Kurt Young <[hidden email]> wrote: > Regarding to version delimiter, I'm in favor of using "-", not "_". Not > only because we are using > "-" for config keys in other module, like CoreOptions, DeploymentOptions, > but also "-" is easier > to type than "_" which doesn't need to press "shift" ;-) > > Best, > Kurt > > > On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <[hidden email]> wrote: > >> Hi all, >> >> Thanks for the feedbacks. >> >> It seems that we have a conclusion to put the version into the factory >> identifier. I'm also fine with this. >> If we have this outcome, the interface of Factory#factoryVersion is not >> needed anymore, this can simplify the learning cost of new factory. >> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> <[hidden email]> >> >> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because >> the meaning of "universal" not easy to understand. >> kafka-0.11 => kafka for 0.11 version >> kafka-0.10 => kafka for 0.10 version >> elasticsearch-6 => elasticsearch for 6.x versions >> elasticsearch-7 => elasticsearch for 7.x versions >> hbase-1.4 => hbase for 1.4.x versions >> jdbc >> filesystem >> >> We use "-" as the version delimiter to make them to be more consistent. >> This is not forces, users can also use other delimiters or without >> delimiter. >> But this can be a guilde in the Javadoc of Factory, to make the connector >> ecosystem to be more consistent. >> >> What do you think? >> >> ------------------------ >> >> Regarding "connector.property-version": >> >> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >> designed not support to read current properties. >> All the current properties are routed to the old factories if they are >> using "connector.type". Otherwise, properties are routed to new factories. >> >> If I understand correctly, the "connector.property-version" is attched >> implicitly by system, not manually set by users. >> For example, the framework should add "connector.property-version=1" to >> properties when processing DDL statement. >> I'm fine to add a "connector.property-version=1" when processing DDL >> statement, but I think it's also fine if we don't, >> because this can be done in the future if need and the default version can >> be 1. >> >> Best, >> Jark >> >> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: >> >> > Hi all, >> > >> > Thanks for the feedbacks. >> > >> > It seems that we have a conclusion to put the version into the factory >> > identifier. I'm also fine with this. >> > If we have this outcome, the interface of Factory#factoryVersion is not >> > needed anymore, this can simplify the learning cost of new factory. >> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> > <[hidden email]> >> > >> > Btw, I would like to use "_" instead of "-" as the version delimiter, >> > because "-" looks like minus and may confuse users, e.g. >> "elasticsearch-6". >> > This is not forced, but should be a guilde in the Javadoc of Factory. >> > I propose to use the following identifiers for existing connectors, >> > >> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", >> because >> > the meaning of "universal" not easy to understand. >> > kafka-0.11 => kafka for 0.11 version >> > kafka-0.10 => kafka for 0.10 version >> > elasticsearch-6 => elasticsearch for 6.x versions >> > elasticsearch-7 => elasticsearch for 7.x versions >> > hbase-1.4 => hbase for 1.4.x versions >> > jdbc >> > filesystem >> > >> > We use "-" as the version delimiter to make them to be more consistent. >> > This is not forces, users can also use other delimiters or without >> > delimiter. >> > But this can be a guilde in the Javadoc of Factory, to make the >> connector >> > ecosystem to be more consistent. >> > >> > What do you think? >> > >> > ------------------------ >> > >> > Regarding "connector.property-version": >> > >> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >> > designed not support to read current properties. >> > All the current properties are routed to the old factories if they are >> > using "connector.type". Otherwise, properties are routed to new >> factories. >> > >> > If I understand correctly, the "connector.property-version" is attched >> > implicitly by system, not manually set by users. >> > For example, the framework should add "connector.property-version=1" to >> > properties when processing DDL statement. >> > I'm fine to add a "connector.property-version=1" when processing DDL >> > statement, but I think it's also fine if we don't, >> > because this can be done in the future if need and the default version >> can >> > be 1. >> > >> > Best, >> > Jark >> > >> > >> > >> > >> > >> > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]> >> > wrote: >> > >> >> Hi all, >> >> >> >> I like the overall design of the FLIP. >> >> >> >> As for the withstanding concerns. I kind of like the approach to put >> the >> >> version into the factory identifier. I think it's the cleanest way to >> >> say that this version actually applies to the connector itself and not >> >> to the system it connects to. BTW, I think the outcome of this >> >> discussion will affect interfaces described in FLIP-95. If we put the >> >> version into the functionIdentifier, do we need Factory#factoryVersion? >> >> >> >> I also think it does make sense to have a versioning for the properties >> >> as well. Are we able to read all the current properties with the new >> >> factories? I think we could use the "connector.property-version" to >> >> alternate between different Factory interfaces to support the old set >> of >> >> properties. Otherwise the new factories need to understand both set of >> >> properties, don't they? >> >> >> >> Best, >> >> >> >> Dawid >> >> >> >> On 30/03/2020 17:07, Timo Walther wrote: >> >> > Hi Jark, >> >> > >> >> > thanks for the FLIP. I don't have a strong opinion on >> >> > DynamicTableFactory#factoryVersion() for distiguishing connector >> >> > versions. We can also include it in the factory identifier itself. >> For >> >> > Kafka, I would then just use `kafka` because the naming "universal" >> >> > was just a helper solution at that time. >> >> > >> >> > What we need is a good guide for how to design the options. We should >> >> > include that in the JavaDoc of factory interface itself, once it is >> >> > in-place. I like the difference between source and sink in >> properties. >> >> > >> >> > Regarding "connector.property-version": >> >> > >> >> > It was meant for backwards compatibility along the dimension of >> >> > "property design" compared to "connector version". However, since the >> >> > connector is now responsible for parsing all properties, it is not as >> >> > necessary as it was before. >> >> > >> >> > However, a change of the property design as it is done in FLIP-107 >> >> > becomes more difficult without a property version and versioning is >> >> > always a good idea in API world. >> >> > >> >> > Regards, >> >> > Timo >> >> > >> >> > >> >> > On 30.03.20 16:38, Benchao Li wrote: >> >> >> Hi Jark, >> >> >> >> >> >> Thanks for starting the discussion. The FLIP LGTM generally. >> >> >> >> >> >> Regarding connector version VS put version into connector's name, >> >> >> I favor the latter personally, using one property to locate a >> >> >> connector can make the error log more precise. >> >> >> From the users' side, using one property to match a connector will >> >> >> be easier. Especially we have many connectors, >> >> >> and some of the need version property required, and some of them >> not. >> >> >> >> >> >> Regarding Jingsong's suggestion, >> >> >> IMO, it's a very good complement to the FLIP. Distinguishing >> >> >> properties for source and sink can be very useful, and >> >> >> also this will make the connector property more precise. >> >> >> We are also sick of this for now, we cannot know whether a DDL is a >> >> >> source or sink unless we look through all queries where >> >> >> the table is used. >> >> >> Even more, some of the required properties are only required for >> >> >> source, bug we cannot leave it blank for sink, and vice versa. >> >> >> I think we can also add a type for dimension tables except source >> and >> >> >> sink. >> >> >> >> >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 >> >> >> 周一 下午8:16写道: >> >> >> >> >> >> > It's not possible for the framework to throw such exception. >> >> >> Because the >> >> >> framework doesn't know what versions do the connector support. >> >> >> >> >> >> Not really, we can list all valid connectors framework could >> >> >> found. E.g. >> >> >> user mistyped 'kafka-0.x', the error message will looks like: >> >> >> >> >> >> we don't have any connector named "kafka-0.x", but we have: >> >> >> FileSystem >> >> >> Kafka-0.10 >> >> >> Kafka-0.11 >> >> >> ElasticSearch6 >> >> >> ElasticSearch7 >> >> >> >> >> >> Best, >> >> >> Kurt >> >> >> >> >> >> >> >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >> >> >> <mailto:[hidden email]>> wrote: >> >> >> >> >> >> > Hi Kurt, >> >> >> > >> >> >> > > 2) Lists all available connectors seems also quite >> >> >> straightforward, e.g >> >> >> > user provided a wrong "kafka-0.8", we tell user we have >> >> >> candidates of >> >> >> > "kakfa-0.11", "kafka-universal" >> >> >> > It's not possible for the framework to throw such exception. >> >> >> Because the >> >> >> > framework doesn't know what versions do the connector >> support. >> >> >> All the >> >> >> > version information is a blackbox in the identifier. But with >> >> >> > `Factory#factoryVersion()` interface, we can know all the >> >> >> supported >> >> >> > versions. >> >> >> > >> >> >> > > 3) I don't think so. We can still treat it as the same >> >> >> connector but with >> >> >> > different versions. >> >> >> > That's true but that's weird. Because from the plain DDL >> >> >> definition, they >> >> >> > look like different connectors with different "connector" >> >> >> value, e.g. >> >> >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. >> >> >> > >> >> >> > > If users don't set any version, we will use >> "kafka-universal" >> >> >> instead. >> >> >> > The behavior is inconsistent IMO. >> >> >> > That is a long term vision when there is no kafka clusters >> >> >> with <0.11 >> >> >> > version. >> >> >> > At that point, "universal" is the only supported version in >> >> Flink >> >> >> and the >> >> >> > "version" key can be optional. >> >> >> > >> >> >> > --------------------------------------------- >> >> >> > >> >> >> > Hi Jingsong, >> >> >> > >> >> >> > > "version" vs "kafka.version" >> >> >> > I though about it. But if we prefix "kafka" to version, we >> >> should >> >> >> prefix >> >> >> > "kafka" for all other property keys, because they are all >> kafka >> >> >> specific >> >> >> > options. >> >> >> > However, that will make the property set verbose, see >> rejected >> >> >> option#2 in >> >> >> > the FLIP. >> >> >> > >> >> >> > > explicitly separate options for source and sink >> >> >> > That's a good topic. It's good to have a guideline for the >> new >> >> >> property >> >> >> > keys. >> >> >> > I'm fine to prefix with a "source"/"sink" for some connector >> >> >> keys. >> >> >> > Actually, we already do this in some connectors, e.g. jdbc >> and >> >> >> hbase. >> >> >> > >> >> >> > Best, >> >> >> > Jark >> >> >> > >> >> >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >> >> [hidden email] >> >> >> <mailto:[hidden email]>> wrote: >> >> >> > >> >> >> > > Thanks Jark for the proposal. >> >> >> > > >> >> >> > > +1 to the general idea. >> >> >> > > >> >> >> > > For "version", what about "kafka.version"? It is obvious to >> >> >> know its >> >> >> > > meaning. >> >> >> > > >> >> >> > > And I'd like to start a new topic: >> >> >> > > Should we need to explicitly separate source from sink? >> >> >> > > With the development of batch and streaming, more and more >> >> >> connectors >> >> >> > have >> >> >> > > both source and sink. >> >> >> > > >> >> >> > > So should we set a rule for table properties: >> >> >> > > - properties for both source and sink: without prefix, like >> >> >> "topic" >> >> >> > > - properties for source only: with "source." prefix, like >> >> >> > > "source.startup-mode" >> >> >> > > - properties for sink only: with "sink." prefix, like >> >> >> "sink.partitioner" >> >> >> > > >> >> >> > > What do you think? >> >> >> > > >> >> >> > > Best, >> >> >> > > Jingsong Lee >> >> >> > > >> >> >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >> >> >> <mailto:[hidden email]>> wrote: >> >> >> > > >> >> >> > > > Hi Kurt, >> >> >> > > > >> >> >> > > > That's good questions. >> >> >> > > > >> >> >> > > > > the meaning of "version" >> >> >> > > > There are two versions in the old design. One is property >> >> >> version >> >> >> > > > "connector.property-version" which can be used for >> backward >> >> >> > > compatibility. >> >> >> > > > The other one is "connector.version" which defines the >> >> >> version of >> >> >> > > external >> >> >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. >> >> >> > > > In this proposal, the "version" is the previous >> >> >> "connector.version". >> >> >> > The >> >> >> > > > ""connector.property-version" is not introduced in new >> >> >> design. >> >> >> > > > >> >> >> > > > > how to keep the old capability which can evolve >> connector >> >> >> properties >> >> >> > > > The "connector.property-version" is an optional key in >> the >> >> >> old design >> >> >> > and >> >> >> > > > is never bumped up. >> >> >> > > > I'm not sure how "connector.property-version" should work >> >> >> in the >> >> >> > initial >> >> >> > > > design. Maybe @Timo Walther <[hidden email] >> >> >> <mailto:[hidden email]>> has more knowledge on >> >> >> > > > this. >> >> >> > > > But for the new properties, every options should be >> >> >> expressed as >> >> >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` >> >> >> method to >> >> >> > easily >> >> >> > > > support evolving keys. >> >> >> > > > >> >> >> > > > > a single keys instead of two, e.g. "kafka-0.11", >> >> >> "kafka-universal"? >> >> >> > > > There are several benefit to use separate "version" key I >> >> can >> >> >> see: >> >> >> > > > 1) it's more readable to separete them into different >> keys, >> >> >> because >> >> >> > they >> >> >> > > > are orthogonal concepts. >> >> >> > > > 2) the planner can give all the availble versions in the >> >> >> exception >> >> >> > > message, >> >> >> > > > if user uses a wrong version (this is often reported in >> >> >> user ML). >> >> >> > > > 3) If we use "kafka-0.11" as connector identifier, we may >> >> >> have to >> >> >> > write a >> >> >> > > > full documentation for each version, because they are >> >> >> different >> >> >> > > > "connector"? >> >> >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually >> >> >> the same >> >> >> > > connector >> >> >> > > > but with different "client jar" version, >> >> >> > > > they share all the same supported property keys and >> >> >> should reside >> >> >> > > > together. >> >> >> > > > 4) IMO, the future vision is version-free. At some point >> in >> >> >> the future, >> >> >> > > we >> >> >> > > > may don't need users to specify kafka version anymore, >> and >> >> >> make >> >> >> > > > "version=universal" as optional or removed in the future. >> >> >> This is can >> >> >> > be >> >> >> > > > done easily if they are separate keys. >> >> >> > > > >> >> >> > > > Best, >> >> >> > > > Jark >> >> >> > > > >> >> >> > > > >> >> >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young < >> [hidden email] >> >> >> <mailto:[hidden email]>> wrote: >> >> >> > > > >> >> >> > > > > Hi Jark, >> >> >> > > > > >> >> >> > > > > Thanks for the proposal, I'm +1 to the general idea. >> >> >> However I have a >> >> >> > > > > question about "version", >> >> >> > > > > in the old design, the version seems to be aimed for >> >> >> tracking >> >> >> > property >> >> >> > > > > version, with different >> >> >> > > > > version, we could evolve these step by step without >> >> >> breaking backward >> >> >> > > > > compatibility. But in this >> >> >> > > > > design, version is representing external system's >> >> >> version, like >> >> >> > "0.11" >> >> >> > > > for >> >> >> > > > > kafka, "6" or "7" for ES. >> >> >> > > > > I'm not sure if this is necessary, what's the benefit >> of >> >> >> using two >> >> >> > keys >> >> >> > > > > instead of one, like "kafka-0.11" >> >> >> > > > > or "ES6" directly? And how about the old capability >> which >> >> >> could let >> >> >> > us >> >> >> > > > > evolving connector properties? >> >> >> > > > > >> >> >> > > > > Best, >> >> >> > > > > Kurt >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen >> >> >> <[hidden email] <mailto:[hidden email]>> >> >> >> > > > > wrote: >> >> >> > > > > >> >> >> > > > > > Hi Jark, >> >> >> > > > > > >> >> >> > > > > > I am really looking forward to this feature. I think >> >> this >> >> >> feature >> >> >> > > > > > could simplify flink sql code,and at the same time , >> >> >> > > > > > it could make the developer more easlier to config >> the >> >> >> flink sql >> >> >> > WITH >> >> >> > > > > > options. >> >> >> > > > > > >> >> >> > > > > > Now when I am using flink sql to write flink task , >> >> >> sometimes I >> >> >> > think >> >> >> > > > the >> >> >> > > > > > WITH options is too long for user. >> >> >> > > > > > For example,I config the kafka source connector >> >> >> parameter,for >> >> >> > > consumer >> >> >> > > > > > group and brokers parameter: >> >> >> > > > > > >> >> >> > > > > > 'connector.properties.0.key' = 'group.id >> >> >> <http://group.id>' >> >> >> > > > > > > , 'connector.properties.0.value' = 'xxx' >> >> >> > > > > > > , 'connector.properties.1.key' = >> >> >> 'bootstrap.servers' >> >> >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > > I can understand this config , but for the flink >> fresh >> >> >> man,maybe it >> >> >> > > > > > is confused for him. >> >> >> > > > > > In my thought, I am really looking forward to this >> >> >> feature,thank >> >> >> > you >> >> >> > > to >> >> >> > > > > > propose this feature. >> >> >> > > > > > >> >> >> > > > > > Best wishes, >> >> >> > > > > > LakeShen >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> >> 于 >> >> >> 2020年3月30日周一 下午2:02写道: >> >> >> > > > > > >> >> >> > > > > > > Hi everyone, >> >> >> > > > > > > >> >> >> > > > > > > I want to start a discussion about further improve >> and >> >> >> simplify >> >> >> > our >> >> >> > > > > > current >> >> >> > > > > > > connector porperty keys, aka WITH options. >> Currently, >> >> >> we have a >> >> >> > > > > > > 'connector.' prefix for many properties, but they >> are >> >> >> verbose, >> >> >> > and >> >> >> > > we >> >> >> > > > > > see a >> >> >> > > > > > > big inconsistency between the properties when >> >> designing >> >> >> FLIP-107. >> >> >> > > > > > > >> >> >> > > > > > > So we propose to remove all the 'connector.' prefix >> >> and >> >> >> rename >> >> >> > > > > > > 'connector.type' to 'connector', 'format.type' to >> >> >> 'format'. So a >> >> >> > > new >> >> >> > > > > > Kafka >> >> >> > > > > > > DDL may look like this: >> >> >> > > > > > > >> >> >> > > > > > > CREATE TABLE kafka_table ( >> >> >> > > > > > > ... >> >> >> > > > > > > ) WITH ( >> >> >> > > > > > > 'connector' = 'kafka', >> >> >> > > > > > > 'version' = '0.10', >> >> >> > > > > > > 'topic' = 'test-topic', >> >> >> > > > > > > 'startup-mode' = 'earliest-offset', >> >> >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', >> >> >> > > > > > > 'properties.group.id <http://properties.group.id>' >> = >> >> >> 'testGroup', >> >> >> > > > > > > 'format' = 'json', >> >> >> > > > > > > 'format.fail-on-missing-field' = 'false' >> >> >> > > > > > > ); >> >> >> > > > > > > >> >> >> > > > > > > The new connector property key set will come >> together >> >> >> with new >> >> >> > > > Factory >> >> >> > > > > > > inferface which is proposed in FLIP-95. Old >> properties >> >> >> are still >> >> >> > > > > > compatible >> >> >> > > > > > > with their existing implementation. New properties >> >> >> are only >> >> >> > > available >> >> >> > > > > in >> >> >> > > > > > > new DynamicTableFactory implementations. >> >> >> > > > > > > >> >> >> > > > > > > You can access the detailed FLIP here: >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> >> > >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >> >> >> > > > > > > >> >> >> > > > > > > Best, >> >> >> > > > > > > Jark >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> >> > > >> >> >> > > -- >> >> >> > > Best, Jingsong Lee >> >> >> > > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> Benchao Li >> >> >> School of Electronics Engineering and Computer Science, Peking >> >> >> University >> >> >> Tel:+86-15650713730 >> >> >> Email:[hidden email] >> >> >> <mailto:[hidden email]>;[hidden email] >> >> >> <mailto:[hidden email]> >> >> >> >> >> > >> >> >> > >> > |
In reply to this post by Jark Wu-2
Thanks Jark for bring up this discussion, +1 for this idea, I believe the user has suffered from the verbose property key for long time.
Just one question, how do we handle the keys with wildcard, such as the “connector.properties.*” in Kafka connector which would then hand-over to Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption to describe the “supported properties”, then I have to concerns: • For the new keys, do we still need to put multi-lines there the such key, such as “connector.properties.abc” “connector.properties.def”, or should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." • Should the ConfigOption support the wildcard ? (If we plan to support the current multi-line style) Best, Danny Chan 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道: > Hi all, > > Thanks for the feedbacks. > > It seems that we have a conclusion to put the version into the factory > identifier. I'm also fine with this. > If we have this outcome, the interface of Factory#factoryVersion is not > needed anymore, this can simplify the learning cost of new factory. > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > <[hidden email]> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > the meaning of "universal" not easy to understand. > kafka-0.11 => kafka for 0.11 version > kafka-0.10 => kafka for 0.10 version > elasticsearch-6 => elasticsearch for 6.x versions > elasticsearch-7 => elasticsearch for 7.x versions > hbase-1.4 => hbase for 1.4.x versions > jdbc > filesystem > > We use "-" as the version delimiter to make them to be more consistent. > This is not forces, users can also use other delimiters or without > delimiter. > But this can be a guilde in the Javadoc of Factory, to make the connector > ecosystem to be more consistent. > > What do you think? > > ------------------------ > > Regarding "connector.property-version": > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > designed not support to read current properties. > All the current properties are routed to the old factories if they are > using "connector.type". Otherwise, properties are routed to new factories. > > If I understand correctly, the "connector.property-version" is attched > implicitly by system, not manually set by users. > For example, the framework should add "connector.property-version=1" to > properties when processing DDL statement. > I'm fine to add a "connector.property-version=1" when processing DDL > statement, but I think it's also fine if we don't, > because this can be done in the future if need and the default version can > be 1. > > Best, > Jark > > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: > > > Hi all, > > > > Thanks for the feedbacks. > > > > It seems that we have a conclusion to put the version into the factory > > identifier. I'm also fine with this. > > If we have this outcome, the interface of Factory#factoryVersion is not > > needed anymore, this can simplify the learning cost of new factory. > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > <[hidden email]> > > > > Btw, I would like to use "_" instead of "-" as the version delimiter, > > because "-" looks like minus and may confuse users, e.g. "elasticsearch-6". > > This is not forced, but should be a guilde in the Javadoc of Factory. > > I propose to use the following identifiers for existing connectors, > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > > the meaning of "universal" not easy to understand. > > kafka-0.11 => kafka for 0.11 version > > kafka-0.10 => kafka for 0.10 version > > elasticsearch-6 => elasticsearch for 6.x versions > > elasticsearch-7 => elasticsearch for 7.x versions > > hbase-1.4 => hbase for 1.4.x versions > > jdbc > > filesystem > > > > We use "-" as the version delimiter to make them to be more consistent. > > This is not forces, users can also use other delimiters or without > > delimiter. > > But this can be a guilde in the Javadoc of Factory, to make the connector > > ecosystem to be more consistent. > > > > What do you think? > > > > ------------------------ > > > > Regarding "connector.property-version": > > > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > > designed not support to read current properties. > > All the current properties are routed to the old factories if they are > > using "connector.type". Otherwise, properties are routed to new factories. > > > > If I understand correctly, the "connector.property-version" is attched > > implicitly by system, not manually set by users. > > For example, the framework should add "connector.property-version=1" to > > properties when processing DDL statement. > > I'm fine to add a "connector.property-version=1" when processing DDL > > statement, but I think it's also fine if we don't, > > because this can be done in the future if need and the default version can > > be 1. > > > > Best, > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]> > > wrote: > > > > > Hi all, > > > > > > I like the overall design of the FLIP. > > > > > > As for the withstanding concerns. I kind of like the approach to put the > > > version into the factory identifier. I think it's the cleanest way to > > > say that this version actually applies to the connector itself and not > > > to the system it connects to. BTW, I think the outcome of this > > > discussion will affect interfaces described in FLIP-95. If we put the > > > version into the functionIdentifier, do we need Factory#factoryVersion? > > > > > > I also think it does make sense to have a versioning for the properties > > > as well. Are we able to read all the current properties with the new > > > factories? I think we could use the "connector.property-version" to > > > alternate between different Factory interfaces to support the old set of > > > properties. Otherwise the new factories need to understand both set of > > > properties, don't they? > > > > > > Best, > > > > > > Dawid > > > > > > On 30/03/2020 17:07, Timo Walther wrote: > > > > Hi Jark, > > > > > > > > thanks for the FLIP. I don't have a strong opinion on > > > > DynamicTableFactory#factoryVersion() for distiguishing connector > > > > versions. We can also include it in the factory identifier itself. For > > > > Kafka, I would then just use `kafka` because the naming "universal" > > > > was just a helper solution at that time. > > > > > > > > What we need is a good guide for how to design the options. We should > > > > include that in the JavaDoc of factory interface itself, once it is > > > > in-place. I like the difference between source and sink in properties. > > > > > > > > Regarding "connector.property-version": > > > > > > > > It was meant for backwards compatibility along the dimension of > > > > "property design" compared to "connector version". However, since the > > > > connector is now responsible for parsing all properties, it is not as > > > > necessary as it was before. > > > > > > > > However, a change of the property design as it is done in FLIP-107 > > > > becomes more difficult without a property version and versioning is > > > > always a good idea in API world. > > > > > > > > Regards, > > > > Timo > > > > > > > > > > > > On 30.03.20 16:38, Benchao Li wrote: > > > > > Hi Jark, > > > > > > > > > > Thanks for starting the discussion. The FLIP LGTM generally. > > > > > > > > > > Regarding connector version VS put version into connector's name, > > > > > I favor the latter personally, using one property to locate a > > > > > connector can make the error log more precise. > > > > > From the users' side, using one property to match a connector will > > > > > be easier. Especially we have many connectors, > > > > > and some of the need version property required, and some of them not. > > > > > > > > > > Regarding Jingsong's suggestion, > > > > > IMO, it's a very good complement to the FLIP. Distinguishing > > > > > properties for source and sink can be very useful, and > > > > > also this will make the connector property more precise. > > > > > We are also sick of this for now, we cannot know whether a DDL is a > > > > > source or sink unless we look through all queries where > > > > > the table is used. > > > > > Even more, some of the required properties are only required for > > > > > source, bug we cannot leave it blank for sink, and vice versa. > > > > > I think we can also add a type for dimension tables except source and > > > > > sink. > > > > > > > > > > Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日 > > > > > 周一 下午8:16写道: > > > > > > > > > > > It's not possible for the framework to throw such exception. > > > > > Because the > > > > > framework doesn't know what versions do the connector support. > > > > > > > > > > Not really, we can list all valid connectors framework could > > > > > found. E.g. > > > > > user mistyped 'kafka-0.x', the error message will looks like: > > > > > > > > > > we don't have any connector named "kafka-0.x", but we have: > > > > > FileSystem > > > > > Kafka-0.10 > > > > > Kafka-0.11 > > > > > ElasticSearch6 > > > > > ElasticSearch7 > > > > > > > > > > Best, > > > > > Kurt > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > 2) Lists all available connectors seems also quite > > > > > straightforward, e.g > > > > > > user provided a wrong "kafka-0.8", we tell user we have > > > > > candidates of > > > > > > "kakfa-0.11", "kafka-universal" > > > > > > It's not possible for the framework to throw such exception. > > > > > Because the > > > > > > framework doesn't know what versions do the connector support. > > > > > All the > > > > > > version information is a blackbox in the identifier. But with > > > > > > `Factory#factoryVersion()` interface, we can know all the > > > > > supported > > > > > > versions. > > > > > > > > > > > > > 3) I don't think so. We can still treat it as the same > > > > > connector but with > > > > > > different versions. > > > > > > That's true but that's weird. Because from the plain DDL > > > > > definition, they > > > > > > look like different connectors with different "connector" > > > > > value, e.g. > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > > > > > > > > > > > If users don't set any version, we will use "kafka-universal" > > > > > instead. > > > > > > The behavior is inconsistent IMO. > > > > > > That is a long term vision when there is no kafka clusters > > > > > with <0.11 > > > > > > version. > > > > > > At that point, "universal" is the only supported version in > > > Flink > > > > > and the > > > > > > "version" key can be optional. > > > > > > > > > > > > --------------------------------------------- > > > > > > > > > > > > Hi Jingsong, > > > > > > > > > > > > > "version" vs "kafka.version" > > > > > > I though about it. But if we prefix "kafka" to version, we > > > should > > > > > prefix > > > > > > "kafka" for all other property keys, because they are all kafka > > > > > specific > > > > > > options. > > > > > > However, that will make the property set verbose, see rejected > > > > > option#2 in > > > > > > the FLIP. > > > > > > > > > > > > > explicitly separate options for source and sink > > > > > > That's a good topic. It's good to have a guideline for the new > > > > > property > > > > > > keys. > > > > > > I'm fine to prefix with a "source"/"sink" for some connector > > > > > keys. > > > > > > Actually, we already do this in some connectors, e.g. jdbc and > > > > > hbase. > > > > > > > > > > > > Best, > > > > > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > > > [hidden email] > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > Thanks Jark for the proposal. > > > > > > > > > > > > > > +1 to the general idea. > > > > > > > > > > > > > > For "version", what about "kafka.version"? It is obvious to > > > > > know its > > > > > > > meaning. > > > > > > > > > > > > > > And I'd like to start a new topic: > > > > > > > Should we need to explicitly separate source from sink? > > > > > > > With the development of batch and streaming, more and more > > > > > connectors > > > > > > have > > > > > > > both source and sink. > > > > > > > > > > > > > > So should we set a rule for table properties: > > > > > > > - properties for both source and sink: without prefix, like > > > > > "topic" > > > > > > > - properties for source only: with "source." prefix, like > > > > > > > "source.startup-mode" > > > > > > > - properties for sink only: with "sink." prefix, like > > > > > "sink.partitioner" > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > Best, > > > > > > > Jingsong Lee > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > > That's good questions. > > > > > > > > > > > > > > > > > the meaning of "version" > > > > > > > > There are two versions in the old design. One is property > > > > > version > > > > > > > > "connector.property-version" which can be used for backward > > > > > > > compatibility. > > > > > > > > The other one is "connector.version" which defines the > > > > > version of > > > > > > > external > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > > > > > > In this proposal, the "version" is the previous > > > > > "connector.version". > > > > > > The > > > > > > > > ""connector.property-version" is not introduced in new > > > > > design. > > > > > > > > > > > > > > > > > how to keep the old capability which can evolve connector > > > > > properties > > > > > > > > The "connector.property-version" is an optional key in the > > > > > old design > > > > > > and > > > > > > > > is never bumped up. > > > > > > > > I'm not sure how "connector.property-version" should work > > > > > in the > > > > > > initial > > > > > > > > design. Maybe @Timo Walther <[hidden email] > > > > > <mailto:[hidden email]>> has more knowledge on > > > > > > > > this. > > > > > > > > But for the new properties, every options should be > > > > > expressed as > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > > > > > method to > > > > > > easily > > > > > > > > support evolving keys. > > > > > > > > > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", > > > > > "kafka-universal"? > > > > > > > > There are several benefit to use separate "version" key I > > > can > > > > > see: > > > > > > > > 1) it's more readable to separete them into different keys, > > > > > because > > > > > > they > > > > > > > > are orthogonal concepts. > > > > > > > > 2) the planner can give all the availble versions in the > > > > > exception > > > > > > > message, > > > > > > > > if user uses a wrong version (this is often reported in > > > > > user ML). > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may > > > > > have to > > > > > > write a > > > > > > > > full documentation for each version, because they are > > > > > different > > > > > > > > "connector"? > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > > > > > the same > > > > > > > connector > > > > > > > > but with different "client jar" version, > > > > > > > > they share all the same supported property keys and > > > > > should reside > > > > > > > > together. > > > > > > > > 4) IMO, the future vision is version-free. At some point in > > > > > the future, > > > > > > > we > > > > > > > > may don't need users to specify kafka version anymore, and > > > > > make > > > > > > > > "version=universal" as optional or removed in the future. > > > > > This is can > > > > > > be > > > > > > > > done easily if they are separate keys. > > > > > > > > > > > > > > > > Best, > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. > > > > > However I have a > > > > > > > > > question about "version", > > > > > > > > > in the old design, the version seems to be aimed for > > > > > tracking > > > > > > property > > > > > > > > > version, with different > > > > > > > > > version, we could evolve these step by step without > > > > > breaking backward > > > > > > > > > compatibility. But in this > > > > > > > > > design, version is representing external system's > > > > > version, like > > > > > > "0.11" > > > > > > > > for > > > > > > > > > kafka, "6" or "7" for ES. > > > > > > > > > I'm not sure if this is necessary, what's the benefit of > > > > > using two > > > > > > keys > > > > > > > > > instead of one, like "kafka-0.11" > > > > > > > > > or "ES6" directly? And how about the old capability which > > > > > could let > > > > > > us > > > > > > > > > evolving connector properties? > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > > > > > <[hidden email] <mailto:[hidden email]>> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > > > I am really looking forward to this feature. I think > > > this > > > > > feature > > > > > > > > > > could simplify flink sql code,and at the same time , > > > > > > > > > > it could make the developer more easlier to config the > > > > > flink sql > > > > > > WITH > > > > > > > > > > options. > > > > > > > > > > > > > > > > > > > > Now when I am using flink sql to write flink task , > > > > > sometimes I > > > > > > think > > > > > > > > the > > > > > > > > > > WITH options is too long for user. > > > > > > > > > > For example,I config the kafka source connector > > > > > parameter,for > > > > > > > consumer > > > > > > > > > > group and brokers parameter: > > > > > > > > > > > > > > > > > > > > 'connector.properties.0.key' = 'group.id > > > > > <http://group.id>' > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > > > > > > , 'connector.properties.1.key' = > > > > > 'bootstrap.servers' > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh > > > > > man,maybe it > > > > > > > > > > is confused for him. > > > > > > > > > > In my thought, I am really looking forward to this > > > > > feature,thank > > > > > > you > > > > > > > to > > > > > > > > > > propose this feature. > > > > > > > > > > > > > > > > > > > > Best wishes, > > > > > > > > > > LakeShen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 > > > > > 2020年3月30日周一 下午2:02写道: > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > I want to start a discussion about further improve and > > > > > simplify > > > > > > our > > > > > > > > > > current > > > > > > > > > > > connector porperty keys, aka WITH options. Currently, > > > > > we have a > > > > > > > > > > > 'connector.' prefix for many properties, but they are > > > > > verbose, > > > > > > and > > > > > > > we > > > > > > > > > > see a > > > > > > > > > > > big inconsistency between the properties when > > > designing > > > > > FLIP-107. > > > > > > > > > > > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix > > > and > > > > > rename > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to > > > > > 'format'. So a > > > > > > > new > > > > > > > > > > Kafka > > > > > > > > > > > DDL may look like this: > > > > > > > > > > > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > > > > > > ... > > > > > > > > > > > ) WITH ( > > > > > > > > > > > 'connector' = 'kafka', > > > > > > > > > > > 'version' = '0.10', > > > > > > > > > > > 'topic' = 'test-topic', > > > > > > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > > > > > > 'properties.group.id <http://properties.group.id>' = > > > > > 'testGroup', > > > > > > > > > > > 'format' = 'json', > > > > > > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > > > > > > ); > > > > > > > > > > > > > > > > > > > > > > The new connector property key set will come together > > > > > with new > > > > > > > > Factory > > > > > > > > > > > inferface which is proposed in FLIP-95. Old properties > > > > > are still > > > > > > > > > > compatible > > > > > > > > > > > with their existing implementation. New properties > > > > > are only > > > > > > > available > > > > > > > > > in > > > > > > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Benchao Li > > > > > School of Electronics Engineering and Computer Science, Peking > > > > > University > > > > > Tel:+86-15650713730 > > > > > Email:[hidden email] > > > > > <mailto:[hidden email]>;[hidden email] > > > > > <mailto:[hidden email]> > > > > > > > > > > > > > > |
Hi Kurt,
I also prefer "-" as version delimiter now. I didn't remove the "_" proposal by mistake, that's why I sent another email last night :) Regarding to "property-version", I also think we shouldn't let users to learn about this. And ConfigOption provides a good ability to support deprecated keys and auto-generate documentation for deprecated keys. Hi Danny, Regarding to “connector.properties.*”: In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions() inferfaces are only used for generation of documentation. It does not influence the discovery and validation of a factory. The validation logic is defined by connectors in createDynamicTableSource/Sink(). So you don't have to provide an option for "connector.properties.*". But I think we should make ConfigOption support wildcard in the long term for a full story. I don't think we should inline all the "connector.properties.*", otherwise, it will be very tricky for users to configure the properties. Regarding to FLIP-113, I suggest to provide some ConfigOptions for commonly used kafka properties and put them in the supportedHintOptions(), e.g. "connector.properties.group.id", "connector.properties.fetch.min.bytes". Best, Jark On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote: > Thanks Jark for bring up this discussion, +1 for this idea, I believe the > user has suffered from the verbose property key for long time. > > Just one question, how do we handle the keys with wildcard, such as the > “connector.properties.*” in Kafka connector which would then hand-over to > Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption > to describe the “supported properties”, then I have to concerns: > > • For the new keys, do we still need to put multi-lines there the such > key, such as “connector.properties.abc” “connector.properties.def”, or > should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." > • Should the ConfigOption support the wildcard ? (If we plan to support > the current multi-line style) > > > Best, > Danny Chan > 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道: > > Hi all, > > > > Thanks for the feedbacks. > > > > It seems that we have a conclusion to put the version into the factory > > identifier. I'm also fine with this. > > If we have this outcome, the interface of Factory#factoryVersion is not > > needed anymore, this can simplify the learning cost of new factory. > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > <[hidden email]> > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > > the meaning of "universal" not easy to understand. > > kafka-0.11 => kafka for 0.11 version > > kafka-0.10 => kafka for 0.10 version > > elasticsearch-6 => elasticsearch for 6.x versions > > elasticsearch-7 => elasticsearch for 7.x versions > > hbase-1.4 => hbase for 1.4.x versions > > jdbc > > filesystem > > > > We use "-" as the version delimiter to make them to be more consistent. > > This is not forces, users can also use other delimiters or without > > delimiter. > > But this can be a guilde in the Javadoc of Factory, to make the connector > > ecosystem to be more consistent. > > > > What do you think? > > > > ------------------------ > > > > Regarding "connector.property-version": > > > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > > designed not support to read current properties. > > All the current properties are routed to the old factories if they are > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > If I understand correctly, the "connector.property-version" is attched > > implicitly by system, not manually set by users. > > For example, the framework should add "connector.property-version=1" to > > properties when processing DDL statement. > > I'm fine to add a "connector.property-version=1" when processing DDL > > statement, but I think it's also fine if we don't, > > because this can be done in the future if need and the default version > can > > be 1. > > > > Best, > > Jark > > > > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: > > > > > Hi all, > > > > > > Thanks for the feedbacks. > > > > > > It seems that we have a conclusion to put the version into the factory > > > identifier. I'm also fine with this. > > > If we have this outcome, the interface of Factory#factoryVersion is not > > > needed anymore, this can simplify the learning cost of new factory. > > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > > <[hidden email]> > > > > > > Btw, I would like to use "_" instead of "-" as the version delimiter, > > > because "-" looks like minus and may confuse users, e.g. > "elasticsearch-6". > > > This is not forced, but should be a guilde in the Javadoc of Factory. > > > I propose to use the following identifiers for existing connectors, > > > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", > because > > > the meaning of "universal" not easy to understand. > > > kafka-0.11 => kafka for 0.11 version > > > kafka-0.10 => kafka for 0.10 version > > > elasticsearch-6 => elasticsearch for 6.x versions > > > elasticsearch-7 => elasticsearch for 7.x versions > > > hbase-1.4 => hbase for 1.4.x versions > > > jdbc > > > filesystem > > > > > > We use "-" as the version delimiter to make them to be more consistent. > > > This is not forces, users can also use other delimiters or without > > > delimiter. > > > But this can be a guilde in the Javadoc of Factory, to make the > connector > > > ecosystem to be more consistent. > > > > > > What do you think? > > > > > > ------------------------ > > > > > > Regarding "connector.property-version": > > > > > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are > > > designed not support to read current properties. > > > All the current properties are routed to the old factories if they are > > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > > > If I understand correctly, the "connector.property-version" is attched > > > implicitly by system, not manually set by users. > > > For example, the framework should add "connector.property-version=1" to > > > properties when processing DDL statement. > > > I'm fine to add a "connector.property-version=1" when processing DDL > > > statement, but I think it's also fine if we don't, > > > because this can be done in the future if need and the default version > can > > > be 1. > > > > > > Best, > > > Jark > > > > > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email] > > > > > wrote: > > > > > > > Hi all, > > > > > > > > I like the overall design of the FLIP. > > > > > > > > As for the withstanding concerns. I kind of like the approach to put > the > > > > version into the factory identifier. I think it's the cleanest way to > > > > say that this version actually applies to the connector itself and > not > > > > to the system it connects to. BTW, I think the outcome of this > > > > discussion will affect interfaces described in FLIP-95. If we put the > > > > version into the functionIdentifier, do we need > Factory#factoryVersion? > > > > > > > > I also think it does make sense to have a versioning for the > properties > > > > as well. Are we able to read all the current properties with the new > > > > factories? I think we could use the "connector.property-version" to > > > > alternate between different Factory interfaces to support the old > set of > > > > properties. Otherwise the new factories need to understand both set > of > > > > properties, don't they? > > > > > > > > Best, > > > > > > > > Dawid > > > > > > > > On 30/03/2020 17:07, Timo Walther wrote: > > > > > Hi Jark, > > > > > > > > > > thanks for the FLIP. I don't have a strong opinion on > > > > > DynamicTableFactory#factoryVersion() for distiguishing connector > > > > > versions. We can also include it in the factory identifier itself. > For > > > > > Kafka, I would then just use `kafka` because the naming "universal" > > > > > was just a helper solution at that time. > > > > > > > > > > What we need is a good guide for how to design the options. We > should > > > > > include that in the JavaDoc of factory interface itself, once it is > > > > > in-place. I like the difference between source and sink in > properties. > > > > > > > > > > Regarding "connector.property-version": > > > > > > > > > > It was meant for backwards compatibility along the dimension of > > > > > "property design" compared to "connector version". However, since > the > > > > > connector is now responsible for parsing all properties, it is not > as > > > > > necessary as it was before. > > > > > > > > > > However, a change of the property design as it is done in FLIP-107 > > > > > becomes more difficult without a property version and versioning is > > > > > always a good idea in API world. > > > > > > > > > > Regards, > > > > > Timo > > > > > > > > > > > > > > > On 30.03.20 16:38, Benchao Li wrote: > > > > > > Hi Jark, > > > > > > > > > > > > Thanks for starting the discussion. The FLIP LGTM generally. > > > > > > > > > > > > Regarding connector version VS put version into connector's name, > > > > > > I favor the latter personally, using one property to locate a > > > > > > connector can make the error log more precise. > > > > > > From the users' side, using one property to match a connector > will > > > > > > be easier. Especially we have many connectors, > > > > > > and some of the need version property required, and some of them > not. > > > > > > > > > > > > Regarding Jingsong's suggestion, > > > > > > IMO, it's a very good complement to the FLIP. Distinguishing > > > > > > properties for source and sink can be very useful, and > > > > > > also this will make the connector property more precise. > > > > > > We are also sick of this for now, we cannot know whether a DDL > is a > > > > > > source or sink unless we look through all queries where > > > > > > the table is used. > > > > > > Even more, some of the required properties are only required for > > > > > > source, bug we cannot leave it blank for sink, and vice versa. > > > > > > I think we can also add a type for dimension tables except > source and > > > > > > sink. > > > > > > > > > > > > Kurt Young <[hidden email] <mailto:[hidden email]>> > 于2020年3月30日 > > > > > > 周一 下午8:16写道: > > > > > > > > > > > > > It's not possible for the framework to throw such exception. > > > > > > Because the > > > > > > framework doesn't know what versions do the connector support. > > > > > > > > > > > > Not really, we can list all valid connectors framework could > > > > > > found. E.g. > > > > > > user mistyped 'kafka-0.x', the error message will looks like: > > > > > > > > > > > > we don't have any connector named "kafka-0.x", but we have: > > > > > > FileSystem > > > > > > Kafka-0.10 > > > > > > Kafka-0.11 > > > > > > ElasticSearch6 > > > > > > ElasticSearch7 > > > > > > > > > > > > Best, > > > > > > Kurt > > > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] > > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > 2) Lists all available connectors seems also quite > > > > > > straightforward, e.g > > > > > > > user provided a wrong "kafka-0.8", we tell user we have > > > > > > candidates of > > > > > > > "kakfa-0.11", "kafka-universal" > > > > > > > It's not possible for the framework to throw such exception. > > > > > > Because the > > > > > > > framework doesn't know what versions do the connector support. > > > > > > All the > > > > > > > version information is a blackbox in the identifier. But with > > > > > > > `Factory#factoryVersion()` interface, we can know all the > > > > > > supported > > > > > > > versions. > > > > > > > > > > > > > > > 3) I don't think so. We can still treat it as the same > > > > > > connector but with > > > > > > > different versions. > > > > > > > That's true but that's weird. Because from the plain DDL > > > > > > definition, they > > > > > > > look like different connectors with different "connector" > > > > > > value, e.g. > > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > > > > > > > > > > > > > If users don't set any version, we will use "kafka-universal" > > > > > > instead. > > > > > > > The behavior is inconsistent IMO. > > > > > > > That is a long term vision when there is no kafka clusters > > > > > > with <0.11 > > > > > > > version. > > > > > > > At that point, "universal" is the only supported version in > > > > Flink > > > > > > and the > > > > > > > "version" key can be optional. > > > > > > > > > > > > > > --------------------------------------------- > > > > > > > > > > > > > > Hi Jingsong, > > > > > > > > > > > > > > > "version" vs "kafka.version" > > > > > > > I though about it. But if we prefix "kafka" to version, we > > > > should > > > > > > prefix > > > > > > > "kafka" for all other property keys, because they are all kafka > > > > > > specific > > > > > > > options. > > > > > > > However, that will make the property set verbose, see rejected > > > > > > option#2 in > > > > > > > the FLIP. > > > > > > > > > > > > > > > explicitly separate options for source and sink > > > > > > > That's a good topic. It's good to have a guideline for the new > > > > > > property > > > > > > > keys. > > > > > > > I'm fine to prefix with a "source"/"sink" for some connector > > > > > > keys. > > > > > > > Actually, we already do this in some connectors, e.g. jdbc and > > > > > > hbase. > > > > > > > > > > > > > > Best, > > > > > > > Jark > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > > > > [hidden email] > > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > > > Thanks Jark for the proposal. > > > > > > > > > > > > > > > > +1 to the general idea. > > > > > > > > > > > > > > > > For "version", what about "kafka.version"? It is obvious to > > > > > > know its > > > > > > > > meaning. > > > > > > > > > > > > > > > > And I'd like to start a new topic: > > > > > > > > Should we need to explicitly separate source from sink? > > > > > > > > With the development of batch and streaming, more and more > > > > > > connectors > > > > > > > have > > > > > > > > both source and sink. > > > > > > > > > > > > > > > > So should we set a rule for table properties: > > > > > > > > - properties for both source and sink: without prefix, like > > > > > > "topic" > > > > > > > > - properties for source only: with "source." prefix, like > > > > > > > > "source.startup-mode" > > > > > > > > - properties for sink only: with "sink." prefix, like > > > > > > "sink.partitioner" > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > Best, > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] > > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > > > > That's good questions. > > > > > > > > > > > > > > > > > > > the meaning of "version" > > > > > > > > > There are two versions in the old design. One is property > > > > > > version > > > > > > > > > "connector.property-version" which can be used for backward > > > > > > > > compatibility. > > > > > > > > > The other one is "connector.version" which defines the > > > > > > version of > > > > > > > > external > > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > > > > > > > In this proposal, the "version" is the previous > > > > > > "connector.version". > > > > > > > The > > > > > > > > > ""connector.property-version" is not introduced in new > > > > > > design. > > > > > > > > > > > > > > > > > > > how to keep the old capability which can evolve connector > > > > > > properties > > > > > > > > > The "connector.property-version" is an optional key in the > > > > > > old design > > > > > > > and > > > > > > > > > is never bumped up. > > > > > > > > > I'm not sure how "connector.property-version" should work > > > > > > in the > > > > > > > initial > > > > > > > > > design. Maybe @Timo Walther <[hidden email] > > > > > > <mailto:[hidden email]>> has more knowledge on > > > > > > > > > this. > > > > > > > > > But for the new properties, every options should be > > > > > > expressed as > > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > > > > > > method to > > > > > > > easily > > > > > > > > > support evolving keys. > > > > > > > > > > > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", > > > > > > "kafka-universal"? > > > > > > > > > There are several benefit to use separate "version" key I > > > > can > > > > > > see: > > > > > > > > > 1) it's more readable to separete them into different keys, > > > > > > because > > > > > > > they > > > > > > > > > are orthogonal concepts. > > > > > > > > > 2) the planner can give all the availble versions in the > > > > > > exception > > > > > > > > message, > > > > > > > > > if user uses a wrong version (this is often reported in > > > > > > user ML). > > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may > > > > > > have to > > > > > > > write a > > > > > > > > > full documentation for each version, because they are > > > > > > different > > > > > > > > > "connector"? > > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > > > > > > the same > > > > > > > > connector > > > > > > > > > but with different "client jar" version, > > > > > > > > > they share all the same supported property keys and > > > > > > should reside > > > > > > > > > together. > > > > > > > > > 4) IMO, the future vision is version-free. At some point in > > > > > > the future, > > > > > > > > we > > > > > > > > > may don't need users to specify kafka version anymore, and > > > > > > make > > > > > > > > > "version=universal" as optional or removed in the future. > > > > > > This is can > > > > > > > be > > > > > > > > > done easily if they are separate keys. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email] > > > > > > <mailto:[hidden email]>> wrote: > > > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. > > > > > > However I have a > > > > > > > > > > question about "version", > > > > > > > > > > in the old design, the version seems to be aimed for > > > > > > tracking > > > > > > > property > > > > > > > > > > version, with different > > > > > > > > > > version, we could evolve these step by step without > > > > > > breaking backward > > > > > > > > > > compatibility. But in this > > > > > > > > > > design, version is representing external system's > > > > > > version, like > > > > > > > "0.11" > > > > > > > > > for > > > > > > > > > > kafka, "6" or "7" for ES. > > > > > > > > > > I'm not sure if this is necessary, what's the benefit of > > > > > > using two > > > > > > > keys > > > > > > > > > > instead of one, like "kafka-0.11" > > > > > > > > > > or "ES6" directly? And how about the old capability which > > > > > > could let > > > > > > > us > > > > > > > > > > evolving connector properties? > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > > > > > > <[hidden email] <mailto:[hidden email]>> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > > > > > I am really looking forward to this feature. I think > > > > this > > > > > > feature > > > > > > > > > > > could simplify flink sql code,and at the same time , > > > > > > > > > > > it could make the developer more easlier to config the > > > > > > flink sql > > > > > > > WITH > > > > > > > > > > > options. > > > > > > > > > > > > > > > > > > > > > > Now when I am using flink sql to write flink task , > > > > > > sometimes I > > > > > > > think > > > > > > > > > the > > > > > > > > > > > WITH options is too long for user. > > > > > > > > > > > For example,I config the kafka source connector > > > > > > parameter,for > > > > > > > > consumer > > > > > > > > > > > group and brokers parameter: > > > > > > > > > > > > > > > > > > > > > > 'connector.properties.0.key' = 'group.id > > > > > > <http://group.id>' > > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > > > > > > > , 'connector.properties.1.key' = > > > > > > 'bootstrap.servers' > > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh > > > > > > man,maybe it > > > > > > > > > > > is confused for him. > > > > > > > > > > > In my thought, I am really looking forward to this > > > > > > feature,thank > > > > > > > you > > > > > > > > to > > > > > > > > > > > propose this feature. > > > > > > > > > > > > > > > > > > > > > > Best wishes, > > > > > > > > > > > LakeShen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于 > > > > > > 2020年3月30日周一 下午2:02写道: > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > I want to start a discussion about further improve > and > > > > > > simplify > > > > > > > our > > > > > > > > > > > current > > > > > > > > > > > > connector porperty keys, aka WITH options. Currently, > > > > > > we have a > > > > > > > > > > > > 'connector.' prefix for many properties, but they are > > > > > > verbose, > > > > > > > and > > > > > > > > we > > > > > > > > > > > see a > > > > > > > > > > > > big inconsistency between the properties when > > > > designing > > > > > > FLIP-107. > > > > > > > > > > > > > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix > > > > and > > > > > > rename > > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to > > > > > > 'format'. So a > > > > > > > > new > > > > > > > > > > > Kafka > > > > > > > > > > > > DDL may look like this: > > > > > > > > > > > > > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > > > > > > > ... > > > > > > > > > > > > ) WITH ( > > > > > > > > > > > > 'connector' = 'kafka', > > > > > > > > > > > > 'version' = '0.10', > > > > > > > > > > > > 'topic' = 'test-topic', > > > > > > > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > > > > > > > 'properties.group.id <http://properties.group.id>' = > > > > > > 'testGroup', > > > > > > > > > > > > 'format' = 'json', > > > > > > > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > > > > > > > ); > > > > > > > > > > > > > > > > > > > > > > > > The new connector property key set will come together > > > > > > with new > > > > > > > > > Factory > > > > > > > > > > > > inferface which is proposed in FLIP-95. Old > properties > > > > > > are still > > > > > > > > > > > compatible > > > > > > > > > > > > with their existing implementation. New properties > > > > > > are only > > > > > > > > available > > > > > > > > > > in > > > > > > > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Benchao Li > > > > > > School of Electronics Engineering and Computer Science, Peking > > > > > > University > > > > > > Tel:+86-15650713730 > > > > > > Email:[hidden email] > > > > > > <mailto:[hidden email]>;[hidden email] > > > > > > <mailto:[hidden email]> > > > > > > > > > > > > > > > > > > > |
Hi everyone,
Thanks for the great feedbacks so far. I updated the FLIP documentation according to the discussion. Changes include: - remove "version" key, and merge it into "connector" - add "scan", "lookup", "sink" prefix to some property keys if they are only used in that case. - add a "New Property Key" section to list all the previous property keys and new property keys. We use "scan" and "lookup" instead of "source" prefix because we should distinguish them and they aligns to FLIP-95 ScanTableSource and LookupTableSource. I also colored red for some major change of property keys in the FLIP. I will list some of them here too: kafka: connector.startup-mode => scan.startup.mode connector.specific-offsets => scan.startup.specific-offsets connector.startup-timestamp-millis => scan.startup.timestamp-millis connector.sink-partitioner & connector.sink-partitioner-class => sink.partitioner elasticsearch: connector.key-delimiter => document-id.key-delimiter # make it explicit that it is used for document id connector.key-null-literal => document-id.key-null-literal # and it also can be used for es sources in the future connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy jdbc: connector.table => table-name Welcome further feedbacks! Best, Jark On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote: > Hi Kurt, > > I also prefer "-" as version delimiter now. I didn't remove the "_" > proposal by mistake, that's why I sent another email last night :) > Regarding to "property-version", I also think we shouldn't let users to > learn about this. And ConfigOption provides a good ability > to support deprecated keys and auto-generate documentation for deprecated > keys. > > Hi Danny, > > Regarding to “connector.properties.*”: > In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions() > inferfaces are only used for generation of documentation. > It does not influence the discovery and validation of a factory. The > validation logic is defined by connectors > in createDynamicTableSource/Sink(). > So you don't have to provide an option for "connector.properties.*". But I > think we should make ConfigOption support wildcard in the long term for a > full story. > > I don't think we should inline all the "connector.properties.*", > otherwise, it will be very tricky for users to configure the properties. > Regarding to FLIP-113, I suggest to provide some ConfigOptions for > commonly used kafka properties and put them in the supportedHintOptions(), > e.g. "connector.properties.group.id", > "connector.properties.fetch.min.bytes". > > Best, > Jark > > > > > > On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote: > >> Thanks Jark for bring up this discussion, +1 for this idea, I believe the >> user has suffered from the verbose property key for long time. >> >> Just one question, how do we handle the keys with wildcard, such as the >> “connector.properties.*” in Kafka connector which would then hand-over to >> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption >> to describe the “supported properties”, then I have to concerns: >> >> • For the new keys, do we still need to put multi-lines there the such >> key, such as “connector.properties.abc” “connector.properties.def”, or >> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." >> • Should the ConfigOption support the wildcard ? (If we plan to support >> the current multi-line style) >> >> >> Best, >> Danny Chan >> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道: >> > Hi all, >> > >> > Thanks for the feedbacks. >> > >> > It seems that we have a conclusion to put the version into the factory >> > identifier. I'm also fine with this. >> > If we have this outcome, the interface of Factory#factoryVersion is not >> > needed anymore, this can simplify the learning cost of new factory. >> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> > <[hidden email]> >> > >> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because >> > the meaning of "universal" not easy to understand. >> > kafka-0.11 => kafka for 0.11 version >> > kafka-0.10 => kafka for 0.10 version >> > elasticsearch-6 => elasticsearch for 6.x versions >> > elasticsearch-7 => elasticsearch for 7.x versions >> > hbase-1.4 => hbase for 1.4.x versions >> > jdbc >> > filesystem >> > >> > We use "-" as the version delimiter to make them to be more consistent. >> > This is not forces, users can also use other delimiters or without >> > delimiter. >> > But this can be a guilde in the Javadoc of Factory, to make the >> connector >> > ecosystem to be more consistent. >> > >> > What do you think? >> > >> > ------------------------ >> > >> > Regarding "connector.property-version": >> > >> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >> > designed not support to read current properties. >> > All the current properties are routed to the old factories if they are >> > using "connector.type". Otherwise, properties are routed to new >> factories. >> > >> > If I understand correctly, the "connector.property-version" is attched >> > implicitly by system, not manually set by users. >> > For example, the framework should add "connector.property-version=1" to >> > properties when processing DDL statement. >> > I'm fine to add a "connector.property-version=1" when processing DDL >> > statement, but I think it's also fine if we don't, >> > because this can be done in the future if need and the default version >> can >> > be 1. >> > >> > Best, >> > Jark >> > >> > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: >> > >> > > Hi all, >> > > >> > > Thanks for the feedbacks. >> > > >> > > It seems that we have a conclusion to put the version into the factory >> > > identifier. I'm also fine with this. >> > > If we have this outcome, the interface of Factory#factoryVersion is >> not >> > > needed anymore, this can simplify the learning cost of new factory. >> > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> > > <[hidden email]> >> > > >> > > Btw, I would like to use "_" instead of "-" as the version delimiter, >> > > because "-" looks like minus and may confuse users, e.g. >> "elasticsearch-6". >> > > This is not forced, but should be a guilde in the Javadoc of Factory. >> > > I propose to use the following identifiers for existing connectors, >> > > >> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", >> because >> > > the meaning of "universal" not easy to understand. >> > > kafka-0.11 => kafka for 0.11 version >> > > kafka-0.10 => kafka for 0.10 version >> > > elasticsearch-6 => elasticsearch for 6.x versions >> > > elasticsearch-7 => elasticsearch for 7.x versions >> > > hbase-1.4 => hbase for 1.4.x versions >> > > jdbc >> > > filesystem >> > > >> > > We use "-" as the version delimiter to make them to be more >> consistent. >> > > This is not forces, users can also use other delimiters or without >> > > delimiter. >> > > But this can be a guilde in the Javadoc of Factory, to make the >> connector >> > > ecosystem to be more consistent. >> > > >> > > What do you think? >> > > >> > > ------------------------ >> > > >> > > Regarding "connector.property-version": >> > > >> > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >> > > designed not support to read current properties. >> > > All the current properties are routed to the old factories if they are >> > > using "connector.type". Otherwise, properties are routed to new >> factories. >> > > >> > > If I understand correctly, the "connector.property-version" is attched >> > > implicitly by system, not manually set by users. >> > > For example, the framework should add "connector.property-version=1" >> to >> > > properties when processing DDL statement. >> > > I'm fine to add a "connector.property-version=1" when processing DDL >> > > statement, but I think it's also fine if we don't, >> > > because this can be done in the future if need and the default >> version can >> > > be 1. >> > > >> > > Best, >> > > Jark >> > > >> > > >> > > >> > > >> > > >> > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz < >> [hidden email]> >> > > wrote: >> > > >> > > > Hi all, >> > > > >> > > > I like the overall design of the FLIP. >> > > > >> > > > As for the withstanding concerns. I kind of like the approach to >> put the >> > > > version into the factory identifier. I think it's the cleanest way >> to >> > > > say that this version actually applies to the connector itself and >> not >> > > > to the system it connects to. BTW, I think the outcome of this >> > > > discussion will affect interfaces described in FLIP-95. If we put >> the >> > > > version into the functionIdentifier, do we need >> Factory#factoryVersion? >> > > > >> > > > I also think it does make sense to have a versioning for the >> properties >> > > > as well. Are we able to read all the current properties with the new >> > > > factories? I think we could use the "connector.property-version" to >> > > > alternate between different Factory interfaces to support the old >> set of >> > > > properties. Otherwise the new factories need to understand both set >> of >> > > > properties, don't they? >> > > > >> > > > Best, >> > > > >> > > > Dawid >> > > > >> > > > On 30/03/2020 17:07, Timo Walther wrote: >> > > > > Hi Jark, >> > > > > >> > > > > thanks for the FLIP. I don't have a strong opinion on >> > > > > DynamicTableFactory#factoryVersion() for distiguishing connector >> > > > > versions. We can also include it in the factory identifier >> itself. For >> > > > > Kafka, I would then just use `kafka` because the naming >> "universal" >> > > > > was just a helper solution at that time. >> > > > > >> > > > > What we need is a good guide for how to design the options. We >> should >> > > > > include that in the JavaDoc of factory interface itself, once it >> is >> > > > > in-place. I like the difference between source and sink in >> properties. >> > > > > >> > > > > Regarding "connector.property-version": >> > > > > >> > > > > It was meant for backwards compatibility along the dimension of >> > > > > "property design" compared to "connector version". However, since >> the >> > > > > connector is now responsible for parsing all properties, it is >> not as >> > > > > necessary as it was before. >> > > > > >> > > > > However, a change of the property design as it is done in FLIP-107 >> > > > > becomes more difficult without a property version and versioning >> is >> > > > > always a good idea in API world. >> > > > > >> > > > > Regards, >> > > > > Timo >> > > > > >> > > > > >> > > > > On 30.03.20 16:38, Benchao Li wrote: >> > > > > > Hi Jark, >> > > > > > >> > > > > > Thanks for starting the discussion. The FLIP LGTM generally. >> > > > > > >> > > > > > Regarding connector version VS put version into connector's >> name, >> > > > > > I favor the latter personally, using one property to locate a >> > > > > > connector can make the error log more precise. >> > > > > > From the users' side, using one property to match a connector >> will >> > > > > > be easier. Especially we have many connectors, >> > > > > > and some of the need version property required, and some of >> them not. >> > > > > > >> > > > > > Regarding Jingsong's suggestion, >> > > > > > IMO, it's a very good complement to the FLIP. Distinguishing >> > > > > > properties for source and sink can be very useful, and >> > > > > > also this will make the connector property more precise. >> > > > > > We are also sick of this for now, we cannot know whether a DDL >> is a >> > > > > > source or sink unless we look through all queries where >> > > > > > the table is used. >> > > > > > Even more, some of the required properties are only required for >> > > > > > source, bug we cannot leave it blank for sink, and vice versa. >> > > > > > I think we can also add a type for dimension tables except >> source and >> > > > > > sink. >> > > > > > >> > > > > > Kurt Young <[hidden email] <mailto:[hidden email]>> >> 于2020年3月30日 >> > > > > > 周一 下午8:16写道: >> > > > > > >> > > > > > > It's not possible for the framework to throw such exception. >> > > > > > Because the >> > > > > > framework doesn't know what versions do the connector support. >> > > > > > >> > > > > > Not really, we can list all valid connectors framework could >> > > > > > found. E.g. >> > > > > > user mistyped 'kafka-0.x', the error message will looks like: >> > > > > > >> > > > > > we don't have any connector named "kafka-0.x", but we have: >> > > > > > FileSystem >> > > > > > Kafka-0.10 >> > > > > > Kafka-0.11 >> > > > > > ElasticSearch6 >> > > > > > ElasticSearch7 >> > > > > > >> > > > > > Best, >> > > > > > Kurt >> > > > > > >> > > > > > >> > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >> > > > > > <mailto:[hidden email]>> wrote: >> > > > > > >> > > > > > > Hi Kurt, >> > > > > > > >> > > > > > > > 2) Lists all available connectors seems also quite >> > > > > > straightforward, e.g >> > > > > > > user provided a wrong "kafka-0.8", we tell user we have >> > > > > > candidates of >> > > > > > > "kakfa-0.11", "kafka-universal" >> > > > > > > It's not possible for the framework to throw such exception. >> > > > > > Because the >> > > > > > > framework doesn't know what versions do the connector support. >> > > > > > All the >> > > > > > > version information is a blackbox in the identifier. But with >> > > > > > > `Factory#factoryVersion()` interface, we can know all the >> > > > > > supported >> > > > > > > versions. >> > > > > > > >> > > > > > > > 3) I don't think so. We can still treat it as the same >> > > > > > connector but with >> > > > > > > different versions. >> > > > > > > That's true but that's weird. Because from the plain DDL >> > > > > > definition, they >> > > > > > > look like different connectors with different "connector" >> > > > > > value, e.g. >> > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'. >> > > > > > > >> > > > > > > > If users don't set any version, we will use >> "kafka-universal" >> > > > > > instead. >> > > > > > > The behavior is inconsistent IMO. >> > > > > > > That is a long term vision when there is no kafka clusters >> > > > > > with <0.11 >> > > > > > > version. >> > > > > > > At that point, "universal" is the only supported version in >> > > > Flink >> > > > > > and the >> > > > > > > "version" key can be optional. >> > > > > > > >> > > > > > > --------------------------------------------- >> > > > > > > >> > > > > > > Hi Jingsong, >> > > > > > > >> > > > > > > > "version" vs "kafka.version" >> > > > > > > I though about it. But if we prefix "kafka" to version, we >> > > > should >> > > > > > prefix >> > > > > > > "kafka" for all other property keys, because they are all >> kafka >> > > > > > specific >> > > > > > > options. >> > > > > > > However, that will make the property set verbose, see rejected >> > > > > > option#2 in >> > > > > > > the FLIP. >> > > > > > > >> > > > > > > > explicitly separate options for source and sink >> > > > > > > That's a good topic. It's good to have a guideline for the new >> > > > > > property >> > > > > > > keys. >> > > > > > > I'm fine to prefix with a "source"/"sink" for some connector >> > > > > > keys. >> > > > > > > Actually, we already do this in some connectors, e.g. jdbc and >> > > > > > hbase. >> > > > > > > >> > > > > > > Best, >> > > > > > > Jark >> > > > > > > >> > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >> > > > [hidden email] >> > > > > > <mailto:[hidden email]>> wrote: >> > > > > > > >> > > > > > > > Thanks Jark for the proposal. >> > > > > > > > >> > > > > > > > +1 to the general idea. >> > > > > > > > >> > > > > > > > For "version", what about "kafka.version"? It is obvious to >> > > > > > know its >> > > > > > > > meaning. >> > > > > > > > >> > > > > > > > And I'd like to start a new topic: >> > > > > > > > Should we need to explicitly separate source from sink? >> > > > > > > > With the development of batch and streaming, more and more >> > > > > > connectors >> > > > > > > have >> > > > > > > > both source and sink. >> > > > > > > > >> > > > > > > > So should we set a rule for table properties: >> > > > > > > > - properties for both source and sink: without prefix, like >> > > > > > "topic" >> > > > > > > > - properties for source only: with "source." prefix, like >> > > > > > > > "source.startup-mode" >> > > > > > > > - properties for sink only: with "sink." prefix, like >> > > > > > "sink.partitioner" >> > > > > > > > >> > > > > > > > What do you think? >> > > > > > > > >> > > > > > > > Best, >> > > > > > > > Jingsong Lee >> > > > > > > > >> > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >> > > > > > <mailto:[hidden email]>> wrote: >> > > > > > > > >> > > > > > > > > Hi Kurt, >> > > > > > > > > >> > > > > > > > > That's good questions. >> > > > > > > > > >> > > > > > > > > > the meaning of "version" >> > > > > > > > > There are two versions in the old design. One is property >> > > > > > version >> > > > > > > > > "connector.property-version" which can be used for >> backward >> > > > > > > > compatibility. >> > > > > > > > > The other one is "connector.version" which defines the >> > > > > > version of >> > > > > > > > external >> > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. >> > > > > > > > > In this proposal, the "version" is the previous >> > > > > > "connector.version". >> > > > > > > The >> > > > > > > > > ""connector.property-version" is not introduced in new >> > > > > > design. >> > > > > > > > > >> > > > > > > > > > how to keep the old capability which can evolve >> connector >> > > > > > properties >> > > > > > > > > The "connector.property-version" is an optional key in the >> > > > > > old design >> > > > > > > and >> > > > > > > > > is never bumped up. >> > > > > > > > > I'm not sure how "connector.property-version" should work >> > > > > > in the >> > > > > > > initial >> > > > > > > > > design. Maybe @Timo Walther <[hidden email] >> > > > > > <mailto:[hidden email]>> has more knowledge on >> > > > > > > > > this. >> > > > > > > > > But for the new properties, every options should be >> > > > > > expressed as >> > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` >> > > > > > method to >> > > > > > > easily >> > > > > > > > > support evolving keys. >> > > > > > > > > >> > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", >> > > > > > "kafka-universal"? >> > > > > > > > > There are several benefit to use separate "version" key I >> > > > can >> > > > > > see: >> > > > > > > > > 1) it's more readable to separete them into different >> keys, >> > > > > > because >> > > > > > > they >> > > > > > > > > are orthogonal concepts. >> > > > > > > > > 2) the planner can give all the availble versions in the >> > > > > > exception >> > > > > > > > message, >> > > > > > > > > if user uses a wrong version (this is often reported in >> > > > > > user ML). >> > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may >> > > > > > have to >> > > > > > > write a >> > > > > > > > > full documentation for each version, because they are >> > > > > > different >> > > > > > > > > "connector"? >> > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually >> > > > > > the same >> > > > > > > > connector >> > > > > > > > > but with different "client jar" version, >> > > > > > > > > they share all the same supported property keys and >> > > > > > should reside >> > > > > > > > > together. >> > > > > > > > > 4) IMO, the future vision is version-free. At some point >> in >> > > > > > the future, >> > > > > > > > we >> > > > > > > > > may don't need users to specify kafka version anymore, and >> > > > > > make >> > > > > > > > > "version=universal" as optional or removed in the future. >> > > > > > This is can >> > > > > > > be >> > > > > > > > > done easily if they are separate keys. >> > > > > > > > > >> > > > > > > > > Best, >> > > > > > > > > Jark >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young < >> [hidden email] >> > > > > > <mailto:[hidden email]>> wrote: >> > > > > > > > > >> > > > > > > > > > Hi Jark, >> > > > > > > > > > >> > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. >> > > > > > However I have a >> > > > > > > > > > question about "version", >> > > > > > > > > > in the old design, the version seems to be aimed for >> > > > > > tracking >> > > > > > > property >> > > > > > > > > > version, with different >> > > > > > > > > > version, we could evolve these step by step without >> > > > > > breaking backward >> > > > > > > > > > compatibility. But in this >> > > > > > > > > > design, version is representing external system's >> > > > > > version, like >> > > > > > > "0.11" >> > > > > > > > > for >> > > > > > > > > > kafka, "6" or "7" for ES. >> > > > > > > > > > I'm not sure if this is necessary, what's the benefit of >> > > > > > using two >> > > > > > > keys >> > > > > > > > > > instead of one, like "kafka-0.11" >> > > > > > > > > > or "ES6" directly? And how about the old capability >> which >> > > > > > could let >> > > > > > > us >> > > > > > > > > > evolving connector properties? >> > > > > > > > > > >> > > > > > > > > > Best, >> > > > > > > > > > Kurt >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen >> > > > > > <[hidden email] <mailto:[hidden email]>> >> > > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > Hi Jark, >> > > > > > > > > > > >> > > > > > > > > > > I am really looking forward to this feature. I think >> > > > this >> > > > > > feature >> > > > > > > > > > > could simplify flink sql code,and at the same time , >> > > > > > > > > > > it could make the developer more easlier to config the >> > > > > > flink sql >> > > > > > > WITH >> > > > > > > > > > > options. >> > > > > > > > > > > >> > > > > > > > > > > Now when I am using flink sql to write flink task , >> > > > > > sometimes I >> > > > > > > think >> > > > > > > > > the >> > > > > > > > > > > WITH options is too long for user. >> > > > > > > > > > > For example,I config the kafka source connector >> > > > > > parameter,for >> > > > > > > > consumer >> > > > > > > > > > > group and brokers parameter: >> > > > > > > > > > > >> > > > > > > > > > > 'connector.properties.0.key' = 'group.id >> > > > > > <http://group.id>' >> > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx' >> > > > > > > > > > > > , 'connector.properties.1.key' = >> > > > > > 'bootstrap.servers' >> > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx' >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > I can understand this config , but for the flink fresh >> > > > > > man,maybe it >> > > > > > > > > > > is confused for him. >> > > > > > > > > > > In my thought, I am really looking forward to this >> > > > > > feature,thank >> > > > > > > you >> > > > > > > > to >> > > > > > > > > > > propose this feature. >> > > > > > > > > > > >> > > > > > > > > > > Best wishes, >> > > > > > > > > > > LakeShen >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> >> 于 >> > > > > > 2020年3月30日周一 下午2:02写道: >> > > > > > > > > > > >> > > > > > > > > > > > Hi everyone, >> > > > > > > > > > > > >> > > > > > > > > > > > I want to start a discussion about further improve >> and >> > > > > > simplify >> > > > > > > our >> > > > > > > > > > > current >> > > > > > > > > > > > connector porperty keys, aka WITH options. >> Currently, >> > > > > > we have a >> > > > > > > > > > > > 'connector.' prefix for many properties, but they >> are >> > > > > > verbose, >> > > > > > > and >> > > > > > > > we >> > > > > > > > > > > see a >> > > > > > > > > > > > big inconsistency between the properties when >> > > > designing >> > > > > > FLIP-107. >> > > > > > > > > > > > >> > > > > > > > > > > > So we propose to remove all the 'connector.' prefix >> > > > and >> > > > > > rename >> > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to >> > > > > > 'format'. So a >> > > > > > > > new >> > > > > > > > > > > Kafka >> > > > > > > > > > > > DDL may look like this: >> > > > > > > > > > > > >> > > > > > > > > > > > CREATE TABLE kafka_table ( >> > > > > > > > > > > > ... >> > > > > > > > > > > > ) WITH ( >> > > > > > > > > > > > 'connector' = 'kafka', >> > > > > > > > > > > > 'version' = '0.10', >> > > > > > > > > > > > 'topic' = 'test-topic', >> > > > > > > > > > > > 'startup-mode' = 'earliest-offset', >> > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', >> > > > > > > > > > > > 'properties.group.id <http://properties.group.id>' >> = >> > > > > > 'testGroup', >> > > > > > > > > > > > 'format' = 'json', >> > > > > > > > > > > > 'format.fail-on-missing-field' = 'false' >> > > > > > > > > > > > ); >> > > > > > > > > > > > >> > > > > > > > > > > > The new connector property key set will come >> together >> > > > > > with new >> > > > > > > > > Factory >> > > > > > > > > > > > inferface which is proposed in FLIP-95. Old >> properties >> > > > > > are still >> > > > > > > > > > > compatible >> > > > > > > > > > > > with their existing implementation. New properties >> > > > > > are only >> > > > > > > > available >> > > > > > > > > > in >> > > > > > > > > > > > new DynamicTableFactory implementations. >> > > > > > > > > > > > >> > > > > > > > > > > > You can access the detailed FLIP here: >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >> > > > > > > > > > > > >> > > > > > > > > > > > Best, >> > > > > > > > > > > > Jark >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > -- >> > > > > > > > Best, Jingsong Lee >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > >> > > > > > Benchao Li >> > > > > > School of Electronics Engineering and Computer Science, Peking >> > > > > > University >> > > > > > Tel:+86-15650713730 >> > > > > > Email:[hidden email] >> > > > > > <mailto:[hidden email]>;[hidden email] >> > > > > > <mailto:[hidden email]> >> > > > > > >> > > > > >> > > > >> > > >> > |
Hi,
As for the `connector.property-version`. I have doubts that having deprecated keys mechanism is enough to fully support backwards compatibility. I do think we are changing not just the key names, but structure of the keys as well as the discovery mechanism and I think it is really hard to foresee all the interconnections. I don't think users need to learn about the `connector.property-version` too much. By default it would always be set to the newest version. Users could still set this property to an old version if they don't want to migrate the properties, but go fully against the old structure. I think such mechanism could be useful. I understand a similar thing can be achieved with renaming/changing the version in the factoryIdentifier. 'kafka' can use the new properties, 'kafka-v1' the old properties. I think though having a dedicated mechanism could make it a tad easier. I am not strict on that though, just wanted to express it. Best, Dawid On 31/03/2020 08:58, Jark Wu wrote: > Hi everyone, > > Thanks for the great feedbacks so far. > > I updated the FLIP documentation according to the discussion. Changes > include: > - remove "version" key, and merge it into "connector" > - add "scan", "lookup", "sink" prefix to some property keys if they are > only used in that case. > - add a "New Property Key" section to list all the previous property keys > and new property keys. > > We use "scan" and "lookup" instead of "source" prefix because we should > distinguish them and they aligns to FLIP-95 ScanTableSource and > LookupTableSource. > I also colored red for some major change of property keys in the FLIP. I > will list some of them here too: > > kafka: > connector.startup-mode => scan.startup.mode > connector.specific-offsets => scan.startup.specific-offsets > connector.startup-timestamp-millis => scan.startup.timestamp-millis > connector.sink-partitioner & connector.sink-partitioner-class => > sink.partitioner > > elasticsearch: > connector.key-delimiter => document-id.key-delimiter # make it > explicit that it is used for document id > connector.key-null-literal => document-id.key-null-literal # and > it also can be used for es sources in the future > connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy > > jdbc: > connector.table => table-name > > Welcome further feedbacks! > > Best, > Jark > > > On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote: > >> Hi Kurt, >> >> I also prefer "-" as version delimiter now. I didn't remove the "_" >> proposal by mistake, that's why I sent another email last night :) >> Regarding to "property-version", I also think we shouldn't let users to >> learn about this. And ConfigOption provides a good ability >> to support deprecated keys and auto-generate documentation for deprecated >> keys. >> >> Hi Danny, >> >> Regarding to “connector.properties.*”: >> In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions() >> inferfaces are only used for generation of documentation. >> It does not influence the discovery and validation of a factory. The >> validation logic is defined by connectors >> in createDynamicTableSource/Sink(). >> So you don't have to provide an option for "connector.properties.*". But I >> think we should make ConfigOption support wildcard in the long term for a >> full story. >> >> I don't think we should inline all the "connector.properties.*", >> otherwise, it will be very tricky for users to configure the properties. >> Regarding to FLIP-113, I suggest to provide some ConfigOptions for >> commonly used kafka properties and put them in the supportedHintOptions(), >> e.g. "connector.properties.group.id", >> "connector.properties.fetch.min.bytes". >> >> Best, >> Jark >> >> >> >> >> >> On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote: >> >>> Thanks Jark for bring up this discussion, +1 for this idea, I believe the >>> user has suffered from the verbose property key for long time. >>> >>> Just one question, how do we handle the keys with wildcard, such as the >>> “connector.properties.*” in Kafka connector which would then hand-over to >>> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption >>> to describe the “supported properties”, then I have to concerns: >>> >>> • For the new keys, do we still need to put multi-lines there the such >>> key, such as “connector.properties.abc” “connector.properties.def”, or >>> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." >>> • Should the ConfigOption support the wildcard ? (If we plan to support >>> the current multi-line style) >>> >>> >>> Best, >>> Danny Chan >>> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道: >>>> Hi all, >>>> >>>> Thanks for the feedbacks. >>>> >>>> It seems that we have a conclusion to put the version into the factory >>>> identifier. I'm also fine with this. >>>> If we have this outcome, the interface of Factory#factoryVersion is not >>>> needed anymore, this can simplify the learning cost of new factory. >>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >>>> <[hidden email]> >>>> >>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because >>>> the meaning of "universal" not easy to understand. >>>> kafka-0.11 => kafka for 0.11 version >>>> kafka-0.10 => kafka for 0.10 version >>>> elasticsearch-6 => elasticsearch for 6.x versions >>>> elasticsearch-7 => elasticsearch for 7.x versions >>>> hbase-1.4 => hbase for 1.4.x versions >>>> jdbc >>>> filesystem >>>> >>>> We use "-" as the version delimiter to make them to be more consistent. >>>> This is not forces, users can also use other delimiters or without >>>> delimiter. >>>> But this can be a guilde in the Javadoc of Factory, to make the >>> connector >>>> ecosystem to be more consistent. >>>> >>>> What do you think? >>>> >>>> ------------------------ >>>> >>>> Regarding "connector.property-version": >>>> >>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >>>> designed not support to read current properties. >>>> All the current properties are routed to the old factories if they are >>>> using "connector.type". Otherwise, properties are routed to new >>> factories. >>>> If I understand correctly, the "connector.property-version" is attched >>>> implicitly by system, not manually set by users. >>>> For example, the framework should add "connector.property-version=1" to >>>> properties when processing DDL statement. >>>> I'm fine to add a "connector.property-version=1" when processing DDL >>>> statement, but I think it's also fine if we don't, >>>> because this can be done in the future if need and the default version >>> can >>>> be 1. >>>> >>>> Best, >>>> Jark >>>> >>>> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Thanks for the feedbacks. >>>>> >>>>> It seems that we have a conclusion to put the version into the factory >>>>> identifier. I'm also fine with this. >>>>> If we have this outcome, the interface of Factory#factoryVersion is >>> not >>>>> needed anymore, this can simplify the learning cost of new factory. >>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >>>>> <[hidden email]> >>>>> >>>>> Btw, I would like to use "_" instead of "-" as the version delimiter, >>>>> because "-" looks like minus and may confuse users, e.g. >>> "elasticsearch-6". >>>>> This is not forced, but should be a guilde in the Javadoc of Factory. >>>>> I propose to use the following identifiers for existing connectors, >>>>> >>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", >>> because >>>>> the meaning of "universal" not easy to understand. >>>>> kafka-0.11 => kafka for 0.11 version >>>>> kafka-0.10 => kafka for 0.10 version >>>>> elasticsearch-6 => elasticsearch for 6.x versions >>>>> elasticsearch-7 => elasticsearch for 7.x versions >>>>> hbase-1.4 => hbase for 1.4.x versions >>>>> jdbc >>>>> filesystem >>>>> >>>>> We use "-" as the version delimiter to make them to be more >>> consistent. >>>>> This is not forces, users can also use other delimiters or without >>>>> delimiter. >>>>> But this can be a guilde in the Javadoc of Factory, to make the >>> connector >>>>> ecosystem to be more consistent. >>>>> >>>>> What do you think? >>>>> >>>>> ------------------------ >>>>> >>>>> Regarding "connector.property-version": >>>>> >>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >>>>> designed not support to read current properties. >>>>> All the current properties are routed to the old factories if they are >>>>> using "connector.type". Otherwise, properties are routed to new >>> factories. >>>>> If I understand correctly, the "connector.property-version" is attched >>>>> implicitly by system, not manually set by users. >>>>> For example, the framework should add "connector.property-version=1" >>> to >>>>> properties when processing DDL statement. >>>>> I'm fine to add a "connector.property-version=1" when processing DDL >>>>> statement, but I think it's also fine if we don't, >>>>> because this can be done in the future if need and the default >>> version can >>>>> be 1. >>>>> >>>>> Best, >>>>> Jark >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz < >>> [hidden email]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I like the overall design of the FLIP. >>>>>> >>>>>> As for the withstanding concerns. I kind of like the approach to >>> put the >>>>>> version into the factory identifier. I think it's the cleanest way >>> to >>>>>> say that this version actually applies to the connector itself and >>> not >>>>>> to the system it connects to. BTW, I think the outcome of this >>>>>> discussion will affect interfaces described in FLIP-95. If we put >>> the >>>>>> version into the functionIdentifier, do we need >>> Factory#factoryVersion? >>>>>> I also think it does make sense to have a versioning for the >>> properties >>>>>> as well. Are we able to read all the current properties with the new >>>>>> factories? I think we could use the "connector.property-version" to >>>>>> alternate between different Factory interfaces to support the old >>> set of >>>>>> properties. Otherwise the new factories need to understand both set >>> of >>>>>> properties, don't they? >>>>>> >>>>>> Best, >>>>>> >>>>>> Dawid >>>>>> >>>>>> On 30/03/2020 17:07, Timo Walther wrote: >>>>>>> Hi Jark, >>>>>>> >>>>>>> thanks for the FLIP. I don't have a strong opinion on >>>>>>> DynamicTableFactory#factoryVersion() for distiguishing connector >>>>>>> versions. We can also include it in the factory identifier >>> itself. For >>>>>>> Kafka, I would then just use `kafka` because the naming >>> "universal" >>>>>>> was just a helper solution at that time. >>>>>>> >>>>>>> What we need is a good guide for how to design the options. We >>> should >>>>>>> include that in the JavaDoc of factory interface itself, once it >>> is >>>>>>> in-place. I like the difference between source and sink in >>> properties. >>>>>>> Regarding "connector.property-version": >>>>>>> >>>>>>> It was meant for backwards compatibility along the dimension of >>>>>>> "property design" compared to "connector version". However, since >>> the >>>>>>> connector is now responsible for parsing all properties, it is >>> not as >>>>>>> necessary as it was before. >>>>>>> >>>>>>> However, a change of the property design as it is done in FLIP-107 >>>>>>> becomes more difficult without a property version and versioning >>> is >>>>>>> always a good idea in API world. >>>>>>> >>>>>>> Regards, >>>>>>> Timo >>>>>>> >>>>>>> >>>>>>> On 30.03.20 16:38, Benchao Li wrote: >>>>>>>> Hi Jark, >>>>>>>> >>>>>>>> Thanks for starting the discussion. The FLIP LGTM generally. >>>>>>>> >>>>>>>> Regarding connector version VS put version into connector's >>> name, >>>>>>>> I favor the latter personally, using one property to locate a >>>>>>>> connector can make the error log more precise. >>>>>>>> From the users' side, using one property to match a connector >>> will >>>>>>>> be easier. Especially we have many connectors, >>>>>>>> and some of the need version property required, and some of >>> them not. >>>>>>>> Regarding Jingsong's suggestion, >>>>>>>> IMO, it's a very good complement to the FLIP. Distinguishing >>>>>>>> properties for source and sink can be very useful, and >>>>>>>> also this will make the connector property more precise. >>>>>>>> We are also sick of this for now, we cannot know whether a DDL >>> is a >>>>>>>> source or sink unless we look through all queries where >>>>>>>> the table is used. >>>>>>>> Even more, some of the required properties are only required for >>>>>>>> source, bug we cannot leave it blank for sink, and vice versa. >>>>>>>> I think we can also add a type for dimension tables except >>> source and >>>>>>>> sink. >>>>>>>> >>>>>>>> Kurt Young <[hidden email] <mailto:[hidden email]>> >>> 于2020年3月30日 >>>>>>>> 周一 下午8:16写道: >>>>>>>> >>>>>>>>> It's not possible for the framework to throw such exception. >>>>>>>> Because the >>>>>>>> framework doesn't know what versions do the connector support. >>>>>>>> >>>>>>>> Not really, we can list all valid connectors framework could >>>>>>>> found. E.g. >>>>>>>> user mistyped 'kafka-0.x', the error message will looks like: >>>>>>>> >>>>>>>> we don't have any connector named "kafka-0.x", but we have: >>>>>>>> FileSystem >>>>>>>> Kafka-0.10 >>>>>>>> Kafka-0.11 >>>>>>>> ElasticSearch6 >>>>>>>> ElasticSearch7 >>>>>>>> >>>>>>>> Best, >>>>>>>> Kurt >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>> >>>>>>>>> Hi Kurt, >>>>>>>>> >>>>>>>>>> 2) Lists all available connectors seems also quite >>>>>>>> straightforward, e.g >>>>>>>>> user provided a wrong "kafka-0.8", we tell user we have >>>>>>>> candidates of >>>>>>>>> "kakfa-0.11", "kafka-universal" >>>>>>>>> It's not possible for the framework to throw such exception. >>>>>>>> Because the >>>>>>>>> framework doesn't know what versions do the connector support. >>>>>>>> All the >>>>>>>>> version information is a blackbox in the identifier. But with >>>>>>>>> `Factory#factoryVersion()` interface, we can know all the >>>>>>>> supported >>>>>>>>> versions. >>>>>>>>> >>>>>>>>>> 3) I don't think so. We can still treat it as the same >>>>>>>> connector but with >>>>>>>>> different versions. >>>>>>>>> That's true but that's weird. Because from the plain DDL >>>>>>>> definition, they >>>>>>>>> look like different connectors with different "connector" >>>>>>>> value, e.g. >>>>>>>>> 'connector=kafka-0.8', 'connector=kafka-0.10'. >>>>>>>>> >>>>>>>>>> If users don't set any version, we will use >>> "kafka-universal" >>>>>>>> instead. >>>>>>>>> The behavior is inconsistent IMO. >>>>>>>>> That is a long term vision when there is no kafka clusters >>>>>>>> with <0.11 >>>>>>>>> version. >>>>>>>>> At that point, "universal" is the only supported version in >>>>>> Flink >>>>>>>> and the >>>>>>>>> "version" key can be optional. >>>>>>>>> >>>>>>>>> --------------------------------------------- >>>>>>>>> >>>>>>>>> Hi Jingsong, >>>>>>>>> >>>>>>>>>> "version" vs "kafka.version" >>>>>>>>> I though about it. But if we prefix "kafka" to version, we >>>>>> should >>>>>>>> prefix >>>>>>>>> "kafka" for all other property keys, because they are all >>> kafka >>>>>>>> specific >>>>>>>>> options. >>>>>>>>> However, that will make the property set verbose, see rejected >>>>>>>> option#2 in >>>>>>>>> the FLIP. >>>>>>>>> >>>>>>>>>> explicitly separate options for source and sink >>>>>>>>> That's a good topic. It's good to have a guideline for the new >>>>>>>> property >>>>>>>>> keys. >>>>>>>>> I'm fine to prefix with a "source"/"sink" for some connector >>>>>>>> keys. >>>>>>>>> Actually, we already do this in some connectors, e.g. jdbc and >>>>>>>> hbase. >>>>>>>>> Best, >>>>>>>>> Jark >>>>>>>>> >>>>>>>>> On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >>>>>> [hidden email] >>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>> Thanks Jark for the proposal. >>>>>>>>>> >>>>>>>>>> +1 to the general idea. >>>>>>>>>> >>>>>>>>>> For "version", what about "kafka.version"? It is obvious to >>>>>>>> know its >>>>>>>>>> meaning. >>>>>>>>>> >>>>>>>>>> And I'd like to start a new topic: >>>>>>>>>> Should we need to explicitly separate source from sink? >>>>>>>>>> With the development of batch and streaming, more and more >>>>>>>> connectors >>>>>>>>> have >>>>>>>>>> both source and sink. >>>>>>>>>> >>>>>>>>>> So should we set a rule for table properties: >>>>>>>>>> - properties for both source and sink: without prefix, like >>>>>>>> "topic" >>>>>>>>>> - properties for source only: with "source." prefix, like >>>>>>>>>> "source.startup-mode" >>>>>>>>>> - properties for sink only: with "sink." prefix, like >>>>>>>> "sink.partitioner" >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jingsong Lee >>>>>>>>>> >>>>>>>>>> On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>>> Hi Kurt, >>>>>>>>>>> >>>>>>>>>>> That's good questions. >>>>>>>>>>> >>>>>>>>>>>> the meaning of "version" >>>>>>>>>>> There are two versions in the old design. One is property >>>>>>>> version >>>>>>>>>>> "connector.property-version" which can be used for >>> backward >>>>>>>>>> compatibility. >>>>>>>>>>> The other one is "connector.version" which defines the >>>>>>>> version of >>>>>>>>>> external >>>>>>>>>>> system, e.g. 0.11" for kafka, "6" or "7" for ES. >>>>>>>>>>> In this proposal, the "version" is the previous >>>>>>>> "connector.version". >>>>>>>>> The >>>>>>>>>>> ""connector.property-version" is not introduced in new >>>>>>>> design. >>>>>>>>>>>> how to keep the old capability which can evolve >>> connector >>>>>>>> properties >>>>>>>>>>> The "connector.property-version" is an optional key in the >>>>>>>> old design >>>>>>>>> and >>>>>>>>>>> is never bumped up. >>>>>>>>>>> I'm not sure how "connector.property-version" should work >>>>>>>> in the >>>>>>>>> initial >>>>>>>>>>> design. Maybe @Timo Walther <[hidden email] >>>>>>>> <mailto:[hidden email]>> has more knowledge on >>>>>>>>>>> this. >>>>>>>>>>> But for the new properties, every options should be >>>>>>>> expressed as >>>>>>>>>>> `ConfigOption` which provides `withDeprecatedKeys(...)` >>>>>>>> method to >>>>>>>>> easily >>>>>>>>>>> support evolving keys. >>>>>>>>>>> >>>>>>>>>>>> a single keys instead of two, e.g. "kafka-0.11", >>>>>>>> "kafka-universal"? >>>>>>>>>>> There are several benefit to use separate "version" key I >>>>>> can >>>>>>>> see: >>>>>>>>>>> 1) it's more readable to separete them into different >>> keys, >>>>>>>> because >>>>>>>>> they >>>>>>>>>>> are orthogonal concepts. >>>>>>>>>>> 2) the planner can give all the availble versions in the >>>>>>>> exception >>>>>>>>>> message, >>>>>>>>>>> if user uses a wrong version (this is often reported in >>>>>>>> user ML). >>>>>>>>>>> 3) If we use "kafka-0.11" as connector identifier, we may >>>>>>>> have to >>>>>>>>> write a >>>>>>>>>>> full documentation for each version, because they are >>>>>>>> different >>>>>>>>>>> "connector"? >>>>>>>>>>> IMO, for 0.11, 0.11, etc... kafka, they are actually >>>>>>>> the same >>>>>>>>>> connector >>>>>>>>>>> but with different "client jar" version, >>>>>>>>>>> they share all the same supported property keys and >>>>>>>> should reside >>>>>>>>>>> together. >>>>>>>>>>> 4) IMO, the future vision is version-free. At some point >>> in >>>>>>>> the future, >>>>>>>>>> we >>>>>>>>>>> may don't need users to specify kafka version anymore, and >>>>>>>> make >>>>>>>>>>> "version=universal" as optional or removed in the future. >>>>>>>> This is can >>>>>>>>> be >>>>>>>>>>> done easily if they are separate keys. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jark >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, 30 Mar 2020 at 14:45, Kurt Young < >>> [hidden email] >>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>>>> Hi Jark, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the proposal, I'm +1 to the general idea. >>>>>>>> However I have a >>>>>>>>>>>> question about "version", >>>>>>>>>>>> in the old design, the version seems to be aimed for >>>>>>>> tracking >>>>>>>>> property >>>>>>>>>>>> version, with different >>>>>>>>>>>> version, we could evolve these step by step without >>>>>>>> breaking backward >>>>>>>>>>>> compatibility. But in this >>>>>>>>>>>> design, version is representing external system's >>>>>>>> version, like >>>>>>>>> "0.11" >>>>>>>>>>> for >>>>>>>>>>>> kafka, "6" or "7" for ES. >>>>>>>>>>>> I'm not sure if this is necessary, what's the benefit of >>>>>>>> using two >>>>>>>>> keys >>>>>>>>>>>> instead of one, like "kafka-0.11" >>>>>>>>>>>> or "ES6" directly? And how about the old capability >>> which >>>>>>>> could let >>>>>>>>> us >>>>>>>>>>>> evolving connector properties? >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Kurt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Mar 30, 2020 at 2:36 PM LakeShen >>>>>>>> <[hidden email] <mailto:[hidden email]>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Jark, >>>>>>>>>>>>> >>>>>>>>>>>>> I am really looking forward to this feature. I think >>>>>> this >>>>>>>> feature >>>>>>>>>>>>> could simplify flink sql code,and at the same time , >>>>>>>>>>>>> it could make the developer more easlier to config the >>>>>>>> flink sql >>>>>>>>> WITH >>>>>>>>>>>>> options. >>>>>>>>>>>>> >>>>>>>>>>>>> Now when I am using flink sql to write flink task , >>>>>>>> sometimes I >>>>>>>>> think >>>>>>>>>>> the >>>>>>>>>>>>> WITH options is too long for user. >>>>>>>>>>>>> For example,I config the kafka source connector >>>>>>>> parameter,for >>>>>>>>>> consumer >>>>>>>>>>>>> group and brokers parameter: >>>>>>>>>>>>> >>>>>>>>>>>>> 'connector.properties.0.key' = 'group.id >>>>>>>> <http://group.id>' >>>>>>>>>>>>>> , 'connector.properties.0.value' = 'xxx' >>>>>>>>>>>>>> , 'connector.properties.1.key' = >>>>>>>> 'bootstrap.servers' >>>>>>>>>>>>>> , 'connector.properties.1.value' = 'xxxxx' >>>>>>>>>>>>>> >>>>>>>>>>>>> I can understand this config , but for the flink fresh >>>>>>>> man,maybe it >>>>>>>>>>>>> is confused for him. >>>>>>>>>>>>> In my thought, I am really looking forward to this >>>>>>>> feature,thank >>>>>>>>> you >>>>>>>>>> to >>>>>>>>>>>>> propose this feature. >>>>>>>>>>>>> >>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>> LakeShen >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Jark Wu <[hidden email] <mailto:[hidden email]>> >>> 于 >>>>>>>> 2020年3月30日周一 下午2:02写道: >>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I want to start a discussion about further improve >>> and >>>>>>>> simplify >>>>>>>>> our >>>>>>>>>>>>> current >>>>>>>>>>>>>> connector porperty keys, aka WITH options. >>> Currently, >>>>>>>> we have a >>>>>>>>>>>>>> 'connector.' prefix for many properties, but they >>> are >>>>>>>> verbose, >>>>>>>>> and >>>>>>>>>> we >>>>>>>>>>>>> see a >>>>>>>>>>>>>> big inconsistency between the properties when >>>>>> designing >>>>>>>> FLIP-107. >>>>>>>>>>>>>> So we propose to remove all the 'connector.' prefix >>>>>> and >>>>>>>> rename >>>>>>>>>>>>>> 'connector.type' to 'connector', 'format.type' to >>>>>>>> 'format'. So a >>>>>>>>>> new >>>>>>>>>>>>> Kafka >>>>>>>>>>>>>> DDL may look like this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> CREATE TABLE kafka_table ( >>>>>>>>>>>>>> ... >>>>>>>>>>>>>> ) WITH ( >>>>>>>>>>>>>> 'connector' = 'kafka', >>>>>>>>>>>>>> 'version' = '0.10', >>>>>>>>>>>>>> 'topic' = 'test-topic', >>>>>>>>>>>>>> 'startup-mode' = 'earliest-offset', >>>>>>>>>>>>>> 'properties.bootstrap.servers' = 'localhost:9092', >>>>>>>>>>>>>> 'properties.group.id <http://properties.group.id>' >>> = >>>>>>>> 'testGroup', >>>>>>>>>>>>>> 'format' = 'json', >>>>>>>>>>>>>> 'format.fail-on-missing-field' = 'false' >>>>>>>>>>>>>> ); >>>>>>>>>>>>>> >>>>>>>>>>>>>> The new connector property key set will come >>> together >>>>>>>> with new >>>>>>>>>>> Factory >>>>>>>>>>>>>> inferface which is proposed in FLIP-95. Old >>> properties >>>>>>>> are still >>>>>>>>>>>>> compatible >>>>>>>>>>>>>> with their existing implementation. New properties >>>>>>>> are only >>>>>>>>>> available >>>>>>>>>>>> in >>>>>>>>>>>>>> new DynamicTableFactory implementations. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can access the detailed FLIP here: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Jark >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best, Jingsong Lee >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Benchao Li >>>>>>>> School of Electronics Engineering and Computer Science, Peking >>>>>>>> University >>>>>>>> Tel:+86-15650713730 >>>>>>>> Email:[hidden email] >>>>>>>> <mailto:[hidden email]>;[hidden email] >>>>>>>> <mailto:[hidden email]> >>>>>>>> signature.asc (849 bytes) Download Attachment |
Thanks for the update, Jark. +1 for your proposal.
Some minor feedback from my side: sink.bulk-flush -> sink.buffer-flush? zookeeper.znode.parent -> zookeeper.znode-parent? for consistency. username -> secrect.username? password -> secrect.password? How about we prefix options that should not show up in logs with "secret."? I agree with Dawid, we should not mix factory identifier with updates to the property design. I would introduce a general `property-version` option that users can add to their DDL if they want to ensure backwards compatibility. All factories would get this option by default, they don't need to list it explicitly. Regards, Timo On 31.03.20 09:21, Dawid Wysakowicz wrote: > Hi, > > As for the `connector.property-version`. I have doubts that having > deprecated keys mechanism is enough to fully support backwards > compatibility. I do think we are changing not just the key names, but > structure of the keys as well as the discovery mechanism and I think it > is really hard to foresee all the interconnections. I don't think users > need to learn about the `connector.property-version` too much. By > default it would always be set to the newest version. Users could still > set this property to an old version if they don't want to migrate the > properties, but go fully against the old structure. I think such > mechanism could be useful. > > I understand a similar thing can be achieved with renaming/changing the > version in the factoryIdentifier. 'kafka' can use the new properties, > 'kafka-v1' the old properties. I think though having a dedicated > mechanism could make it a tad easier. I am not strict on that though, > just wanted to express it. > > Best, > > Dawid > > On 31/03/2020 08:58, Jark Wu wrote: >> Hi everyone, >> >> Thanks for the great feedbacks so far. >> >> I updated the FLIP documentation according to the discussion. Changes >> include: >> - remove "version" key, and merge it into "connector" >> - add "scan", "lookup", "sink" prefix to some property keys if they are >> only used in that case. >> - add a "New Property Key" section to list all the previous property keys >> and new property keys. >> >> We use "scan" and "lookup" instead of "source" prefix because we should >> distinguish them and they aligns to FLIP-95 ScanTableSource and >> LookupTableSource. >> I also colored red for some major change of property keys in the FLIP. I >> will list some of them here too: >> >> kafka: >> connector.startup-mode => scan.startup.mode >> connector.specific-offsets => scan.startup.specific-offsets >> connector.startup-timestamp-millis => scan.startup.timestamp-millis >> connector.sink-partitioner & connector.sink-partitioner-class => >> sink.partitioner >> >> elasticsearch: >> connector.key-delimiter => document-id.key-delimiter # make it >> explicit that it is used for document id >> connector.key-null-literal => document-id.key-null-literal # and >> it also can be used for es sources in the future >> connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy >> >> jdbc: >> connector.table => table-name >> >> Welcome further feedbacks! >> >> Best, >> Jark >> >> >> On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote: >> >>> Hi Kurt, >>> >>> I also prefer "-" as version delimiter now. I didn't remove the "_" >>> proposal by mistake, that's why I sent another email last night :) >>> Regarding to "property-version", I also think we shouldn't let users to >>> learn about this. And ConfigOption provides a good ability >>> to support deprecated keys and auto-generate documentation for deprecated >>> keys. >>> >>> Hi Danny, >>> >>> Regarding to “connector.properties.*”: >>> In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions() >>> inferfaces are only used for generation of documentation. >>> It does not influence the discovery and validation of a factory. The >>> validation logic is defined by connectors >>> in createDynamicTableSource/Sink(). >>> So you don't have to provide an option for "connector.properties.*". But I >>> think we should make ConfigOption support wildcard in the long term for a >>> full story. >>> >>> I don't think we should inline all the "connector.properties.*", >>> otherwise, it will be very tricky for users to configure the properties. >>> Regarding to FLIP-113, I suggest to provide some ConfigOptions for >>> commonly used kafka properties and put them in the supportedHintOptions(), >>> e.g. "connector.properties.group.id", >>> "connector.properties.fetch.min.bytes". >>> >>> Best, >>> Jark >>> >>> >>> >>> >>> >>> On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote: >>> >>>> Thanks Jark for bring up this discussion, +1 for this idea, I believe the >>>> user has suffered from the verbose property key for long time. >>>> >>>> Just one question, how do we handle the keys with wildcard, such as the >>>> “connector.properties.*” in Kafka connector which would then hand-over to >>>> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption >>>> to describe the “supported properties”, then I have to concerns: >>>> >>>> • For the new keys, do we still need to put multi-lines there the such >>>> key, such as “connector.properties.abc” “connector.properties.def”, or >>>> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." >>>> • Should the ConfigOption support the wildcard ? (If we plan to support >>>> the current multi-line style) >>>> >>>> >>>> Best, >>>> Danny Chan >>>> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道: >>>>> Hi all, >>>>> >>>>> Thanks for the feedbacks. >>>>> >>>>> It seems that we have a conclusion to put the version into the factory >>>>> identifier. I'm also fine with this. >>>>> If we have this outcome, the interface of Factory#factoryVersion is not >>>>> needed anymore, this can simplify the learning cost of new factory. >>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >>>>> <[hidden email]> >>>>> >>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because >>>>> the meaning of "universal" not easy to understand. >>>>> kafka-0.11 => kafka for 0.11 version >>>>> kafka-0.10 => kafka for 0.10 version >>>>> elasticsearch-6 => elasticsearch for 6.x versions >>>>> elasticsearch-7 => elasticsearch for 7.x versions >>>>> hbase-1.4 => hbase for 1.4.x versions >>>>> jdbc >>>>> filesystem >>>>> >>>>> We use "-" as the version delimiter to make them to be more consistent. >>>>> This is not forces, users can also use other delimiters or without >>>>> delimiter. >>>>> But this can be a guilde in the Javadoc of Factory, to make the >>>> connector >>>>> ecosystem to be more consistent. >>>>> >>>>> What do you think? >>>>> >>>>> ------------------------ >>>>> >>>>> Regarding "connector.property-version": >>>>> >>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >>>>> designed not support to read current properties. >>>>> All the current properties are routed to the old factories if they are >>>>> using "connector.type". Otherwise, properties are routed to new >>>> factories. >>>>> If I understand correctly, the "connector.property-version" is attched >>>>> implicitly by system, not manually set by users. >>>>> For example, the framework should add "connector.property-version=1" to >>>>> properties when processing DDL statement. >>>>> I'm fine to add a "connector.property-version=1" when processing DDL >>>>> statement, but I think it's also fine if we don't, >>>>> because this can be done in the future if need and the default version >>>> can >>>>> be 1. >>>>> >>>>> Best, >>>>> Jark >>>>> >>>>> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Thanks for the feedbacks. >>>>>> >>>>>> It seems that we have a conclusion to put the version into the factory >>>>>> identifier. I'm also fine with this. >>>>>> If we have this outcome, the interface of Factory#factoryVersion is >>>> not >>>>>> needed anymore, this can simplify the learning cost of new factory. >>>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >>>>>> <[hidden email]> >>>>>> >>>>>> Btw, I would like to use "_" instead of "-" as the version delimiter, >>>>>> because "-" looks like minus and may confuse users, e.g. >>>> "elasticsearch-6". >>>>>> This is not forced, but should be a guilde in the Javadoc of Factory. >>>>>> I propose to use the following identifiers for existing connectors, >>>>>> >>>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", >>>> because >>>>>> the meaning of "universal" not easy to understand. >>>>>> kafka-0.11 => kafka for 0.11 version >>>>>> kafka-0.10 => kafka for 0.10 version >>>>>> elasticsearch-6 => elasticsearch for 6.x versions >>>>>> elasticsearch-7 => elasticsearch for 7.x versions >>>>>> hbase-1.4 => hbase for 1.4.x versions >>>>>> jdbc >>>>>> filesystem >>>>>> >>>>>> We use "-" as the version delimiter to make them to be more >>>> consistent. >>>>>> This is not forces, users can also use other delimiters or without >>>>>> delimiter. >>>>>> But this can be a guilde in the Javadoc of Factory, to make the >>>> connector >>>>>> ecosystem to be more consistent. >>>>>> >>>>>> What do you think? >>>>>> >>>>>> ------------------------ >>>>>> >>>>>> Regarding "connector.property-version": >>>>>> >>>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are >>>>>> designed not support to read current properties. >>>>>> All the current properties are routed to the old factories if they are >>>>>> using "connector.type". Otherwise, properties are routed to new >>>> factories. >>>>>> If I understand correctly, the "connector.property-version" is attched >>>>>> implicitly by system, not manually set by users. >>>>>> For example, the framework should add "connector.property-version=1" >>>> to >>>>>> properties when processing DDL statement. >>>>>> I'm fine to add a "connector.property-version=1" when processing DDL >>>>>> statement, but I think it's also fine if we don't, >>>>>> because this can be done in the future if need and the default >>>> version can >>>>>> be 1. >>>>>> >>>>>> Best, >>>>>> Jark >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz < >>>> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I like the overall design of the FLIP. >>>>>>> >>>>>>> As for the withstanding concerns. I kind of like the approach to >>>> put the >>>>>>> version into the factory identifier. I think it's the cleanest way >>>> to >>>>>>> say that this version actually applies to the connector itself and >>>> not >>>>>>> to the system it connects to. BTW, I think the outcome of this >>>>>>> discussion will affect interfaces described in FLIP-95. If we put >>>> the >>>>>>> version into the functionIdentifier, do we need >>>> Factory#factoryVersion? >>>>>>> I also think it does make sense to have a versioning for the >>>> properties >>>>>>> as well. Are we able to read all the current properties with the new >>>>>>> factories? I think we could use the "connector.property-version" to >>>>>>> alternate between different Factory interfaces to support the old >>>> set of >>>>>>> properties. Otherwise the new factories need to understand both set >>>> of >>>>>>> properties, don't they? >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Dawid >>>>>>> >>>>>>> On 30/03/2020 17:07, Timo Walther wrote: >>>>>>>> Hi Jark, >>>>>>>> >>>>>>>> thanks for the FLIP. I don't have a strong opinion on >>>>>>>> DynamicTableFactory#factoryVersion() for distiguishing connector >>>>>>>> versions. We can also include it in the factory identifier >>>> itself. For >>>>>>>> Kafka, I would then just use `kafka` because the naming >>>> "universal" >>>>>>>> was just a helper solution at that time. >>>>>>>> >>>>>>>> What we need is a good guide for how to design the options. We >>>> should >>>>>>>> include that in the JavaDoc of factory interface itself, once it >>>> is >>>>>>>> in-place. I like the difference between source and sink in >>>> properties. >>>>>>>> Regarding "connector.property-version": >>>>>>>> >>>>>>>> It was meant for backwards compatibility along the dimension of >>>>>>>> "property design" compared to "connector version". However, since >>>> the >>>>>>>> connector is now responsible for parsing all properties, it is >>>> not as >>>>>>>> necessary as it was before. >>>>>>>> >>>>>>>> However, a change of the property design as it is done in FLIP-107 >>>>>>>> becomes more difficult without a property version and versioning >>>> is >>>>>>>> always a good idea in API world. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Timo >>>>>>>> >>>>>>>> >>>>>>>> On 30.03.20 16:38, Benchao Li wrote: >>>>>>>>> Hi Jark, >>>>>>>>> >>>>>>>>> Thanks for starting the discussion. The FLIP LGTM generally. >>>>>>>>> >>>>>>>>> Regarding connector version VS put version into connector's >>>> name, >>>>>>>>> I favor the latter personally, using one property to locate a >>>>>>>>> connector can make the error log more precise. >>>>>>>>> From the users' side, using one property to match a connector >>>> will >>>>>>>>> be easier. Especially we have many connectors, >>>>>>>>> and some of the need version property required, and some of >>>> them not. >>>>>>>>> Regarding Jingsong's suggestion, >>>>>>>>> IMO, it's a very good complement to the FLIP. Distinguishing >>>>>>>>> properties for source and sink can be very useful, and >>>>>>>>> also this will make the connector property more precise. >>>>>>>>> We are also sick of this for now, we cannot know whether a DDL >>>> is a >>>>>>>>> source or sink unless we look through all queries where >>>>>>>>> the table is used. >>>>>>>>> Even more, some of the required properties are only required for >>>>>>>>> source, bug we cannot leave it blank for sink, and vice versa. >>>>>>>>> I think we can also add a type for dimension tables except >>>> source and >>>>>>>>> sink. >>>>>>>>> >>>>>>>>> Kurt Young <[hidden email] <mailto:[hidden email]>> >>>> 于2020年3月30日 >>>>>>>>> 周一 下午8:16写道: >>>>>>>>> >>>>>>>>>> It's not possible for the framework to throw such exception. >>>>>>>>> Because the >>>>>>>>> framework doesn't know what versions do the connector support. >>>>>>>>> >>>>>>>>> Not really, we can list all valid connectors framework could >>>>>>>>> found. E.g. >>>>>>>>> user mistyped 'kafka-0.x', the error message will looks like: >>>>>>>>> >>>>>>>>> we don't have any connector named "kafka-0.x", but we have: >>>>>>>>> FileSystem >>>>>>>>> Kafka-0.10 >>>>>>>>> Kafka-0.11 >>>>>>>>> ElasticSearch6 >>>>>>>>> ElasticSearch7 >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Kurt >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email] >>>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>> >>>>>>>>>> Hi Kurt, >>>>>>>>>> >>>>>>>>>>> 2) Lists all available connectors seems also quite >>>>>>>>> straightforward, e.g >>>>>>>>>> user provided a wrong "kafka-0.8", we tell user we have >>>>>>>>> candidates of >>>>>>>>>> "kakfa-0.11", "kafka-universal" >>>>>>>>>> It's not possible for the framework to throw such exception. >>>>>>>>> Because the >>>>>>>>>> framework doesn't know what versions do the connector support. >>>>>>>>> All the >>>>>>>>>> version information is a blackbox in the identifier. But with >>>>>>>>>> `Factory#factoryVersion()` interface, we can know all the >>>>>>>>> supported >>>>>>>>>> versions. >>>>>>>>>> >>>>>>>>>>> 3) I don't think so. We can still treat it as the same >>>>>>>>> connector but with >>>>>>>>>> different versions. >>>>>>>>>> That's true but that's weird. Because from the plain DDL >>>>>>>>> definition, they >>>>>>>>>> look like different connectors with different "connector" >>>>>>>>> value, e.g. >>>>>>>>>> 'connector=kafka-0.8', 'connector=kafka-0.10'. >>>>>>>>>> >>>>>>>>>>> If users don't set any version, we will use >>>> "kafka-universal" >>>>>>>>> instead. >>>>>>>>>> The behavior is inconsistent IMO. >>>>>>>>>> That is a long term vision when there is no kafka clusters >>>>>>>>> with <0.11 >>>>>>>>>> version. >>>>>>>>>> At that point, "universal" is the only supported version in >>>>>>> Flink >>>>>>>>> and the >>>>>>>>>> "version" key can be optional. >>>>>>>>>> >>>>>>>>>> --------------------------------------------- >>>>>>>>>> >>>>>>>>>> Hi Jingsong, >>>>>>>>>> >>>>>>>>>>> "version" vs "kafka.version" >>>>>>>>>> I though about it. But if we prefix "kafka" to version, we >>>>>>> should >>>>>>>>> prefix >>>>>>>>>> "kafka" for all other property keys, because they are all >>>> kafka >>>>>>>>> specific >>>>>>>>>> options. >>>>>>>>>> However, that will make the property set verbose, see rejected >>>>>>>>> option#2 in >>>>>>>>>> the FLIP. >>>>>>>>>> >>>>>>>>>>> explicitly separate options for source and sink >>>>>>>>>> That's a good topic. It's good to have a guideline for the new >>>>>>>>> property >>>>>>>>>> keys. >>>>>>>>>> I'm fine to prefix with a "source"/"sink" for some connector >>>>>>>>> keys. >>>>>>>>>> Actually, we already do this in some connectors, e.g. jdbc and >>>>>>>>> hbase. >>>>>>>>>> Best, >>>>>>>>>> Jark >>>>>>>>>> >>>>>>>>>> On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >>>>>>> [hidden email] >>>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>>> Thanks Jark for the proposal. >>>>>>>>>>> >>>>>>>>>>> +1 to the general idea. >>>>>>>>>>> >>>>>>>>>>> For "version", what about "kafka.version"? It is obvious to >>>>>>>>> know its >>>>>>>>>>> meaning. >>>>>>>>>>> >>>>>>>>>>> And I'd like to start a new topic: >>>>>>>>>>> Should we need to explicitly separate source from sink? >>>>>>>>>>> With the development of batch and streaming, more and more >>>>>>>>> connectors >>>>>>>>>> have >>>>>>>>>>> both source and sink. >>>>>>>>>>> >>>>>>>>>>> So should we set a rule for table properties: >>>>>>>>>>> - properties for both source and sink: without prefix, like >>>>>>>>> "topic" >>>>>>>>>>> - properties for source only: with "source." prefix, like >>>>>>>>>>> "source.startup-mode" >>>>>>>>>>> - properties for sink only: with "sink." prefix, like >>>>>>>>> "sink.partitioner" >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jingsong Lee >>>>>>>>>>> >>>>>>>>>>> On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email] >>>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>>>> Hi Kurt, >>>>>>>>>>>> >>>>>>>>>>>> That's good questions. >>>>>>>>>>>> >>>>>>>>>>>>> the meaning of "version" >>>>>>>>>>>> There are two versions in the old design. One is property >>>>>>>>> version >>>>>>>>>>>> "connector.property-version" which can be used for >>>> backward >>>>>>>>>>> compatibility. >>>>>>>>>>>> The other one is "connector.version" which defines the >>>>>>>>> version of >>>>>>>>>>> external >>>>>>>>>>>> system, e.g. 0.11" for kafka, "6" or "7" for ES. >>>>>>>>>>>> In this proposal, the "version" is the previous >>>>>>>>> "connector.version". >>>>>>>>>> The >>>>>>>>>>>> ""connector.property-version" is not introduced in new >>>>>>>>> design. >>>>>>>>>>>>> how to keep the old capability which can evolve >>>> connector >>>>>>>>> properties >>>>>>>>>>>> The "connector.property-version" is an optional key in the >>>>>>>>> old design >>>>>>>>>> and >>>>>>>>>>>> is never bumped up. >>>>>>>>>>>> I'm not sure how "connector.property-version" should work >>>>>>>>> in the >>>>>>>>>> initial >>>>>>>>>>>> design. Maybe @Timo Walther <[hidden email] >>>>>>>>> <mailto:[hidden email]>> has more knowledge on >>>>>>>>>>>> this. >>>>>>>>>>>> But for the new properties, every options should be >>>>>>>>> expressed as >>>>>>>>>>>> `ConfigOption` which provides `withDeprecatedKeys(...)` >>>>>>>>> method to >>>>>>>>>> easily >>>>>>>>>>>> support evolving keys. >>>>>>>>>>>> >>>>>>>>>>>>> a single keys instead of two, e.g. "kafka-0.11", >>>>>>>>> "kafka-universal"? >>>>>>>>>>>> There are several benefit to use separate "version" key I >>>>>>> can >>>>>>>>> see: >>>>>>>>>>>> 1) it's more readable to separete them into different >>>> keys, >>>>>>>>> because >>>>>>>>>> they >>>>>>>>>>>> are orthogonal concepts. >>>>>>>>>>>> 2) the planner can give all the availble versions in the >>>>>>>>> exception >>>>>>>>>>> message, >>>>>>>>>>>> if user uses a wrong version (this is often reported in >>>>>>>>> user ML). >>>>>>>>>>>> 3) If we use "kafka-0.11" as connector identifier, we may >>>>>>>>> have to >>>>>>>>>> write a >>>>>>>>>>>> full documentation for each version, because they are >>>>>>>>> different >>>>>>>>>>>> "connector"? >>>>>>>>>>>> IMO, for 0.11, 0.11, etc... kafka, they are actually >>>>>>>>> the same >>>>>>>>>>> connector >>>>>>>>>>>> but with different "client jar" version, >>>>>>>>>>>> they share all the same supported property keys and >>>>>>>>> should reside >>>>>>>>>>>> together. >>>>>>>>>>>> 4) IMO, the future vision is version-free. At some point >>>> in >>>>>>>>> the future, >>>>>>>>>>> we >>>>>>>>>>>> may don't need users to specify kafka version anymore, and >>>>>>>>> make >>>>>>>>>>>> "version=universal" as optional or removed in the future. >>>>>>>>> This is can >>>>>>>>>> be >>>>>>>>>>>> done easily if they are separate keys. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Jark >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, 30 Mar 2020 at 14:45, Kurt Young < >>>> [hidden email] >>>>>>>>> <mailto:[hidden email]>> wrote: >>>>>>>>>>>>> Hi Jark, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the proposal, I'm +1 to the general idea. >>>>>>>>> However I have a >>>>>>>>>>>>> question about "version", >>>>>>>>>>>>> in the old design, the version seems to be aimed for >>>>>>>>> tracking >>>>>>>>>> property >>>>>>>>>>>>> version, with different >>>>>>>>>>>>> version, we could evolve these step by step without >>>>>>>>> breaking backward >>>>>>>>>>>>> compatibility. But in this >>>>>>>>>>>>> design, version is representing external system's >>>>>>>>> version, like >>>>>>>>>> "0.11" >>>>>>>>>>>> for >>>>>>>>>>>>> kafka, "6" or "7" for ES. >>>>>>>>>>>>> I'm not sure if this is necessary, what's the benefit of >>>>>>>>> using two >>>>>>>>>> keys >>>>>>>>>>>>> instead of one, like "kafka-0.11" >>>>>>>>>>>>> or "ES6" directly? And how about the old capability >>>> which >>>>>>>>> could let >>>>>>>>>> us >>>>>>>>>>>>> evolving connector properties? >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Kurt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Mar 30, 2020 at 2:36 PM LakeShen >>>>>>>>> <[hidden email] <mailto:[hidden email]>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Jark, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am really looking forward to this feature. I think >>>>>>> this >>>>>>>>> feature >>>>>>>>>>>>>> could simplify flink sql code,and at the same time , >>>>>>>>>>>>>> it could make the developer more easlier to config the >>>>>>>>> flink sql >>>>>>>>>> WITH >>>>>>>>>>>>>> options. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Now when I am using flink sql to write flink task , >>>>>>>>> sometimes I >>>>>>>>>> think >>>>>>>>>>>> the >>>>>>>>>>>>>> WITH options is too long for user. >>>>>>>>>>>>>> For example,I config the kafka source connector >>>>>>>>> parameter,for >>>>>>>>>>> consumer >>>>>>>>>>>>>> group and brokers parameter: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 'connector.properties.0.key' = 'group.id >>>>>>>>> <http://group.id>' >>>>>>>>>>>>>>> , 'connector.properties.0.value' = 'xxx' >>>>>>>>>>>>>>> , 'connector.properties.1.key' = >>>>>>>>> 'bootstrap.servers' >>>>>>>>>>>>>>> , 'connector.properties.1.value' = 'xxxxx' >>>>>>>>>>>>>>> >>>>>>>>>>>>>> I can understand this config , but for the flink fresh >>>>>>>>> man,maybe it >>>>>>>>>>>>>> is confused for him. >>>>>>>>>>>>>> In my thought, I am really looking forward to this >>>>>>>>> feature,thank >>>>>>>>>> you >>>>>>>>>>> to >>>>>>>>>>>>>> propose this feature. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>> LakeShen >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Jark Wu <[hidden email] <mailto:[hidden email]>> >>>> 于 >>>>>>>>> 2020年3月30日周一 下午2:02写道: >>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I want to start a discussion about further improve >>>> and >>>>>>>>> simplify >>>>>>>>>> our >>>>>>>>>>>>>> current >>>>>>>>>>>>>>> connector porperty keys, aka WITH options. >>>> Currently, >>>>>>>>> we have a >>>>>>>>>>>>>>> 'connector.' prefix for many properties, but they >>>> are >>>>>>>>> verbose, >>>>>>>>>> and >>>>>>>>>>> we >>>>>>>>>>>>>> see a >>>>>>>>>>>>>>> big inconsistency between the properties when >>>>>>> designing >>>>>>>>> FLIP-107. >>>>>>>>>>>>>>> So we propose to remove all the 'connector.' prefix >>>>>>> and >>>>>>>>> rename >>>>>>>>>>>>>>> 'connector.type' to 'connector', 'format.type' to >>>>>>>>> 'format'. So a >>>>>>>>>>> new >>>>>>>>>>>>>> Kafka >>>>>>>>>>>>>>> DDL may look like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CREATE TABLE kafka_table ( >>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>> ) WITH ( >>>>>>>>>>>>>>> 'connector' = 'kafka', >>>>>>>>>>>>>>> 'version' = '0.10', >>>>>>>>>>>>>>> 'topic' = 'test-topic', >>>>>>>>>>>>>>> 'startup-mode' = 'earliest-offset', >>>>>>>>>>>>>>> 'properties.bootstrap.servers' = 'localhost:9092', >>>>>>>>>>>>>>> 'properties.group.id <http://properties.group.id>' >>>> = >>>>>>>>> 'testGroup', >>>>>>>>>>>>>>> 'format' = 'json', >>>>>>>>>>>>>>> 'format.fail-on-missing-field' = 'false' >>>>>>>>>>>>>>> ); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The new connector property key set will come >>>> together >>>>>>>>> with new >>>>>>>>>>>> Factory >>>>>>>>>>>>>>> inferface which is proposed in FLIP-95. Old >>>> properties >>>>>>>>> are still >>>>>>>>>>>>>> compatible >>>>>>>>>>>>>>> with their existing implementation. New properties >>>>>>>>> are only >>>>>>>>>>> available >>>>>>>>>>>>> in >>>>>>>>>>>>>>> new DynamicTableFactory implementations. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You can access the detailed FLIP here: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Jark >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best, Jingsong Lee >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Benchao Li >>>>>>>>> School of Electronics Engineering and Computer Science, Peking >>>>>>>>> University >>>>>>>>> Tel:+86-15650713730 >>>>>>>>> Email:[hidden email] >>>>>>>>> <mailto:[hidden email]>;[hidden email] >>>>>>>>> <mailto:[hidden email]> >>>>>>>>> > |
Free forum by Nabble | Edit this page |