[DISCUSS] FLIP-122: New Connector Property Keys for New Factory

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi everyone,

I want to start a discussion about further improve and simplify our current
connector porperty keys, aka WITH options. Currently, we have a
'connector.' prefix for many properties, but they are verbose, and we see a
big inconsistency between the properties when designing FLIP-107.

So we propose to remove all the 'connector.' prefix and rename
'connector.type' to 'connector', 'format.type' to 'format'. So a new Kafka
DDL may look like this:

CREATE TABLE kafka_table (
 ...
) WITH (
 'connector' = 'kafka',
 'version' = '0.10',
 'topic' = 'test-topic',
 'startup-mode' = 'earliest-offset',
 'properties.bootstrap.servers' = 'localhost:9092',
 'properties.group.id' = 'testGroup',
 'format' = 'json',
 'format.fail-on-missing-field' = 'false'
);

The new connector property key set will come together with new Factory
inferface which is proposed in FLIP-95. Old properties are still compatible
with their existing implementation. New properties are only available in
new DynamicTableFactory implementations.

You can access the detailed FLIP here:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory

Best,
Jark
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

LakeShen
Hi Jark,

I am really looking forward to this feature. I think this feature
could simplify flink sql code,and at the same time ,
it could make the developer more easlier to config the flink sql WITH
options.

Now when I am using flink sql to write flink task , sometimes I think the
WITH options is too long for user.
For example,I config the kafka source connector parameter,for consumer
group and brokers parameter:

          'connector.properties.0.key' = 'group.id'
>          , 'connector.properties.0.value' = 'xxx'
>          , 'connector.properties.1.key' = 'bootstrap.servers'
>          , 'connector.properties.1.value' = 'xxxxx'
>

I can understand this config , but for the flink fresh man,maybe it
is confused for him.
In my thought, I am really looking forward to this feature,thank you to
propose this feature.

Best wishes,
LakeShen


Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:

> Hi everyone,
>
> I want to start a discussion about further improve and simplify our current
> connector porperty keys, aka WITH options. Currently, we have a
> 'connector.' prefix for many properties, but they are verbose, and we see a
> big inconsistency between the properties when designing FLIP-107.
>
> So we propose to remove all the 'connector.' prefix and rename
> 'connector.type' to 'connector', 'format.type' to 'format'. So a new Kafka
> DDL may look like this:
>
> CREATE TABLE kafka_table (
>  ...
> ) WITH (
>  'connector' = 'kafka',
>  'version' = '0.10',
>  'topic' = 'test-topic',
>  'startup-mode' = 'earliest-offset',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  'format' = 'json',
>  'format.fail-on-missing-field' = 'false'
> );
>
> The new connector property key set will come together with new Factory
> inferface which is proposed in FLIP-95. Old properties are still compatible
> with their existing implementation. New properties are only available in
> new DynamicTableFactory implementations.
>
> You can access the detailed FLIP here:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>
> Best,
> Jark
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Kurt Young
Hi Jark,

Thanks for the proposal, I'm +1 to the general idea. However I have a
question about "version",
in the old design, the version seems to be aimed for tracking property
version, with different
version, we could evolve these step by step without breaking backward
compatibility. But in this
design, version is representing external system's version, like "0.11" for
kafka, "6" or "7" for ES.
I'm not sure if this is necessary, what's the benefit of using two keys
instead of one, like "kafka-0.11"
or "ES6" directly? And how about the old capability which could let us
evolving connector properties?

Best,
Kurt


On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]> wrote:

> Hi Jark,
>
> I am really looking forward to this feature. I think this feature
> could simplify flink sql code,and at the same time ,
> it could make the developer more easlier to config the flink sql WITH
> options.
>
> Now when I am using flink sql to write flink task , sometimes I think the
> WITH options is too long for user.
> For example,I config the kafka source connector parameter,for consumer
> group and brokers parameter:
>
>           'connector.properties.0.key' = 'group.id'
> >          , 'connector.properties.0.value' = 'xxx'
> >          , 'connector.properties.1.key' = 'bootstrap.servers'
> >          , 'connector.properties.1.value' = 'xxxxx'
> >
>
> I can understand this config , but for the flink fresh man,maybe it
> is confused for him.
> In my thought, I am really looking forward to this feature,thank you to
> propose this feature.
>
> Best wishes,
> LakeShen
>
>
> Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
>
> > Hi everyone,
> >
> > I want to start a discussion about further improve and simplify our
> current
> > connector porperty keys, aka WITH options. Currently, we have a
> > 'connector.' prefix for many properties, but they are verbose, and we
> see a
> > big inconsistency between the properties when designing FLIP-107.
> >
> > So we propose to remove all the 'connector.' prefix and rename
> > 'connector.type' to 'connector', 'format.type' to 'format'. So a new
> Kafka
> > DDL may look like this:
> >
> > CREATE TABLE kafka_table (
> >  ...
> > ) WITH (
> >  'connector' = 'kafka',
> >  'version' = '0.10',
> >  'topic' = 'test-topic',
> >  'startup-mode' = 'earliest-offset',
> >  'properties.bootstrap.servers' = 'localhost:9092',
> >  'properties.group.id' = 'testGroup',
> >  'format' = 'json',
> >  'format.fail-on-missing-field' = 'false'
> > );
> >
> > The new connector property key set will come together with new Factory
> > inferface which is proposed in FLIP-95. Old properties are still
> compatible
> > with their existing implementation. New properties are only available in
> > new DynamicTableFactory implementations.
> >
> > You can access the detailed FLIP here:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> >
> > Best,
> > Jark
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi Kurt,

That's good questions.

> the meaning of "version"
There are two versions in the old design. One is property version
"connector.property-version" which can be used for backward compatibility.
The other one is "connector.version" which defines the version of external
system, e.g. 0.11" for kafka, "6" or "7" for ES.
In this proposal, the "version" is the previous "connector.version". The
""connector.property-version" is not introduced in new design.

> how to keep the old capability which can evolve connector properties
The "connector.property-version" is an optional key in the old design and
is never bumped up.
I'm not sure how "connector.property-version" should work in the initial
design. Maybe @Timo Walther <[hidden email]> has more knowledge on
this.
But for the new properties, every options should be expressed as
`ConfigOption` which provides `withDeprecatedKeys(...)` method to easily
support evolving keys.

> a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
There are several benefit to use separate "version" key I can see:
1) it's more readable to separete them into different keys, because they
are orthogonal concepts.
2) the planner can give all the availble versions in the exception message,
if user uses a wrong version (this is often reported in user ML).
3) If we use "kafka-0.11" as connector identifier, we may have to write a
full documentation for each version, because they are different "connector"?
    IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector
but with different "client jar" version,
    they share all the same supported property keys and should reside
together.
4) IMO, the future vision is version-free. At some point in the future, we
may don't need users to specify kafka version anymore, and make
"version=universal" as optional or removed in the future. This is can be
done easily if they are separate keys.

Best,
Jark


On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:

> Hi Jark,
>
> Thanks for the proposal, I'm +1 to the general idea. However I have a
> question about "version",
> in the old design, the version seems to be aimed for tracking property
> version, with different
> version, we could evolve these step by step without breaking backward
> compatibility. But in this
> design, version is representing external system's version, like "0.11" for
> kafka, "6" or "7" for ES.
> I'm not sure if this is necessary, what's the benefit of using two keys
> instead of one, like "kafka-0.11"
> or "ES6" directly? And how about the old capability which could let us
> evolving connector properties?
>
> Best,
> Kurt
>
>
> On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]>
> wrote:
>
> > Hi Jark,
> >
> > I am really looking forward to this feature. I think this feature
> > could simplify flink sql code,and at the same time ,
> > it could make the developer more easlier to config the flink sql WITH
> > options.
> >
> > Now when I am using flink sql to write flink task , sometimes I think the
> > WITH options is too long for user.
> > For example,I config the kafka source connector parameter,for consumer
> > group and brokers parameter:
> >
> >           'connector.properties.0.key' = 'group.id'
> > >          , 'connector.properties.0.value' = 'xxx'
> > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > >          , 'connector.properties.1.value' = 'xxxxx'
> > >
> >
> > I can understand this config , but for the flink fresh man,maybe it
> > is confused for him.
> > In my thought, I am really looking forward to this feature,thank you to
> > propose this feature.
> >
> > Best wishes,
> > LakeShen
> >
> >
> > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> >
> > > Hi everyone,
> > >
> > > I want to start a discussion about further improve and simplify our
> > current
> > > connector porperty keys, aka WITH options. Currently, we have a
> > > 'connector.' prefix for many properties, but they are verbose, and we
> > see a
> > > big inconsistency between the properties when designing FLIP-107.
> > >
> > > So we propose to remove all the 'connector.' prefix and rename
> > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new
> > Kafka
> > > DDL may look like this:
> > >
> > > CREATE TABLE kafka_table (
> > >  ...
> > > ) WITH (
> > >  'connector' = 'kafka',
> > >  'version' = '0.10',
> > >  'topic' = 'test-topic',
> > >  'startup-mode' = 'earliest-offset',
> > >  'properties.bootstrap.servers' = 'localhost:9092',
> > >  'properties.group.id' = 'testGroup',
> > >  'format' = 'json',
> > >  'format.fail-on-missing-field' = 'false'
> > > );
> > >
> > > The new connector property key set will come together with new Factory
> > > inferface which is proposed in FLIP-95. Old properties are still
> > compatible
> > > with their existing implementation. New properties are only available
> in
> > > new DynamicTableFactory implementations.
> > >
> > > You can access the detailed FLIP here:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > >
> > > Best,
> > > Jark
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Kurt Young
> 1) it's more readable to separete them into different keys, because they
are orthogonal concepts.
I'm not sure whether this is more clear, but using one key will be
definitely one line less.

> 2) the planner can give all the availble versions in the exception
message, if user uses a wrong version (this is often reported in user ML).
Lists all available connectors seems also quite straightforward, e.g user
provided a wrong "kafka-0.8", we tell user we have candidates of
"kakfa-0.11", "kafka-universal"

> 3) we may have to write a full documentation for each version, because
they are different "connector"?
I don't think so. We can still treat it as the same connector but with
different versions.

> 4)  At some point in the future, we may don't need users to specify kafka
version anymore.
I think this could make user confuse. For example, let's say current
"universal" kafka connector is the unified connector, which can support
all kinds of versions. Then users will have different experience regarding
to version:
If you provide "version=0.11", we will use "kafka-0.11" for connector. But
if users don't set any version, we will use "kafka-universal" instead.
The behavior is inconsistent IMO.

Best,
Kurt


On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote:

> Hi Kurt,
>
> That's good questions.
>
> > the meaning of "version"
> There are two versions in the old design. One is property version
> "connector.property-version" which can be used for backward compatibility.
> The other one is "connector.version" which defines the version of external
> system, e.g. 0.11" for kafka, "6" or "7" for ES.
> In this proposal, the "version" is the previous "connector.version". The
> ""connector.property-version" is not introduced in new design.
>
> > how to keep the old capability which can evolve connector properties
> The "connector.property-version" is an optional key in the old design and
> is never bumped up.
> I'm not sure how "connector.property-version" should work in the initial
> design. Maybe @Timo Walther <[hidden email]> has more knowledge on
> this.
> But for the new properties, every options should be expressed as
> `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily
> support evolving keys.
>
> > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
> There are several benefit to use separate "version" key I can see:
> 1) it's more readable to separete them into different keys, because they
> are orthogonal concepts.
> 2) the planner can give all the availble versions in the exception message,
> if user uses a wrong version (this is often reported in user ML).
> 3) If we use "kafka-0.11" as connector identifier, we may have to write a
> full documentation for each version, because they are different
> "connector"?
>     IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector
> but with different "client jar" version,
>     they share all the same supported property keys and should reside
> together.
> 4) IMO, the future vision is version-free. At some point in the future, we
> may don't need users to specify kafka version anymore, and make
> "version=universal" as optional or removed in the future. This is can be
> done easily if they are separate keys.
>
> Best,
> Jark
>
>
> On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:
>
> > Hi Jark,
> >
> > Thanks for the proposal, I'm +1 to the general idea. However I have a
> > question about "version",
> > in the old design, the version seems to be aimed for tracking property
> > version, with different
> > version, we could evolve these step by step without breaking backward
> > compatibility. But in this
> > design, version is representing external system's version, like "0.11"
> for
> > kafka, "6" or "7" for ES.
> > I'm not sure if this is necessary, what's the benefit of using two keys
> > instead of one, like "kafka-0.11"
> > or "ES6" directly? And how about the old capability which could let us
> > evolving connector properties?
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]>
> > wrote:
> >
> > > Hi Jark,
> > >
> > > I am really looking forward to this feature. I think this feature
> > > could simplify flink sql code,and at the same time ,
> > > it could make the developer more easlier to config the flink sql WITH
> > > options.
> > >
> > > Now when I am using flink sql to write flink task , sometimes I think
> the
> > > WITH options is too long for user.
> > > For example,I config the kafka source connector parameter,for consumer
> > > group and brokers parameter:
> > >
> > >           'connector.properties.0.key' = 'group.id'
> > > >          , 'connector.properties.0.value' = 'xxx'
> > > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > > >          , 'connector.properties.1.value' = 'xxxxx'
> > > >
> > >
> > > I can understand this config , but for the flink fresh man,maybe it
> > > is confused for him.
> > > In my thought, I am really looking forward to this feature,thank you to
> > > propose this feature.
> > >
> > > Best wishes,
> > > LakeShen
> > >
> > >
> > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I want to start a discussion about further improve and simplify our
> > > current
> > > > connector porperty keys, aka WITH options. Currently, we have a
> > > > 'connector.' prefix for many properties, but they are verbose, and we
> > > see a
> > > > big inconsistency between the properties when designing FLIP-107.
> > > >
> > > > So we propose to remove all the 'connector.' prefix and rename
> > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new
> > > Kafka
> > > > DDL may look like this:
> > > >
> > > > CREATE TABLE kafka_table (
> > > >  ...
> > > > ) WITH (
> > > >  'connector' = 'kafka',
> > > >  'version' = '0.10',
> > > >  'topic' = 'test-topic',
> > > >  'startup-mode' = 'earliest-offset',
> > > >  'properties.bootstrap.servers' = 'localhost:9092',
> > > >  'properties.group.id' = 'testGroup',
> > > >  'format' = 'json',
> > > >  'format.fail-on-missing-field' = 'false'
> > > > );
> > > >
> > > > The new connector property key set will come together with new
> Factory
> > > > inferface which is proposed in FLIP-95. Old properties are still
> > > compatible
> > > > with their existing implementation. New properties are only available
> > in
> > > > new DynamicTableFactory implementations.
> > > >
> > > > You can access the detailed FLIP here:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > >
> > > > Best,
> > > > Jark
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jingsong Li
In reply to this post by Jark Wu-2
Thanks Jark for the proposal.

+1 to the general idea.

For "version", what about "kafka.version"? It is obvious to know its
meaning.

And I'd like to start a new topic:
Should we need to explicitly separate source from sink?
With the development of batch and streaming, more and more connectors have
both source and sink.

So should we set a rule for table properties:
- properties for both source and sink: without prefix, like "topic"
- properties for source only: with "source." prefix, like
"source.startup-mode"
- properties for sink only: with "sink." prefix, like "sink.partitioner"

What do you think?

Best,
Jingsong Lee

On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote:

> Hi Kurt,
>
> That's good questions.
>
> > the meaning of "version"
> There are two versions in the old design. One is property version
> "connector.property-version" which can be used for backward compatibility.
> The other one is "connector.version" which defines the version of external
> system, e.g. 0.11" for kafka, "6" or "7" for ES.
> In this proposal, the "version" is the previous "connector.version". The
> ""connector.property-version" is not introduced in new design.
>
> > how to keep the old capability which can evolve connector properties
> The "connector.property-version" is an optional key in the old design and
> is never bumped up.
> I'm not sure how "connector.property-version" should work in the initial
> design. Maybe @Timo Walther <[hidden email]> has more knowledge on
> this.
> But for the new properties, every options should be expressed as
> `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily
> support evolving keys.
>
> > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
> There are several benefit to use separate "version" key I can see:
> 1) it's more readable to separete them into different keys, because they
> are orthogonal concepts.
> 2) the planner can give all the availble versions in the exception message,
> if user uses a wrong version (this is often reported in user ML).
> 3) If we use "kafka-0.11" as connector identifier, we may have to write a
> full documentation for each version, because they are different
> "connector"?
>     IMO, for 0.11, 0.11, etc... kafka, they are actually the same connector
> but with different "client jar" version,
>     they share all the same supported property keys and should reside
> together.
> 4) IMO, the future vision is version-free. At some point in the future, we
> may don't need users to specify kafka version anymore, and make
> "version=universal" as optional or removed in the future. This is can be
> done easily if they are separate keys.
>
> Best,
> Jark
>
>
> On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:
>
> > Hi Jark,
> >
> > Thanks for the proposal, I'm +1 to the general idea. However I have a
> > question about "version",
> > in the old design, the version seems to be aimed for tracking property
> > version, with different
> > version, we could evolve these step by step without breaking backward
> > compatibility. But in this
> > design, version is representing external system's version, like "0.11"
> for
> > kafka, "6" or "7" for ES.
> > I'm not sure if this is necessary, what's the benefit of using two keys
> > instead of one, like "kafka-0.11"
> > or "ES6" directly? And how about the old capability which could let us
> > evolving connector properties?
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]>
> > wrote:
> >
> > > Hi Jark,
> > >
> > > I am really looking forward to this feature. I think this feature
> > > could simplify flink sql code,and at the same time ,
> > > it could make the developer more easlier to config the flink sql WITH
> > > options.
> > >
> > > Now when I am using flink sql to write flink task , sometimes I think
> the
> > > WITH options is too long for user.
> > > For example,I config the kafka source connector parameter,for consumer
> > > group and brokers parameter:
> > >
> > >           'connector.properties.0.key' = 'group.id'
> > > >          , 'connector.properties.0.value' = 'xxx'
> > > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > > >          , 'connector.properties.1.value' = 'xxxxx'
> > > >
> > >
> > > I can understand this config , but for the flink fresh man,maybe it
> > > is confused for him.
> > > In my thought, I am really looking forward to this feature,thank you to
> > > propose this feature.
> > >
> > > Best wishes,
> > > LakeShen
> > >
> > >
> > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> > >
> > > > Hi everyone,
> > > >
> > > > I want to start a discussion about further improve and simplify our
> > > current
> > > > connector porperty keys, aka WITH options. Currently, we have a
> > > > 'connector.' prefix for many properties, but they are verbose, and we
> > > see a
> > > > big inconsistency between the properties when designing FLIP-107.
> > > >
> > > > So we propose to remove all the 'connector.' prefix and rename
> > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a new
> > > Kafka
> > > > DDL may look like this:
> > > >
> > > > CREATE TABLE kafka_table (
> > > >  ...
> > > > ) WITH (
> > > >  'connector' = 'kafka',
> > > >  'version' = '0.10',
> > > >  'topic' = 'test-topic',
> > > >  'startup-mode' = 'earliest-offset',
> > > >  'properties.bootstrap.servers' = 'localhost:9092',
> > > >  'properties.group.id' = 'testGroup',
> > > >  'format' = 'json',
> > > >  'format.fail-on-missing-field' = 'false'
> > > > );
> > > >
> > > > The new connector property key set will come together with new
> Factory
> > > > inferface which is proposed in FLIP-95. Old properties are still
> > > compatible
> > > > with their existing implementation. New properties are only available
> > in
> > > > new DynamicTableFactory implementations.
> > > >
> > > > You can access the detailed FLIP here:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > >
> > > > Best,
> > > > Jark
> > > >
> > >
> >
>


--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi Kurt,

> 2) Lists all available connectors seems also quite straightforward, e.g
user provided a wrong "kafka-0.8", we tell user we have candidates of
"kakfa-0.11", "kafka-universal"
It's not possible for the framework to throw such exception. Because the
framework doesn't know what versions do the connector support. All the
version information is a blackbox in the identifier. But with
`Factory#factoryVersion()` interface, we can know all the supported
versions.

> 3) I don't think so. We can still treat it as the same connector but with
different versions.
That's true but that's weird. Because from the plain DDL definition, they
look like different connectors with different "connector" value, e.g.
'connector=kafka-0.8', 'connector=kafka-0.10'.

> If users don't set any version, we will use "kafka-universal" instead.
The behavior is inconsistent IMO.
That is a long term vision when there is no kafka clusters with <0.11
version.
At that point, "universal" is the only supported version in Flink and the
"version" key can be optional.

---------------------------------------------

Hi Jingsong,

> "version" vs "kafka.version"
I though about it. But if we prefix "kafka" to version, we should prefix
"kafka" for all other property keys, because they are all kafka specific
options.
However, that will make the property set verbose, see rejected option#2 in
the FLIP.

> explicitly separate options for source and sink
That's a good topic. It's good to have a guideline for the new property
keys.
I'm fine to prefix with a "source"/"sink" for some connector keys.
Actually, we already do this in some connectors, e.g. jdbc and hbase.

Best,
Jark

On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]> wrote:

> Thanks Jark for the proposal.
>
> +1 to the general idea.
>
> For "version", what about "kafka.version"? It is obvious to know its
> meaning.
>
> And I'd like to start a new topic:
> Should we need to explicitly separate source from sink?
> With the development of batch and streaming, more and more connectors have
> both source and sink.
>
> So should we set a rule for table properties:
> - properties for both source and sink: without prefix, like "topic"
> - properties for source only: with "source." prefix, like
> "source.startup-mode"
> - properties for sink only: with "sink." prefix, like "sink.partitioner"
>
> What do you think?
>
> Best,
> Jingsong Lee
>
> On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote:
>
> > Hi Kurt,
> >
> > That's good questions.
> >
> > > the meaning of "version"
> > There are two versions in the old design. One is property version
> > "connector.property-version" which can be used for backward
> compatibility.
> > The other one is "connector.version" which defines the version of
> external
> > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> > In this proposal, the "version" is the previous "connector.version". The
> > ""connector.property-version" is not introduced in new design.
> >
> > > how to keep the old capability which can evolve connector properties
> > The "connector.property-version" is an optional key in the old design and
> > is never bumped up.
> > I'm not sure how "connector.property-version" should work in the initial
> > design. Maybe @Timo Walther <[hidden email]> has more knowledge on
> > this.
> > But for the new properties, every options should be expressed as
> > `ConfigOption` which provides `withDeprecatedKeys(...)` method to easily
> > support evolving keys.
> >
> > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
> > There are several benefit to use separate "version" key I can see:
> > 1) it's more readable to separete them into different keys, because they
> > are orthogonal concepts.
> > 2) the planner can give all the availble versions in the exception
> message,
> > if user uses a wrong version (this is often reported in user ML).
> > 3) If we use "kafka-0.11" as connector identifier, we may have to write a
> > full documentation for each version, because they are different
> > "connector"?
> >     IMO, for 0.11, 0.11, etc... kafka, they are actually the same
> connector
> > but with different "client jar" version,
> >     they share all the same supported property keys and should reside
> > together.
> > 4) IMO, the future vision is version-free. At some point in the future,
> we
> > may don't need users to specify kafka version anymore, and make
> > "version=universal" as optional or removed in the future. This is can be
> > done easily if they are separate keys.
> >
> > Best,
> > Jark
> >
> >
> > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:
> >
> > > Hi Jark,
> > >
> > > Thanks for the proposal, I'm +1 to the general idea. However I have a
> > > question about "version",
> > > in the old design, the version seems to be aimed for tracking property
> > > version, with different
> > > version, we could evolve these step by step without breaking backward
> > > compatibility. But in this
> > > design, version is representing external system's version, like "0.11"
> > for
> > > kafka, "6" or "7" for ES.
> > > I'm not sure if this is necessary, what's the benefit of using two keys
> > > instead of one, like "kafka-0.11"
> > > or "ES6" directly? And how about the old capability which could let us
> > > evolving connector properties?
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]>
> > > wrote:
> > >
> > > > Hi Jark,
> > > >
> > > > I am really looking forward to this feature. I think this feature
> > > > could simplify flink sql code,and at the same time ,
> > > > it could make the developer more easlier to config the flink sql WITH
> > > > options.
> > > >
> > > > Now when I am using flink sql to write flink task , sometimes I think
> > the
> > > > WITH options is too long for user.
> > > > For example,I config the kafka source connector parameter,for
> consumer
> > > > group and brokers parameter:
> > > >
> > > >           'connector.properties.0.key' = 'group.id'
> > > > >          , 'connector.properties.0.value' = 'xxx'
> > > > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > > > >          , 'connector.properties.1.value' = 'xxxxx'
> > > > >
> > > >
> > > > I can understand this config , but for the flink fresh man,maybe it
> > > > is confused for him.
> > > > In my thought, I am really looking forward to this feature,thank you
> to
> > > > propose this feature.
> > > >
> > > > Best wishes,
> > > > LakeShen
> > > >
> > > >
> > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I want to start a discussion about further improve and simplify our
> > > > current
> > > > > connector porperty keys, aka WITH options. Currently, we have a
> > > > > 'connector.' prefix for many properties, but they are verbose, and
> we
> > > > see a
> > > > > big inconsistency between the properties when designing FLIP-107.
> > > > >
> > > > > So we propose to remove all the 'connector.' prefix and rename
> > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a
> new
> > > > Kafka
> > > > > DDL may look like this:
> > > > >
> > > > > CREATE TABLE kafka_table (
> > > > >  ...
> > > > > ) WITH (
> > > > >  'connector' = 'kafka',
> > > > >  'version' = '0.10',
> > > > >  'topic' = 'test-topic',
> > > > >  'startup-mode' = 'earliest-offset',
> > > > >  'properties.bootstrap.servers' = 'localhost:9092',
> > > > >  'properties.group.id' = 'testGroup',
> > > > >  'format' = 'json',
> > > > >  'format.fail-on-missing-field' = 'false'
> > > > > );
> > > > >
> > > > > The new connector property key set will come together with new
> > Factory
> > > > > inferface which is proposed in FLIP-95. Old properties are still
> > > > compatible
> > > > > with their existing implementation. New properties are only
> available
> > > in
> > > > > new DynamicTableFactory implementations.
> > > > >
> > > > > You can access the detailed FLIP here:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Kurt Young
> It's not possible for the framework to throw such exception. Because the
framework doesn't know what versions do the connector support.

Not really, we can list all valid connectors framework could found. E.g.
user mistyped 'kafka-0.x', the error message will looks like:

we don't have any connector named "kafka-0.x", but we have:
FileSystem
Kafka-0.10
Kafka-0.11
ElasticSearch6
ElasticSearch7

Best,
Kurt


On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]> wrote:

> Hi Kurt,
>
> > 2) Lists all available connectors seems also quite straightforward, e.g
> user provided a wrong "kafka-0.8", we tell user we have candidates of
> "kakfa-0.11", "kafka-universal"
> It's not possible for the framework to throw such exception. Because the
> framework doesn't know what versions do the connector support. All the
> version information is a blackbox in the identifier. But with
> `Factory#factoryVersion()` interface, we can know all the supported
> versions.
>
> > 3) I don't think so. We can still treat it as the same connector but with
> different versions.
> That's true but that's weird. Because from the plain DDL definition, they
> look like different connectors with different "connector" value, e.g.
> 'connector=kafka-0.8', 'connector=kafka-0.10'.
>
> > If users don't set any version, we will use "kafka-universal" instead.
> The behavior is inconsistent IMO.
> That is a long term vision when there is no kafka clusters with <0.11
> version.
> At that point, "universal" is the only supported version in Flink and the
> "version" key can be optional.
>
> ---------------------------------------------
>
> Hi Jingsong,
>
> > "version" vs "kafka.version"
> I though about it. But if we prefix "kafka" to version, we should prefix
> "kafka" for all other property keys, because they are all kafka specific
> options.
> However, that will make the property set verbose, see rejected option#2 in
> the FLIP.
>
> > explicitly separate options for source and sink
> That's a good topic. It's good to have a guideline for the new property
> keys.
> I'm fine to prefix with a "source"/"sink" for some connector keys.
> Actually, we already do this in some connectors, e.g. jdbc and hbase.
>
> Best,
> Jark
>
> On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]> wrote:
>
> > Thanks Jark for the proposal.
> >
> > +1 to the general idea.
> >
> > For "version", what about "kafka.version"? It is obvious to know its
> > meaning.
> >
> > And I'd like to start a new topic:
> > Should we need to explicitly separate source from sink?
> > With the development of batch and streaming, more and more connectors
> have
> > both source and sink.
> >
> > So should we set a rule for table properties:
> > - properties for both source and sink: without prefix, like "topic"
> > - properties for source only: with "source." prefix, like
> > "source.startup-mode"
> > - properties for sink only: with "sink." prefix, like "sink.partitioner"
> >
> > What do you think?
> >
> > Best,
> > Jingsong Lee
> >
> > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote:
> >
> > > Hi Kurt,
> > >
> > > That's good questions.
> > >
> > > > the meaning of "version"
> > > There are two versions in the old design. One is property version
> > > "connector.property-version" which can be used for backward
> > compatibility.
> > > The other one is "connector.version" which defines the version of
> > external
> > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> > > In this proposal, the "version" is the previous "connector.version".
> The
> > > ""connector.property-version" is not introduced in new design.
> > >
> > > > how to keep the old capability which can evolve connector properties
> > > The "connector.property-version" is an optional key in the old design
> and
> > > is never bumped up.
> > > I'm not sure how "connector.property-version" should work in the
> initial
> > > design. Maybe @Timo Walther <[hidden email]> has more knowledge on
> > > this.
> > > But for the new properties, every options should be expressed as
> > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to
> easily
> > > support evolving keys.
> > >
> > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
> > > There are several benefit to use separate "version" key I can see:
> > > 1) it's more readable to separete them into different keys, because
> they
> > > are orthogonal concepts.
> > > 2) the planner can give all the availble versions in the exception
> > message,
> > > if user uses a wrong version (this is often reported in user ML).
> > > 3) If we use "kafka-0.11" as connector identifier, we may have to
> write a
> > > full documentation for each version, because they are different
> > > "connector"?
> > >     IMO, for 0.11, 0.11, etc... kafka, they are actually the same
> > connector
> > > but with different "client jar" version,
> > >     they share all the same supported property keys and should reside
> > > together.
> > > 4) IMO, the future vision is version-free. At some point in the future,
> > we
> > > may don't need users to specify kafka version anymore, and make
> > > "version=universal" as optional or removed in the future. This is can
> be
> > > done easily if they are separate keys.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:
> > >
> > > > Hi Jark,
> > > >
> > > > Thanks for the proposal, I'm +1 to the general idea. However I have a
> > > > question about "version",
> > > > in the old design, the version seems to be aimed for tracking
> property
> > > > version, with different
> > > > version, we could evolve these step by step without breaking backward
> > > > compatibility. But in this
> > > > design, version is representing external system's version, like
> "0.11"
> > > for
> > > > kafka, "6" or "7" for ES.
> > > > I'm not sure if this is necessary, what's the benefit of using two
> keys
> > > > instead of one, like "kafka-0.11"
> > > > or "ES6" directly? And how about the old capability which could let
> us
> > > > evolving connector properties?
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > I am really looking forward to this feature. I think this feature
> > > > > could simplify flink sql code,and at the same time ,
> > > > > it could make the developer more easlier to config the flink sql
> WITH
> > > > > options.
> > > > >
> > > > > Now when I am using flink sql to write flink task , sometimes I
> think
> > > the
> > > > > WITH options is too long for user.
> > > > > For example,I config the kafka source connector parameter,for
> > consumer
> > > > > group and brokers parameter:
> > > > >
> > > > >           'connector.properties.0.key' = 'group.id'
> > > > > >          , 'connector.properties.0.value' = 'xxx'
> > > > > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > > > > >          , 'connector.properties.1.value' = 'xxxxx'
> > > > > >
> > > > >
> > > > > I can understand this config , but for the flink fresh man,maybe it
> > > > > is confused for him.
> > > > > In my thought, I am really looking forward to this feature,thank
> you
> > to
> > > > > propose this feature.
> > > > >
> > > > > Best wishes,
> > > > > LakeShen
> > > > >
> > > > >
> > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I want to start a discussion about further improve and simplify
> our
> > > > > current
> > > > > > connector porperty keys, aka WITH options. Currently, we have a
> > > > > > 'connector.' prefix for many properties, but they are verbose,
> and
> > we
> > > > > see a
> > > > > > big inconsistency between the properties when designing FLIP-107.
> > > > > >
> > > > > > So we propose to remove all the 'connector.' prefix and rename
> > > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So a
> > new
> > > > > Kafka
> > > > > > DDL may look like this:
> > > > > >
> > > > > > CREATE TABLE kafka_table (
> > > > > >  ...
> > > > > > ) WITH (
> > > > > >  'connector' = 'kafka',
> > > > > >  'version' = '0.10',
> > > > > >  'topic' = 'test-topic',
> > > > > >  'startup-mode' = 'earliest-offset',
> > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
> > > > > >  'properties.group.id' = 'testGroup',
> > > > > >  'format' = 'json',
> > > > > >  'format.fail-on-missing-field' = 'false'
> > > > > > );
> > > > > >
> > > > > > The new connector property key set will come together with new
> > > Factory
> > > > > > inferface which is proposed in FLIP-95. Old properties are still
> > > > > compatible
> > > > > > with their existing implementation. New properties are only
> > available
> > > > in
> > > > > > new DynamicTableFactory implementations.
> > > > > >
> > > > > > You can access the detailed FLIP here:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Benchao Li
Hi Jark,

Thanks for starting the discussion. The FLIP LGTM generally.

Regarding connector version VS put version into connector's name,
I favor the latter personally, using one property to locate a connector can
make the error log more precise.
From the users' side, using one property to match a connector will be
easier. Especially we have many connectors,
and some of the need version property required, and some of them not.

Regarding Jingsong's suggestion,
IMO, it's a very good complement to the FLIP. Distinguishing properties for
source and sink can be very useful, and
also this will make the connector property more precise.
We are also sick of this for now, we cannot know whether a DDL is a source
or sink unless we look through all queries where
the table is used.
Even more, some of the required properties are only required for source,
bug we cannot leave it blank for sink, and vice versa.
I think we can also add a type for dimension tables except source and sink.

Kurt Young <[hidden email]> 于2020年3月30日周一 下午8:16写道:

> > It's not possible for the framework to throw such exception. Because the
> framework doesn't know what versions do the connector support.
>
> Not really, we can list all valid connectors framework could found. E.g.
> user mistyped 'kafka-0.x', the error message will looks like:
>
> we don't have any connector named "kafka-0.x", but we have:
> FileSystem
> Kafka-0.10
> Kafka-0.11
> ElasticSearch6
> ElasticSearch7
>
> Best,
> Kurt
>
>
> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]> wrote:
>
> > Hi Kurt,
> >
> > > 2) Lists all available connectors seems also quite straightforward, e.g
> > user provided a wrong "kafka-0.8", we tell user we have candidates of
> > "kakfa-0.11", "kafka-universal"
> > It's not possible for the framework to throw such exception. Because the
> > framework doesn't know what versions do the connector support. All the
> > version information is a blackbox in the identifier. But with
> > `Factory#factoryVersion()` interface, we can know all the supported
> > versions.
> >
> > > 3) I don't think so. We can still treat it as the same connector but
> with
> > different versions.
> > That's true but that's weird. Because from the plain DDL definition, they
> > look like different connectors with different "connector" value, e.g.
> > 'connector=kafka-0.8', 'connector=kafka-0.10'.
> >
> > > If users don't set any version, we will use "kafka-universal" instead.
> > The behavior is inconsistent IMO.
> > That is a long term vision when there is no kafka clusters with <0.11
> > version.
> > At that point, "universal" is the only supported version in Flink and the
> > "version" key can be optional.
> >
> > ---------------------------------------------
> >
> > Hi Jingsong,
> >
> > > "version" vs "kafka.version"
> > I though about it. But if we prefix "kafka" to version, we should prefix
> > "kafka" for all other property keys, because they are all kafka specific
> > options.
> > However, that will make the property set verbose, see rejected option#2
> in
> > the FLIP.
> >
> > > explicitly separate options for source and sink
> > That's a good topic. It's good to have a guideline for the new property
> > keys.
> > I'm fine to prefix with a "source"/"sink" for some connector keys.
> > Actually, we already do this in some connectors, e.g. jdbc and hbase.
> >
> > Best,
> > Jark
> >
> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]>
> wrote:
> >
> > > Thanks Jark for the proposal.
> > >
> > > +1 to the general idea.
> > >
> > > For "version", what about "kafka.version"? It is obvious to know its
> > > meaning.
> > >
> > > And I'd like to start a new topic:
> > > Should we need to explicitly separate source from sink?
> > > With the development of batch and streaming, more and more connectors
> > have
> > > both source and sink.
> > >
> > > So should we set a rule for table properties:
> > > - properties for both source and sink: without prefix, like "topic"
> > > - properties for source only: with "source." prefix, like
> > > "source.startup-mode"
> > > - properties for sink only: with "sink." prefix, like
> "sink.partitioner"
> > >
> > > What do you think?
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]> wrote:
> > >
> > > > Hi Kurt,
> > > >
> > > > That's good questions.
> > > >
> > > > > the meaning of "version"
> > > > There are two versions in the old design. One is property version
> > > > "connector.property-version" which can be used for backward
> > > compatibility.
> > > > The other one is "connector.version" which defines the version of
> > > external
> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> > > > In this proposal, the "version" is the previous "connector.version".
> > The
> > > > ""connector.property-version" is not introduced in new design.
> > > >
> > > > > how to keep the old capability which can evolve connector
> properties
> > > > The "connector.property-version" is an optional key in the old design
> > and
> > > > is never bumped up.
> > > > I'm not sure how "connector.property-version" should work in the
> > initial
> > > > design. Maybe @Timo Walther <[hidden email]> has more knowledge
> on
> > > > this.
> > > > But for the new properties, every options should be expressed as
> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to
> > easily
> > > > support evolving keys.
> > > >
> > > > > a single keys instead of two, e.g. "kafka-0.11", "kafka-universal"?
> > > > There are several benefit to use separate "version" key I can see:
> > > > 1) it's more readable to separete them into different keys, because
> > they
> > > > are orthogonal concepts.
> > > > 2) the planner can give all the availble versions in the exception
> > > message,
> > > > if user uses a wrong version (this is often reported in user ML).
> > > > 3) If we use "kafka-0.11" as connector identifier, we may have to
> > write a
> > > > full documentation for each version, because they are different
> > > > "connector"?
> > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually the same
> > > connector
> > > > but with different "client jar" version,
> > > >     they share all the same supported property keys and should reside
> > > > together.
> > > > 4) IMO, the future vision is version-free. At some point in the
> future,
> > > we
> > > > may don't need users to specify kafka version anymore, and make
> > > > "version=universal" as optional or removed in the future. This is can
> > be
> > > > done easily if they are separate keys.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]> wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for the proposal, I'm +1 to the general idea. However I
> have a
> > > > > question about "version",
> > > > > in the old design, the version seems to be aimed for tracking
> > property
> > > > > version, with different
> > > > > version, we could evolve these step by step without breaking
> backward
> > > > > compatibility. But in this
> > > > > design, version is representing external system's version, like
> > "0.11"
> > > > for
> > > > > kafka, "6" or "7" for ES.
> > > > > I'm not sure if this is necessary, what's the benefit of using two
> > keys
> > > > > instead of one, like "kafka-0.11"
> > > > > or "ES6" directly? And how about the old capability which could let
> > us
> > > > > evolving connector properties?
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi Jark,
> > > > > >
> > > > > > I am really looking forward to this feature. I think this feature
> > > > > > could simplify flink sql code,and at the same time ,
> > > > > > it could make the developer more easlier to config the flink sql
> > WITH
> > > > > > options.
> > > > > >
> > > > > > Now when I am using flink sql to write flink task , sometimes I
> > think
> > > > the
> > > > > > WITH options is too long for user.
> > > > > > For example,I config the kafka source connector parameter,for
> > > consumer
> > > > > > group and brokers parameter:
> > > > > >
> > > > > >           'connector.properties.0.key' = 'group.id'
> > > > > > >          , 'connector.properties.0.value' = 'xxx'
> > > > > > >          , 'connector.properties.1.key' = 'bootstrap.servers'
> > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
> > > > > > >
> > > > > >
> > > > > > I can understand this config , but for the flink fresh man,maybe
> it
> > > > > > is confused for him.
> > > > > > In my thought, I am really looking forward to this feature,thank
> > you
> > > to
> > > > > > propose this feature.
> > > > > >
> > > > > > Best wishes,
> > > > > > LakeShen
> > > > > >
> > > > > >
> > > > > > Jark Wu <[hidden email]> 于2020年3月30日周一 下午2:02写道:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I want to start a discussion about further improve and simplify
> > our
> > > > > > current
> > > > > > > connector porperty keys, aka WITH options. Currently, we have a
> > > > > > > 'connector.' prefix for many properties, but they are verbose,
> > and
> > > we
> > > > > > see a
> > > > > > > big inconsistency between the properties when designing
> FLIP-107.
> > > > > > >
> > > > > > > So we propose to remove all the 'connector.' prefix and rename
> > > > > > > 'connector.type' to 'connector', 'format.type' to 'format'. So
> a
> > > new
> > > > > > Kafka
> > > > > > > DDL may look like this:
> > > > > > >
> > > > > > > CREATE TABLE kafka_table (
> > > > > > >  ...
> > > > > > > ) WITH (
> > > > > > >  'connector' = 'kafka',
> > > > > > >  'version' = '0.10',
> > > > > > >  'topic' = 'test-topic',
> > > > > > >  'startup-mode' = 'earliest-offset',
> > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
> > > > > > >  'properties.group.id' = 'testGroup',
> > > > > > >  'format' = 'json',
> > > > > > >  'format.fail-on-missing-field' = 'false'
> > > > > > > );
> > > > > > >
> > > > > > > The new connector property key set will come together with new
> > > > Factory
> > > > > > > inferface which is proposed in FLIP-95. Old properties are
> still
> > > > > > compatible
> > > > > > > with their existing implementation. New properties are only
> > > available
> > > > > in
> > > > > > > new DynamicTableFactory implementations.
> > > > > > >
> > > > > > > You can access the detailed FLIP here:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>


--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: [hidden email]; [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Timo Walther-2
Hi Jark,

thanks for the FLIP. I don't have a strong opinion on
DynamicTableFactory#factoryVersion() for distiguishing connector
versions. We can also include it in the factory identifier itself. For
Kafka, I would then just use `kafka` because the naming "universal" was
just a helper solution at that time.

What we need is a good guide for how to design the options. We should
include that in the JavaDoc of factory interface itself, once it is
in-place. I like the difference between source and sink in properties.

Regarding "connector.property-version":

It was meant for backwards compatibility along the dimension of
"property design" compared to "connector version". However, since the
connector is now responsible for parsing all properties, it is not as
necessary as it was before.

However, a change of the property design as it is done in FLIP-107
becomes more difficult without a property version and versioning is
always a good idea in API world.

Regards,
Timo


On 30.03.20 16:38, Benchao Li wrote:

> Hi Jark,
>
> Thanks for starting the discussion. The FLIP LGTM generally.
>
> Regarding connector version VS put version into connector's name,
> I favor the latter personally, using one property to locate a connector
> can make the error log more precise.
>  From the users' side, using one property to match a connector will be
> easier. Especially we have many connectors,
> and some of the need version property required, and some of them not.
>
> Regarding Jingsong's suggestion,
> IMO, it's a very good complement to the FLIP. Distinguishing properties
> for source and sink can be very useful, and
> also this will make the connector property more precise.
> We are also sick of this for now, we cannot know whether a DDL is a
> source or sink unless we look through all queries where
> the table is used.
> Even more, some of the required properties are only required for source,
> bug we cannot leave it blank for sink, and vice versa.
> I think we can also add a type for dimension tables except source and sink.
>
> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
> 周一 下午8:16写道:
>
>      > It's not possible for the framework to throw such exception.
>     Because the
>     framework doesn't know what versions do the connector support.
>
>     Not really, we can list all valid connectors framework could found. E.g.
>     user mistyped 'kafka-0.x', the error message will looks like:
>
>     we don't have any connector named "kafka-0.x", but we have:
>     FileSystem
>     Kafka-0.10
>     Kafka-0.11
>     ElasticSearch6
>     ElasticSearch7
>
>     Best,
>     Kurt
>
>
>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>      > Hi Kurt,
>      >
>      > > 2) Lists all available connectors seems also quite
>     straightforward, e.g
>      > user provided a wrong "kafka-0.8", we tell user we have candidates of
>      > "kakfa-0.11", "kafka-universal"
>      > It's not possible for the framework to throw such exception.
>     Because the
>      > framework doesn't know what versions do the connector support.
>     All the
>      > version information is a blackbox in the identifier. But with
>      > `Factory#factoryVersion()` interface, we can know all the supported
>      > versions.
>      >
>      > > 3) I don't think so. We can still treat it as the same
>     connector but with
>      > different versions.
>      > That's true but that's weird. Because from the plain DDL
>     definition, they
>      > look like different connectors with different "connector" value, e.g.
>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>      >
>      > > If users don't set any version, we will use "kafka-universal"
>     instead.
>      > The behavior is inconsistent IMO.
>      > That is a long term vision when there is no kafka clusters with <0.11
>      > version.
>      > At that point, "universal" is the only supported version in Flink
>     and the
>      > "version" key can be optional.
>      >
>      > ---------------------------------------------
>      >
>      > Hi Jingsong,
>      >
>      > > "version" vs "kafka.version"
>      > I though about it. But if we prefix "kafka" to version, we should
>     prefix
>      > "kafka" for all other property keys, because they are all kafka
>     specific
>      > options.
>      > However, that will make the property set verbose, see rejected
>     option#2 in
>      > the FLIP.
>      >
>      > > explicitly separate options for source and sink
>      > That's a good topic. It's good to have a guideline for the new
>     property
>      > keys.
>      > I'm fine to prefix with a "source"/"sink" for some connector keys.
>      > Actually, we already do this in some connectors, e.g. jdbc and hbase.
>      >
>      > Best,
>      > Jark
>      >
>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]
>     <mailto:[hidden email]>> wrote:
>      >
>      > > Thanks Jark for the proposal.
>      > >
>      > > +1 to the general idea.
>      > >
>      > > For "version", what about "kafka.version"? It is obvious to
>     know its
>      > > meaning.
>      > >
>      > > And I'd like to start a new topic:
>      > > Should we need to explicitly separate source from sink?
>      > > With the development of batch and streaming, more and more
>     connectors
>      > have
>      > > both source and sink.
>      > >
>      > > So should we set a rule for table properties:
>      > > - properties for both source and sink: without prefix, like "topic"
>      > > - properties for source only: with "source." prefix, like
>      > > "source.startup-mode"
>      > > - properties for sink only: with "sink." prefix, like
>     "sink.partitioner"
>      > >
>      > > What do you think?
>      > >
>      > > Best,
>      > > Jingsong Lee
>      > >
>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>     <mailto:[hidden email]>> wrote:
>      > >
>      > > > Hi Kurt,
>      > > >
>      > > > That's good questions.
>      > > >
>      > > > > the meaning of "version"
>      > > > There are two versions in the old design. One is property version
>      > > > "connector.property-version" which can be used for backward
>      > > compatibility.
>      > > > The other one is "connector.version" which defines the version of
>      > > external
>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>      > > > In this proposal, the "version" is the previous
>     "connector.version".
>      > The
>      > > > ""connector.property-version" is not introduced in new design.
>      > > >
>      > > > > how to keep the old capability which can evolve connector
>     properties
>      > > > The "connector.property-version" is an optional key in the
>     old design
>      > and
>      > > > is never bumped up.
>      > > > I'm not sure how "connector.property-version" should work in the
>      > initial
>      > > > design. Maybe @Timo Walther <[hidden email]
>     <mailto:[hidden email]>> has more knowledge on
>      > > > this.
>      > > > But for the new properties, every options should be expressed as
>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)` method to
>      > easily
>      > > > support evolving keys.
>      > > >
>      > > > > a single keys instead of two, e.g. "kafka-0.11",
>     "kafka-universal"?
>      > > > There are several benefit to use separate "version" key I can
>     see:
>      > > > 1) it's more readable to separete them into different keys,
>     because
>      > they
>      > > > are orthogonal concepts.
>      > > > 2) the planner can give all the availble versions in the
>     exception
>      > > message,
>      > > > if user uses a wrong version (this is often reported in user ML).
>      > > > 3) If we use "kafka-0.11" as connector identifier, we may have to
>      > write a
>      > > > full documentation for each version, because they are different
>      > > > "connector"?
>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually the same
>      > > connector
>      > > > but with different "client jar" version,
>      > > >     they share all the same supported property keys and
>     should reside
>      > > > together.
>      > > > 4) IMO, the future vision is version-free. At some point in
>     the future,
>      > > we
>      > > > may don't need users to specify kafka version anymore, and make
>      > > > "version=universal" as optional or removed in the future.
>     This is can
>      > be
>      > > > done easily if they are separate keys.
>      > > >
>      > > > Best,
>      > > > Jark
>      > > >
>      > > >
>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
>     <mailto:[hidden email]>> wrote:
>      > > >
>      > > > > Hi Jark,
>      > > > >
>      > > > > Thanks for the proposal, I'm +1 to the general idea.
>     However I have a
>      > > > > question about "version",
>      > > > > in the old design, the version seems to be aimed for tracking
>      > property
>      > > > > version, with different
>      > > > > version, we could evolve these step by step without
>     breaking backward
>      > > > > compatibility. But in this
>      > > > > design, version is representing external system's version, like
>      > "0.11"
>      > > > for
>      > > > > kafka, "6" or "7" for ES.
>      > > > > I'm not sure if this is necessary, what's the benefit of
>     using two
>      > keys
>      > > > > instead of one, like "kafka-0.11"
>      > > > > or "ES6" directly? And how about the old capability which
>     could let
>      > us
>      > > > > evolving connector properties?
>      > > > >
>      > > > > Best,
>      > > > > Kurt
>      > > > >
>      > > > >
>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>     <[hidden email] <mailto:[hidden email]>>
>      > > > > wrote:
>      > > > >
>      > > > > > Hi Jark,
>      > > > > >
>      > > > > > I am really looking forward to this feature. I think this
>     feature
>      > > > > > could simplify flink sql code,and at the same time ,
>      > > > > > it could make the developer more easlier to config the
>     flink sql
>      > WITH
>      > > > > > options.
>      > > > > >
>      > > > > > Now when I am using flink sql to write flink task ,
>     sometimes I
>      > think
>      > > > the
>      > > > > > WITH options is too long for user.
>      > > > > > For example,I config the kafka source connector
>     parameter,for
>      > > consumer
>      > > > > > group and brokers parameter:
>      > > > > >
>      > > > > >           'connector.properties.0.key' = 'group.id
>     <http://group.id>'
>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
>      > > > > > >          , 'connector.properties.1.key' =
>     'bootstrap.servers'
>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
>      > > > > > >
>      > > > > >
>      > > > > > I can understand this config , but for the flink fresh
>     man,maybe it
>      > > > > > is confused for him.
>      > > > > > In my thought, I am really looking forward to this
>     feature,thank
>      > you
>      > > to
>      > > > > > propose this feature.
>      > > > > >
>      > > > > > Best wishes,
>      > > > > > LakeShen
>      > > > > >
>      > > > > >
>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
>     2020年3月30日周一 下午2:02写道:
>      > > > > >
>      > > > > > > Hi everyone,
>      > > > > > >
>      > > > > > > I want to start a discussion about further improve and
>     simplify
>      > our
>      > > > > > current
>      > > > > > > connector porperty keys, aka WITH options. Currently,
>     we have a
>      > > > > > > 'connector.' prefix for many properties, but they are
>     verbose,
>      > and
>      > > we
>      > > > > > see a
>      > > > > > > big inconsistency between the properties when designing
>     FLIP-107.
>      > > > > > >
>      > > > > > > So we propose to remove all the 'connector.' prefix and
>     rename
>      > > > > > > 'connector.type' to 'connector', 'format.type' to
>     'format'. So a
>      > > new
>      > > > > > Kafka
>      > > > > > > DDL may look like this:
>      > > > > > >
>      > > > > > > CREATE TABLE kafka_table (
>      > > > > > >  ...
>      > > > > > > ) WITH (
>      > > > > > >  'connector' = 'kafka',
>      > > > > > >  'version' = '0.10',
>      > > > > > >  'topic' = 'test-topic',
>      > > > > > >  'startup-mode' = 'earliest-offset',
>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
>      > > > > > >  'properties.group.id <http://properties.group.id>' =
>     'testGroup',
>      > > > > > >  'format' = 'json',
>      > > > > > >  'format.fail-on-missing-field' = 'false'
>      > > > > > > );
>      > > > > > >
>      > > > > > > The new connector property key set will come together
>     with new
>      > > > Factory
>      > > > > > > inferface which is proposed in FLIP-95. Old properties
>     are still
>      > > > > > compatible
>      > > > > > > with their existing implementation. New properties are only
>      > > available
>      > > > > in
>      > > > > > > new DynamicTableFactory implementations.
>      > > > > > >
>      > > > > > > You can access the detailed FLIP here:
>      > > > > > >
>      > > > > > >
>      > > > > >
>      > > > >
>      > > >
>      > >
>      >
>     https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>      > > > > > >
>      > > > > > > Best,
>      > > > > > > Jark
>      > > > > > >
>      > > > > >
>      > > > >
>      > > >
>      > >
>      > >
>      > > --
>      > > Best, Jingsong Lee
>      > >
>      >
>
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email:[hidden email]  <mailto:[hidden email]>;[hidden email]  <mailto:[hidden email]>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

dwysakowicz
Hi all,

I like the overall design of the FLIP.

As for the withstanding concerns. I kind of like the approach to put the
version into the factory identifier. I think it's the cleanest way to
say that this version actually applies to the connector itself and not
to the system it connects to. BTW, I think the outcome of this
discussion will affect interfaces described in FLIP-95. If we put the
version into the functionIdentifier, do we need Factory#factoryVersion?

I also think it does make sense to have a versioning for the properties
as well. Are we able to read all the current properties with the new
factories? I think we could use the "connector.property-version" to
alternate between different Factory interfaces to support the old set of
properties. Otherwise the new factories need to understand both set of
properties, don't they?

Best,

Dawid

On 30/03/2020 17:07, Timo Walther wrote:

> Hi Jark,
>
> thanks for the FLIP. I don't have a strong opinion on
> DynamicTableFactory#factoryVersion() for distiguishing connector
> versions. We can also include it in the factory identifier itself. For
> Kafka, I would then just use `kafka` because the naming "universal"
> was just a helper solution at that time.
>
> What we need is a good guide for how to design the options. We should
> include that in the JavaDoc of factory interface itself, once it is
> in-place. I like the difference between source and sink in properties.
>
> Regarding "connector.property-version":
>
> It was meant for backwards compatibility along the dimension of
> "property design" compared to "connector version". However, since the
> connector is now responsible for parsing all properties, it is not as
> necessary as it was before.
>
> However, a change of the property design as it is done in FLIP-107
> becomes more difficult without a property version and versioning is
> always a good idea in API world.
>
> Regards,
> Timo
>
>
> On 30.03.20 16:38, Benchao Li wrote:
>> Hi Jark,
>>
>> Thanks for starting the discussion. The FLIP LGTM generally.
>>
>> Regarding connector version VS put version into connector's name,
>> I favor the latter personally, using one property to locate a
>> connector can make the error log more precise.
>>  From the users' side, using one property to match a connector will
>> be easier. Especially we have many connectors,
>> and some of the need version property required, and some of them not.
>>
>> Regarding Jingsong's suggestion,
>> IMO, it's a very good complement to the FLIP. Distinguishing
>> properties for source and sink can be very useful, and
>> also this will make the connector property more precise.
>> We are also sick of this for now, we cannot know whether a DDL is a
>> source or sink unless we look through all queries where
>> the table is used.
>> Even more, some of the required properties are only required for
>> source, bug we cannot leave it blank for sink, and vice versa.
>> I think we can also add a type for dimension tables except source and
>> sink.
>>
>> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
>> 周一 下午8:16写道:
>>
>>      > It's not possible for the framework to throw such exception.
>>     Because the
>>     framework doesn't know what versions do the connector support.
>>
>>     Not really, we can list all valid connectors framework could
>> found. E.g.
>>     user mistyped 'kafka-0.x', the error message will looks like:
>>
>>     we don't have any connector named "kafka-0.x", but we have:
>>     FileSystem
>>     Kafka-0.10
>>     Kafka-0.11
>>     ElasticSearch6
>>     ElasticSearch7
>>
>>     Best,
>>     Kurt
>>
>>
>>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>      > Hi Kurt,
>>      >
>>      > > 2) Lists all available connectors seems also quite
>>     straightforward, e.g
>>      > user provided a wrong "kafka-0.8", we tell user we have
>> candidates of
>>      > "kakfa-0.11", "kafka-universal"
>>      > It's not possible for the framework to throw such exception.
>>     Because the
>>      > framework doesn't know what versions do the connector support.
>>     All the
>>      > version information is a blackbox in the identifier. But with
>>      > `Factory#factoryVersion()` interface, we can know all the
>> supported
>>      > versions.
>>      >
>>      > > 3) I don't think so. We can still treat it as the same
>>     connector but with
>>      > different versions.
>>      > That's true but that's weird. Because from the plain DDL
>>     definition, they
>>      > look like different connectors with different "connector"
>> value, e.g.
>>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>>      >
>>      > > If users don't set any version, we will use "kafka-universal"
>>     instead.
>>      > The behavior is inconsistent IMO.
>>      > That is a long term vision when there is no kafka clusters
>> with <0.11
>>      > version.
>>      > At that point, "universal" is the only supported version in Flink
>>     and the
>>      > "version" key can be optional.
>>      >
>>      > ---------------------------------------------
>>      >
>>      > Hi Jingsong,
>>      >
>>      > > "version" vs "kafka.version"
>>      > I though about it. But if we prefix "kafka" to version, we should
>>     prefix
>>      > "kafka" for all other property keys, because they are all kafka
>>     specific
>>      > options.
>>      > However, that will make the property set verbose, see rejected
>>     option#2 in
>>      > the FLIP.
>>      >
>>      > > explicitly separate options for source and sink
>>      > That's a good topic. It's good to have a guideline for the new
>>     property
>>      > keys.
>>      > I'm fine to prefix with a "source"/"sink" for some connector
>> keys.
>>      > Actually, we already do this in some connectors, e.g. jdbc and
>> hbase.
>>      >
>>      > Best,
>>      > Jark
>>      >
>>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>      >
>>      > > Thanks Jark for the proposal.
>>      > >
>>      > > +1 to the general idea.
>>      > >
>>      > > For "version", what about "kafka.version"? It is obvious to
>>     know its
>>      > > meaning.
>>      > >
>>      > > And I'd like to start a new topic:
>>      > > Should we need to explicitly separate source from sink?
>>      > > With the development of batch and streaming, more and more
>>     connectors
>>      > have
>>      > > both source and sink.
>>      > >
>>      > > So should we set a rule for table properties:
>>      > > - properties for both source and sink: without prefix, like
>> "topic"
>>      > > - properties for source only: with "source." prefix, like
>>      > > "source.startup-mode"
>>      > > - properties for sink only: with "sink." prefix, like
>>     "sink.partitioner"
>>      > >
>>      > > What do you think?
>>      > >
>>      > > Best,
>>      > > Jingsong Lee
>>      > >
>>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>      > >
>>      > > > Hi Kurt,
>>      > > >
>>      > > > That's good questions.
>>      > > >
>>      > > > > the meaning of "version"
>>      > > > There are two versions in the old design. One is property
>> version
>>      > > > "connector.property-version" which can be used for backward
>>      > > compatibility.
>>      > > > The other one is "connector.version" which defines the
>> version of
>>      > > external
>>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>>      > > > In this proposal, the "version" is the previous
>>     "connector.version".
>>      > The
>>      > > > ""connector.property-version" is not introduced in new
>> design.
>>      > > >
>>      > > > > how to keep the old capability which can evolve connector
>>     properties
>>      > > > The "connector.property-version" is an optional key in the
>>     old design
>>      > and
>>      > > > is never bumped up.
>>      > > > I'm not sure how "connector.property-version" should work
>> in the
>>      > initial
>>      > > > design. Maybe @Timo Walther <[hidden email]
>>     <mailto:[hidden email]>> has more knowledge on
>>      > > > this.
>>      > > > But for the new properties, every options should be
>> expressed as
>>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
>> method to
>>      > easily
>>      > > > support evolving keys.
>>      > > >
>>      > > > > a single keys instead of two, e.g. "kafka-0.11",
>>     "kafka-universal"?
>>      > > > There are several benefit to use separate "version" key I can
>>     see:
>>      > > > 1) it's more readable to separete them into different keys,
>>     because
>>      > they
>>      > > > are orthogonal concepts.
>>      > > > 2) the planner can give all the availble versions in the
>>     exception
>>      > > message,
>>      > > > if user uses a wrong version (this is often reported in
>> user ML).
>>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
>> have to
>>      > write a
>>      > > > full documentation for each version, because they are
>> different
>>      > > > "connector"?
>>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
>> the same
>>      > > connector
>>      > > > but with different "client jar" version,
>>      > > >     they share all the same supported property keys and
>>     should reside
>>      > > > together.
>>      > > > 4) IMO, the future vision is version-free. At some point in
>>     the future,
>>      > > we
>>      > > > may don't need users to specify kafka version anymore, and
>> make
>>      > > > "version=universal" as optional or removed in the future.
>>     This is can
>>      > be
>>      > > > done easily if they are separate keys.
>>      > > >
>>      > > > Best,
>>      > > > Jark
>>      > > >
>>      > > >
>>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>      > > >
>>      > > > > Hi Jark,
>>      > > > >
>>      > > > > Thanks for the proposal, I'm +1 to the general idea.
>>     However I have a
>>      > > > > question about "version",
>>      > > > > in the old design, the version seems to be aimed for
>> tracking
>>      > property
>>      > > > > version, with different
>>      > > > > version, we could evolve these step by step without
>>     breaking backward
>>      > > > > compatibility. But in this
>>      > > > > design, version is representing external system's
>> version, like
>>      > "0.11"
>>      > > > for
>>      > > > > kafka, "6" or "7" for ES.
>>      > > > > I'm not sure if this is necessary, what's the benefit of
>>     using two
>>      > keys
>>      > > > > instead of one, like "kafka-0.11"
>>      > > > > or "ES6" directly? And how about the old capability which
>>     could let
>>      > us
>>      > > > > evolving connector properties?
>>      > > > >
>>      > > > > Best,
>>      > > > > Kurt
>>      > > > >
>>      > > > >
>>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>>     <[hidden email] <mailto:[hidden email]>>
>>      > > > > wrote:
>>      > > > >
>>      > > > > > Hi Jark,
>>      > > > > >
>>      > > > > > I am really looking forward to this feature. I think this
>>     feature
>>      > > > > > could simplify flink sql code,and at the same time ,
>>      > > > > > it could make the developer more easlier to config the
>>     flink sql
>>      > WITH
>>      > > > > > options.
>>      > > > > >
>>      > > > > > Now when I am using flink sql to write flink task ,
>>     sometimes I
>>      > think
>>      > > > the
>>      > > > > > WITH options is too long for user.
>>      > > > > > For example,I config the kafka source connector
>>     parameter,for
>>      > > consumer
>>      > > > > > group and brokers parameter:
>>      > > > > >
>>      > > > > >           'connector.properties.0.key' = 'group.id
>>     <http://group.id>'
>>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
>>      > > > > > >          , 'connector.properties.1.key' =
>>     'bootstrap.servers'
>>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
>>      > > > > > >
>>      > > > > >
>>      > > > > > I can understand this config , but for the flink fresh
>>     man,maybe it
>>      > > > > > is confused for him.
>>      > > > > > In my thought, I am really looking forward to this
>>     feature,thank
>>      > you
>>      > > to
>>      > > > > > propose this feature.
>>      > > > > >
>>      > > > > > Best wishes,
>>      > > > > > LakeShen
>>      > > > > >
>>      > > > > >
>>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
>>     2020年3月30日周一 下午2:02写道:
>>      > > > > >
>>      > > > > > > Hi everyone,
>>      > > > > > >
>>      > > > > > > I want to start a discussion about further improve and
>>     simplify
>>      > our
>>      > > > > > current
>>      > > > > > > connector porperty keys, aka WITH options. Currently,
>>     we have a
>>      > > > > > > 'connector.' prefix for many properties, but they are
>>     verbose,
>>      > and
>>      > > we
>>      > > > > > see a
>>      > > > > > > big inconsistency between the properties when designing
>>     FLIP-107.
>>      > > > > > >
>>      > > > > > > So we propose to remove all the 'connector.' prefix and
>>     rename
>>      > > > > > > 'connector.type' to 'connector', 'format.type' to
>>     'format'. So a
>>      > > new
>>      > > > > > Kafka
>>      > > > > > > DDL may look like this:
>>      > > > > > >
>>      > > > > > > CREATE TABLE kafka_table (
>>      > > > > > >  ...
>>      > > > > > > ) WITH (
>>      > > > > > >  'connector' = 'kafka',
>>      > > > > > >  'version' = '0.10',
>>      > > > > > >  'topic' = 'test-topic',
>>      > > > > > >  'startup-mode' = 'earliest-offset',
>>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
>>      > > > > > >  'properties.group.id <http://properties.group.id>' =
>>     'testGroup',
>>      > > > > > >  'format' = 'json',
>>      > > > > > >  'format.fail-on-missing-field' = 'false'
>>      > > > > > > );
>>      > > > > > >
>>      > > > > > > The new connector property key set will come together
>>     with new
>>      > > > Factory
>>      > > > > > > inferface which is proposed in FLIP-95. Old properties
>>     are still
>>      > > > > > compatible
>>      > > > > > > with their existing implementation. New properties
>> are only
>>      > > available
>>      > > > > in
>>      > > > > > > new DynamicTableFactory implementations.
>>      > > > > > >
>>      > > > > > > You can access the detailed FLIP here:
>>      > > > > > >
>>      > > > > > >
>>      > > > > >
>>      > > > >
>>      > > >
>>      > >
>>      >
>>    
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>>      > > > > > >
>>      > > > > > > Best,
>>      > > > > > > Jark
>>      > > > > > >
>>      > > > > >
>>      > > > >
>>      > > >
>>      > >
>>      > >
>>      > > --
>>      > > Best, Jingsong Lee
>>      > >
>>      >
>>
>>
>>
>> -- 
>>
>> Benchao Li
>> School of Electronics Engineering and Computer Science, Peking
>> University
>> Tel:+86-15650713730
>> Email:[hidden email] 
>> <mailto:[hidden email]>;[hidden email] 
>> <mailto:[hidden email]>
>>
>

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi all,

Thanks for the feedbacks.

It seems that we have a conclusion to put the version into the factory
identifier. I'm also fine with this.
If we have this outcome, the interface of Factory#factoryVersion is not
needed anymore, this can simplify the learning cost of new factory.
We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
<[hidden email]>

Btw, I would like to use "_" instead of "-" as the version delimiter,
because "-" looks like minus and may confuse users, e.g. "elasticsearch-6".
This is not forced, but should be a guilde in the Javadoc of Factory.
I propose to use the following identifiers for existing connectors,

kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
the meaning of "universal" not easy to understand.
kafka-0.11 => kafka for 0.11 version
kafka-0.10 => kafka for 0.10 version
elasticsearch-6 => elasticsearch for 6.x versions
elasticsearch-7 => elasticsearch for 7.x versions
hbase-1.4 => hbase for 1.4.x versions
jdbc
filesystem

We use "-" as the version delimiter to make them to be more consistent.
This is not forces, users can also use other delimiters or without
delimiter.
But this can be a guilde in the Javadoc of Factory, to make the connector
ecosystem to be more consistent.

What do you think?

------------------------

Regarding "connector.property-version":

Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
designed not support to read current properties.
All the current properties are routed to the old factories if they are
using "connector.type". Otherwise, properties are routed to new factories.

If I understand correctly, the "connector.property-version" is attched
implicitly by system, not manually set by users.
For example, the framework should add "connector.property-version=1" to
properties when processing DDL statement.
I'm fine to add a "connector.property-version=1" when processing DDL
statement, but I think it's also fine if we don't,
because this can be done in the future if need and the default version can
be 1.

Best,
Jark





On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]>
wrote:

> Hi all,
>
> I like the overall design of the FLIP.
>
> As for the withstanding concerns. I kind of like the approach to put the
> version into the factory identifier. I think it's the cleanest way to
> say that this version actually applies to the connector itself and not
> to the system it connects to. BTW, I think the outcome of this
> discussion will affect interfaces described in FLIP-95. If we put the
> version into the functionIdentifier, do we need Factory#factoryVersion?
>
> I also think it does make sense to have a versioning for the properties
> as well. Are we able to read all the current properties with the new
> factories? I think we could use the "connector.property-version" to
> alternate between different Factory interfaces to support the old set of
> properties. Otherwise the new factories need to understand both set of
> properties, don't they?
>
> Best,
>
> Dawid
>
> On 30/03/2020 17:07, Timo Walther wrote:
> > Hi Jark,
> >
> > thanks for the FLIP. I don't have a strong opinion on
> > DynamicTableFactory#factoryVersion() for distiguishing connector
> > versions. We can also include it in the factory identifier itself. For
> > Kafka, I would then just use `kafka` because the naming "universal"
> > was just a helper solution at that time.
> >
> > What we need is a good guide for how to design the options. We should
> > include that in the JavaDoc of factory interface itself, once it is
> > in-place. I like the difference between source and sink in properties.
> >
> > Regarding "connector.property-version":
> >
> > It was meant for backwards compatibility along the dimension of
> > "property design" compared to "connector version". However, since the
> > connector is now responsible for parsing all properties, it is not as
> > necessary as it was before.
> >
> > However, a change of the property design as it is done in FLIP-107
> > becomes more difficult without a property version and versioning is
> > always a good idea in API world.
> >
> > Regards,
> > Timo
> >
> >
> > On 30.03.20 16:38, Benchao Li wrote:
> >> Hi Jark,
> >>
> >> Thanks for starting the discussion. The FLIP LGTM generally.
> >>
> >> Regarding connector version VS put version into connector's name,
> >> I favor the latter personally, using one property to locate a
> >> connector can make the error log more precise.
> >>  From the users' side, using one property to match a connector will
> >> be easier. Especially we have many connectors,
> >> and some of the need version property required, and some of them not.
> >>
> >> Regarding Jingsong's suggestion,
> >> IMO, it's a very good complement to the FLIP. Distinguishing
> >> properties for source and sink can be very useful, and
> >> also this will make the connector property more precise.
> >> We are also sick of this for now, we cannot know whether a DDL is a
> >> source or sink unless we look through all queries where
> >> the table is used.
> >> Even more, some of the required properties are only required for
> >> source, bug we cannot leave it blank for sink, and vice versa.
> >> I think we can also add a type for dimension tables except source and
> >> sink.
> >>
> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
> >> 周一 下午8:16写道:
> >>
> >>      > It's not possible for the framework to throw such exception.
> >>     Because the
> >>     framework doesn't know what versions do the connector support.
> >>
> >>     Not really, we can list all valid connectors framework could
> >> found. E.g.
> >>     user mistyped 'kafka-0.x', the error message will looks like:
> >>
> >>     we don't have any connector named "kafka-0.x", but we have:
> >>     FileSystem
> >>     Kafka-0.10
> >>     Kafka-0.11
> >>     ElasticSearch6
> >>     ElasticSearch7
> >>
> >>     Best,
> >>     Kurt
> >>
> >>
> >>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
> >>     <mailto:[hidden email]>> wrote:
> >>
> >>      > Hi Kurt,
> >>      >
> >>      > > 2) Lists all available connectors seems also quite
> >>     straightforward, e.g
> >>      > user provided a wrong "kafka-0.8", we tell user we have
> >> candidates of
> >>      > "kakfa-0.11", "kafka-universal"
> >>      > It's not possible for the framework to throw such exception.
> >>     Because the
> >>      > framework doesn't know what versions do the connector support.
> >>     All the
> >>      > version information is a blackbox in the identifier. But with
> >>      > `Factory#factoryVersion()` interface, we can know all the
> >> supported
> >>      > versions.
> >>      >
> >>      > > 3) I don't think so. We can still treat it as the same
> >>     connector but with
> >>      > different versions.
> >>      > That's true but that's weird. Because from the plain DDL
> >>     definition, they
> >>      > look like different connectors with different "connector"
> >> value, e.g.
> >>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
> >>      >
> >>      > > If users don't set any version, we will use "kafka-universal"
> >>     instead.
> >>      > The behavior is inconsistent IMO.
> >>      > That is a long term vision when there is no kafka clusters
> >> with <0.11
> >>      > version.
> >>      > At that point, "universal" is the only supported version in Flink
> >>     and the
> >>      > "version" key can be optional.
> >>      >
> >>      > ---------------------------------------------
> >>      >
> >>      > Hi Jingsong,
> >>      >
> >>      > > "version" vs "kafka.version"
> >>      > I though about it. But if we prefix "kafka" to version, we should
> >>     prefix
> >>      > "kafka" for all other property keys, because they are all kafka
> >>     specific
> >>      > options.
> >>      > However, that will make the property set verbose, see rejected
> >>     option#2 in
> >>      > the FLIP.
> >>      >
> >>      > > explicitly separate options for source and sink
> >>      > That's a good topic. It's good to have a guideline for the new
> >>     property
> >>      > keys.
> >>      > I'm fine to prefix with a "source"/"sink" for some connector
> >> keys.
> >>      > Actually, we already do this in some connectors, e.g. jdbc and
> >> hbase.
> >>      >
> >>      > Best,
> >>      > Jark
> >>      >
> >>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
> [hidden email]
> >>     <mailto:[hidden email]>> wrote:
> >>      >
> >>      > > Thanks Jark for the proposal.
> >>      > >
> >>      > > +1 to the general idea.
> >>      > >
> >>      > > For "version", what about "kafka.version"? It is obvious to
> >>     know its
> >>      > > meaning.
> >>      > >
> >>      > > And I'd like to start a new topic:
> >>      > > Should we need to explicitly separate source from sink?
> >>      > > With the development of batch and streaming, more and more
> >>     connectors
> >>      > have
> >>      > > both source and sink.
> >>      > >
> >>      > > So should we set a rule for table properties:
> >>      > > - properties for both source and sink: without prefix, like
> >> "topic"
> >>      > > - properties for source only: with "source." prefix, like
> >>      > > "source.startup-mode"
> >>      > > - properties for sink only: with "sink." prefix, like
> >>     "sink.partitioner"
> >>      > >
> >>      > > What do you think?
> >>      > >
> >>      > > Best,
> >>      > > Jingsong Lee
> >>      > >
> >>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
> >>     <mailto:[hidden email]>> wrote:
> >>      > >
> >>      > > > Hi Kurt,
> >>      > > >
> >>      > > > That's good questions.
> >>      > > >
> >>      > > > > the meaning of "version"
> >>      > > > There are two versions in the old design. One is property
> >> version
> >>      > > > "connector.property-version" which can be used for backward
> >>      > > compatibility.
> >>      > > > The other one is "connector.version" which defines the
> >> version of
> >>      > > external
> >>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> >>      > > > In this proposal, the "version" is the previous
> >>     "connector.version".
> >>      > The
> >>      > > > ""connector.property-version" is not introduced in new
> >> design.
> >>      > > >
> >>      > > > > how to keep the old capability which can evolve connector
> >>     properties
> >>      > > > The "connector.property-version" is an optional key in the
> >>     old design
> >>      > and
> >>      > > > is never bumped up.
> >>      > > > I'm not sure how "connector.property-version" should work
> >> in the
> >>      > initial
> >>      > > > design. Maybe @Timo Walther <[hidden email]
> >>     <mailto:[hidden email]>> has more knowledge on
> >>      > > > this.
> >>      > > > But for the new properties, every options should be
> >> expressed as
> >>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
> >> method to
> >>      > easily
> >>      > > > support evolving keys.
> >>      > > >
> >>      > > > > a single keys instead of two, e.g. "kafka-0.11",
> >>     "kafka-universal"?
> >>      > > > There are several benefit to use separate "version" key I can
> >>     see:
> >>      > > > 1) it's more readable to separete them into different keys,
> >>     because
> >>      > they
> >>      > > > are orthogonal concepts.
> >>      > > > 2) the planner can give all the availble versions in the
> >>     exception
> >>      > > message,
> >>      > > > if user uses a wrong version (this is often reported in
> >> user ML).
> >>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
> >> have to
> >>      > write a
> >>      > > > full documentation for each version, because they are
> >> different
> >>      > > > "connector"?
> >>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
> >> the same
> >>      > > connector
> >>      > > > but with different "client jar" version,
> >>      > > >     they share all the same supported property keys and
> >>     should reside
> >>      > > > together.
> >>      > > > 4) IMO, the future vision is version-free. At some point in
> >>     the future,
> >>      > > we
> >>      > > > may don't need users to specify kafka version anymore, and
> >> make
> >>      > > > "version=universal" as optional or removed in the future.
> >>     This is can
> >>      > be
> >>      > > > done easily if they are separate keys.
> >>      > > >
> >>      > > > Best,
> >>      > > > Jark
> >>      > > >
> >>      > > >
> >>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
> >>     <mailto:[hidden email]>> wrote:
> >>      > > >
> >>      > > > > Hi Jark,
> >>      > > > >
> >>      > > > > Thanks for the proposal, I'm +1 to the general idea.
> >>     However I have a
> >>      > > > > question about "version",
> >>      > > > > in the old design, the version seems to be aimed for
> >> tracking
> >>      > property
> >>      > > > > version, with different
> >>      > > > > version, we could evolve these step by step without
> >>     breaking backward
> >>      > > > > compatibility. But in this
> >>      > > > > design, version is representing external system's
> >> version, like
> >>      > "0.11"
> >>      > > > for
> >>      > > > > kafka, "6" or "7" for ES.
> >>      > > > > I'm not sure if this is necessary, what's the benefit of
> >>     using two
> >>      > keys
> >>      > > > > instead of one, like "kafka-0.11"
> >>      > > > > or "ES6" directly? And how about the old capability which
> >>     could let
> >>      > us
> >>      > > > > evolving connector properties?
> >>      > > > >
> >>      > > > > Best,
> >>      > > > > Kurt
> >>      > > > >
> >>      > > > >
> >>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
> >>     <[hidden email] <mailto:[hidden email]>>
> >>      > > > > wrote:
> >>      > > > >
> >>      > > > > > Hi Jark,
> >>      > > > > >
> >>      > > > > > I am really looking forward to this feature. I think this
> >>     feature
> >>      > > > > > could simplify flink sql code,and at the same time ,
> >>      > > > > > it could make the developer more easlier to config the
> >>     flink sql
> >>      > WITH
> >>      > > > > > options.
> >>      > > > > >
> >>      > > > > > Now when I am using flink sql to write flink task ,
> >>     sometimes I
> >>      > think
> >>      > > > the
> >>      > > > > > WITH options is too long for user.
> >>      > > > > > For example,I config the kafka source connector
> >>     parameter,for
> >>      > > consumer
> >>      > > > > > group and brokers parameter:
> >>      > > > > >
> >>      > > > > >           'connector.properties.0.key' = 'group.id
> >>     <http://group.id>'
> >>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
> >>      > > > > > >          , 'connector.properties.1.key' =
> >>     'bootstrap.servers'
> >>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
> >>      > > > > > >
> >>      > > > > >
> >>      > > > > > I can understand this config , but for the flink fresh
> >>     man,maybe it
> >>      > > > > > is confused for him.
> >>      > > > > > In my thought, I am really looking forward to this
> >>     feature,thank
> >>      > you
> >>      > > to
> >>      > > > > > propose this feature.
> >>      > > > > >
> >>      > > > > > Best wishes,
> >>      > > > > > LakeShen
> >>      > > > > >
> >>      > > > > >
> >>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
> >>     2020年3月30日周一 下午2:02写道:
> >>      > > > > >
> >>      > > > > > > Hi everyone,
> >>      > > > > > >
> >>      > > > > > > I want to start a discussion about further improve and
> >>     simplify
> >>      > our
> >>      > > > > > current
> >>      > > > > > > connector porperty keys, aka WITH options. Currently,
> >>     we have a
> >>      > > > > > > 'connector.' prefix for many properties, but they are
> >>     verbose,
> >>      > and
> >>      > > we
> >>      > > > > > see a
> >>      > > > > > > big inconsistency between the properties when designing
> >>     FLIP-107.
> >>      > > > > > >
> >>      > > > > > > So we propose to remove all the 'connector.' prefix and
> >>     rename
> >>      > > > > > > 'connector.type' to 'connector', 'format.type' to
> >>     'format'. So a
> >>      > > new
> >>      > > > > > Kafka
> >>      > > > > > > DDL may look like this:
> >>      > > > > > >
> >>      > > > > > > CREATE TABLE kafka_table (
> >>      > > > > > >  ...
> >>      > > > > > > ) WITH (
> >>      > > > > > >  'connector' = 'kafka',
> >>      > > > > > >  'version' = '0.10',
> >>      > > > > > >  'topic' = 'test-topic',
> >>      > > > > > >  'startup-mode' = 'earliest-offset',
> >>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
> >>      > > > > > >  'properties.group.id <http://properties.group.id>' =
> >>     'testGroup',
> >>      > > > > > >  'format' = 'json',
> >>      > > > > > >  'format.fail-on-missing-field' = 'false'
> >>      > > > > > > );
> >>      > > > > > >
> >>      > > > > > > The new connector property key set will come together
> >>     with new
> >>      > > > Factory
> >>      > > > > > > inferface which is proposed in FLIP-95. Old properties
> >>     are still
> >>      > > > > > compatible
> >>      > > > > > > with their existing implementation. New properties
> >> are only
> >>      > > available
> >>      > > > > in
> >>      > > > > > > new DynamicTableFactory implementations.
> >>      > > > > > >
> >>      > > > > > > You can access the detailed FLIP here:
> >>      > > > > > >
> >>      > > > > > >
> >>      > > > > >
> >>      > > > >
> >>      > > >
> >>      > >
> >>      >
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> >>      > > > > > >
> >>      > > > > > > Best,
> >>      > > > > > > Jark
> >>      > > > > > >
> >>      > > > > >
> >>      > > > >
> >>      > > >
> >>      > >
> >>      > >
> >>      > > --
> >>      > > Best, Jingsong Lee
> >>      > >
> >>      >
> >>
> >>
> >>
> >> --
> >>
> >> Benchao Li
> >> School of Electronics Engineering and Computer Science, Peking
> >> University
> >> Tel:+86-15650713730
> >> Email:[hidden email]
> >> <mailto:[hidden email]>;[hidden email]
> >> <mailto:[hidden email]>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi all,

Thanks for the feedbacks.

It seems that we have a conclusion to put the version into the factory
identifier. I'm also fine with this.
If we have this outcome, the interface of Factory#factoryVersion is not
needed anymore, this can simplify the learning cost of new factory.
We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
<[hidden email]>

kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
the meaning of "universal" not easy to understand.
kafka-0.11 => kafka for 0.11 version
kafka-0.10 => kafka for 0.10 version
elasticsearch-6 => elasticsearch for 6.x versions
elasticsearch-7 => elasticsearch for 7.x versions
hbase-1.4 => hbase for 1.4.x versions
jdbc
filesystem

We use "-" as the version delimiter to make them to be more consistent.
This is not forces, users can also use other delimiters or without
delimiter.
But this can be a guilde in the Javadoc of Factory, to make the connector
ecosystem to be more consistent.

What do you think?

------------------------

Regarding "connector.property-version":

Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
designed not support to read current properties.
All the current properties are routed to the old factories if they are
using "connector.type". Otherwise, properties are routed to new factories.

If I understand correctly, the "connector.property-version" is attched
implicitly by system, not manually set by users.
For example, the framework should add "connector.property-version=1" to
properties when processing DDL statement.
I'm fine to add a "connector.property-version=1" when processing DDL
statement, but I think it's also fine if we don't,
because this can be done in the future if need and the default version can
be 1.

Best,
Jark

On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:

> Hi all,
>
> Thanks for the feedbacks.
>
> It seems that we have a conclusion to put the version into the factory
> identifier. I'm also fine with this.
> If we have this outcome, the interface of Factory#factoryVersion is not
> needed anymore, this can simplify the learning cost of new factory.
> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> <[hidden email]>
>
> Btw, I would like to use "_" instead of "-" as the version delimiter,
> because "-" looks like minus and may confuse users, e.g. "elasticsearch-6".
> This is not forced, but should be a guilde in the Javadoc of Factory.
> I propose to use the following identifiers for existing connectors,
>
> kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
> the meaning of "universal" not easy to understand.
> kafka-0.11 => kafka for 0.11 version
> kafka-0.10 => kafka for 0.10 version
> elasticsearch-6 => elasticsearch for 6.x versions
> elasticsearch-7 => elasticsearch for 7.x versions
> hbase-1.4 => hbase for 1.4.x versions
> jdbc
> filesystem
>
> We use "-" as the version delimiter to make them to be more consistent.
> This is not forces, users can also use other delimiters or without
> delimiter.
> But this can be a guilde in the Javadoc of Factory, to make the connector
> ecosystem to be more consistent.
>
> What do you think?
>
> ------------------------
>
> Regarding "connector.property-version":
>
> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> designed not support to read current properties.
> All the current properties are routed to the old factories if they are
> using "connector.type". Otherwise, properties are routed to new factories.
>
> If I understand correctly, the "connector.property-version" is attched
> implicitly by system, not manually set by users.
> For example, the framework should add "connector.property-version=1" to
> properties when processing DDL statement.
> I'm fine to add a "connector.property-version=1" when processing DDL
> statement, but I think it's also fine if we don't,
> because this can be done in the future if need and the default version can
> be 1.
>
> Best,
> Jark
>
>
>
>
>
> On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]>
> wrote:
>
>> Hi all,
>>
>> I like the overall design of the FLIP.
>>
>> As for the withstanding concerns. I kind of like the approach to put the
>> version into the factory identifier. I think it's the cleanest way to
>> say that this version actually applies to the connector itself and not
>> to the system it connects to. BTW, I think the outcome of this
>> discussion will affect interfaces described in FLIP-95. If we put the
>> version into the functionIdentifier, do we need Factory#factoryVersion?
>>
>> I also think it does make sense to have a versioning for the properties
>> as well. Are we able to read all the current properties with the new
>> factories? I think we could use the "connector.property-version" to
>> alternate between different Factory interfaces to support the old set of
>> properties. Otherwise the new factories need to understand both set of
>> properties, don't they?
>>
>> Best,
>>
>> Dawid
>>
>> On 30/03/2020 17:07, Timo Walther wrote:
>> > Hi Jark,
>> >
>> > thanks for the FLIP. I don't have a strong opinion on
>> > DynamicTableFactory#factoryVersion() for distiguishing connector
>> > versions. We can also include it in the factory identifier itself. For
>> > Kafka, I would then just use `kafka` because the naming "universal"
>> > was just a helper solution at that time.
>> >
>> > What we need is a good guide for how to design the options. We should
>> > include that in the JavaDoc of factory interface itself, once it is
>> > in-place. I like the difference between source and sink in properties.
>> >
>> > Regarding "connector.property-version":
>> >
>> > It was meant for backwards compatibility along the dimension of
>> > "property design" compared to "connector version". However, since the
>> > connector is now responsible for parsing all properties, it is not as
>> > necessary as it was before.
>> >
>> > However, a change of the property design as it is done in FLIP-107
>> > becomes more difficult without a property version and versioning is
>> > always a good idea in API world.
>> >
>> > Regards,
>> > Timo
>> >
>> >
>> > On 30.03.20 16:38, Benchao Li wrote:
>> >> Hi Jark,
>> >>
>> >> Thanks for starting the discussion. The FLIP LGTM generally.
>> >>
>> >> Regarding connector version VS put version into connector's name,
>> >> I favor the latter personally, using one property to locate a
>> >> connector can make the error log more precise.
>> >>  From the users' side, using one property to match a connector will
>> >> be easier. Especially we have many connectors,
>> >> and some of the need version property required, and some of them not.
>> >>
>> >> Regarding Jingsong's suggestion,
>> >> IMO, it's a very good complement to the FLIP. Distinguishing
>> >> properties for source and sink can be very useful, and
>> >> also this will make the connector property more precise.
>> >> We are also sick of this for now, we cannot know whether a DDL is a
>> >> source or sink unless we look through all queries where
>> >> the table is used.
>> >> Even more, some of the required properties are only required for
>> >> source, bug we cannot leave it blank for sink, and vice versa.
>> >> I think we can also add a type for dimension tables except source and
>> >> sink.
>> >>
>> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
>> >> 周一 下午8:16写道:
>> >>
>> >>      > It's not possible for the framework to throw such exception.
>> >>     Because the
>> >>     framework doesn't know what versions do the connector support.
>> >>
>> >>     Not really, we can list all valid connectors framework could
>> >> found. E.g.
>> >>     user mistyped 'kafka-0.x', the error message will looks like:
>> >>
>> >>     we don't have any connector named "kafka-0.x", but we have:
>> >>     FileSystem
>> >>     Kafka-0.10
>> >>     Kafka-0.11
>> >>     ElasticSearch6
>> >>     ElasticSearch7
>> >>
>> >>     Best,
>> >>     Kurt
>> >>
>> >>
>> >>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>> >>     <mailto:[hidden email]>> wrote:
>> >>
>> >>      > Hi Kurt,
>> >>      >
>> >>      > > 2) Lists all available connectors seems also quite
>> >>     straightforward, e.g
>> >>      > user provided a wrong "kafka-0.8", we tell user we have
>> >> candidates of
>> >>      > "kakfa-0.11", "kafka-universal"
>> >>      > It's not possible for the framework to throw such exception.
>> >>     Because the
>> >>      > framework doesn't know what versions do the connector support.
>> >>     All the
>> >>      > version information is a blackbox in the identifier. But with
>> >>      > `Factory#factoryVersion()` interface, we can know all the
>> >> supported
>> >>      > versions.
>> >>      >
>> >>      > > 3) I don't think so. We can still treat it as the same
>> >>     connector but with
>> >>      > different versions.
>> >>      > That's true but that's weird. Because from the plain DDL
>> >>     definition, they
>> >>      > look like different connectors with different "connector"
>> >> value, e.g.
>> >>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>> >>      >
>> >>      > > If users don't set any version, we will use "kafka-universal"
>> >>     instead.
>> >>      > The behavior is inconsistent IMO.
>> >>      > That is a long term vision when there is no kafka clusters
>> >> with <0.11
>> >>      > version.
>> >>      > At that point, "universal" is the only supported version in
>> Flink
>> >>     and the
>> >>      > "version" key can be optional.
>> >>      >
>> >>      > ---------------------------------------------
>> >>      >
>> >>      > Hi Jingsong,
>> >>      >
>> >>      > > "version" vs "kafka.version"
>> >>      > I though about it. But if we prefix "kafka" to version, we
>> should
>> >>     prefix
>> >>      > "kafka" for all other property keys, because they are all kafka
>> >>     specific
>> >>      > options.
>> >>      > However, that will make the property set verbose, see rejected
>> >>     option#2 in
>> >>      > the FLIP.
>> >>      >
>> >>      > > explicitly separate options for source and sink
>> >>      > That's a good topic. It's good to have a guideline for the new
>> >>     property
>> >>      > keys.
>> >>      > I'm fine to prefix with a "source"/"sink" for some connector
>> >> keys.
>> >>      > Actually, we already do this in some connectors, e.g. jdbc and
>> >> hbase.
>> >>      >
>> >>      > Best,
>> >>      > Jark
>> >>      >
>> >>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
>> [hidden email]
>> >>     <mailto:[hidden email]>> wrote:
>> >>      >
>> >>      > > Thanks Jark for the proposal.
>> >>      > >
>> >>      > > +1 to the general idea.
>> >>      > >
>> >>      > > For "version", what about "kafka.version"? It is obvious to
>> >>     know its
>> >>      > > meaning.
>> >>      > >
>> >>      > > And I'd like to start a new topic:
>> >>      > > Should we need to explicitly separate source from sink?
>> >>      > > With the development of batch and streaming, more and more
>> >>     connectors
>> >>      > have
>> >>      > > both source and sink.
>> >>      > >
>> >>      > > So should we set a rule for table properties:
>> >>      > > - properties for both source and sink: without prefix, like
>> >> "topic"
>> >>      > > - properties for source only: with "source." prefix, like
>> >>      > > "source.startup-mode"
>> >>      > > - properties for sink only: with "sink." prefix, like
>> >>     "sink.partitioner"
>> >>      > >
>> >>      > > What do you think?
>> >>      > >
>> >>      > > Best,
>> >>      > > Jingsong Lee
>> >>      > >
>> >>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>> >>     <mailto:[hidden email]>> wrote:
>> >>      > >
>> >>      > > > Hi Kurt,
>> >>      > > >
>> >>      > > > That's good questions.
>> >>      > > >
>> >>      > > > > the meaning of "version"
>> >>      > > > There are two versions in the old design. One is property
>> >> version
>> >>      > > > "connector.property-version" which can be used for backward
>> >>      > > compatibility.
>> >>      > > > The other one is "connector.version" which defines the
>> >> version of
>> >>      > > external
>> >>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>> >>      > > > In this proposal, the "version" is the previous
>> >>     "connector.version".
>> >>      > The
>> >>      > > > ""connector.property-version" is not introduced in new
>> >> design.
>> >>      > > >
>> >>      > > > > how to keep the old capability which can evolve connector
>> >>     properties
>> >>      > > > The "connector.property-version" is an optional key in the
>> >>     old design
>> >>      > and
>> >>      > > > is never bumped up.
>> >>      > > > I'm not sure how "connector.property-version" should work
>> >> in the
>> >>      > initial
>> >>      > > > design. Maybe @Timo Walther <[hidden email]
>> >>     <mailto:[hidden email]>> has more knowledge on
>> >>      > > > this.
>> >>      > > > But for the new properties, every options should be
>> >> expressed as
>> >>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
>> >> method to
>> >>      > easily
>> >>      > > > support evolving keys.
>> >>      > > >
>> >>      > > > > a single keys instead of two, e.g. "kafka-0.11",
>> >>     "kafka-universal"?
>> >>      > > > There are several benefit to use separate "version" key I
>> can
>> >>     see:
>> >>      > > > 1) it's more readable to separete them into different keys,
>> >>     because
>> >>      > they
>> >>      > > > are orthogonal concepts.
>> >>      > > > 2) the planner can give all the availble versions in the
>> >>     exception
>> >>      > > message,
>> >>      > > > if user uses a wrong version (this is often reported in
>> >> user ML).
>> >>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
>> >> have to
>> >>      > write a
>> >>      > > > full documentation for each version, because they are
>> >> different
>> >>      > > > "connector"?
>> >>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
>> >> the same
>> >>      > > connector
>> >>      > > > but with different "client jar" version,
>> >>      > > >     they share all the same supported property keys and
>> >>     should reside
>> >>      > > > together.
>> >>      > > > 4) IMO, the future vision is version-free. At some point in
>> >>     the future,
>> >>      > > we
>> >>      > > > may don't need users to specify kafka version anymore, and
>> >> make
>> >>      > > > "version=universal" as optional or removed in the future.
>> >>     This is can
>> >>      > be
>> >>      > > > done easily if they are separate keys.
>> >>      > > >
>> >>      > > > Best,
>> >>      > > > Jark
>> >>      > > >
>> >>      > > >
>> >>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
>> >>     <mailto:[hidden email]>> wrote:
>> >>      > > >
>> >>      > > > > Hi Jark,
>> >>      > > > >
>> >>      > > > > Thanks for the proposal, I'm +1 to the general idea.
>> >>     However I have a
>> >>      > > > > question about "version",
>> >>      > > > > in the old design, the version seems to be aimed for
>> >> tracking
>> >>      > property
>> >>      > > > > version, with different
>> >>      > > > > version, we could evolve these step by step without
>> >>     breaking backward
>> >>      > > > > compatibility. But in this
>> >>      > > > > design, version is representing external system's
>> >> version, like
>> >>      > "0.11"
>> >>      > > > for
>> >>      > > > > kafka, "6" or "7" for ES.
>> >>      > > > > I'm not sure if this is necessary, what's the benefit of
>> >>     using two
>> >>      > keys
>> >>      > > > > instead of one, like "kafka-0.11"
>> >>      > > > > or "ES6" directly? And how about the old capability which
>> >>     could let
>> >>      > us
>> >>      > > > > evolving connector properties?
>> >>      > > > >
>> >>      > > > > Best,
>> >>      > > > > Kurt
>> >>      > > > >
>> >>      > > > >
>> >>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>> >>     <[hidden email] <mailto:[hidden email]>>
>> >>      > > > > wrote:
>> >>      > > > >
>> >>      > > > > > Hi Jark,
>> >>      > > > > >
>> >>      > > > > > I am really looking forward to this feature. I think
>> this
>> >>     feature
>> >>      > > > > > could simplify flink sql code,and at the same time ,
>> >>      > > > > > it could make the developer more easlier to config the
>> >>     flink sql
>> >>      > WITH
>> >>      > > > > > options.
>> >>      > > > > >
>> >>      > > > > > Now when I am using flink sql to write flink task ,
>> >>     sometimes I
>> >>      > think
>> >>      > > > the
>> >>      > > > > > WITH options is too long for user.
>> >>      > > > > > For example,I config the kafka source connector
>> >>     parameter,for
>> >>      > > consumer
>> >>      > > > > > group and brokers parameter:
>> >>      > > > > >
>> >>      > > > > >           'connector.properties.0.key' = 'group.id
>> >>     <http://group.id>'
>> >>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
>> >>      > > > > > >          , 'connector.properties.1.key' =
>> >>     'bootstrap.servers'
>> >>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
>> >>      > > > > > >
>> >>      > > > > >
>> >>      > > > > > I can understand this config , but for the flink fresh
>> >>     man,maybe it
>> >>      > > > > > is confused for him.
>> >>      > > > > > In my thought, I am really looking forward to this
>> >>     feature,thank
>> >>      > you
>> >>      > > to
>> >>      > > > > > propose this feature.
>> >>      > > > > >
>> >>      > > > > > Best wishes,
>> >>      > > > > > LakeShen
>> >>      > > > > >
>> >>      > > > > >
>> >>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
>> >>     2020年3月30日周一 下午2:02写道:
>> >>      > > > > >
>> >>      > > > > > > Hi everyone,
>> >>      > > > > > >
>> >>      > > > > > > I want to start a discussion about further improve and
>> >>     simplify
>> >>      > our
>> >>      > > > > > current
>> >>      > > > > > > connector porperty keys, aka WITH options. Currently,
>> >>     we have a
>> >>      > > > > > > 'connector.' prefix for many properties, but they are
>> >>     verbose,
>> >>      > and
>> >>      > > we
>> >>      > > > > > see a
>> >>      > > > > > > big inconsistency between the properties when
>> designing
>> >>     FLIP-107.
>> >>      > > > > > >
>> >>      > > > > > > So we propose to remove all the 'connector.' prefix
>> and
>> >>     rename
>> >>      > > > > > > 'connector.type' to 'connector', 'format.type' to
>> >>     'format'. So a
>> >>      > > new
>> >>      > > > > > Kafka
>> >>      > > > > > > DDL may look like this:
>> >>      > > > > > >
>> >>      > > > > > > CREATE TABLE kafka_table (
>> >>      > > > > > >  ...
>> >>      > > > > > > ) WITH (
>> >>      > > > > > >  'connector' = 'kafka',
>> >>      > > > > > >  'version' = '0.10',
>> >>      > > > > > >  'topic' = 'test-topic',
>> >>      > > > > > >  'startup-mode' = 'earliest-offset',
>> >>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
>> >>      > > > > > >  'properties.group.id <http://properties.group.id>' =
>> >>     'testGroup',
>> >>      > > > > > >  'format' = 'json',
>> >>      > > > > > >  'format.fail-on-missing-field' = 'false'
>> >>      > > > > > > );
>> >>      > > > > > >
>> >>      > > > > > > The new connector property key set will come together
>> >>     with new
>> >>      > > > Factory
>> >>      > > > > > > inferface which is proposed in FLIP-95. Old properties
>> >>     are still
>> >>      > > > > > compatible
>> >>      > > > > > > with their existing implementation. New properties
>> >> are only
>> >>      > > available
>> >>      > > > > in
>> >>      > > > > > > new DynamicTableFactory implementations.
>> >>      > > > > > >
>> >>      > > > > > > You can access the detailed FLIP here:
>> >>      > > > > > >
>> >>      > > > > > >
>> >>      > > > > >
>> >>      > > > >
>> >>      > > >
>> >>      > >
>> >>      >
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>> >>      > > > > > >
>> >>      > > > > > > Best,
>> >>      > > > > > > Jark
>> >>      > > > > > >
>> >>      > > > > >
>> >>      > > > >
>> >>      > > >
>> >>      > >
>> >>      > >
>> >>      > > --
>> >>      > > Best, Jingsong Lee
>> >>      > >
>> >>      >
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Benchao Li
>> >> School of Electronics Engineering and Computer Science, Peking
>> >> University
>> >> Tel:+86-15650713730
>> >> Email:[hidden email]
>> >> <mailto:[hidden email]>;[hidden email]
>> >> <mailto:[hidden email]>
>> >>
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Kurt Young
Regarding to version delimiter, I'm in favor of using "-", not "_". Not
only because we are using
"-" for config keys in other module, like CoreOptions, DeploymentOptions,
but also "-" is easier
to type than "_" which doesn't need to press "shift" ;-)

Best,
Kurt


On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <[hidden email]> wrote:

> Hi all,
>
> Thanks for the feedbacks.
>
> It seems that we have a conclusion to put the version into the factory
> identifier. I'm also fine with this.
> If we have this outcome, the interface of Factory#factoryVersion is not
> needed anymore, this can simplify the learning cost of new factory.
> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> <[hidden email]>
>
> kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
> the meaning of "universal" not easy to understand.
> kafka-0.11 => kafka for 0.11 version
> kafka-0.10 => kafka for 0.10 version
> elasticsearch-6 => elasticsearch for 6.x versions
> elasticsearch-7 => elasticsearch for 7.x versions
> hbase-1.4 => hbase for 1.4.x versions
> jdbc
> filesystem
>
> We use "-" as the version delimiter to make them to be more consistent.
> This is not forces, users can also use other delimiters or without
> delimiter.
> But this can be a guilde in the Javadoc of Factory, to make the connector
> ecosystem to be more consistent.
>
> What do you think?
>
> ------------------------
>
> Regarding "connector.property-version":
>
> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> designed not support to read current properties.
> All the current properties are routed to the old factories if they are
> using "connector.type". Otherwise, properties are routed to new factories.
>
> If I understand correctly, the "connector.property-version" is attched
> implicitly by system, not manually set by users.
> For example, the framework should add "connector.property-version=1" to
> properties when processing DDL statement.
> I'm fine to add a "connector.property-version=1" when processing DDL
> statement, but I think it's also fine if we don't,
> because this can be done in the future if need and the default version can
> be 1.
>
> Best,
> Jark
>
> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>
> > Hi all,
> >
> > Thanks for the feedbacks.
> >
> > It seems that we have a conclusion to put the version into the factory
> > identifier. I'm also fine with this.
> > If we have this outcome, the interface of Factory#factoryVersion is not
> > needed anymore, this can simplify the learning cost of new factory.
> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> > <[hidden email]>
> >
> > Btw, I would like to use "_" instead of "-" as the version delimiter,
> > because "-" looks like minus and may confuse users, e.g.
> "elasticsearch-6".
> > This is not forced, but should be a guilde in the Javadoc of Factory.
> > I propose to use the following identifiers for existing connectors,
> >
> > kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
> > the meaning of "universal" not easy to understand.
> > kafka-0.11 => kafka for 0.11 version
> > kafka-0.10 => kafka for 0.10 version
> > elasticsearch-6 => elasticsearch for 6.x versions
> > elasticsearch-7 => elasticsearch for 7.x versions
> > hbase-1.4 => hbase for 1.4.x versions
> > jdbc
> > filesystem
> >
> > We use "-" as the version delimiter to make them to be more consistent.
> > This is not forces, users can also use other delimiters or without
> > delimiter.
> > But this can be a guilde in the Javadoc of Factory, to make the connector
> > ecosystem to be more consistent.
> >
> > What do you think?
> >
> > ------------------------
> >
> > Regarding "connector.property-version":
> >
> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> > designed not support to read current properties.
> > All the current properties are routed to the old factories if they are
> > using "connector.type". Otherwise, properties are routed to new
> factories.
> >
> > If I understand correctly, the "connector.property-version" is attched
> > implicitly by system, not manually set by users.
> > For example, the framework should add "connector.property-version=1" to
> > properties when processing DDL statement.
> > I'm fine to add a "connector.property-version=1" when processing DDL
> > statement, but I think it's also fine if we don't,
> > because this can be done in the future if need and the default version
> can
> > be 1.
> >
> > Best,
> > Jark
> >
> >
> >
> >
> >
> > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]>
> > wrote:
> >
> >> Hi all,
> >>
> >> I like the overall design of the FLIP.
> >>
> >> As for the withstanding concerns. I kind of like the approach to put the
> >> version into the factory identifier. I think it's the cleanest way to
> >> say that this version actually applies to the connector itself and not
> >> to the system it connects to. BTW, I think the outcome of this
> >> discussion will affect interfaces described in FLIP-95. If we put the
> >> version into the functionIdentifier, do we need Factory#factoryVersion?
> >>
> >> I also think it does make sense to have a versioning for the properties
> >> as well. Are we able to read all the current properties with the new
> >> factories? I think we could use the "connector.property-version" to
> >> alternate between different Factory interfaces to support the old set of
> >> properties. Otherwise the new factories need to understand both set of
> >> properties, don't they?
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> On 30/03/2020 17:07, Timo Walther wrote:
> >> > Hi Jark,
> >> >
> >> > thanks for the FLIP. I don't have a strong opinion on
> >> > DynamicTableFactory#factoryVersion() for distiguishing connector
> >> > versions. We can also include it in the factory identifier itself. For
> >> > Kafka, I would then just use `kafka` because the naming "universal"
> >> > was just a helper solution at that time.
> >> >
> >> > What we need is a good guide for how to design the options. We should
> >> > include that in the JavaDoc of factory interface itself, once it is
> >> > in-place. I like the difference between source and sink in properties.
> >> >
> >> > Regarding "connector.property-version":
> >> >
> >> > It was meant for backwards compatibility along the dimension of
> >> > "property design" compared to "connector version". However, since the
> >> > connector is now responsible for parsing all properties, it is not as
> >> > necessary as it was before.
> >> >
> >> > However, a change of the property design as it is done in FLIP-107
> >> > becomes more difficult without a property version and versioning is
> >> > always a good idea in API world.
> >> >
> >> > Regards,
> >> > Timo
> >> >
> >> >
> >> > On 30.03.20 16:38, Benchao Li wrote:
> >> >> Hi Jark,
> >> >>
> >> >> Thanks for starting the discussion. The FLIP LGTM generally.
> >> >>
> >> >> Regarding connector version VS put version into connector's name,
> >> >> I favor the latter personally, using one property to locate a
> >> >> connector can make the error log more precise.
> >> >>  From the users' side, using one property to match a connector will
> >> >> be easier. Especially we have many connectors,
> >> >> and some of the need version property required, and some of them not.
> >> >>
> >> >> Regarding Jingsong's suggestion,
> >> >> IMO, it's a very good complement to the FLIP. Distinguishing
> >> >> properties for source and sink can be very useful, and
> >> >> also this will make the connector property more precise.
> >> >> We are also sick of this for now, we cannot know whether a DDL is a
> >> >> source or sink unless we look through all queries where
> >> >> the table is used.
> >> >> Even more, some of the required properties are only required for
> >> >> source, bug we cannot leave it blank for sink, and vice versa.
> >> >> I think we can also add a type for dimension tables except source and
> >> >> sink.
> >> >>
> >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
> >> >> 周一 下午8:16写道:
> >> >>
> >> >>      > It's not possible for the framework to throw such exception.
> >> >>     Because the
> >> >>     framework doesn't know what versions do the connector support.
> >> >>
> >> >>     Not really, we can list all valid connectors framework could
> >> >> found. E.g.
> >> >>     user mistyped 'kafka-0.x', the error message will looks like:
> >> >>
> >> >>     we don't have any connector named "kafka-0.x", but we have:
> >> >>     FileSystem
> >> >>     Kafka-0.10
> >> >>     Kafka-0.11
> >> >>     ElasticSearch6
> >> >>     ElasticSearch7
> >> >>
> >> >>     Best,
> >> >>     Kurt
> >> >>
> >> >>
> >> >>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
> >> >>     <mailto:[hidden email]>> wrote:
> >> >>
> >> >>      > Hi Kurt,
> >> >>      >
> >> >>      > > 2) Lists all available connectors seems also quite
> >> >>     straightforward, e.g
> >> >>      > user provided a wrong "kafka-0.8", we tell user we have
> >> >> candidates of
> >> >>      > "kakfa-0.11", "kafka-universal"
> >> >>      > It's not possible for the framework to throw such exception.
> >> >>     Because the
> >> >>      > framework doesn't know what versions do the connector support.
> >> >>     All the
> >> >>      > version information is a blackbox in the identifier. But with
> >> >>      > `Factory#factoryVersion()` interface, we can know all the
> >> >> supported
> >> >>      > versions.
> >> >>      >
> >> >>      > > 3) I don't think so. We can still treat it as the same
> >> >>     connector but with
> >> >>      > different versions.
> >> >>      > That's true but that's weird. Because from the plain DDL
> >> >>     definition, they
> >> >>      > look like different connectors with different "connector"
> >> >> value, e.g.
> >> >>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
> >> >>      >
> >> >>      > > If users don't set any version, we will use
> "kafka-universal"
> >> >>     instead.
> >> >>      > The behavior is inconsistent IMO.
> >> >>      > That is a long term vision when there is no kafka clusters
> >> >> with <0.11
> >> >>      > version.
> >> >>      > At that point, "universal" is the only supported version in
> >> Flink
> >> >>     and the
> >> >>      > "version" key can be optional.
> >> >>      >
> >> >>      > ---------------------------------------------
> >> >>      >
> >> >>      > Hi Jingsong,
> >> >>      >
> >> >>      > > "version" vs "kafka.version"
> >> >>      > I though about it. But if we prefix "kafka" to version, we
> >> should
> >> >>     prefix
> >> >>      > "kafka" for all other property keys, because they are all
> kafka
> >> >>     specific
> >> >>      > options.
> >> >>      > However, that will make the property set verbose, see rejected
> >> >>     option#2 in
> >> >>      > the FLIP.
> >> >>      >
> >> >>      > > explicitly separate options for source and sink
> >> >>      > That's a good topic. It's good to have a guideline for the new
> >> >>     property
> >> >>      > keys.
> >> >>      > I'm fine to prefix with a "source"/"sink" for some connector
> >> >> keys.
> >> >>      > Actually, we already do this in some connectors, e.g. jdbc and
> >> >> hbase.
> >> >>      >
> >> >>      > Best,
> >> >>      > Jark
> >> >>      >
> >> >>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
> >> [hidden email]
> >> >>     <mailto:[hidden email]>> wrote:
> >> >>      >
> >> >>      > > Thanks Jark for the proposal.
> >> >>      > >
> >> >>      > > +1 to the general idea.
> >> >>      > >
> >> >>      > > For "version", what about "kafka.version"? It is obvious to
> >> >>     know its
> >> >>      > > meaning.
> >> >>      > >
> >> >>      > > And I'd like to start a new topic:
> >> >>      > > Should we need to explicitly separate source from sink?
> >> >>      > > With the development of batch and streaming, more and more
> >> >>     connectors
> >> >>      > have
> >> >>      > > both source and sink.
> >> >>      > >
> >> >>      > > So should we set a rule for table properties:
> >> >>      > > - properties for both source and sink: without prefix, like
> >> >> "topic"
> >> >>      > > - properties for source only: with "source." prefix, like
> >> >>      > > "source.startup-mode"
> >> >>      > > - properties for sink only: with "sink." prefix, like
> >> >>     "sink.partitioner"
> >> >>      > >
> >> >>      > > What do you think?
> >> >>      > >
> >> >>      > > Best,
> >> >>      > > Jingsong Lee
> >> >>      > >
> >> >>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
> >> >>     <mailto:[hidden email]>> wrote:
> >> >>      > >
> >> >>      > > > Hi Kurt,
> >> >>      > > >
> >> >>      > > > That's good questions.
> >> >>      > > >
> >> >>      > > > > the meaning of "version"
> >> >>      > > > There are two versions in the old design. One is property
> >> >> version
> >> >>      > > > "connector.property-version" which can be used for
> backward
> >> >>      > > compatibility.
> >> >>      > > > The other one is "connector.version" which defines the
> >> >> version of
> >> >>      > > external
> >> >>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> >> >>      > > > In this proposal, the "version" is the previous
> >> >>     "connector.version".
> >> >>      > The
> >> >>      > > > ""connector.property-version" is not introduced in new
> >> >> design.
> >> >>      > > >
> >> >>      > > > > how to keep the old capability which can evolve
> connector
> >> >>     properties
> >> >>      > > > The "connector.property-version" is an optional key in the
> >> >>     old design
> >> >>      > and
> >> >>      > > > is never bumped up.
> >> >>      > > > I'm not sure how "connector.property-version" should work
> >> >> in the
> >> >>      > initial
> >> >>      > > > design. Maybe @Timo Walther <[hidden email]
> >> >>     <mailto:[hidden email]>> has more knowledge on
> >> >>      > > > this.
> >> >>      > > > But for the new properties, every options should be
> >> >> expressed as
> >> >>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
> >> >> method to
> >> >>      > easily
> >> >>      > > > support evolving keys.
> >> >>      > > >
> >> >>      > > > > a single keys instead of two, e.g. "kafka-0.11",
> >> >>     "kafka-universal"?
> >> >>      > > > There are several benefit to use separate "version" key I
> >> can
> >> >>     see:
> >> >>      > > > 1) it's more readable to separete them into different
> keys,
> >> >>     because
> >> >>      > they
> >> >>      > > > are orthogonal concepts.
> >> >>      > > > 2) the planner can give all the availble versions in the
> >> >>     exception
> >> >>      > > message,
> >> >>      > > > if user uses a wrong version (this is often reported in
> >> >> user ML).
> >> >>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
> >> >> have to
> >> >>      > write a
> >> >>      > > > full documentation for each version, because they are
> >> >> different
> >> >>      > > > "connector"?
> >> >>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
> >> >> the same
> >> >>      > > connector
> >> >>      > > > but with different "client jar" version,
> >> >>      > > >     they share all the same supported property keys and
> >> >>     should reside
> >> >>      > > > together.
> >> >>      > > > 4) IMO, the future vision is version-free. At some point
> in
> >> >>     the future,
> >> >>      > > we
> >> >>      > > > may don't need users to specify kafka version anymore, and
> >> >> make
> >> >>      > > > "version=universal" as optional or removed in the future.
> >> >>     This is can
> >> >>      > be
> >> >>      > > > done easily if they are separate keys.
> >> >>      > > >
> >> >>      > > > Best,
> >> >>      > > > Jark
> >> >>      > > >
> >> >>      > > >
> >> >>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <
> [hidden email]
> >> >>     <mailto:[hidden email]>> wrote:
> >> >>      > > >
> >> >>      > > > > Hi Jark,
> >> >>      > > > >
> >> >>      > > > > Thanks for the proposal, I'm +1 to the general idea.
> >> >>     However I have a
> >> >>      > > > > question about "version",
> >> >>      > > > > in the old design, the version seems to be aimed for
> >> >> tracking
> >> >>      > property
> >> >>      > > > > version, with different
> >> >>      > > > > version, we could evolve these step by step without
> >> >>     breaking backward
> >> >>      > > > > compatibility. But in this
> >> >>      > > > > design, version is representing external system's
> >> >> version, like
> >> >>      > "0.11"
> >> >>      > > > for
> >> >>      > > > > kafka, "6" or "7" for ES.
> >> >>      > > > > I'm not sure if this is necessary, what's the benefit of
> >> >>     using two
> >> >>      > keys
> >> >>      > > > > instead of one, like "kafka-0.11"
> >> >>      > > > > or "ES6" directly? And how about the old capability
> which
> >> >>     could let
> >> >>      > us
> >> >>      > > > > evolving connector properties?
> >> >>      > > > >
> >> >>      > > > > Best,
> >> >>      > > > > Kurt
> >> >>      > > > >
> >> >>      > > > >
> >> >>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
> >> >>     <[hidden email] <mailto:[hidden email]>>
> >> >>      > > > > wrote:
> >> >>      > > > >
> >> >>      > > > > > Hi Jark,
> >> >>      > > > > >
> >> >>      > > > > > I am really looking forward to this feature. I think
> >> this
> >> >>     feature
> >> >>      > > > > > could simplify flink sql code,and at the same time ,
> >> >>      > > > > > it could make the developer more easlier to config the
> >> >>     flink sql
> >> >>      > WITH
> >> >>      > > > > > options.
> >> >>      > > > > >
> >> >>      > > > > > Now when I am using flink sql to write flink task ,
> >> >>     sometimes I
> >> >>      > think
> >> >>      > > > the
> >> >>      > > > > > WITH options is too long for user.
> >> >>      > > > > > For example,I config the kafka source connector
> >> >>     parameter,for
> >> >>      > > consumer
> >> >>      > > > > > group and brokers parameter:
> >> >>      > > > > >
> >> >>      > > > > >           'connector.properties.0.key' = 'group.id
> >> >>     <http://group.id>'
> >> >>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
> >> >>      > > > > > >          , 'connector.properties.1.key' =
> >> >>     'bootstrap.servers'
> >> >>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
> >> >>      > > > > > >
> >> >>      > > > > >
> >> >>      > > > > > I can understand this config , but for the flink fresh
> >> >>     man,maybe it
> >> >>      > > > > > is confused for him.
> >> >>      > > > > > In my thought, I am really looking forward to this
> >> >>     feature,thank
> >> >>      > you
> >> >>      > > to
> >> >>      > > > > > propose this feature.
> >> >>      > > > > >
> >> >>      > > > > > Best wishes,
> >> >>      > > > > > LakeShen
> >> >>      > > > > >
> >> >>      > > > > >
> >> >>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>>
> 于
> >> >>     2020年3月30日周一 下午2:02写道:
> >> >>      > > > > >
> >> >>      > > > > > > Hi everyone,
> >> >>      > > > > > >
> >> >>      > > > > > > I want to start a discussion about further improve
> and
> >> >>     simplify
> >> >>      > our
> >> >>      > > > > > current
> >> >>      > > > > > > connector porperty keys, aka WITH options.
> Currently,
> >> >>     we have a
> >> >>      > > > > > > 'connector.' prefix for many properties, but they
> are
> >> >>     verbose,
> >> >>      > and
> >> >>      > > we
> >> >>      > > > > > see a
> >> >>      > > > > > > big inconsistency between the properties when
> >> designing
> >> >>     FLIP-107.
> >> >>      > > > > > >
> >> >>      > > > > > > So we propose to remove all the 'connector.' prefix
> >> and
> >> >>     rename
> >> >>      > > > > > > 'connector.type' to 'connector', 'format.type' to
> >> >>     'format'. So a
> >> >>      > > new
> >> >>      > > > > > Kafka
> >> >>      > > > > > > DDL may look like this:
> >> >>      > > > > > >
> >> >>      > > > > > > CREATE TABLE kafka_table (
> >> >>      > > > > > >  ...
> >> >>      > > > > > > ) WITH (
> >> >>      > > > > > >  'connector' = 'kafka',
> >> >>      > > > > > >  'version' = '0.10',
> >> >>      > > > > > >  'topic' = 'test-topic',
> >> >>      > > > > > >  'startup-mode' = 'earliest-offset',
> >> >>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
> >> >>      > > > > > >  'properties.group.id <http://properties.group.id>'
> =
> >> >>     'testGroup',
> >> >>      > > > > > >  'format' = 'json',
> >> >>      > > > > > >  'format.fail-on-missing-field' = 'false'
> >> >>      > > > > > > );
> >> >>      > > > > > >
> >> >>      > > > > > > The new connector property key set will come
> together
> >> >>     with new
> >> >>      > > > Factory
> >> >>      > > > > > > inferface which is proposed in FLIP-95. Old
> properties
> >> >>     are still
> >> >>      > > > > > compatible
> >> >>      > > > > > > with their existing implementation. New properties
> >> >> are only
> >> >>      > > available
> >> >>      > > > > in
> >> >>      > > > > > > new DynamicTableFactory implementations.
> >> >>      > > > > > >
> >> >>      > > > > > > You can access the detailed FLIP here:
> >> >>      > > > > > >
> >> >>      > > > > > >
> >> >>      > > > > >
> >> >>      > > > >
> >> >>      > > >
> >> >>      > >
> >> >>      >
> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> >> >>      > > > > > >
> >> >>      > > > > > > Best,
> >> >>      > > > > > > Jark
> >> >>      > > > > > >
> >> >>      > > > > >
> >> >>      > > > >
> >> >>      > > >
> >> >>      > >
> >> >>      > >
> >> >>      > > --
> >> >>      > > Best, Jingsong Lee
> >> >>      > >
> >> >>      >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Benchao Li
> >> >> School of Electronics Engineering and Computer Science, Peking
> >> >> University
> >> >> Tel:+86-15650713730
> >> >> Email:[hidden email]
> >> >> <mailto:[hidden email]>;[hidden email]
> >> >> <mailto:[hidden email]>
> >> >>
> >> >
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Kurt Young
Regarding to "property-version", even this is brought up by myself, but I'm
actually not a fan of this.
As a user, I think this property is too much for me to learn, and I
probably won't use this anyway.
If framework can follow a good procedure to upgrade properties, show some
error messages (but still
can work) when user used some deprecated properties, and then finally drop
them 2-3 versions later.
I think this is good enough.

Best,
Kurt


On Tue, Mar 31, 2020 at 9:07 AM Kurt Young <[hidden email]> wrote:

> Regarding to version delimiter, I'm in favor of using "-", not "_". Not
> only because we are using
> "-" for config keys in other module, like CoreOptions, DeploymentOptions,
> but also "-" is easier
> to type than "_" which doesn't need to press "shift" ;-)
>
> Best,
> Kurt
>
>
> On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <[hidden email]> wrote:
>
>> Hi all,
>>
>> Thanks for the feedbacks.
>>
>> It seems that we have a conclusion to put the version into the factory
>> identifier. I'm also fine with this.
>> If we have this outcome, the interface of Factory#factoryVersion is not
>> needed anymore, this can simplify the learning cost of new factory.
>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>> <[hidden email]>
>>
>> kafka  => kafka for 0.11+ versions, we don't suffix "-universal", because
>> the meaning of "universal" not easy to understand.
>> kafka-0.11 => kafka for 0.11 version
>> kafka-0.10 => kafka for 0.10 version
>> elasticsearch-6 => elasticsearch for 6.x versions
>> elasticsearch-7 => elasticsearch for 7.x versions
>> hbase-1.4 => hbase for 1.4.x versions
>> jdbc
>> filesystem
>>
>> We use "-" as the version delimiter to make them to be more consistent.
>> This is not forces, users can also use other delimiters or without
>> delimiter.
>> But this can be a guilde in the Javadoc of Factory, to make the connector
>> ecosystem to be more consistent.
>>
>> What do you think?
>>
>> ------------------------
>>
>> Regarding "connector.property-version":
>>
>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>> designed not support to read current properties.
>> All the current properties are routed to the old factories if they are
>> using "connector.type". Otherwise, properties are routed to new factories.
>>
>> If I understand correctly, the "connector.property-version" is attched
>> implicitly by system, not manually set by users.
>> For example, the framework should add "connector.property-version=1" to
>> properties when processing DDL statement.
>> I'm fine to add a "connector.property-version=1" when processing DDL
>> statement, but I think it's also fine if we don't,
>> because this can be done in the future if need and the default version can
>> be 1.
>>
>> Best,
>> Jark
>>
>> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>>
>> > Hi all,
>> >
>> > Thanks for the feedbacks.
>> >
>> > It seems that we have a conclusion to put the version into the factory
>> > identifier. I'm also fine with this.
>> > If we have this outcome, the interface of Factory#factoryVersion is not
>> > needed anymore, this can simplify the learning cost of new factory.
>> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>> > <[hidden email]>
>> >
>> > Btw, I would like to use "_" instead of "-" as the version delimiter,
>> > because "-" looks like minus and may confuse users, e.g.
>> "elasticsearch-6".
>> > This is not forced, but should be a guilde in the Javadoc of Factory.
>> > I propose to use the following identifiers for existing connectors,
>> >
>> > kafka  => kafka for 0.11+ versions, we don't suffix "-universal",
>> because
>> > the meaning of "universal" not easy to understand.
>> > kafka-0.11 => kafka for 0.11 version
>> > kafka-0.10 => kafka for 0.10 version
>> > elasticsearch-6 => elasticsearch for 6.x versions
>> > elasticsearch-7 => elasticsearch for 7.x versions
>> > hbase-1.4 => hbase for 1.4.x versions
>> > jdbc
>> > filesystem
>> >
>> > We use "-" as the version delimiter to make them to be more consistent.
>> > This is not forces, users can also use other delimiters or without
>> > delimiter.
>> > But this can be a guilde in the Javadoc of Factory, to make the
>> connector
>> > ecosystem to be more consistent.
>> >
>> > What do you think?
>> >
>> > ------------------------
>> >
>> > Regarding "connector.property-version":
>> >
>> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>> > designed not support to read current properties.
>> > All the current properties are routed to the old factories if they are
>> > using "connector.type". Otherwise, properties are routed to new
>> factories.
>> >
>> > If I understand correctly, the "connector.property-version" is attched
>> > implicitly by system, not manually set by users.
>> > For example, the framework should add "connector.property-version=1" to
>> > properties when processing DDL statement.
>> > I'm fine to add a "connector.property-version=1" when processing DDL
>> > statement, but I think it's also fine if we don't,
>> > because this can be done in the future if need and the default version
>> can
>> > be 1.
>> >
>> > Best,
>> > Jark
>> >
>> >
>> >
>> >
>> >
>> > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]>
>> > wrote:
>> >
>> >> Hi all,
>> >>
>> >> I like the overall design of the FLIP.
>> >>
>> >> As for the withstanding concerns. I kind of like the approach to put
>> the
>> >> version into the factory identifier. I think it's the cleanest way to
>> >> say that this version actually applies to the connector itself and not
>> >> to the system it connects to. BTW, I think the outcome of this
>> >> discussion will affect interfaces described in FLIP-95. If we put the
>> >> version into the functionIdentifier, do we need Factory#factoryVersion?
>> >>
>> >> I also think it does make sense to have a versioning for the properties
>> >> as well. Are we able to read all the current properties with the new
>> >> factories? I think we could use the "connector.property-version" to
>> >> alternate between different Factory interfaces to support the old set
>> of
>> >> properties. Otherwise the new factories need to understand both set of
>> >> properties, don't they?
>> >>
>> >> Best,
>> >>
>> >> Dawid
>> >>
>> >> On 30/03/2020 17:07, Timo Walther wrote:
>> >> > Hi Jark,
>> >> >
>> >> > thanks for the FLIP. I don't have a strong opinion on
>> >> > DynamicTableFactory#factoryVersion() for distiguishing connector
>> >> > versions. We can also include it in the factory identifier itself.
>> For
>> >> > Kafka, I would then just use `kafka` because the naming "universal"
>> >> > was just a helper solution at that time.
>> >> >
>> >> > What we need is a good guide for how to design the options. We should
>> >> > include that in the JavaDoc of factory interface itself, once it is
>> >> > in-place. I like the difference between source and sink in
>> properties.
>> >> >
>> >> > Regarding "connector.property-version":
>> >> >
>> >> > It was meant for backwards compatibility along the dimension of
>> >> > "property design" compared to "connector version". However, since the
>> >> > connector is now responsible for parsing all properties, it is not as
>> >> > necessary as it was before.
>> >> >
>> >> > However, a change of the property design as it is done in FLIP-107
>> >> > becomes more difficult without a property version and versioning is
>> >> > always a good idea in API world.
>> >> >
>> >> > Regards,
>> >> > Timo
>> >> >
>> >> >
>> >> > On 30.03.20 16:38, Benchao Li wrote:
>> >> >> Hi Jark,
>> >> >>
>> >> >> Thanks for starting the discussion. The FLIP LGTM generally.
>> >> >>
>> >> >> Regarding connector version VS put version into connector's name,
>> >> >> I favor the latter personally, using one property to locate a
>> >> >> connector can make the error log more precise.
>> >> >>  From the users' side, using one property to match a connector will
>> >> >> be easier. Especially we have many connectors,
>> >> >> and some of the need version property required, and some of them
>> not.
>> >> >>
>> >> >> Regarding Jingsong's suggestion,
>> >> >> IMO, it's a very good complement to the FLIP. Distinguishing
>> >> >> properties for source and sink can be very useful, and
>> >> >> also this will make the connector property more precise.
>> >> >> We are also sick of this for now, we cannot know whether a DDL is a
>> >> >> source or sink unless we look through all queries where
>> >> >> the table is used.
>> >> >> Even more, some of the required properties are only required for
>> >> >> source, bug we cannot leave it blank for sink, and vice versa.
>> >> >> I think we can also add a type for dimension tables except source
>> and
>> >> >> sink.
>> >> >>
>> >> >> Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
>> >> >> 周一 下午8:16写道:
>> >> >>
>> >> >>      > It's not possible for the framework to throw such exception.
>> >> >>     Because the
>> >> >>     framework doesn't know what versions do the connector support.
>> >> >>
>> >> >>     Not really, we can list all valid connectors framework could
>> >> >> found. E.g.
>> >> >>     user mistyped 'kafka-0.x', the error message will looks like:
>> >> >>
>> >> >>     we don't have any connector named "kafka-0.x", but we have:
>> >> >>     FileSystem
>> >> >>     Kafka-0.10
>> >> >>     Kafka-0.11
>> >> >>     ElasticSearch6
>> >> >>     ElasticSearch7
>> >> >>
>> >> >>     Best,
>> >> >>     Kurt
>> >> >>
>> >> >>
>> >> >>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>> >> >>     <mailto:[hidden email]>> wrote:
>> >> >>
>> >> >>      > Hi Kurt,
>> >> >>      >
>> >> >>      > > 2) Lists all available connectors seems also quite
>> >> >>     straightforward, e.g
>> >> >>      > user provided a wrong "kafka-0.8", we tell user we have
>> >> >> candidates of
>> >> >>      > "kakfa-0.11", "kafka-universal"
>> >> >>      > It's not possible for the framework to throw such exception.
>> >> >>     Because the
>> >> >>      > framework doesn't know what versions do the connector
>> support.
>> >> >>     All the
>> >> >>      > version information is a blackbox in the identifier. But with
>> >> >>      > `Factory#factoryVersion()` interface, we can know all the
>> >> >> supported
>> >> >>      > versions.
>> >> >>      >
>> >> >>      > > 3) I don't think so. We can still treat it as the same
>> >> >>     connector but with
>> >> >>      > different versions.
>> >> >>      > That's true but that's weird. Because from the plain DDL
>> >> >>     definition, they
>> >> >>      > look like different connectors with different "connector"
>> >> >> value, e.g.
>> >> >>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>> >> >>      >
>> >> >>      > > If users don't set any version, we will use
>> "kafka-universal"
>> >> >>     instead.
>> >> >>      > The behavior is inconsistent IMO.
>> >> >>      > That is a long term vision when there is no kafka clusters
>> >> >> with <0.11
>> >> >>      > version.
>> >> >>      > At that point, "universal" is the only supported version in
>> >> Flink
>> >> >>     and the
>> >> >>      > "version" key can be optional.
>> >> >>      >
>> >> >>      > ---------------------------------------------
>> >> >>      >
>> >> >>      > Hi Jingsong,
>> >> >>      >
>> >> >>      > > "version" vs "kafka.version"
>> >> >>      > I though about it. But if we prefix "kafka" to version, we
>> >> should
>> >> >>     prefix
>> >> >>      > "kafka" for all other property keys, because they are all
>> kafka
>> >> >>     specific
>> >> >>      > options.
>> >> >>      > However, that will make the property set verbose, see
>> rejected
>> >> >>     option#2 in
>> >> >>      > the FLIP.
>> >> >>      >
>> >> >>      > > explicitly separate options for source and sink
>> >> >>      > That's a good topic. It's good to have a guideline for the
>> new
>> >> >>     property
>> >> >>      > keys.
>> >> >>      > I'm fine to prefix with a "source"/"sink" for some connector
>> >> >> keys.
>> >> >>      > Actually, we already do this in some connectors, e.g. jdbc
>> and
>> >> >> hbase.
>> >> >>      >
>> >> >>      > Best,
>> >> >>      > Jark
>> >> >>      >
>> >> >>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
>> >> [hidden email]
>> >> >>     <mailto:[hidden email]>> wrote:
>> >> >>      >
>> >> >>      > > Thanks Jark for the proposal.
>> >> >>      > >
>> >> >>      > > +1 to the general idea.
>> >> >>      > >
>> >> >>      > > For "version", what about "kafka.version"? It is obvious to
>> >> >>     know its
>> >> >>      > > meaning.
>> >> >>      > >
>> >> >>      > > And I'd like to start a new topic:
>> >> >>      > > Should we need to explicitly separate source from sink?
>> >> >>      > > With the development of batch and streaming, more and more
>> >> >>     connectors
>> >> >>      > have
>> >> >>      > > both source and sink.
>> >> >>      > >
>> >> >>      > > So should we set a rule for table properties:
>> >> >>      > > - properties for both source and sink: without prefix, like
>> >> >> "topic"
>> >> >>      > > - properties for source only: with "source." prefix, like
>> >> >>      > > "source.startup-mode"
>> >> >>      > > - properties for sink only: with "sink." prefix, like
>> >> >>     "sink.partitioner"
>> >> >>      > >
>> >> >>      > > What do you think?
>> >> >>      > >
>> >> >>      > > Best,
>> >> >>      > > Jingsong Lee
>> >> >>      > >
>> >> >>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>> >> >>     <mailto:[hidden email]>> wrote:
>> >> >>      > >
>> >> >>      > > > Hi Kurt,
>> >> >>      > > >
>> >> >>      > > > That's good questions.
>> >> >>      > > >
>> >> >>      > > > > the meaning of "version"
>> >> >>      > > > There are two versions in the old design. One is property
>> >> >> version
>> >> >>      > > > "connector.property-version" which can be used for
>> backward
>> >> >>      > > compatibility.
>> >> >>      > > > The other one is "connector.version" which defines the
>> >> >> version of
>> >> >>      > > external
>> >> >>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>> >> >>      > > > In this proposal, the "version" is the previous
>> >> >>     "connector.version".
>> >> >>      > The
>> >> >>      > > > ""connector.property-version" is not introduced in new
>> >> >> design.
>> >> >>      > > >
>> >> >>      > > > > how to keep the old capability which can evolve
>> connector
>> >> >>     properties
>> >> >>      > > > The "connector.property-version" is an optional key in
>> the
>> >> >>     old design
>> >> >>      > and
>> >> >>      > > > is never bumped up.
>> >> >>      > > > I'm not sure how "connector.property-version" should work
>> >> >> in the
>> >> >>      > initial
>> >> >>      > > > design. Maybe @Timo Walther <[hidden email]
>> >> >>     <mailto:[hidden email]>> has more knowledge on
>> >> >>      > > > this.
>> >> >>      > > > But for the new properties, every options should be
>> >> >> expressed as
>> >> >>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
>> >> >> method to
>> >> >>      > easily
>> >> >>      > > > support evolving keys.
>> >> >>      > > >
>> >> >>      > > > > a single keys instead of two, e.g. "kafka-0.11",
>> >> >>     "kafka-universal"?
>> >> >>      > > > There are several benefit to use separate "version" key I
>> >> can
>> >> >>     see:
>> >> >>      > > > 1) it's more readable to separete them into different
>> keys,
>> >> >>     because
>> >> >>      > they
>> >> >>      > > > are orthogonal concepts.
>> >> >>      > > > 2) the planner can give all the availble versions in the
>> >> >>     exception
>> >> >>      > > message,
>> >> >>      > > > if user uses a wrong version (this is often reported in
>> >> >> user ML).
>> >> >>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
>> >> >> have to
>> >> >>      > write a
>> >> >>      > > > full documentation for each version, because they are
>> >> >> different
>> >> >>      > > > "connector"?
>> >> >>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
>> >> >> the same
>> >> >>      > > connector
>> >> >>      > > > but with different "client jar" version,
>> >> >>      > > >     they share all the same supported property keys and
>> >> >>     should reside
>> >> >>      > > > together.
>> >> >>      > > > 4) IMO, the future vision is version-free. At some point
>> in
>> >> >>     the future,
>> >> >>      > > we
>> >> >>      > > > may don't need users to specify kafka version anymore,
>> and
>> >> >> make
>> >> >>      > > > "version=universal" as optional or removed in the future.
>> >> >>     This is can
>> >> >>      > be
>> >> >>      > > > done easily if they are separate keys.
>> >> >>      > > >
>> >> >>      > > > Best,
>> >> >>      > > > Jark
>> >> >>      > > >
>> >> >>      > > >
>> >> >>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <
>> [hidden email]
>> >> >>     <mailto:[hidden email]>> wrote:
>> >> >>      > > >
>> >> >>      > > > > Hi Jark,
>> >> >>      > > > >
>> >> >>      > > > > Thanks for the proposal, I'm +1 to the general idea.
>> >> >>     However I have a
>> >> >>      > > > > question about "version",
>> >> >>      > > > > in the old design, the version seems to be aimed for
>> >> >> tracking
>> >> >>      > property
>> >> >>      > > > > version, with different
>> >> >>      > > > > version, we could evolve these step by step without
>> >> >>     breaking backward
>> >> >>      > > > > compatibility. But in this
>> >> >>      > > > > design, version is representing external system's
>> >> >> version, like
>> >> >>      > "0.11"
>> >> >>      > > > for
>> >> >>      > > > > kafka, "6" or "7" for ES.
>> >> >>      > > > > I'm not sure if this is necessary, what's the benefit
>> of
>> >> >>     using two
>> >> >>      > keys
>> >> >>      > > > > instead of one, like "kafka-0.11"
>> >> >>      > > > > or "ES6" directly? And how about the old capability
>> which
>> >> >>     could let
>> >> >>      > us
>> >> >>      > > > > evolving connector properties?
>> >> >>      > > > >
>> >> >>      > > > > Best,
>> >> >>      > > > > Kurt
>> >> >>      > > > >
>> >> >>      > > > >
>> >> >>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>> >> >>     <[hidden email] <mailto:[hidden email]>>
>> >> >>      > > > > wrote:
>> >> >>      > > > >
>> >> >>      > > > > > Hi Jark,
>> >> >>      > > > > >
>> >> >>      > > > > > I am really looking forward to this feature. I think
>> >> this
>> >> >>     feature
>> >> >>      > > > > > could simplify flink sql code,and at the same time ,
>> >> >>      > > > > > it could make the developer more easlier to config
>> the
>> >> >>     flink sql
>> >> >>      > WITH
>> >> >>      > > > > > options.
>> >> >>      > > > > >
>> >> >>      > > > > > Now when I am using flink sql to write flink task ,
>> >> >>     sometimes I
>> >> >>      > think
>> >> >>      > > > the
>> >> >>      > > > > > WITH options is too long for user.
>> >> >>      > > > > > For example,I config the kafka source connector
>> >> >>     parameter,for
>> >> >>      > > consumer
>> >> >>      > > > > > group and brokers parameter:
>> >> >>      > > > > >
>> >> >>      > > > > >           'connector.properties.0.key' = 'group.id
>> >> >>     <http://group.id>'
>> >> >>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
>> >> >>      > > > > > >          , 'connector.properties.1.key' =
>> >> >>     'bootstrap.servers'
>> >> >>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
>> >> >>      > > > > > >
>> >> >>      > > > > >
>> >> >>      > > > > > I can understand this config , but for the flink
>> fresh
>> >> >>     man,maybe it
>> >> >>      > > > > > is confused for him.
>> >> >>      > > > > > In my thought, I am really looking forward to this
>> >> >>     feature,thank
>> >> >>      > you
>> >> >>      > > to
>> >> >>      > > > > > propose this feature.
>> >> >>      > > > > >
>> >> >>      > > > > > Best wishes,
>> >> >>      > > > > > LakeShen
>> >> >>      > > > > >
>> >> >>      > > > > >
>> >> >>      > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>>
>> 于
>> >> >>     2020年3月30日周一 下午2:02写道:
>> >> >>      > > > > >
>> >> >>      > > > > > > Hi everyone,
>> >> >>      > > > > > >
>> >> >>      > > > > > > I want to start a discussion about further improve
>> and
>> >> >>     simplify
>> >> >>      > our
>> >> >>      > > > > > current
>> >> >>      > > > > > > connector porperty keys, aka WITH options.
>> Currently,
>> >> >>     we have a
>> >> >>      > > > > > > 'connector.' prefix for many properties, but they
>> are
>> >> >>     verbose,
>> >> >>      > and
>> >> >>      > > we
>> >> >>      > > > > > see a
>> >> >>      > > > > > > big inconsistency between the properties when
>> >> designing
>> >> >>     FLIP-107.
>> >> >>      > > > > > >
>> >> >>      > > > > > > So we propose to remove all the 'connector.' prefix
>> >> and
>> >> >>     rename
>> >> >>      > > > > > > 'connector.type' to 'connector', 'format.type' to
>> >> >>     'format'. So a
>> >> >>      > > new
>> >> >>      > > > > > Kafka
>> >> >>      > > > > > > DDL may look like this:
>> >> >>      > > > > > >
>> >> >>      > > > > > > CREATE TABLE kafka_table (
>> >> >>      > > > > > >  ...
>> >> >>      > > > > > > ) WITH (
>> >> >>      > > > > > >  'connector' = 'kafka',
>> >> >>      > > > > > >  'version' = '0.10',
>> >> >>      > > > > > >  'topic' = 'test-topic',
>> >> >>      > > > > > >  'startup-mode' = 'earliest-offset',
>> >> >>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
>> >> >>      > > > > > >  'properties.group.id <http://properties.group.id>'
>> =
>> >> >>     'testGroup',
>> >> >>      > > > > > >  'format' = 'json',
>> >> >>      > > > > > >  'format.fail-on-missing-field' = 'false'
>> >> >>      > > > > > > );
>> >> >>      > > > > > >
>> >> >>      > > > > > > The new connector property key set will come
>> together
>> >> >>     with new
>> >> >>      > > > Factory
>> >> >>      > > > > > > inferface which is proposed in FLIP-95. Old
>> properties
>> >> >>     are still
>> >> >>      > > > > > compatible
>> >> >>      > > > > > > with their existing implementation. New properties
>> >> >> are only
>> >> >>      > > available
>> >> >>      > > > > in
>> >> >>      > > > > > > new DynamicTableFactory implementations.
>> >> >>      > > > > > >
>> >> >>      > > > > > > You can access the detailed FLIP here:
>> >> >>      > > > > > >
>> >> >>      > > > > > >
>> >> >>      > > > > >
>> >> >>      > > > >
>> >> >>      > > >
>> >> >>      > >
>> >> >>      >
>> >> >>
>> >> >>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>> >> >>      > > > > > >
>> >> >>      > > > > > > Best,
>> >> >>      > > > > > > Jark
>> >> >>      > > > > > >
>> >> >>      > > > > >
>> >> >>      > > > >
>> >> >>      > > >
>> >> >>      > >
>> >> >>      > >
>> >> >>      > > --
>> >> >>      > > Best, Jingsong Lee
>> >> >>      > >
>> >> >>      >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Benchao Li
>> >> >> School of Electronics Engineering and Computer Science, Peking
>> >> >> University
>> >> >> Tel:+86-15650713730
>> >> >> Email:[hidden email]
>> >> >> <mailto:[hidden email]>;[hidden email]
>> >> >> <mailto:[hidden email]>
>> >> >>
>> >> >
>> >>
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Danny Chan
In reply to this post by Jark Wu-2
Thanks Jark for bring up this discussion, +1 for this idea, I believe the user has suffered from the verbose property key for long time.

Just one question, how do we handle the keys with wildcard, such as the “connector.properties.*” in Kafka connector which would then hand-over to Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption to describe the “supported properties”, then I have to concerns:

• For the new keys, do we still need to put multi-lines there the such key, such as “connector.properties.abc” “connector.properties.def”, or should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..."
• Should the ConfigOption support the wildcard ? (If we plan to support the current multi-line style)


Best,
Danny Chan
在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道:

> Hi all,
>
> Thanks for the feedbacks.
>
> It seems that we have a conclusion to put the version into the factory
> identifier. I'm also fine with this.
> If we have this outcome, the interface of Factory#factoryVersion is not
> needed anymore, this can simplify the learning cost of new factory.
> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> <[hidden email]>
>
> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
> the meaning of "universal" not easy to understand.
> kafka-0.11 => kafka for 0.11 version
> kafka-0.10 => kafka for 0.10 version
> elasticsearch-6 => elasticsearch for 6.x versions
> elasticsearch-7 => elasticsearch for 7.x versions
> hbase-1.4 => hbase for 1.4.x versions
> jdbc
> filesystem
>
> We use "-" as the version delimiter to make them to be more consistent.
> This is not forces, users can also use other delimiters or without
> delimiter.
> But this can be a guilde in the Javadoc of Factory, to make the connector
> ecosystem to be more consistent.
>
> What do you think?
>
> ------------------------
>
> Regarding "connector.property-version":
>
> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> designed not support to read current properties.
> All the current properties are routed to the old factories if they are
> using "connector.type". Otherwise, properties are routed to new factories.
>
> If I understand correctly, the "connector.property-version" is attched
> implicitly by system, not manually set by users.
> For example, the framework should add "connector.property-version=1" to
> properties when processing DDL statement.
> I'm fine to add a "connector.property-version=1" when processing DDL
> statement, but I think it's also fine if we don't,
> because this can be done in the future if need and the default version can
> be 1.
>
> Best,
> Jark
>
> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>
> > Hi all,
> >
> > Thanks for the feedbacks.
> >
> > It seems that we have a conclusion to put the version into the factory
> > identifier. I'm also fine with this.
> > If we have this outcome, the interface of Factory#factoryVersion is not
> > needed anymore, this can simplify the learning cost of new factory.
> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> > <[hidden email]>
> >
> > Btw, I would like to use "_" instead of "-" as the version delimiter,
> > because "-" looks like minus and may confuse users, e.g. "elasticsearch-6".
> > This is not forced, but should be a guilde in the Javadoc of Factory.
> > I propose to use the following identifiers for existing connectors,
> >
> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
> > the meaning of "universal" not easy to understand.
> > kafka-0.11 => kafka for 0.11 version
> > kafka-0.10 => kafka for 0.10 version
> > elasticsearch-6 => elasticsearch for 6.x versions
> > elasticsearch-7 => elasticsearch for 7.x versions
> > hbase-1.4 => hbase for 1.4.x versions
> > jdbc
> > filesystem
> >
> > We use "-" as the version delimiter to make them to be more consistent.
> > This is not forces, users can also use other delimiters or without
> > delimiter.
> > But this can be a guilde in the Javadoc of Factory, to make the connector
> > ecosystem to be more consistent.
> >
> > What do you think?
> >
> > ------------------------
> >
> > Regarding "connector.property-version":
> >
> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> > designed not support to read current properties.
> > All the current properties are routed to the old factories if they are
> > using "connector.type". Otherwise, properties are routed to new factories.
> >
> > If I understand correctly, the "connector.property-version" is attched
> > implicitly by system, not manually set by users.
> > For example, the framework should add "connector.property-version=1" to
> > properties when processing DDL statement.
> > I'm fine to add a "connector.property-version=1" when processing DDL
> > statement, but I think it's also fine if we don't,
> > because this can be done in the future if need and the default version can
> > be 1.
> >
> > Best,
> > Jark
> >
> >
> >
> >
> >
> > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]>
> > wrote:
> >
> > > Hi all,
> > >
> > > I like the overall design of the FLIP.
> > >
> > > As for the withstanding concerns. I kind of like the approach to put the
> > > version into the factory identifier. I think it's the cleanest way to
> > > say that this version actually applies to the connector itself and not
> > > to the system it connects to. BTW, I think the outcome of this
> > > discussion will affect interfaces described in FLIP-95. If we put the
> > > version into the functionIdentifier, do we need Factory#factoryVersion?
> > >
> > > I also think it does make sense to have a versioning for the properties
> > > as well. Are we able to read all the current properties with the new
> > > factories? I think we could use the "connector.property-version" to
> > > alternate between different Factory interfaces to support the old set of
> > > properties. Otherwise the new factories need to understand both set of
> > > properties, don't they?
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > On 30/03/2020 17:07, Timo Walther wrote:
> > > > Hi Jark,
> > > >
> > > > thanks for the FLIP. I don't have a strong opinion on
> > > > DynamicTableFactory#factoryVersion() for distiguishing connector
> > > > versions. We can also include it in the factory identifier itself. For
> > > > Kafka, I would then just use `kafka` because the naming "universal"
> > > > was just a helper solution at that time.
> > > >
> > > > What we need is a good guide for how to design the options. We should
> > > > include that in the JavaDoc of factory interface itself, once it is
> > > > in-place. I like the difference between source and sink in properties.
> > > >
> > > > Regarding "connector.property-version":
> > > >
> > > > It was meant for backwards compatibility along the dimension of
> > > > "property design" compared to "connector version". However, since the
> > > > connector is now responsible for parsing all properties, it is not as
> > > > necessary as it was before.
> > > >
> > > > However, a change of the property design as it is done in FLIP-107
> > > > becomes more difficult without a property version and versioning is
> > > > always a good idea in API world.
> > > >
> > > > Regards,
> > > > Timo
> > > >
> > > >
> > > > On 30.03.20 16:38, Benchao Li wrote:
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for starting the discussion. The FLIP LGTM generally.
> > > > >
> > > > > Regarding connector version VS put version into connector's name,
> > > > > I favor the latter personally, using one property to locate a
> > > > > connector can make the error log more precise.
> > > > > From the users' side, using one property to match a connector will
> > > > > be easier. Especially we have many connectors,
> > > > > and some of the need version property required, and some of them not.
> > > > >
> > > > > Regarding Jingsong's suggestion,
> > > > > IMO, it's a very good complement to the FLIP. Distinguishing
> > > > > properties for source and sink can be very useful, and
> > > > > also this will make the connector property more precise.
> > > > > We are also sick of this for now, we cannot know whether a DDL is a
> > > > > source or sink unless we look through all queries where
> > > > > the table is used.
> > > > > Even more, some of the required properties are only required for
> > > > > source, bug we cannot leave it blank for sink, and vice versa.
> > > > > I think we can also add a type for dimension tables except source and
> > > > > sink.
> > > > >
> > > > > Kurt Young <[hidden email] <mailto:[hidden email]>> 于2020年3月30日
> > > > > 周一 下午8:16写道:
> > > > >
> > > > > > It's not possible for the framework to throw such exception.
> > > > > Because the
> > > > > framework doesn't know what versions do the connector support.
> > > > >
> > > > > Not really, we can list all valid connectors framework could
> > > > > found. E.g.
> > > > > user mistyped 'kafka-0.x', the error message will looks like:
> > > > >
> > > > > we don't have any connector named "kafka-0.x", but we have:
> > > > > FileSystem
> > > > > Kafka-0.10
> > > > > Kafka-0.11
> > > > > ElasticSearch6
> > > > > ElasticSearch7
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
> > > > > <mailto:[hidden email]>> wrote:
> > > > >
> > > > > > Hi Kurt,
> > > > > >
> > > > > > > 2) Lists all available connectors seems also quite
> > > > > straightforward, e.g
> > > > > > user provided a wrong "kafka-0.8", we tell user we have
> > > > > candidates of
> > > > > > "kakfa-0.11", "kafka-universal"
> > > > > > It's not possible for the framework to throw such exception.
> > > > > Because the
> > > > > > framework doesn't know what versions do the connector support.
> > > > > All the
> > > > > > version information is a blackbox in the identifier. But with
> > > > > > `Factory#factoryVersion()` interface, we can know all the
> > > > > supported
> > > > > > versions.
> > > > > >
> > > > > > > 3) I don't think so. We can still treat it as the same
> > > > > connector but with
> > > > > > different versions.
> > > > > > That's true but that's weird. Because from the plain DDL
> > > > > definition, they
> > > > > > look like different connectors with different "connector"
> > > > > value, e.g.
> > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'.
> > > > > >
> > > > > > > If users don't set any version, we will use "kafka-universal"
> > > > > instead.
> > > > > > The behavior is inconsistent IMO.
> > > > > > That is a long term vision when there is no kafka clusters
> > > > > with <0.11
> > > > > > version.
> > > > > > At that point, "universal" is the only supported version in
> > > Flink
> > > > > and the
> > > > > > "version" key can be optional.
> > > > > >
> > > > > > ---------------------------------------------
> > > > > >
> > > > > > Hi Jingsong,
> > > > > >
> > > > > > > "version" vs "kafka.version"
> > > > > > I though about it. But if we prefix "kafka" to version, we
> > > should
> > > > > prefix
> > > > > > "kafka" for all other property keys, because they are all kafka
> > > > > specific
> > > > > > options.
> > > > > > However, that will make the property set verbose, see rejected
> > > > > option#2 in
> > > > > > the FLIP.
> > > > > >
> > > > > > > explicitly separate options for source and sink
> > > > > > That's a good topic. It's good to have a guideline for the new
> > > > > property
> > > > > > keys.
> > > > > > I'm fine to prefix with a "source"/"sink" for some connector
> > > > > keys.
> > > > > > Actually, we already do this in some connectors, e.g. jdbc and
> > > > > hbase.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
> > > [hidden email]
> > > > > <mailto:[hidden email]>> wrote:
> > > > > >
> > > > > > > Thanks Jark for the proposal.
> > > > > > >
> > > > > > > +1 to the general idea.
> > > > > > >
> > > > > > > For "version", what about "kafka.version"? It is obvious to
> > > > > know its
> > > > > > > meaning.
> > > > > > >
> > > > > > > And I'd like to start a new topic:
> > > > > > > Should we need to explicitly separate source from sink?
> > > > > > > With the development of batch and streaming, more and more
> > > > > connectors
> > > > > > have
> > > > > > > both source and sink.
> > > > > > >
> > > > > > > So should we set a rule for table properties:
> > > > > > > - properties for both source and sink: without prefix, like
> > > > > "topic"
> > > > > > > - properties for source only: with "source." prefix, like
> > > > > > > "source.startup-mode"
> > > > > > > - properties for sink only: with "sink." prefix, like
> > > > > "sink.partitioner"
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong Lee
> > > > > > >
> > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
> > > > > <mailto:[hidden email]>> wrote:
> > > > > > >
> > > > > > > > Hi Kurt,
> > > > > > > >
> > > > > > > > That's good questions.
> > > > > > > >
> > > > > > > > > the meaning of "version"
> > > > > > > > There are two versions in the old design. One is property
> > > > > version
> > > > > > > > "connector.property-version" which can be used for backward
> > > > > > > compatibility.
> > > > > > > > The other one is "connector.version" which defines the
> > > > > version of
> > > > > > > external
> > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> > > > > > > > In this proposal, the "version" is the previous
> > > > > "connector.version".
> > > > > > The
> > > > > > > > ""connector.property-version" is not introduced in new
> > > > > design.
> > > > > > > >
> > > > > > > > > how to keep the old capability which can evolve connector
> > > > > properties
> > > > > > > > The "connector.property-version" is an optional key in the
> > > > > old design
> > > > > > and
> > > > > > > > is never bumped up.
> > > > > > > > I'm not sure how "connector.property-version" should work
> > > > > in the
> > > > > > initial
> > > > > > > > design. Maybe @Timo Walther <[hidden email]
> > > > > <mailto:[hidden email]>> has more knowledge on
> > > > > > > > this.
> > > > > > > > But for the new properties, every options should be
> > > > > expressed as
> > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
> > > > > method to
> > > > > > easily
> > > > > > > > support evolving keys.
> > > > > > > >
> > > > > > > > > a single keys instead of two, e.g. "kafka-0.11",
> > > > > "kafka-universal"?
> > > > > > > > There are several benefit to use separate "version" key I
> > > can
> > > > > see:
> > > > > > > > 1) it's more readable to separete them into different keys,
> > > > > because
> > > > > > they
> > > > > > > > are orthogonal concepts.
> > > > > > > > 2) the planner can give all the availble versions in the
> > > > > exception
> > > > > > > message,
> > > > > > > > if user uses a wrong version (this is often reported in
> > > > > user ML).
> > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may
> > > > > have to
> > > > > > write a
> > > > > > > > full documentation for each version, because they are
> > > > > different
> > > > > > > > "connector"?
> > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually
> > > > > the same
> > > > > > > connector
> > > > > > > > but with different "client jar" version,
> > > > > > > > they share all the same supported property keys and
> > > > > should reside
> > > > > > > > together.
> > > > > > > > 4) IMO, the future vision is version-free. At some point in
> > > > > the future,
> > > > > > > we
> > > > > > > > may don't need users to specify kafka version anymore, and
> > > > > make
> > > > > > > > "version=universal" as optional or removed in the future.
> > > > > This is can
> > > > > > be
> > > > > > > > done easily if they are separate keys.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jark
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
> > > > > <mailto:[hidden email]>> wrote:
> > > > > > > >
> > > > > > > > > Hi Jark,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal, I'm +1 to the general idea.
> > > > > However I have a
> > > > > > > > > question about "version",
> > > > > > > > > in the old design, the version seems to be aimed for
> > > > > tracking
> > > > > > property
> > > > > > > > > version, with different
> > > > > > > > > version, we could evolve these step by step without
> > > > > breaking backward
> > > > > > > > > compatibility. But in this
> > > > > > > > > design, version is representing external system's
> > > > > version, like
> > > > > > "0.11"
> > > > > > > > for
> > > > > > > > > kafka, "6" or "7" for ES.
> > > > > > > > > I'm not sure if this is necessary, what's the benefit of
> > > > > using two
> > > > > > keys
> > > > > > > > > instead of one, like "kafka-0.11"
> > > > > > > > > or "ES6" directly? And how about the old capability which
> > > > > could let
> > > > > > us
> > > > > > > > > evolving connector properties?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Kurt
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
> > > > > <[hidden email] <mailto:[hidden email]>>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Jark,
> > > > > > > > > >
> > > > > > > > > > I am really looking forward to this feature. I think
> > > this
> > > > > feature
> > > > > > > > > > could simplify flink sql code,and at the same time ,
> > > > > > > > > > it could make the developer more easlier to config the
> > > > > flink sql
> > > > > > WITH
> > > > > > > > > > options.
> > > > > > > > > >
> > > > > > > > > > Now when I am using flink sql to write flink task ,
> > > > > sometimes I
> > > > > > think
> > > > > > > > the
> > > > > > > > > > WITH options is too long for user.
> > > > > > > > > > For example,I config the kafka source connector
> > > > > parameter,for
> > > > > > > consumer
> > > > > > > > > > group and brokers parameter:
> > > > > > > > > >
> > > > > > > > > > 'connector.properties.0.key' = 'group.id
> > > > > <http://group.id>'
> > > > > > > > > > > , 'connector.properties.0.value' = 'xxx'
> > > > > > > > > > > , 'connector.properties.1.key' =
> > > > > 'bootstrap.servers'
> > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx'
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I can understand this config , but for the flink fresh
> > > > > man,maybe it
> > > > > > > > > > is confused for him.
> > > > > > > > > > In my thought, I am really looking forward to this
> > > > > feature,thank
> > > > > > you
> > > > > > > to
> > > > > > > > > > propose this feature.
> > > > > > > > > >
> > > > > > > > > > Best wishes,
> > > > > > > > > > LakeShen
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
> > > > > 2020年3月30日周一 下午2:02写道:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I want to start a discussion about further improve and
> > > > > simplify
> > > > > > our
> > > > > > > > > > current
> > > > > > > > > > > connector porperty keys, aka WITH options. Currently,
> > > > > we have a
> > > > > > > > > > > 'connector.' prefix for many properties, but they are
> > > > > verbose,
> > > > > > and
> > > > > > > we
> > > > > > > > > > see a
> > > > > > > > > > > big inconsistency between the properties when
> > > designing
> > > > > FLIP-107.
> > > > > > > > > > >
> > > > > > > > > > > So we propose to remove all the 'connector.' prefix
> > > and
> > > > > rename
> > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to
> > > > > 'format'. So a
> > > > > > > new
> > > > > > > > > > Kafka
> > > > > > > > > > > DDL may look like this:
> > > > > > > > > > >
> > > > > > > > > > > CREATE TABLE kafka_table (
> > > > > > > > > > > ...
> > > > > > > > > > > ) WITH (
> > > > > > > > > > > 'connector' = 'kafka',
> > > > > > > > > > > 'version' = '0.10',
> > > > > > > > > > > 'topic' = 'test-topic',
> > > > > > > > > > > 'startup-mode' = 'earliest-offset',
> > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092',
> > > > > > > > > > > 'properties.group.id <http://properties.group.id>' =
> > > > > 'testGroup',
> > > > > > > > > > > 'format' = 'json',
> > > > > > > > > > > 'format.fail-on-missing-field' = 'false'
> > > > > > > > > > > );
> > > > > > > > > > >
> > > > > > > > > > > The new connector property key set will come together
> > > > > with new
> > > > > > > > Factory
> > > > > > > > > > > inferface which is proposed in FLIP-95. Old properties
> > > > > are still
> > > > > > > > > > compatible
> > > > > > > > > > > with their existing implementation. New properties
> > > > > are only
> > > > > > > available
> > > > > > > > > in
> > > > > > > > > > > new DynamicTableFactory implementations.
> > > > > > > > > > >
> > > > > > > > > > > You can access the detailed FLIP here:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jark
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best, Jingsong Lee
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Benchao Li
> > > > > School of Electronics Engineering and Computer Science, Peking
> > > > > University
> > > > > Tel:+86-15650713730
> > > > > Email:[hidden email]
> > > > > <mailto:[hidden email]>;[hidden email]
> > > > > <mailto:[hidden email]>
> > > > >
> > > >
> > >
> >
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi Kurt,

I also prefer "-" as version delimiter now. I didn't remove the "_"
proposal by mistake, that's why I sent another email last night :)
Regarding to "property-version", I also think we shouldn't let users to
learn about this. And ConfigOption provides a good ability
to support deprecated keys and auto-generate documentation for deprecated
keys.

Hi Danny,

Regarding to “connector.properties.*”:
In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions()
inferfaces are only used for generation of documentation.
It does not influence the discovery and validation of a factory. The
validation logic is defined by connectors
in createDynamicTableSource/Sink().
So you don't have to provide an option for "connector.properties.*". But I
think we should make ConfigOption support wildcard in the long term for a
full story.

I don't think we should inline all the "connector.properties.*", otherwise,
it will be very tricky for users to configure the properties.
Regarding to FLIP-113, I suggest to provide some ConfigOptions for commonly
used kafka properties and put them in the supportedHintOptions(),
e.g. "connector.properties.group.id",
"connector.properties.fetch.min.bytes".

Best,
Jark





On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote:

> Thanks Jark for bring up this discussion, +1 for this idea, I believe the
> user has suffered from the verbose property key for long time.
>
> Just one question, how do we handle the keys with wildcard, such as the
> “connector.properties.*” in Kafka connector which would then hand-over to
> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption
> to describe the “supported properties”, then I have to concerns:
>
> • For the new keys, do we still need to put multi-lines there the such
> key, such as “connector.properties.abc” “connector.properties.def”, or
> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..."
> • Should the ConfigOption support the wildcard ? (If we plan to support
> the current multi-line style)
>
>
> Best,
> Danny Chan
> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道:
> > Hi all,
> >
> > Thanks for the feedbacks.
> >
> > It seems that we have a conclusion to put the version into the factory
> > identifier. I'm also fine with this.
> > If we have this outcome, the interface of Factory#factoryVersion is not
> > needed anymore, this can simplify the learning cost of new factory.
> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> > <[hidden email]>
> >
> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
> > the meaning of "universal" not easy to understand.
> > kafka-0.11 => kafka for 0.11 version
> > kafka-0.10 => kafka for 0.10 version
> > elasticsearch-6 => elasticsearch for 6.x versions
> > elasticsearch-7 => elasticsearch for 7.x versions
> > hbase-1.4 => hbase for 1.4.x versions
> > jdbc
> > filesystem
> >
> > We use "-" as the version delimiter to make them to be more consistent.
> > This is not forces, users can also use other delimiters or without
> > delimiter.
> > But this can be a guilde in the Javadoc of Factory, to make the connector
> > ecosystem to be more consistent.
> >
> > What do you think?
> >
> > ------------------------
> >
> > Regarding "connector.property-version":
> >
> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> > designed not support to read current properties.
> > All the current properties are routed to the old factories if they are
> > using "connector.type". Otherwise, properties are routed to new
> factories.
> >
> > If I understand correctly, the "connector.property-version" is attched
> > implicitly by system, not manually set by users.
> > For example, the framework should add "connector.property-version=1" to
> > properties when processing DDL statement.
> > I'm fine to add a "connector.property-version=1" when processing DDL
> > statement, but I think it's also fine if we don't,
> > because this can be done in the future if need and the default version
> can
> > be 1.
> >
> > Best,
> > Jark
> >
> > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
> >
> > > Hi all,
> > >
> > > Thanks for the feedbacks.
> > >
> > > It seems that we have a conclusion to put the version into the factory
> > > identifier. I'm also fine with this.
> > > If we have this outcome, the interface of Factory#factoryVersion is not
> > > needed anymore, this can simplify the learning cost of new factory.
> > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
> > > <[hidden email]>
> > >
> > > Btw, I would like to use "_" instead of "-" as the version delimiter,
> > > because "-" looks like minus and may confuse users, e.g.
> "elasticsearch-6".
> > > This is not forced, but should be a guilde in the Javadoc of Factory.
> > > I propose to use the following identifiers for existing connectors,
> > >
> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal",
> because
> > > the meaning of "universal" not easy to understand.
> > > kafka-0.11 => kafka for 0.11 version
> > > kafka-0.10 => kafka for 0.10 version
> > > elasticsearch-6 => elasticsearch for 6.x versions
> > > elasticsearch-7 => elasticsearch for 7.x versions
> > > hbase-1.4 => hbase for 1.4.x versions
> > > jdbc
> > > filesystem
> > >
> > > We use "-" as the version delimiter to make them to be more consistent.
> > > This is not forces, users can also use other delimiters or without
> > > delimiter.
> > > But this can be a guilde in the Javadoc of Factory, to make the
> connector
> > > ecosystem to be more consistent.
> > >
> > > What do you think?
> > >
> > > ------------------------
> > >
> > > Regarding "connector.property-version":
> > >
> > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
> > > designed not support to read current properties.
> > > All the current properties are routed to the old factories if they are
> > > using "connector.type". Otherwise, properties are routed to new
> factories.
> > >
> > > If I understand correctly, the "connector.property-version" is attched
> > > implicitly by system, not manually set by users.
> > > For example, the framework should add "connector.property-version=1" to
> > > properties when processing DDL statement.
> > > I'm fine to add a "connector.property-version=1" when processing DDL
> > > statement, but I think it's also fine if we don't,
> > > because this can be done in the future if need and the default version
> can
> > > be 1.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >
> > >
> > >
> > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <[hidden email]
> >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I like the overall design of the FLIP.
> > > >
> > > > As for the withstanding concerns. I kind of like the approach to put
> the
> > > > version into the factory identifier. I think it's the cleanest way to
> > > > say that this version actually applies to the connector itself and
> not
> > > > to the system it connects to. BTW, I think the outcome of this
> > > > discussion will affect interfaces described in FLIP-95. If we put the
> > > > version into the functionIdentifier, do we need
> Factory#factoryVersion?
> > > >
> > > > I also think it does make sense to have a versioning for the
> properties
> > > > as well. Are we able to read all the current properties with the new
> > > > factories? I think we could use the "connector.property-version" to
> > > > alternate between different Factory interfaces to support the old
> set of
> > > > properties. Otherwise the new factories need to understand both set
> of
> > > > properties, don't they?
> > > >
> > > > Best,
> > > >
> > > > Dawid
> > > >
> > > > On 30/03/2020 17:07, Timo Walther wrote:
> > > > > Hi Jark,
> > > > >
> > > > > thanks for the FLIP. I don't have a strong opinion on
> > > > > DynamicTableFactory#factoryVersion() for distiguishing connector
> > > > > versions. We can also include it in the factory identifier itself.
> For
> > > > > Kafka, I would then just use `kafka` because the naming "universal"
> > > > > was just a helper solution at that time.
> > > > >
> > > > > What we need is a good guide for how to design the options. We
> should
> > > > > include that in the JavaDoc of factory interface itself, once it is
> > > > > in-place. I like the difference between source and sink in
> properties.
> > > > >
> > > > > Regarding "connector.property-version":
> > > > >
> > > > > It was meant for backwards compatibility along the dimension of
> > > > > "property design" compared to "connector version". However, since
> the
> > > > > connector is now responsible for parsing all properties, it is not
> as
> > > > > necessary as it was before.
> > > > >
> > > > > However, a change of the property design as it is done in FLIP-107
> > > > > becomes more difficult without a property version and versioning is
> > > > > always a good idea in API world.
> > > > >
> > > > > Regards,
> > > > > Timo
> > > > >
> > > > >
> > > > > On 30.03.20 16:38, Benchao Li wrote:
> > > > > > Hi Jark,
> > > > > >
> > > > > > Thanks for starting the discussion. The FLIP LGTM generally.
> > > > > >
> > > > > > Regarding connector version VS put version into connector's name,
> > > > > > I favor the latter personally, using one property to locate a
> > > > > > connector can make the error log more precise.
> > > > > > From the users' side, using one property to match a connector
> will
> > > > > > be easier. Especially we have many connectors,
> > > > > > and some of the need version property required, and some of them
> not.
> > > > > >
> > > > > > Regarding Jingsong's suggestion,
> > > > > > IMO, it's a very good complement to the FLIP. Distinguishing
> > > > > > properties for source and sink can be very useful, and
> > > > > > also this will make the connector property more precise.
> > > > > > We are also sick of this for now, we cannot know whether a DDL
> is a
> > > > > > source or sink unless we look through all queries where
> > > > > > the table is used.
> > > > > > Even more, some of the required properties are only required for
> > > > > > source, bug we cannot leave it blank for sink, and vice versa.
> > > > > > I think we can also add a type for dimension tables except
> source and
> > > > > > sink.
> > > > > >
> > > > > > Kurt Young <[hidden email] <mailto:[hidden email]>>
> 于2020年3月30日
> > > > > > 周一 下午8:16写道:
> > > > > >
> > > > > > > It's not possible for the framework to throw such exception.
> > > > > > Because the
> > > > > > framework doesn't know what versions do the connector support.
> > > > > >
> > > > > > Not really, we can list all valid connectors framework could
> > > > > > found. E.g.
> > > > > > user mistyped 'kafka-0.x', the error message will looks like:
> > > > > >
> > > > > > we don't have any connector named "kafka-0.x", but we have:
> > > > > > FileSystem
> > > > > > Kafka-0.10
> > > > > > Kafka-0.11
> > > > > > ElasticSearch6
> > > > > > ElasticSearch7
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > > >
> > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
> > > > > > <mailto:[hidden email]>> wrote:
> > > > > >
> > > > > > > Hi Kurt,
> > > > > > >
> > > > > > > > 2) Lists all available connectors seems also quite
> > > > > > straightforward, e.g
> > > > > > > user provided a wrong "kafka-0.8", we tell user we have
> > > > > > candidates of
> > > > > > > "kakfa-0.11", "kafka-universal"
> > > > > > > It's not possible for the framework to throw such exception.
> > > > > > Because the
> > > > > > > framework doesn't know what versions do the connector support.
> > > > > > All the
> > > > > > > version information is a blackbox in the identifier. But with
> > > > > > > `Factory#factoryVersion()` interface, we can know all the
> > > > > > supported
> > > > > > > versions.
> > > > > > >
> > > > > > > > 3) I don't think so. We can still treat it as the same
> > > > > > connector but with
> > > > > > > different versions.
> > > > > > > That's true but that's weird. Because from the plain DDL
> > > > > > definition, they
> > > > > > > look like different connectors with different "connector"
> > > > > > value, e.g.
> > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'.
> > > > > > >
> > > > > > > > If users don't set any version, we will use "kafka-universal"
> > > > > > instead.
> > > > > > > The behavior is inconsistent IMO.
> > > > > > > That is a long term vision when there is no kafka clusters
> > > > > > with <0.11
> > > > > > > version.
> > > > > > > At that point, "universal" is the only supported version in
> > > > Flink
> > > > > > and the
> > > > > > > "version" key can be optional.
> > > > > > >
> > > > > > > ---------------------------------------------
> > > > > > >
> > > > > > > Hi Jingsong,
> > > > > > >
> > > > > > > > "version" vs "kafka.version"
> > > > > > > I though about it. But if we prefix "kafka" to version, we
> > > > should
> > > > > > prefix
> > > > > > > "kafka" for all other property keys, because they are all kafka
> > > > > > specific
> > > > > > > options.
> > > > > > > However, that will make the property set verbose, see rejected
> > > > > > option#2 in
> > > > > > > the FLIP.
> > > > > > >
> > > > > > > > explicitly separate options for source and sink
> > > > > > > That's a good topic. It's good to have a guideline for the new
> > > > > > property
> > > > > > > keys.
> > > > > > > I'm fine to prefix with a "source"/"sink" for some connector
> > > > > > keys.
> > > > > > > Actually, we already do this in some connectors, e.g. jdbc and
> > > > > > hbase.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
> > > > [hidden email]
> > > > > > <mailto:[hidden email]>> wrote:
> > > > > > >
> > > > > > > > Thanks Jark for the proposal.
> > > > > > > >
> > > > > > > > +1 to the general idea.
> > > > > > > >
> > > > > > > > For "version", what about "kafka.version"? It is obvious to
> > > > > > know its
> > > > > > > > meaning.
> > > > > > > >
> > > > > > > > And I'd like to start a new topic:
> > > > > > > > Should we need to explicitly separate source from sink?
> > > > > > > > With the development of batch and streaming, more and more
> > > > > > connectors
> > > > > > > have
> > > > > > > > both source and sink.
> > > > > > > >
> > > > > > > > So should we set a rule for table properties:
> > > > > > > > - properties for both source and sink: without prefix, like
> > > > > > "topic"
> > > > > > > > - properties for source only: with "source." prefix, like
> > > > > > > > "source.startup-mode"
> > > > > > > > - properties for sink only: with "sink." prefix, like
> > > > > > "sink.partitioner"
> > > > > > > >
> > > > > > > > What do you think?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jingsong Lee
> > > > > > > >
> > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
> > > > > > <mailto:[hidden email]>> wrote:
> > > > > > > >
> > > > > > > > > Hi Kurt,
> > > > > > > > >
> > > > > > > > > That's good questions.
> > > > > > > > >
> > > > > > > > > > the meaning of "version"
> > > > > > > > > There are two versions in the old design. One is property
> > > > > > version
> > > > > > > > > "connector.property-version" which can be used for backward
> > > > > > > > compatibility.
> > > > > > > > > The other one is "connector.version" which defines the
> > > > > > version of
> > > > > > > > external
> > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
> > > > > > > > > In this proposal, the "version" is the previous
> > > > > > "connector.version".
> > > > > > > The
> > > > > > > > > ""connector.property-version" is not introduced in new
> > > > > > design.
> > > > > > > > >
> > > > > > > > > > how to keep the old capability which can evolve connector
> > > > > > properties
> > > > > > > > > The "connector.property-version" is an optional key in the
> > > > > > old design
> > > > > > > and
> > > > > > > > > is never bumped up.
> > > > > > > > > I'm not sure how "connector.property-version" should work
> > > > > > in the
> > > > > > > initial
> > > > > > > > > design. Maybe @Timo Walther <[hidden email]
> > > > > > <mailto:[hidden email]>> has more knowledge on
> > > > > > > > > this.
> > > > > > > > > But for the new properties, every options should be
> > > > > > expressed as
> > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
> > > > > > method to
> > > > > > > easily
> > > > > > > > > support evolving keys.
> > > > > > > > >
> > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11",
> > > > > > "kafka-universal"?
> > > > > > > > > There are several benefit to use separate "version" key I
> > > > can
> > > > > > see:
> > > > > > > > > 1) it's more readable to separete them into different keys,
> > > > > > because
> > > > > > > they
> > > > > > > > > are orthogonal concepts.
> > > > > > > > > 2) the planner can give all the availble versions in the
> > > > > > exception
> > > > > > > > message,
> > > > > > > > > if user uses a wrong version (this is often reported in
> > > > > > user ML).
> > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may
> > > > > > have to
> > > > > > > write a
> > > > > > > > > full documentation for each version, because they are
> > > > > > different
> > > > > > > > > "connector"?
> > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually
> > > > > > the same
> > > > > > > > connector
> > > > > > > > > but with different "client jar" version,
> > > > > > > > > they share all the same supported property keys and
> > > > > > should reside
> > > > > > > > > together.
> > > > > > > > > 4) IMO, the future vision is version-free. At some point in
> > > > > > the future,
> > > > > > > > we
> > > > > > > > > may don't need users to specify kafka version anymore, and
> > > > > > make
> > > > > > > > > "version=universal" as optional or removed in the future.
> > > > > > This is can
> > > > > > > be
> > > > > > > > > done easily if they are separate keys.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jark
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <[hidden email]
> > > > > > <mailto:[hidden email]>> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Jark,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea.
> > > > > > However I have a
> > > > > > > > > > question about "version",
> > > > > > > > > > in the old design, the version seems to be aimed for
> > > > > > tracking
> > > > > > > property
> > > > > > > > > > version, with different
> > > > > > > > > > version, we could evolve these step by step without
> > > > > > breaking backward
> > > > > > > > > > compatibility. But in this
> > > > > > > > > > design, version is representing external system's
> > > > > > version, like
> > > > > > > "0.11"
> > > > > > > > > for
> > > > > > > > > > kafka, "6" or "7" for ES.
> > > > > > > > > > I'm not sure if this is necessary, what's the benefit of
> > > > > > using two
> > > > > > > keys
> > > > > > > > > > instead of one, like "kafka-0.11"
> > > > > > > > > > or "ES6" directly? And how about the old capability which
> > > > > > could let
> > > > > > > us
> > > > > > > > > > evolving connector properties?
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Kurt
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
> > > > > > <[hidden email] <mailto:[hidden email]>>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Jark,
> > > > > > > > > > >
> > > > > > > > > > > I am really looking forward to this feature. I think
> > > > this
> > > > > > feature
> > > > > > > > > > > could simplify flink sql code,and at the same time ,
> > > > > > > > > > > it could make the developer more easlier to config the
> > > > > > flink sql
> > > > > > > WITH
> > > > > > > > > > > options.
> > > > > > > > > > >
> > > > > > > > > > > Now when I am using flink sql to write flink task ,
> > > > > > sometimes I
> > > > > > > think
> > > > > > > > > the
> > > > > > > > > > > WITH options is too long for user.
> > > > > > > > > > > For example,I config the kafka source connector
> > > > > > parameter,for
> > > > > > > > consumer
> > > > > > > > > > > group and brokers parameter:
> > > > > > > > > > >
> > > > > > > > > > > 'connector.properties.0.key' = 'group.id
> > > > > > <http://group.id>'
> > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx'
> > > > > > > > > > > > , 'connector.properties.1.key' =
> > > > > > 'bootstrap.servers'
> > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx'
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I can understand this config , but for the flink fresh
> > > > > > man,maybe it
> > > > > > > > > > > is confused for him.
> > > > > > > > > > > In my thought, I am really looking forward to this
> > > > > > feature,thank
> > > > > > > you
> > > > > > > > to
> > > > > > > > > > > propose this feature.
> > > > > > > > > > >
> > > > > > > > > > > Best wishes,
> > > > > > > > > > > LakeShen
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>> 于
> > > > > > 2020年3月30日周一 下午2:02写道:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I want to start a discussion about further improve
> and
> > > > > > simplify
> > > > > > > our
> > > > > > > > > > > current
> > > > > > > > > > > > connector porperty keys, aka WITH options. Currently,
> > > > > > we have a
> > > > > > > > > > > > 'connector.' prefix for many properties, but they are
> > > > > > verbose,
> > > > > > > and
> > > > > > > > we
> > > > > > > > > > > see a
> > > > > > > > > > > > big inconsistency between the properties when
> > > > designing
> > > > > > FLIP-107.
> > > > > > > > > > > >
> > > > > > > > > > > > So we propose to remove all the 'connector.' prefix
> > > > and
> > > > > > rename
> > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to
> > > > > > 'format'. So a
> > > > > > > > new
> > > > > > > > > > > Kafka
> > > > > > > > > > > > DDL may look like this:
> > > > > > > > > > > >
> > > > > > > > > > > > CREATE TABLE kafka_table (
> > > > > > > > > > > > ...
> > > > > > > > > > > > ) WITH (
> > > > > > > > > > > > 'connector' = 'kafka',
> > > > > > > > > > > > 'version' = '0.10',
> > > > > > > > > > > > 'topic' = 'test-topic',
> > > > > > > > > > > > 'startup-mode' = 'earliest-offset',
> > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092',
> > > > > > > > > > > > 'properties.group.id <http://properties.group.id>' =
> > > > > > 'testGroup',
> > > > > > > > > > > > 'format' = 'json',
> > > > > > > > > > > > 'format.fail-on-missing-field' = 'false'
> > > > > > > > > > > > );
> > > > > > > > > > > >
> > > > > > > > > > > > The new connector property key set will come together
> > > > > > with new
> > > > > > > > > Factory
> > > > > > > > > > > > inferface which is proposed in FLIP-95. Old
> properties
> > > > > > are still
> > > > > > > > > > > compatible
> > > > > > > > > > > > with their existing implementation. New properties
> > > > > > are only
> > > > > > > > available
> > > > > > > > > > in
> > > > > > > > > > > > new DynamicTableFactory implementations.
> > > > > > > > > > > >
> > > > > > > > > > > > You can access the detailed FLIP here:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Jark
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best, Jingsong Lee
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Benchao Li
> > > > > > School of Electronics Engineering and Computer Science, Peking
> > > > > > University
> > > > > > Tel:+86-15650713730
> > > > > > Email:[hidden email]
> > > > > > <mailto:[hidden email]>;[hidden email]
> > > > > > <mailto:[hidden email]>
> > > > > >
> > > > >
> > > >
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Jark Wu-2
Hi everyone,

Thanks for the great feedbacks so far.

I updated the FLIP documentation according to the discussion. Changes
include:
- remove "version" key, and merge it into "connector"
- add "scan", "lookup", "sink" prefix to some property keys if they are
only used in that case.
- add a "New Property Key" section to list all the previous property keys
and new property keys.

We use "scan" and "lookup" instead of "source" prefix because we should
distinguish them and they aligns to FLIP-95 ScanTableSource and
LookupTableSource.
I also colored red for some major change of property keys in the FLIP. I
will list some of them here too:

kafka:
connector.startup-mode => scan.startup.mode
connector.specific-offsets => scan.startup.specific-offsets
connector.startup-timestamp-millis => scan.startup.timestamp-millis
connector.sink-partitioner & connector.sink-partitioner-class =>
sink.partitioner

elasticsearch:
connector.key-delimiter => document-id.key-delimiter              # make it
explicit that it is used for document id
connector.key-null-literal => document-id.key-null-literal          # and
it also can be used for es sources in the future
connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy

jdbc:
connector.table => table-name

Welcome further feedbacks!

Best,
Jark


On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote:

> Hi Kurt,
>
> I also prefer "-" as version delimiter now. I didn't remove the "_"
> proposal by mistake, that's why I sent another email last night :)
> Regarding to "property-version", I also think we shouldn't let users to
> learn about this. And ConfigOption provides a good ability
> to support deprecated keys and auto-generate documentation for deprecated
> keys.
>
> Hi Danny,
>
> Regarding to “connector.properties.*”:
> In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions()
> inferfaces are only used for generation of documentation.
> It does not influence the discovery and validation of a factory. The
> validation logic is defined by connectors
> in createDynamicTableSource/Sink().
> So you don't have to provide an option for "connector.properties.*". But I
> think we should make ConfigOption support wildcard in the long term for a
> full story.
>
> I don't think we should inline all the "connector.properties.*",
> otherwise, it will be very tricky for users to configure the properties.
> Regarding to FLIP-113, I suggest to provide some ConfigOptions for
> commonly used kafka properties and put them in the supportedHintOptions(),
> e.g. "connector.properties.group.id",
> "connector.properties.fetch.min.bytes".
>
> Best,
> Jark
>
>
>
>
>
> On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote:
>
>> Thanks Jark for bring up this discussion, +1 for this idea, I believe the
>> user has suffered from the verbose property key for long time.
>>
>> Just one question, how do we handle the keys with wildcard, such as the
>> “connector.properties.*” in Kafka connector which would then hand-over to
>> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption
>> to describe the “supported properties”, then I have to concerns:
>>
>> • For the new keys, do we still need to put multi-lines there the such
>> key, such as “connector.properties.abc” “connector.properties.def”, or
>> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..."
>> • Should the ConfigOption support the wildcard ? (If we plan to support
>> the current multi-line style)
>>
>>
>> Best,
>> Danny Chan
>> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道:
>> > Hi all,
>> >
>> > Thanks for the feedbacks.
>> >
>> > It seems that we have a conclusion to put the version into the factory
>> > identifier. I'm also fine with this.
>> > If we have this outcome, the interface of Factory#factoryVersion is not
>> > needed anymore, this can simplify the learning cost of new factory.
>> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>> > <[hidden email]>
>> >
>> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
>> > the meaning of "universal" not easy to understand.
>> > kafka-0.11 => kafka for 0.11 version
>> > kafka-0.10 => kafka for 0.10 version
>> > elasticsearch-6 => elasticsearch for 6.x versions
>> > elasticsearch-7 => elasticsearch for 7.x versions
>> > hbase-1.4 => hbase for 1.4.x versions
>> > jdbc
>> > filesystem
>> >
>> > We use "-" as the version delimiter to make them to be more consistent.
>> > This is not forces, users can also use other delimiters or without
>> > delimiter.
>> > But this can be a guilde in the Javadoc of Factory, to make the
>> connector
>> > ecosystem to be more consistent.
>> >
>> > What do you think?
>> >
>> > ------------------------
>> >
>> > Regarding "connector.property-version":
>> >
>> > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>> > designed not support to read current properties.
>> > All the current properties are routed to the old factories if they are
>> > using "connector.type". Otherwise, properties are routed to new
>> factories.
>> >
>> > If I understand correctly, the "connector.property-version" is attched
>> > implicitly by system, not manually set by users.
>> > For example, the framework should add "connector.property-version=1" to
>> > properties when processing DDL statement.
>> > I'm fine to add a "connector.property-version=1" when processing DDL
>> > statement, but I think it's also fine if we don't,
>> > because this can be done in the future if need and the default version
>> can
>> > be 1.
>> >
>> > Best,
>> > Jark
>> >
>> > On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>> >
>> > > Hi all,
>> > >
>> > > Thanks for the feedbacks.
>> > >
>> > > It seems that we have a conclusion to put the version into the factory
>> > > identifier. I'm also fine with this.
>> > > If we have this outcome, the interface of Factory#factoryVersion is
>> not
>> > > needed anymore, this can simplify the learning cost of new factory.
>> > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>> > > <[hidden email]>
>> > >
>> > > Btw, I would like to use "_" instead of "-" as the version delimiter,
>> > > because "-" looks like minus and may confuse users, e.g.
>> "elasticsearch-6".
>> > > This is not forced, but should be a guilde in the Javadoc of Factory.
>> > > I propose to use the following identifiers for existing connectors,
>> > >
>> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal",
>> because
>> > > the meaning of "universal" not easy to understand.
>> > > kafka-0.11 => kafka for 0.11 version
>> > > kafka-0.10 => kafka for 0.10 version
>> > > elasticsearch-6 => elasticsearch for 6.x versions
>> > > elasticsearch-7 => elasticsearch for 7.x versions
>> > > hbase-1.4 => hbase for 1.4.x versions
>> > > jdbc
>> > > filesystem
>> > >
>> > > We use "-" as the version delimiter to make them to be more
>> consistent.
>> > > This is not forces, users can also use other delimiters or without
>> > > delimiter.
>> > > But this can be a guilde in the Javadoc of Factory, to make the
>> connector
>> > > ecosystem to be more consistent.
>> > >
>> > > What do you think?
>> > >
>> > > ------------------------
>> > >
>> > > Regarding "connector.property-version":
>> > >
>> > > Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>> > > designed not support to read current properties.
>> > > All the current properties are routed to the old factories if they are
>> > > using "connector.type". Otherwise, properties are routed to new
>> factories.
>> > >
>> > > If I understand correctly, the "connector.property-version" is attched
>> > > implicitly by system, not manually set by users.
>> > > For example, the framework should add "connector.property-version=1"
>> to
>> > > properties when processing DDL statement.
>> > > I'm fine to add a "connector.property-version=1" when processing DDL
>> > > statement, but I think it's also fine if we don't,
>> > > because this can be done in the future if need and the default
>> version can
>> > > be 1.
>> > >
>> > > Best,
>> > > Jark
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <
>> [hidden email]>
>> > > wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I like the overall design of the FLIP.
>> > > >
>> > > > As for the withstanding concerns. I kind of like the approach to
>> put the
>> > > > version into the factory identifier. I think it's the cleanest way
>> to
>> > > > say that this version actually applies to the connector itself and
>> not
>> > > > to the system it connects to. BTW, I think the outcome of this
>> > > > discussion will affect interfaces described in FLIP-95. If we put
>> the
>> > > > version into the functionIdentifier, do we need
>> Factory#factoryVersion?
>> > > >
>> > > > I also think it does make sense to have a versioning for the
>> properties
>> > > > as well. Are we able to read all the current properties with the new
>> > > > factories? I think we could use the "connector.property-version" to
>> > > > alternate between different Factory interfaces to support the old
>> set of
>> > > > properties. Otherwise the new factories need to understand both set
>> of
>> > > > properties, don't they?
>> > > >
>> > > > Best,
>> > > >
>> > > > Dawid
>> > > >
>> > > > On 30/03/2020 17:07, Timo Walther wrote:
>> > > > > Hi Jark,
>> > > > >
>> > > > > thanks for the FLIP. I don't have a strong opinion on
>> > > > > DynamicTableFactory#factoryVersion() for distiguishing connector
>> > > > > versions. We can also include it in the factory identifier
>> itself. For
>> > > > > Kafka, I would then just use `kafka` because the naming
>> "universal"
>> > > > > was just a helper solution at that time.
>> > > > >
>> > > > > What we need is a good guide for how to design the options. We
>> should
>> > > > > include that in the JavaDoc of factory interface itself, once it
>> is
>> > > > > in-place. I like the difference between source and sink in
>> properties.
>> > > > >
>> > > > > Regarding "connector.property-version":
>> > > > >
>> > > > > It was meant for backwards compatibility along the dimension of
>> > > > > "property design" compared to "connector version". However, since
>> the
>> > > > > connector is now responsible for parsing all properties, it is
>> not as
>> > > > > necessary as it was before.
>> > > > >
>> > > > > However, a change of the property design as it is done in FLIP-107
>> > > > > becomes more difficult without a property version and versioning
>> is
>> > > > > always a good idea in API world.
>> > > > >
>> > > > > Regards,
>> > > > > Timo
>> > > > >
>> > > > >
>> > > > > On 30.03.20 16:38, Benchao Li wrote:
>> > > > > > Hi Jark,
>> > > > > >
>> > > > > > Thanks for starting the discussion. The FLIP LGTM generally.
>> > > > > >
>> > > > > > Regarding connector version VS put version into connector's
>> name,
>> > > > > > I favor the latter personally, using one property to locate a
>> > > > > > connector can make the error log more precise.
>> > > > > > From the users' side, using one property to match a connector
>> will
>> > > > > > be easier. Especially we have many connectors,
>> > > > > > and some of the need version property required, and some of
>> them not.
>> > > > > >
>> > > > > > Regarding Jingsong's suggestion,
>> > > > > > IMO, it's a very good complement to the FLIP. Distinguishing
>> > > > > > properties for source and sink can be very useful, and
>> > > > > > also this will make the connector property more precise.
>> > > > > > We are also sick of this for now, we cannot know whether a DDL
>> is a
>> > > > > > source or sink unless we look through all queries where
>> > > > > > the table is used.
>> > > > > > Even more, some of the required properties are only required for
>> > > > > > source, bug we cannot leave it blank for sink, and vice versa.
>> > > > > > I think we can also add a type for dimension tables except
>> source and
>> > > > > > sink.
>> > > > > >
>> > > > > > Kurt Young <[hidden email] <mailto:[hidden email]>>
>> 于2020年3月30日
>> > > > > > 周一 下午8:16写道:
>> > > > > >
>> > > > > > > It's not possible for the framework to throw such exception.
>> > > > > > Because the
>> > > > > > framework doesn't know what versions do the connector support.
>> > > > > >
>> > > > > > Not really, we can list all valid connectors framework could
>> > > > > > found. E.g.
>> > > > > > user mistyped 'kafka-0.x', the error message will looks like:
>> > > > > >
>> > > > > > we don't have any connector named "kafka-0.x", but we have:
>> > > > > > FileSystem
>> > > > > > Kafka-0.10
>> > > > > > Kafka-0.11
>> > > > > > ElasticSearch6
>> > > > > > ElasticSearch7
>> > > > > >
>> > > > > > Best,
>> > > > > > Kurt
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>> > > > > > <mailto:[hidden email]>> wrote:
>> > > > > >
>> > > > > > > Hi Kurt,
>> > > > > > >
>> > > > > > > > 2) Lists all available connectors seems also quite
>> > > > > > straightforward, e.g
>> > > > > > > user provided a wrong "kafka-0.8", we tell user we have
>> > > > > > candidates of
>> > > > > > > "kakfa-0.11", "kafka-universal"
>> > > > > > > It's not possible for the framework to throw such exception.
>> > > > > > Because the
>> > > > > > > framework doesn't know what versions do the connector support.
>> > > > > > All the
>> > > > > > > version information is a blackbox in the identifier. But with
>> > > > > > > `Factory#factoryVersion()` interface, we can know all the
>> > > > > > supported
>> > > > > > > versions.
>> > > > > > >
>> > > > > > > > 3) I don't think so. We can still treat it as the same
>> > > > > > connector but with
>> > > > > > > different versions.
>> > > > > > > That's true but that's weird. Because from the plain DDL
>> > > > > > definition, they
>> > > > > > > look like different connectors with different "connector"
>> > > > > > value, e.g.
>> > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>> > > > > > >
>> > > > > > > > If users don't set any version, we will use
>> "kafka-universal"
>> > > > > > instead.
>> > > > > > > The behavior is inconsistent IMO.
>> > > > > > > That is a long term vision when there is no kafka clusters
>> > > > > > with <0.11
>> > > > > > > version.
>> > > > > > > At that point, "universal" is the only supported version in
>> > > > Flink
>> > > > > > and the
>> > > > > > > "version" key can be optional.
>> > > > > > >
>> > > > > > > ---------------------------------------------
>> > > > > > >
>> > > > > > > Hi Jingsong,
>> > > > > > >
>> > > > > > > > "version" vs "kafka.version"
>> > > > > > > I though about it. But if we prefix "kafka" to version, we
>> > > > should
>> > > > > > prefix
>> > > > > > > "kafka" for all other property keys, because they are all
>> kafka
>> > > > > > specific
>> > > > > > > options.
>> > > > > > > However, that will make the property set verbose, see rejected
>> > > > > > option#2 in
>> > > > > > > the FLIP.
>> > > > > > >
>> > > > > > > > explicitly separate options for source and sink
>> > > > > > > That's a good topic. It's good to have a guideline for the new
>> > > > > > property
>> > > > > > > keys.
>> > > > > > > I'm fine to prefix with a "source"/"sink" for some connector
>> > > > > > keys.
>> > > > > > > Actually, we already do this in some connectors, e.g. jdbc and
>> > > > > > hbase.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > Jark
>> > > > > > >
>> > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
>> > > > [hidden email]
>> > > > > > <mailto:[hidden email]>> wrote:
>> > > > > > >
>> > > > > > > > Thanks Jark for the proposal.
>> > > > > > > >
>> > > > > > > > +1 to the general idea.
>> > > > > > > >
>> > > > > > > > For "version", what about "kafka.version"? It is obvious to
>> > > > > > know its
>> > > > > > > > meaning.
>> > > > > > > >
>> > > > > > > > And I'd like to start a new topic:
>> > > > > > > > Should we need to explicitly separate source from sink?
>> > > > > > > > With the development of batch and streaming, more and more
>> > > > > > connectors
>> > > > > > > have
>> > > > > > > > both source and sink.
>> > > > > > > >
>> > > > > > > > So should we set a rule for table properties:
>> > > > > > > > - properties for both source and sink: without prefix, like
>> > > > > > "topic"
>> > > > > > > > - properties for source only: with "source." prefix, like
>> > > > > > > > "source.startup-mode"
>> > > > > > > > - properties for sink only: with "sink." prefix, like
>> > > > > > "sink.partitioner"
>> > > > > > > >
>> > > > > > > > What do you think?
>> > > > > > > >
>> > > > > > > > Best,
>> > > > > > > > Jingsong Lee
>> > > > > > > >
>> > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>> > > > > > <mailto:[hidden email]>> wrote:
>> > > > > > > >
>> > > > > > > > > Hi Kurt,
>> > > > > > > > >
>> > > > > > > > > That's good questions.
>> > > > > > > > >
>> > > > > > > > > > the meaning of "version"
>> > > > > > > > > There are two versions in the old design. One is property
>> > > > > > version
>> > > > > > > > > "connector.property-version" which can be used for
>> backward
>> > > > > > > > compatibility.
>> > > > > > > > > The other one is "connector.version" which defines the
>> > > > > > version of
>> > > > > > > > external
>> > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>> > > > > > > > > In this proposal, the "version" is the previous
>> > > > > > "connector.version".
>> > > > > > > The
>> > > > > > > > > ""connector.property-version" is not introduced in new
>> > > > > > design.
>> > > > > > > > >
>> > > > > > > > > > how to keep the old capability which can evolve
>> connector
>> > > > > > properties
>> > > > > > > > > The "connector.property-version" is an optional key in the
>> > > > > > old design
>> > > > > > > and
>> > > > > > > > > is never bumped up.
>> > > > > > > > > I'm not sure how "connector.property-version" should work
>> > > > > > in the
>> > > > > > > initial
>> > > > > > > > > design. Maybe @Timo Walther <[hidden email]
>> > > > > > <mailto:[hidden email]>> has more knowledge on
>> > > > > > > > > this.
>> > > > > > > > > But for the new properties, every options should be
>> > > > > > expressed as
>> > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
>> > > > > > method to
>> > > > > > > easily
>> > > > > > > > > support evolving keys.
>> > > > > > > > >
>> > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11",
>> > > > > > "kafka-universal"?
>> > > > > > > > > There are several benefit to use separate "version" key I
>> > > > can
>> > > > > > see:
>> > > > > > > > > 1) it's more readable to separete them into different
>> keys,
>> > > > > > because
>> > > > > > > they
>> > > > > > > > > are orthogonal concepts.
>> > > > > > > > > 2) the planner can give all the availble versions in the
>> > > > > > exception
>> > > > > > > > message,
>> > > > > > > > > if user uses a wrong version (this is often reported in
>> > > > > > user ML).
>> > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may
>> > > > > > have to
>> > > > > > > write a
>> > > > > > > > > full documentation for each version, because they are
>> > > > > > different
>> > > > > > > > > "connector"?
>> > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually
>> > > > > > the same
>> > > > > > > > connector
>> > > > > > > > > but with different "client jar" version,
>> > > > > > > > > they share all the same supported property keys and
>> > > > > > should reside
>> > > > > > > > > together.
>> > > > > > > > > 4) IMO, the future vision is version-free. At some point
>> in
>> > > > > > the future,
>> > > > > > > > we
>> > > > > > > > > may don't need users to specify kafka version anymore, and
>> > > > > > make
>> > > > > > > > > "version=universal" as optional or removed in the future.
>> > > > > > This is can
>> > > > > > > be
>> > > > > > > > > done easily if they are separate keys.
>> > > > > > > > >
>> > > > > > > > > Best,
>> > > > > > > > > Jark
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <
>> [hidden email]
>> > > > > > <mailto:[hidden email]>> wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Jark,
>> > > > > > > > > >
>> > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea.
>> > > > > > However I have a
>> > > > > > > > > > question about "version",
>> > > > > > > > > > in the old design, the version seems to be aimed for
>> > > > > > tracking
>> > > > > > > property
>> > > > > > > > > > version, with different
>> > > > > > > > > > version, we could evolve these step by step without
>> > > > > > breaking backward
>> > > > > > > > > > compatibility. But in this
>> > > > > > > > > > design, version is representing external system's
>> > > > > > version, like
>> > > > > > > "0.11"
>> > > > > > > > > for
>> > > > > > > > > > kafka, "6" or "7" for ES.
>> > > > > > > > > > I'm not sure if this is necessary, what's the benefit of
>> > > > > > using two
>> > > > > > > keys
>> > > > > > > > > > instead of one, like "kafka-0.11"
>> > > > > > > > > > or "ES6" directly? And how about the old capability
>> which
>> > > > > > could let
>> > > > > > > us
>> > > > > > > > > > evolving connector properties?
>> > > > > > > > > >
>> > > > > > > > > > Best,
>> > > > > > > > > > Kurt
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>> > > > > > <[hidden email] <mailto:[hidden email]>>
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Jark,
>> > > > > > > > > > >
>> > > > > > > > > > > I am really looking forward to this feature. I think
>> > > > this
>> > > > > > feature
>> > > > > > > > > > > could simplify flink sql code,and at the same time ,
>> > > > > > > > > > > it could make the developer more easlier to config the
>> > > > > > flink sql
>> > > > > > > WITH
>> > > > > > > > > > > options.
>> > > > > > > > > > >
>> > > > > > > > > > > Now when I am using flink sql to write flink task ,
>> > > > > > sometimes I
>> > > > > > > think
>> > > > > > > > > the
>> > > > > > > > > > > WITH options is too long for user.
>> > > > > > > > > > > For example,I config the kafka source connector
>> > > > > > parameter,for
>> > > > > > > > consumer
>> > > > > > > > > > > group and brokers parameter:
>> > > > > > > > > > >
>> > > > > > > > > > > 'connector.properties.0.key' = 'group.id
>> > > > > > <http://group.id>'
>> > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx'
>> > > > > > > > > > > > , 'connector.properties.1.key' =
>> > > > > > 'bootstrap.servers'
>> > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx'
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I can understand this config , but for the flink fresh
>> > > > > > man,maybe it
>> > > > > > > > > > > is confused for him.
>> > > > > > > > > > > In my thought, I am really looking forward to this
>> > > > > > feature,thank
>> > > > > > > you
>> > > > > > > > to
>> > > > > > > > > > > propose this feature.
>> > > > > > > > > > >
>> > > > > > > > > > > Best wishes,
>> > > > > > > > > > > LakeShen
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > Jark Wu <[hidden email] <mailto:[hidden email]>>
>> 于
>> > > > > > 2020年3月30日周一 下午2:02写道:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi everyone,
>> > > > > > > > > > > >
>> > > > > > > > > > > > I want to start a discussion about further improve
>> and
>> > > > > > simplify
>> > > > > > > our
>> > > > > > > > > > > current
>> > > > > > > > > > > > connector porperty keys, aka WITH options.
>> Currently,
>> > > > > > we have a
>> > > > > > > > > > > > 'connector.' prefix for many properties, but they
>> are
>> > > > > > verbose,
>> > > > > > > and
>> > > > > > > > we
>> > > > > > > > > > > see a
>> > > > > > > > > > > > big inconsistency between the properties when
>> > > > designing
>> > > > > > FLIP-107.
>> > > > > > > > > > > >
>> > > > > > > > > > > > So we propose to remove all the 'connector.' prefix
>> > > > and
>> > > > > > rename
>> > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to
>> > > > > > 'format'. So a
>> > > > > > > > new
>> > > > > > > > > > > Kafka
>> > > > > > > > > > > > DDL may look like this:
>> > > > > > > > > > > >
>> > > > > > > > > > > > CREATE TABLE kafka_table (
>> > > > > > > > > > > > ...
>> > > > > > > > > > > > ) WITH (
>> > > > > > > > > > > > 'connector' = 'kafka',
>> > > > > > > > > > > > 'version' = '0.10',
>> > > > > > > > > > > > 'topic' = 'test-topic',
>> > > > > > > > > > > > 'startup-mode' = 'earliest-offset',
>> > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092',
>> > > > > > > > > > > > 'properties.group.id <http://properties.group.id>'
>> =
>> > > > > > 'testGroup',
>> > > > > > > > > > > > 'format' = 'json',
>> > > > > > > > > > > > 'format.fail-on-missing-field' = 'false'
>> > > > > > > > > > > > );
>> > > > > > > > > > > >
>> > > > > > > > > > > > The new connector property key set will come
>> together
>> > > > > > with new
>> > > > > > > > > Factory
>> > > > > > > > > > > > inferface which is proposed in FLIP-95. Old
>> properties
>> > > > > > are still
>> > > > > > > > > > > compatible
>> > > > > > > > > > > > with their existing implementation. New properties
>> > > > > > are only
>> > > > > > > > available
>> > > > > > > > > > in
>> > > > > > > > > > > > new DynamicTableFactory implementations.
>> > > > > > > > > > > >
>> > > > > > > > > > > > You can access the detailed FLIP here:
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best,
>> > > > > > > > > > > > Jark
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Best, Jingsong Lee
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > >
>> > > > > > Benchao Li
>> > > > > > School of Electronics Engineering and Computer Science, Peking
>> > > > > > University
>> > > > > > Tel:+86-15650713730
>> > > > > > Email:[hidden email]
>> > > > > > <mailto:[hidden email]>;[hidden email]
>> > > > > > <mailto:[hidden email]>
>> > > > > >
>> > > > >
>> > > >
>> > >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

dwysakowicz
Hi,

As for the `connector.property-version`. I have doubts that having
deprecated keys mechanism is enough to fully support backwards
compatibility. I do think we are changing not just the key names, but
structure of the keys as well as the discovery mechanism and I think it
is really hard to foresee all the interconnections. I don't think users
need to learn about the `connector.property-version` too much. By
default it would always be set to the newest version. Users could still
set this property to an old version if they don't want to migrate the
properties, but go fully against the old structure. I think such
mechanism could be useful.

I understand a similar thing can be achieved with renaming/changing the
version in the factoryIdentifier. 'kafka' can use the new properties,
'kafka-v1' the old properties. I think though having a dedicated
mechanism could make it a tad easier. I am not strict on that though,
just wanted to express it.

Best,

Dawid

On 31/03/2020 08:58, Jark Wu wrote:

> Hi everyone,
>
> Thanks for the great feedbacks so far.
>
> I updated the FLIP documentation according to the discussion. Changes
> include:
> - remove "version" key, and merge it into "connector"
> - add "scan", "lookup", "sink" prefix to some property keys if they are
> only used in that case.
> - add a "New Property Key" section to list all the previous property keys
> and new property keys.
>
> We use "scan" and "lookup" instead of "source" prefix because we should
> distinguish them and they aligns to FLIP-95 ScanTableSource and
> LookupTableSource.
> I also colored red for some major change of property keys in the FLIP. I
> will list some of them here too:
>
> kafka:
> connector.startup-mode => scan.startup.mode
> connector.specific-offsets => scan.startup.specific-offsets
> connector.startup-timestamp-millis => scan.startup.timestamp-millis
> connector.sink-partitioner & connector.sink-partitioner-class =>
> sink.partitioner
>
> elasticsearch:
> connector.key-delimiter => document-id.key-delimiter              # make it
> explicit that it is used for document id
> connector.key-null-literal => document-id.key-null-literal          # and
> it also can be used for es sources in the future
> connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy
>
> jdbc:
> connector.table => table-name
>
> Welcome further feedbacks!
>
> Best,
> Jark
>
>
> On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote:
>
>> Hi Kurt,
>>
>> I also prefer "-" as version delimiter now. I didn't remove the "_"
>> proposal by mistake, that's why I sent another email last night :)
>> Regarding to "property-version", I also think we shouldn't let users to
>> learn about this. And ConfigOption provides a good ability
>> to support deprecated keys and auto-generate documentation for deprecated
>> keys.
>>
>> Hi Danny,
>>
>> Regarding to “connector.properties.*”:
>> In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions()
>> inferfaces are only used for generation of documentation.
>> It does not influence the discovery and validation of a factory. The
>> validation logic is defined by connectors
>> in createDynamicTableSource/Sink().
>> So you don't have to provide an option for "connector.properties.*". But I
>> think we should make ConfigOption support wildcard in the long term for a
>> full story.
>>
>> I don't think we should inline all the "connector.properties.*",
>> otherwise, it will be very tricky for users to configure the properties.
>> Regarding to FLIP-113, I suggest to provide some ConfigOptions for
>> commonly used kafka properties and put them in the supportedHintOptions(),
>> e.g. "connector.properties.group.id",
>> "connector.properties.fetch.min.bytes".
>>
>> Best,
>> Jark
>>
>>
>>
>>
>>
>> On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote:
>>
>>> Thanks Jark for bring up this discussion, +1 for this idea, I believe the
>>> user has suffered from the verbose property key for long time.
>>>
>>> Just one question, how do we handle the keys with wildcard, such as the
>>> “connector.properties.*” in Kafka connector which would then hand-over to
>>> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption
>>> to describe the “supported properties”, then I have to concerns:
>>>
>>> • For the new keys, do we still need to put multi-lines there the such
>>> key, such as “connector.properties.abc” “connector.properties.def”, or
>>> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..."
>>> • Should the ConfigOption support the wildcard ? (If we plan to support
>>> the current multi-line style)
>>>
>>>
>>> Best,
>>> Danny Chan
>>> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道:
>>>> Hi all,
>>>>
>>>> Thanks for the feedbacks.
>>>>
>>>> It seems that we have a conclusion to put the version into the factory
>>>> identifier. I'm also fine with this.
>>>> If we have this outcome, the interface of Factory#factoryVersion is not
>>>> needed anymore, this can simplify the learning cost of new factory.
>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>>>> <[hidden email]>
>>>>
>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
>>>> the meaning of "universal" not easy to understand.
>>>> kafka-0.11 => kafka for 0.11 version
>>>> kafka-0.10 => kafka for 0.10 version
>>>> elasticsearch-6 => elasticsearch for 6.x versions
>>>> elasticsearch-7 => elasticsearch for 7.x versions
>>>> hbase-1.4 => hbase for 1.4.x versions
>>>> jdbc
>>>> filesystem
>>>>
>>>> We use "-" as the version delimiter to make them to be more consistent.
>>>> This is not forces, users can also use other delimiters or without
>>>> delimiter.
>>>> But this can be a guilde in the Javadoc of Factory, to make the
>>> connector
>>>> ecosystem to be more consistent.
>>>>
>>>> What do you think?
>>>>
>>>> ------------------------
>>>>
>>>> Regarding "connector.property-version":
>>>>
>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>>>> designed not support to read current properties.
>>>> All the current properties are routed to the old factories if they are
>>>> using "connector.type". Otherwise, properties are routed to new
>>> factories.
>>>> If I understand correctly, the "connector.property-version" is attched
>>>> implicitly by system, not manually set by users.
>>>> For example, the framework should add "connector.property-version=1" to
>>>> properties when processing DDL statement.
>>>> I'm fine to add a "connector.property-version=1" when processing DDL
>>>> statement, but I think it's also fine if we don't,
>>>> because this can be done in the future if need and the default version
>>> can
>>>> be 1.
>>>>
>>>> Best,
>>>> Jark
>>>>
>>>> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Thanks for the feedbacks.
>>>>>
>>>>> It seems that we have a conclusion to put the version into the factory
>>>>> identifier. I'm also fine with this.
>>>>> If we have this outcome, the interface of Factory#factoryVersion is
>>> not
>>>>> needed anymore, this can simplify the learning cost of new factory.
>>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>>>>> <[hidden email]>
>>>>>
>>>>> Btw, I would like to use "_" instead of "-" as the version delimiter,
>>>>> because "-" looks like minus and may confuse users, e.g.
>>> "elasticsearch-6".
>>>>> This is not forced, but should be a guilde in the Javadoc of Factory.
>>>>> I propose to use the following identifiers for existing connectors,
>>>>>
>>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal",
>>> because
>>>>> the meaning of "universal" not easy to understand.
>>>>> kafka-0.11 => kafka for 0.11 version
>>>>> kafka-0.10 => kafka for 0.10 version
>>>>> elasticsearch-6 => elasticsearch for 6.x versions
>>>>> elasticsearch-7 => elasticsearch for 7.x versions
>>>>> hbase-1.4 => hbase for 1.4.x versions
>>>>> jdbc
>>>>> filesystem
>>>>>
>>>>> We use "-" as the version delimiter to make them to be more
>>> consistent.
>>>>> This is not forces, users can also use other delimiters or without
>>>>> delimiter.
>>>>> But this can be a guilde in the Javadoc of Factory, to make the
>>> connector
>>>>> ecosystem to be more consistent.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> ------------------------
>>>>>
>>>>> Regarding "connector.property-version":
>>>>>
>>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>>>>> designed not support to read current properties.
>>>>> All the current properties are routed to the old factories if they are
>>>>> using "connector.type". Otherwise, properties are routed to new
>>> factories.
>>>>> If I understand correctly, the "connector.property-version" is attched
>>>>> implicitly by system, not manually set by users.
>>>>> For example, the framework should add "connector.property-version=1"
>>> to
>>>>> properties when processing DDL statement.
>>>>> I'm fine to add a "connector.property-version=1" when processing DDL
>>>>> statement, but I think it's also fine if we don't,
>>>>> because this can be done in the future if need and the default
>>> version can
>>>>> be 1.
>>>>>
>>>>> Best,
>>>>> Jark
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <
>>> [hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I like the overall design of the FLIP.
>>>>>>
>>>>>> As for the withstanding concerns. I kind of like the approach to
>>> put the
>>>>>> version into the factory identifier. I think it's the cleanest way
>>> to
>>>>>> say that this version actually applies to the connector itself and
>>> not
>>>>>> to the system it connects to. BTW, I think the outcome of this
>>>>>> discussion will affect interfaces described in FLIP-95. If we put
>>> the
>>>>>> version into the functionIdentifier, do we need
>>> Factory#factoryVersion?
>>>>>> I also think it does make sense to have a versioning for the
>>> properties
>>>>>> as well. Are we able to read all the current properties with the new
>>>>>> factories? I think we could use the "connector.property-version" to
>>>>>> alternate between different Factory interfaces to support the old
>>> set of
>>>>>> properties. Otherwise the new factories need to understand both set
>>> of
>>>>>> properties, don't they?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> On 30/03/2020 17:07, Timo Walther wrote:
>>>>>>> Hi Jark,
>>>>>>>
>>>>>>> thanks for the FLIP. I don't have a strong opinion on
>>>>>>> DynamicTableFactory#factoryVersion() for distiguishing connector
>>>>>>> versions. We can also include it in the factory identifier
>>> itself. For
>>>>>>> Kafka, I would then just use `kafka` because the naming
>>> "universal"
>>>>>>> was just a helper solution at that time.
>>>>>>>
>>>>>>> What we need is a good guide for how to design the options. We
>>> should
>>>>>>> include that in the JavaDoc of factory interface itself, once it
>>> is
>>>>>>> in-place. I like the difference between source and sink in
>>> properties.
>>>>>>> Regarding "connector.property-version":
>>>>>>>
>>>>>>> It was meant for backwards compatibility along the dimension of
>>>>>>> "property design" compared to "connector version". However, since
>>> the
>>>>>>> connector is now responsible for parsing all properties, it is
>>> not as
>>>>>>> necessary as it was before.
>>>>>>>
>>>>>>> However, a change of the property design as it is done in FLIP-107
>>>>>>> becomes more difficult without a property version and versioning
>>> is
>>>>>>> always a good idea in API world.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Timo
>>>>>>>
>>>>>>>
>>>>>>> On 30.03.20 16:38, Benchao Li wrote:
>>>>>>>> Hi Jark,
>>>>>>>>
>>>>>>>> Thanks for starting the discussion. The FLIP LGTM generally.
>>>>>>>>
>>>>>>>> Regarding connector version VS put version into connector's
>>> name,
>>>>>>>> I favor the latter personally, using one property to locate a
>>>>>>>> connector can make the error log more precise.
>>>>>>>> From the users' side, using one property to match a connector
>>> will
>>>>>>>> be easier. Especially we have many connectors,
>>>>>>>> and some of the need version property required, and some of
>>> them not.
>>>>>>>> Regarding Jingsong's suggestion,
>>>>>>>> IMO, it's a very good complement to the FLIP. Distinguishing
>>>>>>>> properties for source and sink can be very useful, and
>>>>>>>> also this will make the connector property more precise.
>>>>>>>> We are also sick of this for now, we cannot know whether a DDL
>>> is a
>>>>>>>> source or sink unless we look through all queries where
>>>>>>>> the table is used.
>>>>>>>> Even more, some of the required properties are only required for
>>>>>>>> source, bug we cannot leave it blank for sink, and vice versa.
>>>>>>>> I think we can also add a type for dimension tables except
>>> source and
>>>>>>>> sink.
>>>>>>>>
>>>>>>>> Kurt Young <[hidden email] <mailto:[hidden email]>>
>>> 于2020年3月30日
>>>>>>>> 周一 下午8:16写道:
>>>>>>>>
>>>>>>>>> It's not possible for the framework to throw such exception.
>>>>>>>> Because the
>>>>>>>> framework doesn't know what versions do the connector support.
>>>>>>>>
>>>>>>>> Not really, we can list all valid connectors framework could
>>>>>>>> found. E.g.
>>>>>>>> user mistyped 'kafka-0.x', the error message will looks like:
>>>>>>>>
>>>>>>>> we don't have any connector named "kafka-0.x", but we have:
>>>>>>>> FileSystem
>>>>>>>> Kafka-0.10
>>>>>>>> Kafka-0.11
>>>>>>>> ElasticSearch6
>>>>>>>> ElasticSearch7
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Kurt
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>
>>>>>>>>> Hi Kurt,
>>>>>>>>>
>>>>>>>>>> 2) Lists all available connectors seems also quite
>>>>>>>> straightforward, e.g
>>>>>>>>> user provided a wrong "kafka-0.8", we tell user we have
>>>>>>>> candidates of
>>>>>>>>> "kakfa-0.11", "kafka-universal"
>>>>>>>>> It's not possible for the framework to throw such exception.
>>>>>>>> Because the
>>>>>>>>> framework doesn't know what versions do the connector support.
>>>>>>>> All the
>>>>>>>>> version information is a blackbox in the identifier. But with
>>>>>>>>> `Factory#factoryVersion()` interface, we can know all the
>>>>>>>> supported
>>>>>>>>> versions.
>>>>>>>>>
>>>>>>>>>> 3) I don't think so. We can still treat it as the same
>>>>>>>> connector but with
>>>>>>>>> different versions.
>>>>>>>>> That's true but that's weird. Because from the plain DDL
>>>>>>>> definition, they
>>>>>>>>> look like different connectors with different "connector"
>>>>>>>> value, e.g.
>>>>>>>>> 'connector=kafka-0.8', 'connector=kafka-0.10'.
>>>>>>>>>
>>>>>>>>>> If users don't set any version, we will use
>>> "kafka-universal"
>>>>>>>> instead.
>>>>>>>>> The behavior is inconsistent IMO.
>>>>>>>>> That is a long term vision when there is no kafka clusters
>>>>>>>> with <0.11
>>>>>>>>> version.
>>>>>>>>> At that point, "universal" is the only supported version in
>>>>>> Flink
>>>>>>>> and the
>>>>>>>>> "version" key can be optional.
>>>>>>>>>
>>>>>>>>> ---------------------------------------------
>>>>>>>>>
>>>>>>>>> Hi Jingsong,
>>>>>>>>>
>>>>>>>>>> "version" vs "kafka.version"
>>>>>>>>> I though about it. But if we prefix "kafka" to version, we
>>>>>> should
>>>>>>>> prefix
>>>>>>>>> "kafka" for all other property keys, because they are all
>>> kafka
>>>>>>>> specific
>>>>>>>>> options.
>>>>>>>>> However, that will make the property set verbose, see rejected
>>>>>>>> option#2 in
>>>>>>>>> the FLIP.
>>>>>>>>>
>>>>>>>>>> explicitly separate options for source and sink
>>>>>>>>> That's a good topic. It's good to have a guideline for the new
>>>>>>>> property
>>>>>>>>> keys.
>>>>>>>>> I'm fine to prefix with a "source"/"sink" for some connector
>>>>>>>> keys.
>>>>>>>>> Actually, we already do this in some connectors, e.g. jdbc and
>>>>>>>> hbase.
>>>>>>>>> Best,
>>>>>>>>> Jark
>>>>>>>>>
>>>>>>>>> On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
>>>>>> [hidden email]
>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>> Thanks Jark for the proposal.
>>>>>>>>>>
>>>>>>>>>> +1 to the general idea.
>>>>>>>>>>
>>>>>>>>>> For "version", what about "kafka.version"? It is obvious to
>>>>>>>> know its
>>>>>>>>>> meaning.
>>>>>>>>>>
>>>>>>>>>> And I'd like to start a new topic:
>>>>>>>>>> Should we need to explicitly separate source from sink?
>>>>>>>>>> With the development of batch and streaming, more and more
>>>>>>>> connectors
>>>>>>>>> have
>>>>>>>>>> both source and sink.
>>>>>>>>>>
>>>>>>>>>> So should we set a rule for table properties:
>>>>>>>>>> - properties for both source and sink: without prefix, like
>>>>>>>> "topic"
>>>>>>>>>> - properties for source only: with "source." prefix, like
>>>>>>>>>> "source.startup-mode"
>>>>>>>>>> - properties for sink only: with "sink." prefix, like
>>>>>>>> "sink.partitioner"
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jingsong Lee
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>>> Hi Kurt,
>>>>>>>>>>>
>>>>>>>>>>> That's good questions.
>>>>>>>>>>>
>>>>>>>>>>>> the meaning of "version"
>>>>>>>>>>> There are two versions in the old design. One is property
>>>>>>>> version
>>>>>>>>>>> "connector.property-version" which can be used for
>>> backward
>>>>>>>>>> compatibility.
>>>>>>>>>>> The other one is "connector.version" which defines the
>>>>>>>> version of
>>>>>>>>>> external
>>>>>>>>>>> system, e.g. 0.11" for kafka, "6" or "7" for ES.
>>>>>>>>>>> In this proposal, the "version" is the previous
>>>>>>>> "connector.version".
>>>>>>>>> The
>>>>>>>>>>> ""connector.property-version" is not introduced in new
>>>>>>>> design.
>>>>>>>>>>>> how to keep the old capability which can evolve
>>> connector
>>>>>>>> properties
>>>>>>>>>>> The "connector.property-version" is an optional key in the
>>>>>>>> old design
>>>>>>>>> and
>>>>>>>>>>> is never bumped up.
>>>>>>>>>>> I'm not sure how "connector.property-version" should work
>>>>>>>> in the
>>>>>>>>> initial
>>>>>>>>>>> design. Maybe @Timo Walther <[hidden email]
>>>>>>>> <mailto:[hidden email]>> has more knowledge on
>>>>>>>>>>> this.
>>>>>>>>>>> But for the new properties, every options should be
>>>>>>>> expressed as
>>>>>>>>>>> `ConfigOption` which provides `withDeprecatedKeys(...)`
>>>>>>>> method to
>>>>>>>>> easily
>>>>>>>>>>> support evolving keys.
>>>>>>>>>>>
>>>>>>>>>>>> a single keys instead of two, e.g. "kafka-0.11",
>>>>>>>> "kafka-universal"?
>>>>>>>>>>> There are several benefit to use separate "version" key I
>>>>>> can
>>>>>>>> see:
>>>>>>>>>>> 1) it's more readable to separete them into different
>>> keys,
>>>>>>>> because
>>>>>>>>> they
>>>>>>>>>>> are orthogonal concepts.
>>>>>>>>>>> 2) the planner can give all the availble versions in the
>>>>>>>> exception
>>>>>>>>>> message,
>>>>>>>>>>> if user uses a wrong version (this is often reported in
>>>>>>>> user ML).
>>>>>>>>>>> 3) If we use "kafka-0.11" as connector identifier, we may
>>>>>>>> have to
>>>>>>>>> write a
>>>>>>>>>>> full documentation for each version, because they are
>>>>>>>> different
>>>>>>>>>>> "connector"?
>>>>>>>>>>> IMO, for 0.11, 0.11, etc... kafka, they are actually
>>>>>>>> the same
>>>>>>>>>> connector
>>>>>>>>>>> but with different "client jar" version,
>>>>>>>>>>> they share all the same supported property keys and
>>>>>>>> should reside
>>>>>>>>>>> together.
>>>>>>>>>>> 4) IMO, the future vision is version-free. At some point
>>> in
>>>>>>>> the future,
>>>>>>>>>> we
>>>>>>>>>>> may don't need users to specify kafka version anymore, and
>>>>>>>> make
>>>>>>>>>>> "version=universal" as optional or removed in the future.
>>>>>>>> This is can
>>>>>>>>> be
>>>>>>>>>>> done easily if they are separate keys.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jark
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 30 Mar 2020 at 14:45, Kurt Young <
>>> [hidden email]
>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>>>> Hi Jark,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the proposal, I'm +1 to the general idea.
>>>>>>>> However I have a
>>>>>>>>>>>> question about "version",
>>>>>>>>>>>> in the old design, the version seems to be aimed for
>>>>>>>> tracking
>>>>>>>>> property
>>>>>>>>>>>> version, with different
>>>>>>>>>>>> version, we could evolve these step by step without
>>>>>>>> breaking backward
>>>>>>>>>>>> compatibility. But in this
>>>>>>>>>>>> design, version is representing external system's
>>>>>>>> version, like
>>>>>>>>> "0.11"
>>>>>>>>>>> for
>>>>>>>>>>>> kafka, "6" or "7" for ES.
>>>>>>>>>>>> I'm not sure if this is necessary, what's the benefit of
>>>>>>>> using two
>>>>>>>>> keys
>>>>>>>>>>>> instead of one, like "kafka-0.11"
>>>>>>>>>>>> or "ES6" directly? And how about the old capability
>>> which
>>>>>>>> could let
>>>>>>>>> us
>>>>>>>>>>>> evolving connector properties?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Kurt
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>>>>>>>> <[hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Jark,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am really looking forward to this feature. I think
>>>>>> this
>>>>>>>> feature
>>>>>>>>>>>>> could simplify flink sql code,and at the same time ,
>>>>>>>>>>>>> it could make the developer more easlier to config the
>>>>>>>> flink sql
>>>>>>>>> WITH
>>>>>>>>>>>>> options.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now when I am using flink sql to write flink task ,
>>>>>>>> sometimes I
>>>>>>>>> think
>>>>>>>>>>> the
>>>>>>>>>>>>> WITH options is too long for user.
>>>>>>>>>>>>> For example,I config the kafka source connector
>>>>>>>> parameter,for
>>>>>>>>>> consumer
>>>>>>>>>>>>> group and brokers parameter:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 'connector.properties.0.key' = 'group.id
>>>>>>>> <http://group.id>'
>>>>>>>>>>>>>> , 'connector.properties.0.value' = 'xxx'
>>>>>>>>>>>>>> , 'connector.properties.1.key' =
>>>>>>>> 'bootstrap.servers'
>>>>>>>>>>>>>> , 'connector.properties.1.value' = 'xxxxx'
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I can understand this config , but for the flink fresh
>>>>>>>> man,maybe it
>>>>>>>>>>>>> is confused for him.
>>>>>>>>>>>>> In my thought, I am really looking forward to this
>>>>>>>> feature,thank
>>>>>>>>> you
>>>>>>>>>> to
>>>>>>>>>>>>> propose this feature.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>> LakeShen
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jark Wu <[hidden email] <mailto:[hidden email]>>
>>> 于
>>>>>>>> 2020年3月30日周一 下午2:02写道:
>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I want to start a discussion about further improve
>>> and
>>>>>>>> simplify
>>>>>>>>> our
>>>>>>>>>>>>> current
>>>>>>>>>>>>>> connector porperty keys, aka WITH options.
>>> Currently,
>>>>>>>> we have a
>>>>>>>>>>>>>> 'connector.' prefix for many properties, but they
>>> are
>>>>>>>> verbose,
>>>>>>>>> and
>>>>>>>>>> we
>>>>>>>>>>>>> see a
>>>>>>>>>>>>>> big inconsistency between the properties when
>>>>>> designing
>>>>>>>> FLIP-107.
>>>>>>>>>>>>>> So we propose to remove all the 'connector.' prefix
>>>>>> and
>>>>>>>> rename
>>>>>>>>>>>>>> 'connector.type' to 'connector', 'format.type' to
>>>>>>>> 'format'. So a
>>>>>>>>>> new
>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>> DDL may look like this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> CREATE TABLE kafka_table (
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> ) WITH (
>>>>>>>>>>>>>> 'connector' = 'kafka',
>>>>>>>>>>>>>> 'version' = '0.10',
>>>>>>>>>>>>>> 'topic' = 'test-topic',
>>>>>>>>>>>>>> 'startup-mode' = 'earliest-offset',
>>>>>>>>>>>>>> 'properties.bootstrap.servers' = 'localhost:9092',
>>>>>>>>>>>>>> 'properties.group.id <http://properties.group.id>'
>>> =
>>>>>>>> 'testGroup',
>>>>>>>>>>>>>> 'format' = 'json',
>>>>>>>>>>>>>> 'format.fail-on-missing-field' = 'false'
>>>>>>>>>>>>>> );
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The new connector property key set will come
>>> together
>>>>>>>> with new
>>>>>>>>>>> Factory
>>>>>>>>>>>>>> inferface which is proposed in FLIP-95. Old
>>> properties
>>>>>>>> are still
>>>>>>>>>>>>> compatible
>>>>>>>>>>>>>> with their existing implementation. New properties
>>>>>>>> are only
>>>>>>>>>> available
>>>>>>>>>>>> in
>>>>>>>>>>>>>> new DynamicTableFactory implementations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You can access the detailed FLIP here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best, Jingsong Lee
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Benchao Li
>>>>>>>> School of Electronics Engineering and Computer Science, Peking
>>>>>>>> University
>>>>>>>> Tel:+86-15650713730
>>>>>>>> Email:[hidden email]
>>>>>>>> <mailto:[hidden email]>;[hidden email]
>>>>>>>> <mailto:[hidden email]>
>>>>>>>>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-122: New Connector Property Keys for New Factory

Timo Walther-2
Thanks for the update, Jark. +1 for your proposal.

Some minor feedback from my side:

sink.bulk-flush -> sink.buffer-flush?
zookeeper.znode.parent -> zookeeper.znode-parent?

for consistency.

username -> secrect.username?
password -> secrect.password?

How about we prefix options that should not show up in logs with "secret."?

I agree with Dawid, we should not mix factory identifier with updates to
the property design. I would introduce a general `property-version`
option that users can add to their DDL if they want to ensure backwards
compatibility. All factories would get this option by default, they
don't need to list it explicitly.

Regards,
Timo


On 31.03.20 09:21, Dawid Wysakowicz wrote:

> Hi,
>
> As for the `connector.property-version`. I have doubts that having
> deprecated keys mechanism is enough to fully support backwards
> compatibility. I do think we are changing not just the key names, but
> structure of the keys as well as the discovery mechanism and I think it
> is really hard to foresee all the interconnections. I don't think users
> need to learn about the `connector.property-version` too much. By
> default it would always be set to the newest version. Users could still
> set this property to an old version if they don't want to migrate the
> properties, but go fully against the old structure. I think such
> mechanism could be useful.
>
> I understand a similar thing can be achieved with renaming/changing the
> version in the factoryIdentifier. 'kafka' can use the new properties,
> 'kafka-v1' the old properties. I think though having a dedicated
> mechanism could make it a tad easier. I am not strict on that though,
> just wanted to express it.
>
> Best,
>
> Dawid
>
> On 31/03/2020 08:58, Jark Wu wrote:
>> Hi everyone,
>>
>> Thanks for the great feedbacks so far.
>>
>> I updated the FLIP documentation according to the discussion. Changes
>> include:
>> - remove "version" key, and merge it into "connector"
>> - add "scan", "lookup", "sink" prefix to some property keys if they are
>> only used in that case.
>> - add a "New Property Key" section to list all the previous property keys
>> and new property keys.
>>
>> We use "scan" and "lookup" instead of "source" prefix because we should
>> distinguish them and they aligns to FLIP-95 ScanTableSource and
>> LookupTableSource.
>> I also colored red for some major change of property keys in the FLIP. I
>> will list some of them here too:
>>
>> kafka:
>> connector.startup-mode => scan.startup.mode
>> connector.specific-offsets => scan.startup.specific-offsets
>> connector.startup-timestamp-millis => scan.startup.timestamp-millis
>> connector.sink-partitioner & connector.sink-partitioner-class =>
>> sink.partitioner
>>
>> elasticsearch:
>> connector.key-delimiter => document-id.key-delimiter              # make it
>> explicit that it is used for document id
>> connector.key-null-literal => document-id.key-null-literal          # and
>> it also can be used for es sources in the future
>> connector.bulk-flush.back-off.type => sink.bulk-flush.back-off.strategy
>>
>> jdbc:
>> connector.table => table-name
>>
>> Welcome further feedbacks!
>>
>> Best,
>> Jark
>>
>>
>> On Tue, 31 Mar 2020 at 14:45, Jark Wu <[hidden email]> wrote:
>>
>>> Hi Kurt,
>>>
>>> I also prefer "-" as version delimiter now. I didn't remove the "_"
>>> proposal by mistake, that's why I sent another email last night :)
>>> Regarding to "property-version", I also think we shouldn't let users to
>>> learn about this. And ConfigOption provides a good ability
>>> to support deprecated keys and auto-generate documentation for deprecated
>>> keys.
>>>
>>> Hi Danny,
>>>
>>> Regarding to “connector.properties.*”:
>>> In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions()
>>> inferfaces are only used for generation of documentation.
>>> It does not influence the discovery and validation of a factory. The
>>> validation logic is defined by connectors
>>> in createDynamicTableSource/Sink().
>>> So you don't have to provide an option for "connector.properties.*". But I
>>> think we should make ConfigOption support wildcard in the long term for a
>>> full story.
>>>
>>> I don't think we should inline all the "connector.properties.*",
>>> otherwise, it will be very tricky for users to configure the properties.
>>> Regarding to FLIP-113, I suggest to provide some ConfigOptions for
>>> commonly used kafka properties and put them in the supportedHintOptions(),
>>> e.g. "connector.properties.group.id",
>>> "connector.properties.fetch.min.bytes".
>>>
>>> Best,
>>> Jark
>>>
>>>
>>>
>>>
>>>
>>> On Tue, 31 Mar 2020 at 12:04, Danny Chan <[hidden email]> wrote:
>>>
>>>> Thanks Jark for bring up this discussion, +1 for this idea, I believe the
>>>> user has suffered from the verbose property key for long time.
>>>>
>>>> Just one question, how do we handle the keys with wildcard, such as the
>>>> “connector.properties.*” in Kafka connector which would then hand-over to
>>>> Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption
>>>> to describe the “supported properties”, then I have to concerns:
>>>>
>>>> • For the new keys, do we still need to put multi-lines there the such
>>>> key, such as “connector.properties.abc” “connector.properties.def”, or
>>>> should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..."
>>>> • Should the ConfigOption support the wildcard ? (If we plan to support
>>>> the current multi-line style)
>>>>
>>>>
>>>> Best,
>>>> Danny Chan
>>>> 在 2020年3月31日 +0800 AM12:37,Jark Wu <[hidden email]>,写道:
>>>>> Hi all,
>>>>>
>>>>> Thanks for the feedbacks.
>>>>>
>>>>> It seems that we have a conclusion to put the version into the factory
>>>>> identifier. I'm also fine with this.
>>>>> If we have this outcome, the interface of Factory#factoryVersion is not
>>>>> needed anymore, this can simplify the learning cost of new factory.
>>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>>>>> <[hidden email]>
>>>>>
>>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because
>>>>> the meaning of "universal" not easy to understand.
>>>>> kafka-0.11 => kafka for 0.11 version
>>>>> kafka-0.10 => kafka for 0.10 version
>>>>> elasticsearch-6 => elasticsearch for 6.x versions
>>>>> elasticsearch-7 => elasticsearch for 7.x versions
>>>>> hbase-1.4 => hbase for 1.4.x versions
>>>>> jdbc
>>>>> filesystem
>>>>>
>>>>> We use "-" as the version delimiter to make them to be more consistent.
>>>>> This is not forces, users can also use other delimiters or without
>>>>> delimiter.
>>>>> But this can be a guilde in the Javadoc of Factory, to make the
>>>> connector
>>>>> ecosystem to be more consistent.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> ------------------------
>>>>>
>>>>> Regarding "connector.property-version":
>>>>>
>>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>>>>> designed not support to read current properties.
>>>>> All the current properties are routed to the old factories if they are
>>>>> using "connector.type". Otherwise, properties are routed to new
>>>> factories.
>>>>> If I understand correctly, the "connector.property-version" is attched
>>>>> implicitly by system, not manually set by users.
>>>>> For example, the framework should add "connector.property-version=1" to
>>>>> properties when processing DDL statement.
>>>>> I'm fine to add a "connector.property-version=1" when processing DDL
>>>>> statement, but I think it's also fine if we don't,
>>>>> because this can be done in the future if need and the default version
>>>> can
>>>>> be 1.
>>>>>
>>>>> Best,
>>>>> Jark
>>>>>
>>>>> On Tue, 31 Mar 2020 at 00:36, Jark Wu <[hidden email]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Thanks for the feedbacks.
>>>>>>
>>>>>> It seems that we have a conclusion to put the version into the factory
>>>>>> identifier. I'm also fine with this.
>>>>>> If we have this outcome, the interface of Factory#factoryVersion is
>>>> not
>>>>>> needed anymore, this can simplify the learning cost of new factory.
>>>>>> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther
>>>>>> <[hidden email]>
>>>>>>
>>>>>> Btw, I would like to use "_" instead of "-" as the version delimiter,
>>>>>> because "-" looks like minus and may confuse users, e.g.
>>>> "elasticsearch-6".
>>>>>> This is not forced, but should be a guilde in the Javadoc of Factory.
>>>>>> I propose to use the following identifiers for existing connectors,
>>>>>>
>>>>>> kafka => kafka for 0.11+ versions, we don't suffix "-universal",
>>>> because
>>>>>> the meaning of "universal" not easy to understand.
>>>>>> kafka-0.11 => kafka for 0.11 version
>>>>>> kafka-0.10 => kafka for 0.10 version
>>>>>> elasticsearch-6 => elasticsearch for 6.x versions
>>>>>> elasticsearch-7 => elasticsearch for 7.x versions
>>>>>> hbase-1.4 => hbase for 1.4.x versions
>>>>>> jdbc
>>>>>> filesystem
>>>>>>
>>>>>> We use "-" as the version delimiter to make them to be more
>>>> consistent.
>>>>>> This is not forces, users can also use other delimiters or without
>>>>>> delimiter.
>>>>>> But this can be a guilde in the Javadoc of Factory, to make the
>>>> connector
>>>>>> ecosystem to be more consistent.
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> ------------------------
>>>>>>
>>>>>> Regarding "connector.property-version":
>>>>>>
>>>>>> Hi @Dawid Wysakowicz <[hidden email]> , the new fatories are
>>>>>> designed not support to read current properties.
>>>>>> All the current properties are routed to the old factories if they are
>>>>>> using "connector.type". Otherwise, properties are routed to new
>>>> factories.
>>>>>> If I understand correctly, the "connector.property-version" is attched
>>>>>> implicitly by system, not manually set by users.
>>>>>> For example, the framework should add "connector.property-version=1"
>>>> to
>>>>>> properties when processing DDL statement.
>>>>>> I'm fine to add a "connector.property-version=1" when processing DDL
>>>>>> statement, but I think it's also fine if we don't,
>>>>>> because this can be done in the future if need and the default
>>>> version can
>>>>>> be 1.
>>>>>>
>>>>>> Best,
>>>>>> Jark
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <
>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I like the overall design of the FLIP.
>>>>>>>
>>>>>>> As for the withstanding concerns. I kind of like the approach to
>>>> put the
>>>>>>> version into the factory identifier. I think it's the cleanest way
>>>> to
>>>>>>> say that this version actually applies to the connector itself and
>>>> not
>>>>>>> to the system it connects to. BTW, I think the outcome of this
>>>>>>> discussion will affect interfaces described in FLIP-95. If we put
>>>> the
>>>>>>> version into the functionIdentifier, do we need
>>>> Factory#factoryVersion?
>>>>>>> I also think it does make sense to have a versioning for the
>>>> properties
>>>>>>> as well. Are we able to read all the current properties with the new
>>>>>>> factories? I think we could use the "connector.property-version" to
>>>>>>> alternate between different Factory interfaces to support the old
>>>> set of
>>>>>>> properties. Otherwise the new factories need to understand both set
>>>> of
>>>>>>> properties, don't they?
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Dawid
>>>>>>>
>>>>>>> On 30/03/2020 17:07, Timo Walther wrote:
>>>>>>>> Hi Jark,
>>>>>>>>
>>>>>>>> thanks for the FLIP. I don't have a strong opinion on
>>>>>>>> DynamicTableFactory#factoryVersion() for distiguishing connector
>>>>>>>> versions. We can also include it in the factory identifier
>>>> itself. For
>>>>>>>> Kafka, I would then just use `kafka` because the naming
>>>> "universal"
>>>>>>>> was just a helper solution at that time.
>>>>>>>>
>>>>>>>> What we need is a good guide for how to design the options. We
>>>> should
>>>>>>>> include that in the JavaDoc of factory interface itself, once it
>>>> is
>>>>>>>> in-place. I like the difference between source and sink in
>>>> properties.
>>>>>>>> Regarding "connector.property-version":
>>>>>>>>
>>>>>>>> It was meant for backwards compatibility along the dimension of
>>>>>>>> "property design" compared to "connector version". However, since
>>>> the
>>>>>>>> connector is now responsible for parsing all properties, it is
>>>> not as
>>>>>>>> necessary as it was before.
>>>>>>>>
>>>>>>>> However, a change of the property design as it is done in FLIP-107
>>>>>>>> becomes more difficult without a property version and versioning
>>>> is
>>>>>>>> always a good idea in API world.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>>
>>>>>>>> On 30.03.20 16:38, Benchao Li wrote:
>>>>>>>>> Hi Jark,
>>>>>>>>>
>>>>>>>>> Thanks for starting the discussion. The FLIP LGTM generally.
>>>>>>>>>
>>>>>>>>> Regarding connector version VS put version into connector's
>>>> name,
>>>>>>>>> I favor the latter personally, using one property to locate a
>>>>>>>>> connector can make the error log more precise.
>>>>>>>>>  From the users' side, using one property to match a connector
>>>> will
>>>>>>>>> be easier. Especially we have many connectors,
>>>>>>>>> and some of the need version property required, and some of
>>>> them not.
>>>>>>>>> Regarding Jingsong's suggestion,
>>>>>>>>> IMO, it's a very good complement to the FLIP. Distinguishing
>>>>>>>>> properties for source and sink can be very useful, and
>>>>>>>>> also this will make the connector property more precise.
>>>>>>>>> We are also sick of this for now, we cannot know whether a DDL
>>>> is a
>>>>>>>>> source or sink unless we look through all queries where
>>>>>>>>> the table is used.
>>>>>>>>> Even more, some of the required properties are only required for
>>>>>>>>> source, bug we cannot leave it blank for sink, and vice versa.
>>>>>>>>> I think we can also add a type for dimension tables except
>>>> source and
>>>>>>>>> sink.
>>>>>>>>>
>>>>>>>>> Kurt Young <[hidden email] <mailto:[hidden email]>>
>>>> 于2020年3月30日
>>>>>>>>> 周一 下午8:16写道:
>>>>>>>>>
>>>>>>>>>> It's not possible for the framework to throw such exception.
>>>>>>>>> Because the
>>>>>>>>> framework doesn't know what versions do the connector support.
>>>>>>>>>
>>>>>>>>> Not really, we can list all valid connectors framework could
>>>>>>>>> found. E.g.
>>>>>>>>> user mistyped 'kafka-0.x', the error message will looks like:
>>>>>>>>>
>>>>>>>>> we don't have any connector named "kafka-0.x", but we have:
>>>>>>>>> FileSystem
>>>>>>>>> Kafka-0.10
>>>>>>>>> Kafka-0.11
>>>>>>>>> ElasticSearch6
>>>>>>>>> ElasticSearch7
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Kurt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <[hidden email]
>>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Kurt,
>>>>>>>>>>
>>>>>>>>>>> 2) Lists all available connectors seems also quite
>>>>>>>>> straightforward, e.g
>>>>>>>>>> user provided a wrong "kafka-0.8", we tell user we have
>>>>>>>>> candidates of
>>>>>>>>>> "kakfa-0.11", "kafka-universal"
>>>>>>>>>> It's not possible for the framework to throw such exception.
>>>>>>>>> Because the
>>>>>>>>>> framework doesn't know what versions do the connector support.
>>>>>>>>> All the
>>>>>>>>>> version information is a blackbox in the identifier. But with
>>>>>>>>>> `Factory#factoryVersion()` interface, we can know all the
>>>>>>>>> supported
>>>>>>>>>> versions.
>>>>>>>>>>
>>>>>>>>>>> 3) I don't think so. We can still treat it as the same
>>>>>>>>> connector but with
>>>>>>>>>> different versions.
>>>>>>>>>> That's true but that's weird. Because from the plain DDL
>>>>>>>>> definition, they
>>>>>>>>>> look like different connectors with different "connector"
>>>>>>>>> value, e.g.
>>>>>>>>>> 'connector=kafka-0.8', 'connector=kafka-0.10'.
>>>>>>>>>>
>>>>>>>>>>> If users don't set any version, we will use
>>>> "kafka-universal"
>>>>>>>>> instead.
>>>>>>>>>> The behavior is inconsistent IMO.
>>>>>>>>>> That is a long term vision when there is no kafka clusters
>>>>>>>>> with <0.11
>>>>>>>>>> version.
>>>>>>>>>> At that point, "universal" is the only supported version in
>>>>>>> Flink
>>>>>>>>> and the
>>>>>>>>>> "version" key can be optional.
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------
>>>>>>>>>>
>>>>>>>>>> Hi Jingsong,
>>>>>>>>>>
>>>>>>>>>>> "version" vs "kafka.version"
>>>>>>>>>> I though about it. But if we prefix "kafka" to version, we
>>>>>>> should
>>>>>>>>> prefix
>>>>>>>>>> "kafka" for all other property keys, because they are all
>>>> kafka
>>>>>>>>> specific
>>>>>>>>>> options.
>>>>>>>>>> However, that will make the property set verbose, see rejected
>>>>>>>>> option#2 in
>>>>>>>>>> the FLIP.
>>>>>>>>>>
>>>>>>>>>>> explicitly separate options for source and sink
>>>>>>>>>> That's a good topic. It's good to have a guideline for the new
>>>>>>>>> property
>>>>>>>>>> keys.
>>>>>>>>>> I'm fine to prefix with a "source"/"sink" for some connector
>>>>>>>>> keys.
>>>>>>>>>> Actually, we already do this in some connectors, e.g. jdbc and
>>>>>>>>> hbase.
>>>>>>>>>> Best,
>>>>>>>>>> Jark
>>>>>>>>>>
>>>>>>>>>> On Mon, 30 Mar 2020 at 16:36, Jingsong Li <
>>>>>>> [hidden email]
>>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>>> Thanks Jark for the proposal.
>>>>>>>>>>>
>>>>>>>>>>> +1 to the general idea.
>>>>>>>>>>>
>>>>>>>>>>> For "version", what about "kafka.version"? It is obvious to
>>>>>>>>> know its
>>>>>>>>>>> meaning.
>>>>>>>>>>>
>>>>>>>>>>> And I'd like to start a new topic:
>>>>>>>>>>> Should we need to explicitly separate source from sink?
>>>>>>>>>>> With the development of batch and streaming, more and more
>>>>>>>>> connectors
>>>>>>>>>> have
>>>>>>>>>>> both source and sink.
>>>>>>>>>>>
>>>>>>>>>>> So should we set a rule for table properties:
>>>>>>>>>>> - properties for both source and sink: without prefix, like
>>>>>>>>> "topic"
>>>>>>>>>>> - properties for source only: with "source." prefix, like
>>>>>>>>>>> "source.startup-mode"
>>>>>>>>>>> - properties for sink only: with "sink." prefix, like
>>>>>>>>> "sink.partitioner"
>>>>>>>>>>> What do you think?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jingsong Lee
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <[hidden email]
>>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>>>> Hi Kurt,
>>>>>>>>>>>>
>>>>>>>>>>>> That's good questions.
>>>>>>>>>>>>
>>>>>>>>>>>>> the meaning of "version"
>>>>>>>>>>>> There are two versions in the old design. One is property
>>>>>>>>> version
>>>>>>>>>>>> "connector.property-version" which can be used for
>>>> backward
>>>>>>>>>>> compatibility.
>>>>>>>>>>>> The other one is "connector.version" which defines the
>>>>>>>>> version of
>>>>>>>>>>> external
>>>>>>>>>>>> system, e.g. 0.11" for kafka, "6" or "7" for ES.
>>>>>>>>>>>> In this proposal, the "version" is the previous
>>>>>>>>> "connector.version".
>>>>>>>>>> The
>>>>>>>>>>>> ""connector.property-version" is not introduced in new
>>>>>>>>> design.
>>>>>>>>>>>>> how to keep the old capability which can evolve
>>>> connector
>>>>>>>>> properties
>>>>>>>>>>>> The "connector.property-version" is an optional key in the
>>>>>>>>> old design
>>>>>>>>>> and
>>>>>>>>>>>> is never bumped up.
>>>>>>>>>>>> I'm not sure how "connector.property-version" should work
>>>>>>>>> in the
>>>>>>>>>> initial
>>>>>>>>>>>> design. Maybe @Timo Walther <[hidden email]
>>>>>>>>> <mailto:[hidden email]>> has more knowledge on
>>>>>>>>>>>> this.
>>>>>>>>>>>> But for the new properties, every options should be
>>>>>>>>> expressed as
>>>>>>>>>>>> `ConfigOption` which provides `withDeprecatedKeys(...)`
>>>>>>>>> method to
>>>>>>>>>> easily
>>>>>>>>>>>> support evolving keys.
>>>>>>>>>>>>
>>>>>>>>>>>>> a single keys instead of two, e.g. "kafka-0.11",
>>>>>>>>> "kafka-universal"?
>>>>>>>>>>>> There are several benefit to use separate "version" key I
>>>>>>> can
>>>>>>>>> see:
>>>>>>>>>>>> 1) it's more readable to separete them into different
>>>> keys,
>>>>>>>>> because
>>>>>>>>>> they
>>>>>>>>>>>> are orthogonal concepts.
>>>>>>>>>>>> 2) the planner can give all the availble versions in the
>>>>>>>>> exception
>>>>>>>>>>> message,
>>>>>>>>>>>> if user uses a wrong version (this is often reported in
>>>>>>>>> user ML).
>>>>>>>>>>>> 3) If we use "kafka-0.11" as connector identifier, we may
>>>>>>>>> have to
>>>>>>>>>> write a
>>>>>>>>>>>> full documentation for each version, because they are
>>>>>>>>> different
>>>>>>>>>>>> "connector"?
>>>>>>>>>>>> IMO, for 0.11, 0.11, etc... kafka, they are actually
>>>>>>>>> the same
>>>>>>>>>>> connector
>>>>>>>>>>>> but with different "client jar" version,
>>>>>>>>>>>> they share all the same supported property keys and
>>>>>>>>> should reside
>>>>>>>>>>>> together.
>>>>>>>>>>>> 4) IMO, the future vision is version-free. At some point
>>>> in
>>>>>>>>> the future,
>>>>>>>>>>> we
>>>>>>>>>>>> may don't need users to specify kafka version anymore, and
>>>>>>>>> make
>>>>>>>>>>>> "version=universal" as optional or removed in the future.
>>>>>>>>> This is can
>>>>>>>>>> be
>>>>>>>>>>>> done easily if they are separate keys.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jark
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 30 Mar 2020 at 14:45, Kurt Young <
>>>> [hidden email]
>>>>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>>>>>>>> Hi Jark,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the proposal, I'm +1 to the general idea.
>>>>>>>>> However I have a
>>>>>>>>>>>>> question about "version",
>>>>>>>>>>>>> in the old design, the version seems to be aimed for
>>>>>>>>> tracking
>>>>>>>>>> property
>>>>>>>>>>>>> version, with different
>>>>>>>>>>>>> version, we could evolve these step by step without
>>>>>>>>> breaking backward
>>>>>>>>>>>>> compatibility. But in this
>>>>>>>>>>>>> design, version is representing external system's
>>>>>>>>> version, like
>>>>>>>>>> "0.11"
>>>>>>>>>>>> for
>>>>>>>>>>>>> kafka, "6" or "7" for ES.
>>>>>>>>>>>>> I'm not sure if this is necessary, what's the benefit of
>>>>>>>>> using two
>>>>>>>>>> keys
>>>>>>>>>>>>> instead of one, like "kafka-0.11"
>>>>>>>>>>>>> or "ES6" directly? And how about the old capability
>>>> which
>>>>>>>>> could let
>>>>>>>>>> us
>>>>>>>>>>>>> evolving connector properties?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Kurt
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>>>>>>>>> <[hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Jark,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am really looking forward to this feature. I think
>>>>>>> this
>>>>>>>>> feature
>>>>>>>>>>>>>> could simplify flink sql code,and at the same time ,
>>>>>>>>>>>>>> it could make the developer more easlier to config the
>>>>>>>>> flink sql
>>>>>>>>>> WITH
>>>>>>>>>>>>>> options.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now when I am using flink sql to write flink task ,
>>>>>>>>> sometimes I
>>>>>>>>>> think
>>>>>>>>>>>> the
>>>>>>>>>>>>>> WITH options is too long for user.
>>>>>>>>>>>>>> For example,I config the kafka source connector
>>>>>>>>> parameter,for
>>>>>>>>>>> consumer
>>>>>>>>>>>>>> group and brokers parameter:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 'connector.properties.0.key' = 'group.id
>>>>>>>>> <http://group.id>'
>>>>>>>>>>>>>>> , 'connector.properties.0.value' = 'xxx'
>>>>>>>>>>>>>>> , 'connector.properties.1.key' =
>>>>>>>>> 'bootstrap.servers'
>>>>>>>>>>>>>>> , 'connector.properties.1.value' = 'xxxxx'
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can understand this config , but for the flink fresh
>>>>>>>>> man,maybe it
>>>>>>>>>>>>>> is confused for him.
>>>>>>>>>>>>>> In my thought, I am really looking forward to this
>>>>>>>>> feature,thank
>>>>>>>>>> you
>>>>>>>>>>> to
>>>>>>>>>>>>>> propose this feature.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>>> LakeShen
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jark Wu <[hidden email] <mailto:[hidden email]>>
>>>> 于
>>>>>>>>> 2020年3月30日周一 下午2:02写道:
>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I want to start a discussion about further improve
>>>> and
>>>>>>>>> simplify
>>>>>>>>>> our
>>>>>>>>>>>>>> current
>>>>>>>>>>>>>>> connector porperty keys, aka WITH options.
>>>> Currently,
>>>>>>>>> we have a
>>>>>>>>>>>>>>> 'connector.' prefix for many properties, but they
>>>> are
>>>>>>>>> verbose,
>>>>>>>>>> and
>>>>>>>>>>> we
>>>>>>>>>>>>>> see a
>>>>>>>>>>>>>>> big inconsistency between the properties when
>>>>>>> designing
>>>>>>>>> FLIP-107.
>>>>>>>>>>>>>>> So we propose to remove all the 'connector.' prefix
>>>>>>> and
>>>>>>>>> rename
>>>>>>>>>>>>>>> 'connector.type' to 'connector', 'format.type' to
>>>>>>>>> 'format'. So a
>>>>>>>>>>> new
>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>> DDL may look like this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> CREATE TABLE kafka_table (
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>> ) WITH (
>>>>>>>>>>>>>>> 'connector' = 'kafka',
>>>>>>>>>>>>>>> 'version' = '0.10',
>>>>>>>>>>>>>>> 'topic' = 'test-topic',
>>>>>>>>>>>>>>> 'startup-mode' = 'earliest-offset',
>>>>>>>>>>>>>>> 'properties.bootstrap.servers' = 'localhost:9092',
>>>>>>>>>>>>>>> 'properties.group.id <http://properties.group.id>'
>>>> =
>>>>>>>>> 'testGroup',
>>>>>>>>>>>>>>> 'format' = 'json',
>>>>>>>>>>>>>>> 'format.fail-on-missing-field' = 'false'
>>>>>>>>>>>>>>> );
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The new connector property key set will come
>>>> together
>>>>>>>>> with new
>>>>>>>>>>>> Factory
>>>>>>>>>>>>>>> inferface which is proposed in FLIP-95. Old
>>>> properties
>>>>>>>>> are still
>>>>>>>>>>>>>> compatible
>>>>>>>>>>>>>>> with their existing implementation. New properties
>>>>>>>>> are only
>>>>>>>>>>> available
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> new DynamicTableFactory implementations.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You can access the detailed FLIP here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best, Jingsong Lee
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Benchao Li
>>>>>>>>> School of Electronics Engineering and Computer Science, Peking
>>>>>>>>> University
>>>>>>>>> Tel:+86-15650713730
>>>>>>>>> Email:[hidden email]
>>>>>>>>> <mailto:[hidden email]>;[hidden email]
>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>
>

12