[DISCUSS] Should max/min be part of the hierarchy of config option?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Should max/min be part of the hierarchy of config option?

Yangze Guo
Hi, everyone,

I'm working on FLINK-16605 Add max limitation to the total number of
slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
config option of this limit.
The central question is whether the "max" should be part of the
hierarchy or part of the property itself.

It means there could be two patterns:
- max-xyz
- xyz.max

Currently, there is no clear consensus on which style is better and we
could find both patterns in the current Flink. Here, I'd like to first
sort out[3]:

Config options follow the "max-xyz" pattern:
- restart-strategy.failure-rate.max-failures-per-interval
- yarn.maximum-failed-containers
- state.backend.rocksdb.compaction.level.max-size-level-base
- cluster.registration.max-timeout
- high-availability.zookeeper.client.max-retry-attempts
- rest.client.max-content-length
- rest.retry.max-attempts
- rest.server.max-content-length
- jobstore.max-capacity
- taskmanager.registration.max-backoff
- compiler.delimited-informat.max-line-samples
- compiler.delimited-informat.min-line-samples
- compiler.delimited-informat.max-sample-len
- taskmanager.runtime.max-fan
- pipeline.max-parallelism
- execution.checkpointing.max-concurrent-checkpoint
- execution.checkpointing.min-pause

Config options follow the "xyz.max" pattern:
- taskmanager.memory.jvm-overhead.max
- taskmanager.memory.jvm-overhead.min
- taskmanager.memory.network.max
- taskmanager.memory.network.min
- taskmanager.network.request-backoff.max
- env.log.max

Config options do not follow the above two patterns:
- akka.client-socket-worker-pool.pool-size-max
- akka.client-socket-worker-pool.pool-size-min
- akka.fork-join-executor.parallelism-max
- akka.fork-join-executor.parallelism-min
- akka.server-socket-worker-pool.pool-size-max
- akka.server-socket-worker-pool.pool-size-min
- containerized.heap-cutoff-min
- blob.offload.minsize

It seems more config options follow the "max-xyz" pattern. From my
side, I do not have a strong preference. But it probably make sense to
follow one of them in Flink.
If we decide to make it a naming convention and align all config
options to it, I prefer to follow the "max-xyz" pattern to minimize
the infect to user.

Looking forward to your feedback!

[1] https://issues.apache.org/jira/browse/FLINK-16605
[2] https://github.com/apache/flink/pull/11615#discussion_r412316648
[3] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html

Best,
Yangze Guo
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Robert Metzger
Thanks for starting this discussion.
I believe the different options are a lot about personal taste, there are
no objective arguments why one option is better than the other.

I agree with your proposal to simply go with the "max-xyz" pattern, as this
is the style of the majority of the current configuration options in Flink
(maybe this also means it is the taste of the majority of Flink
developers?).

I would propose to add a test or some tooling that checks that all new
configuration parameters follow this pattern, as well as tickets for Flink
2.0 to migrate the "wrong" configuration options.



On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]> wrote:

> Hi, everyone,
>
> I'm working on FLINK-16605 Add max limitation to the total number of
> slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
> config option of this limit.
> The central question is whether the "max" should be part of the
> hierarchy or part of the property itself.
>
> It means there could be two patterns:
> - max-xyz
> - xyz.max
>
> Currently, there is no clear consensus on which style is better and we
> could find both patterns in the current Flink. Here, I'd like to first
> sort out[3]:
>
> Config options follow the "max-xyz" pattern:
> - restart-strategy.failure-rate.max-failures-per-interval
> - yarn.maximum-failed-containers
> - state.backend.rocksdb.compaction.level.max-size-level-base
> - cluster.registration.max-timeout
> - high-availability.zookeeper.client.max-retry-attempts
> - rest.client.max-content-length
> - rest.retry.max-attempts
> - rest.server.max-content-length
> - jobstore.max-capacity
> - taskmanager.registration.max-backoff
> - compiler.delimited-informat.max-line-samples
> - compiler.delimited-informat.min-line-samples
> - compiler.delimited-informat.max-sample-len
> - taskmanager.runtime.max-fan
> - pipeline.max-parallelism
> - execution.checkpointing.max-concurrent-checkpoint
> - execution.checkpointing.min-pause
>
> Config options follow the "xyz.max" pattern:
> - taskmanager.memory.jvm-overhead.max
> - taskmanager.memory.jvm-overhead.min
> - taskmanager.memory.network.max
> - taskmanager.memory.network.min
> - taskmanager.network.request-backoff.max
> - env.log.max
>
> Config options do not follow the above two patterns:
> - akka.client-socket-worker-pool.pool-size-max
> - akka.client-socket-worker-pool.pool-size-min
> - akka.fork-join-executor.parallelism-max
> - akka.fork-join-executor.parallelism-min
> - akka.server-socket-worker-pool.pool-size-max
> - akka.server-socket-worker-pool.pool-size-min
> - containerized.heap-cutoff-min
> - blob.offload.minsize
>
> It seems more config options follow the "max-xyz" pattern. From my
> side, I do not have a strong preference. But it probably make sense to
> follow one of them in Flink.
> If we decide to make it a naming convention and align all config
> options to it, I prefer to follow the "max-xyz" pattern to minimize
> the infect to user.
>
> Looking forward to your feedback!
>
> [1] https://issues.apache.org/jira/browse/FLINK-16605
> [2] https://github.com/apache/flink/pull/11615#discussion_r412316648
> [3]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
>
> Best,
> Yangze Guo
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Till Rohrmann
Hi everyone,

as Robert said I think the problem is that we don't have strict guidelines
and every committer follows his/her personal taste. I'm actually not sure
whether we can define bullet-proof guidelines but we can definitely
make them more concrete.

In this case here, I have to admit that I have an opposing opinion/taste. I
think it would be better to use xyz.min and xyz.max. The reason is that we
configure a property of xyz which consists of the minimum and maximum
value. Differently said, min and max belong semantically together and hence
should be defined together. You can think of it as if the type of the xyz
config option would be a tuple of two integers instead of two individual
integers.

A comment concerning the existing styles of config options: I think many of
the config options which follow the max-xzy pattern are actually older
configuration options.

Cheers,
Till

On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]> wrote:

> Thanks for starting this discussion.
> I believe the different options are a lot about personal taste, there are
> no objective arguments why one option is better than the other.
>
> I agree with your proposal to simply go with the "max-xyz" pattern, as this
> is the style of the majority of the current configuration options in Flink
> (maybe this also means it is the taste of the majority of Flink
> developers?).
>
> I would propose to add a test or some tooling that checks that all new
> configuration parameters follow this pattern, as well as tickets for Flink
> 2.0 to migrate the "wrong" configuration options.
>
>
>
> On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]> wrote:
>
> > Hi, everyone,
> >
> > I'm working on FLINK-16605 Add max limitation to the total number of
> > slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
> > config option of this limit.
> > The central question is whether the "max" should be part of the
> > hierarchy or part of the property itself.
> >
> > It means there could be two patterns:
> > - max-xyz
> > - xyz.max
> >
> > Currently, there is no clear consensus on which style is better and we
> > could find both patterns in the current Flink. Here, I'd like to first
> > sort out[3]:
> >
> > Config options follow the "max-xyz" pattern:
> > - restart-strategy.failure-rate.max-failures-per-interval
> > - yarn.maximum-failed-containers
> > - state.backend.rocksdb.compaction.level.max-size-level-base
> > - cluster.registration.max-timeout
> > - high-availability.zookeeper.client.max-retry-attempts
> > - rest.client.max-content-length
> > - rest.retry.max-attempts
> > - rest.server.max-content-length
> > - jobstore.max-capacity
> > - taskmanager.registration.max-backoff
> > - compiler.delimited-informat.max-line-samples
> > - compiler.delimited-informat.min-line-samples
> > - compiler.delimited-informat.max-sample-len
> > - taskmanager.runtime.max-fan
> > - pipeline.max-parallelism
> > - execution.checkpointing.max-concurrent-checkpoint
> > - execution.checkpointing.min-pause
> >
> > Config options follow the "xyz.max" pattern:
> > - taskmanager.memory.jvm-overhead.max
> > - taskmanager.memory.jvm-overhead.min
> > - taskmanager.memory.network.max
> > - taskmanager.memory.network.min
> > - taskmanager.network.request-backoff.max
> > - env.log.max
> >
> > Config options do not follow the above two patterns:
> > - akka.client-socket-worker-pool.pool-size-max
> > - akka.client-socket-worker-pool.pool-size-min
> > - akka.fork-join-executor.parallelism-max
> > - akka.fork-join-executor.parallelism-min
> > - akka.server-socket-worker-pool.pool-size-max
> > - akka.server-socket-worker-pool.pool-size-min
> > - containerized.heap-cutoff-min
> > - blob.offload.minsize
> >
> > It seems more config options follow the "max-xyz" pattern. From my
> > side, I do not have a strong preference. But it probably make sense to
> > follow one of them in Flink.
> > If we decide to make it a naming convention and align all config
> > options to it, I prefer to follow the "max-xyz" pattern to minimize
> > the infect to user.
> >
> > Looking forward to your feedback!
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-16605
> > [2] https://github.com/apache/flink/pull/11615#discussion_r412316648
> > [3]
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
> >
> > Best,
> > Yangze Guo
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Xintong Song
+1 for Robert's idea about adding tests/tools checking the pattern of new
configuration options, and migrate the old ones in release 2.0.

Concerning the preferred pattern, I personally agree with Till's opinion. I
think 'xyz.[min|max]' somehow expresses that 'min' and 'max' are properties
of 'xyz', while 'xyz' may also have other properties. An example could be
'taskmanager.memory.network.[min|max|fraction]'.

Thank you~

Xintong Song



On Mon, Apr 27, 2020 at 6:00 PM Till Rohrmann <[hidden email]> wrote:

> Hi everyone,
>
> as Robert said I think the problem is that we don't have strict guidelines
> and every committer follows his/her personal taste. I'm actually not sure
> whether we can define bullet-proof guidelines but we can definitely
> make them more concrete.
>
> In this case here, I have to admit that I have an opposing opinion/taste. I
> think it would be better to use xyz.min and xyz.max. The reason is that we
> configure a property of xyz which consists of the minimum and maximum
> value. Differently said, min and max belong semantically together and hence
> should be defined together. You can think of it as if the type of the xyz
> config option would be a tuple of two integers instead of two individual
> integers.
>
> A comment concerning the existing styles of config options: I think many of
> the config options which follow the max-xzy pattern are actually older
> configuration options.
>
> Cheers,
> Till
>
> On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]>
> wrote:
>
> > Thanks for starting this discussion.
> > I believe the different options are a lot about personal taste, there are
> > no objective arguments why one option is better than the other.
> >
> > I agree with your proposal to simply go with the "max-xyz" pattern, as
> this
> > is the style of the majority of the current configuration options in
> Flink
> > (maybe this also means it is the taste of the majority of Flink
> > developers?).
> >
> > I would propose to add a test or some tooling that checks that all new
> > configuration parameters follow this pattern, as well as tickets for
> Flink
> > 2.0 to migrate the "wrong" configuration options.
> >
> >
> >
> > On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]> wrote:
> >
> > > Hi, everyone,
> > >
> > > I'm working on FLINK-16605 Add max limitation to the total number of
> > > slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
> > > config option of this limit.
> > > The central question is whether the "max" should be part of the
> > > hierarchy or part of the property itself.
> > >
> > > It means there could be two patterns:
> > > - max-xyz
> > > - xyz.max
> > >
> > > Currently, there is no clear consensus on which style is better and we
> > > could find both patterns in the current Flink. Here, I'd like to first
> > > sort out[3]:
> > >
> > > Config options follow the "max-xyz" pattern:
> > > - restart-strategy.failure-rate.max-failures-per-interval
> > > - yarn.maximum-failed-containers
> > > - state.backend.rocksdb.compaction.level.max-size-level-base
> > > - cluster.registration.max-timeout
> > > - high-availability.zookeeper.client.max-retry-attempts
> > > - rest.client.max-content-length
> > > - rest.retry.max-attempts
> > > - rest.server.max-content-length
> > > - jobstore.max-capacity
> > > - taskmanager.registration.max-backoff
> > > - compiler.delimited-informat.max-line-samples
> > > - compiler.delimited-informat.min-line-samples
> > > - compiler.delimited-informat.max-sample-len
> > > - taskmanager.runtime.max-fan
> > > - pipeline.max-parallelism
> > > - execution.checkpointing.max-concurrent-checkpoint
> > > - execution.checkpointing.min-pause
> > >
> > > Config options follow the "xyz.max" pattern:
> > > - taskmanager.memory.jvm-overhead.max
> > > - taskmanager.memory.jvm-overhead.min
> > > - taskmanager.memory.network.max
> > > - taskmanager.memory.network.min
> > > - taskmanager.network.request-backoff.max
> > > - env.log.max
> > >
> > > Config options do not follow the above two patterns:
> > > - akka.client-socket-worker-pool.pool-size-max
> > > - akka.client-socket-worker-pool.pool-size-min
> > > - akka.fork-join-executor.parallelism-max
> > > - akka.fork-join-executor.parallelism-min
> > > - akka.server-socket-worker-pool.pool-size-max
> > > - akka.server-socket-worker-pool.pool-size-min
> > > - containerized.heap-cutoff-min
> > > - blob.offload.minsize
> > >
> > > It seems more config options follow the "max-xyz" pattern. From my
> > > side, I do not have a strong preference. But it probably make sense to
> > > follow one of them in Flink.
> > > If we decide to make it a naming convention and align all config
> > > options to it, I prefer to follow the "max-xyz" pattern to minimize
> > > the infect to user.
> > >
> > > Looking forward to your feedback!
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-16605
> > > [2] https://github.com/apache/flink/pull/11615#discussion_r412316648
> > > [3]
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
> > >
> > > Best,
> > > Yangze Guo
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Chesnay Schepler-3
+1 for xyz.[min|max]; imo it becomes obvious if think of it like a yaml
file:

xyz:
     min:
     max:

opposed to

min-xyz:
max-xyz:

IIRC this would also be more in-line with the hierarchical scheme for
config options we decided on months ago.

On 27/04/2020 13:25, Xintong Song wrote:

> +1 for Robert's idea about adding tests/tools checking the pattern of new
> configuration options, and migrate the old ones in release 2.0.
>
> Concerning the preferred pattern, I personally agree with Till's opinion. I
> think 'xyz.[min|max]' somehow expresses that 'min' and 'max' are properties
> of 'xyz', while 'xyz' may also have other properties. An example could be
> 'taskmanager.memory.network.[min|max|fraction]'.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Apr 27, 2020 at 6:00 PM Till Rohrmann <[hidden email]> wrote:
>
>> Hi everyone,
>>
>> as Robert said I think the problem is that we don't have strict guidelines
>> and every committer follows his/her personal taste. I'm actually not sure
>> whether we can define bullet-proof guidelines but we can definitely
>> make them more concrete.
>>
>> In this case here, I have to admit that I have an opposing opinion/taste. I
>> think it would be better to use xyz.min and xyz.max. The reason is that we
>> configure a property of xyz which consists of the minimum and maximum
>> value. Differently said, min and max belong semantically together and hence
>> should be defined together. You can think of it as if the type of the xyz
>> config option would be a tuple of two integers instead of two individual
>> integers.
>>
>> A comment concerning the existing styles of config options: I think many of
>> the config options which follow the max-xzy pattern are actually older
>> configuration options.
>>
>> Cheers,
>> Till
>>
>> On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]>
>> wrote:
>>
>>> Thanks for starting this discussion.
>>> I believe the different options are a lot about personal taste, there are
>>> no objective arguments why one option is better than the other.
>>>
>>> I agree with your proposal to simply go with the "max-xyz" pattern, as
>> this
>>> is the style of the majority of the current configuration options in
>> Flink
>>> (maybe this also means it is the taste of the majority of Flink
>>> developers?).
>>>
>>> I would propose to add a test or some tooling that checks that all new
>>> configuration parameters follow this pattern, as well as tickets for
>> Flink
>>> 2.0 to migrate the "wrong" configuration options.
>>>
>>>
>>>
>>> On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]> wrote:
>>>
>>>> Hi, everyone,
>>>>
>>>> I'm working on FLINK-16605 Add max limitation to the total number of
>>>> slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
>>>> config option of this limit.
>>>> The central question is whether the "max" should be part of the
>>>> hierarchy or part of the property itself.
>>>>
>>>> It means there could be two patterns:
>>>> - max-xyz
>>>> - xyz.max
>>>>
>>>> Currently, there is no clear consensus on which style is better and we
>>>> could find both patterns in the current Flink. Here, I'd like to first
>>>> sort out[3]:
>>>>
>>>> Config options follow the "max-xyz" pattern:
>>>> - restart-strategy.failure-rate.max-failures-per-interval
>>>> - yarn.maximum-failed-containers
>>>> - state.backend.rocksdb.compaction.level.max-size-level-base
>>>> - cluster.registration.max-timeout
>>>> - high-availability.zookeeper.client.max-retry-attempts
>>>> - rest.client.max-content-length
>>>> - rest.retry.max-attempts
>>>> - rest.server.max-content-length
>>>> - jobstore.max-capacity
>>>> - taskmanager.registration.max-backoff
>>>> - compiler.delimited-informat.max-line-samples
>>>> - compiler.delimited-informat.min-line-samples
>>>> - compiler.delimited-informat.max-sample-len
>>>> - taskmanager.runtime.max-fan
>>>> - pipeline.max-parallelism
>>>> - execution.checkpointing.max-concurrent-checkpoint
>>>> - execution.checkpointing.min-pause
>>>>
>>>> Config options follow the "xyz.max" pattern:
>>>> - taskmanager.memory.jvm-overhead.max
>>>> - taskmanager.memory.jvm-overhead.min
>>>> - taskmanager.memory.network.max
>>>> - taskmanager.memory.network.min
>>>> - taskmanager.network.request-backoff.max
>>>> - env.log.max
>>>>
>>>> Config options do not follow the above two patterns:
>>>> - akka.client-socket-worker-pool.pool-size-max
>>>> - akka.client-socket-worker-pool.pool-size-min
>>>> - akka.fork-join-executor.parallelism-max
>>>> - akka.fork-join-executor.parallelism-min
>>>> - akka.server-socket-worker-pool.pool-size-max
>>>> - akka.server-socket-worker-pool.pool-size-min
>>>> - containerized.heap-cutoff-min
>>>> - blob.offload.minsize
>>>>
>>>> It seems more config options follow the "max-xyz" pattern. From my
>>>> side, I do not have a strong preference. But it probably make sense to
>>>> follow one of them in Flink.
>>>> If we decide to make it a naming convention and align all config
>>>> options to it, I prefer to follow the "max-xyz" pattern to minimize
>>>> the infect to user.
>>>>
>>>> Looking forward to your feedback!
>>>>
>>>> [1] https://issues.apache.org/jira/browse/FLINK-16605
>>>> [2] https://github.com/apache/flink/pull/11615#discussion_r412316648
>>>> [3]
>>>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
>>>> Best,
>>>> Yangze Guo
>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Flavio Pompermaier
+1 for Chesnay approach

On Mon, Apr 27, 2020 at 2:31 PM Chesnay Schepler <[hidden email]> wrote:

> +1 for xyz.[min|max]; imo it becomes obvious if think of it like a yaml
> file:
>
> xyz:
>      min:
>      max:
>
> opposed to
>
> min-xyz:
> max-xyz:
>
> IIRC this would also be more in-line with the hierarchical scheme for
> config options we decided on months ago.
>
> On 27/04/2020 13:25, Xintong Song wrote:
> > +1 for Robert's idea about adding tests/tools checking the pattern of new
> > configuration options, and migrate the old ones in release 2.0.
> >
> > Concerning the preferred pattern, I personally agree with Till's
> opinion. I
> > think 'xyz.[min|max]' somehow expresses that 'min' and 'max' are
> properties
> > of 'xyz', while 'xyz' may also have other properties. An example could be
> > 'taskmanager.memory.network.[min|max|fraction]'.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Apr 27, 2020 at 6:00 PM Till Rohrmann <[hidden email]>
> wrote:
> >
> >> Hi everyone,
> >>
> >> as Robert said I think the problem is that we don't have strict
> guidelines
> >> and every committer follows his/her personal taste. I'm actually not
> sure
> >> whether we can define bullet-proof guidelines but we can definitely
> >> make them more concrete.
> >>
> >> In this case here, I have to admit that I have an opposing
> opinion/taste. I
> >> think it would be better to use xyz.min and xyz.max. The reason is that
> we
> >> configure a property of xyz which consists of the minimum and maximum
> >> value. Differently said, min and max belong semantically together and
> hence
> >> should be defined together. You can think of it as if the type of the
> xyz
> >> config option would be a tuple of two integers instead of two individual
> >> integers.
> >>
> >> A comment concerning the existing styles of config options: I think
> many of
> >> the config options which follow the max-xzy pattern are actually older
> >> configuration options.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]>
> >> wrote:
> >>
> >>> Thanks for starting this discussion.
> >>> I believe the different options are a lot about personal taste, there
> are
> >>> no objective arguments why one option is better than the other.
> >>>
> >>> I agree with your proposal to simply go with the "max-xyz" pattern, as
> >> this
> >>> is the style of the majority of the current configuration options in
> >> Flink
> >>> (maybe this also means it is the taste of the majority of Flink
> >>> developers?).
> >>>
> >>> I would propose to add a test or some tooling that checks that all new
> >>> configuration parameters follow this pattern, as well as tickets for
> >> Flink
> >>> 2.0 to migrate the "wrong" configuration options.
> >>>
> >>>
> >>>
> >>> On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]> wrote:
> >>>
> >>>> Hi, everyone,
> >>>>
> >>>> I'm working on FLINK-16605 Add max limitation to the total number of
> >>>> slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about the
> >>>> config option of this limit.
> >>>> The central question is whether the "max" should be part of the
> >>>> hierarchy or part of the property itself.
> >>>>
> >>>> It means there could be two patterns:
> >>>> - max-xyz
> >>>> - xyz.max
> >>>>
> >>>> Currently, there is no clear consensus on which style is better and we
> >>>> could find both patterns in the current Flink. Here, I'd like to first
> >>>> sort out[3]:
> >>>>
> >>>> Config options follow the "max-xyz" pattern:
> >>>> - restart-strategy.failure-rate.max-failures-per-interval
> >>>> - yarn.maximum-failed-containers
> >>>> - state.backend.rocksdb.compaction.level.max-size-level-base
> >>>> - cluster.registration.max-timeout
> >>>> - high-availability.zookeeper.client.max-retry-attempts
> >>>> - rest.client.max-content-length
> >>>> - rest.retry.max-attempts
> >>>> - rest.server.max-content-length
> >>>> - jobstore.max-capacity
> >>>> - taskmanager.registration.max-backoff
> >>>> - compiler.delimited-informat.max-line-samples
> >>>> - compiler.delimited-informat.min-line-samples
> >>>> - compiler.delimited-informat.max-sample-len
> >>>> - taskmanager.runtime.max-fan
> >>>> - pipeline.max-parallelism
> >>>> - execution.checkpointing.max-concurrent-checkpoint
> >>>> - execution.checkpointing.min-pause
> >>>>
> >>>> Config options follow the "xyz.max" pattern:
> >>>> - taskmanager.memory.jvm-overhead.max
> >>>> - taskmanager.memory.jvm-overhead.min
> >>>> - taskmanager.memory.network.max
> >>>> - taskmanager.memory.network.min
> >>>> - taskmanager.network.request-backoff.max
> >>>> - env.log.max
> >>>>
> >>>> Config options do not follow the above two patterns:
> >>>> - akka.client-socket-worker-pool.pool-size-max
> >>>> - akka.client-socket-worker-pool.pool-size-min
> >>>> - akka.fork-join-executor.parallelism-max
> >>>> - akka.fork-join-executor.parallelism-min
> >>>> - akka.server-socket-worker-pool.pool-size-max
> >>>> - akka.server-socket-worker-pool.pool-size-min
> >>>> - containerized.heap-cutoff-min
> >>>> - blob.offload.minsize
> >>>>
> >>>> It seems more config options follow the "max-xyz" pattern. From my
> >>>> side, I do not have a strong preference. But it probably make sense to
> >>>> follow one of them in Flink.
> >>>> If we decide to make it a naming convention and align all config
> >>>> options to it, I prefer to follow the "max-xyz" pattern to minimize
> >>>> the infect to user.
> >>>>
> >>>> Looking forward to your feedback!
> >>>>
> >>>> [1] https://issues.apache.org/jira/browse/FLINK-16605
> >>>> [2] https://github.com/apache/flink/pull/11615#discussion_r412316648
> >>>> [3]
> >>>>
> >>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
> >>>> Best,
> >>>> Yangze Guo
> >>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Jark Wu-2
+1 for xyz.[min|max]

This is already mentioned in the Code Style Guideline [1].

Best,
Jark


[1]:
https://flink.apache.org/contributing/code-style-and-quality-components.html#configuration-changes

On Mon, 27 Apr 2020 at 21:33, Flavio Pompermaier <[hidden email]>
wrote:

> +1 for Chesnay approach
>
> On Mon, Apr 27, 2020 at 2:31 PM Chesnay Schepler <[hidden email]>
> wrote:
>
> > +1 for xyz.[min|max]; imo it becomes obvious if think of it like a yaml
> > file:
> >
> > xyz:
> >      min:
> >      max:
> >
> > opposed to
> >
> > min-xyz:
> > max-xyz:
> >
> > IIRC this would also be more in-line with the hierarchical scheme for
> > config options we decided on months ago.
> >
> > On 27/04/2020 13:25, Xintong Song wrote:
> > > +1 for Robert's idea about adding tests/tools checking the pattern of
> new
> > > configuration options, and migrate the old ones in release 2.0.
> > >
> > > Concerning the preferred pattern, I personally agree with Till's
> > opinion. I
> > > think 'xyz.[min|max]' somehow expresses that 'min' and 'max' are
> > properties
> > > of 'xyz', while 'xyz' may also have other properties. An example could
> be
> > > 'taskmanager.memory.network.[min|max|fraction]'.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Apr 27, 2020 at 6:00 PM Till Rohrmann <[hidden email]>
> > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> as Robert said I think the problem is that we don't have strict
> > guidelines
> > >> and every committer follows his/her personal taste. I'm actually not
> > sure
> > >> whether we can define bullet-proof guidelines but we can definitely
> > >> make them more concrete.
> > >>
> > >> In this case here, I have to admit that I have an opposing
> > opinion/taste. I
> > >> think it would be better to use xyz.min and xyz.max. The reason is
> that
> > we
> > >> configure a property of xyz which consists of the minimum and maximum
> > >> value. Differently said, min and max belong semantically together and
> > hence
> > >> should be defined together. You can think of it as if the type of the
> > xyz
> > >> config option would be a tuple of two integers instead of two
> individual
> > >> integers.
> > >>
> > >> A comment concerning the existing styles of config options: I think
> > many of
> > >> the config options which follow the max-xzy pattern are actually older
> > >> configuration options.
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]>
> > >> wrote:
> > >>
> > >>> Thanks for starting this discussion.
> > >>> I believe the different options are a lot about personal taste, there
> > are
> > >>> no objective arguments why one option is better than the other.
> > >>>
> > >>> I agree with your proposal to simply go with the "max-xyz" pattern,
> as
> > >> this
> > >>> is the style of the majority of the current configuration options in
> > >> Flink
> > >>> (maybe this also means it is the taste of the majority of Flink
> > >>> developers?).
> > >>>
> > >>> I would propose to add a test or some tooling that checks that all
> new
> > >>> configuration parameters follow this pattern, as well as tickets for
> > >> Flink
> > >>> 2.0 to migrate the "wrong" configuration options.
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]>
> wrote:
> > >>>
> > >>>> Hi, everyone,
> > >>>>
> > >>>> I'm working on FLINK-16605 Add max limitation to the total number of
> > >>>> slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about
> the
> > >>>> config option of this limit.
> > >>>> The central question is whether the "max" should be part of the
> > >>>> hierarchy or part of the property itself.
> > >>>>
> > >>>> It means there could be two patterns:
> > >>>> - max-xyz
> > >>>> - xyz.max
> > >>>>
> > >>>> Currently, there is no clear consensus on which style is better and
> we
> > >>>> could find both patterns in the current Flink. Here, I'd like to
> first
> > >>>> sort out[3]:
> > >>>>
> > >>>> Config options follow the "max-xyz" pattern:
> > >>>> - restart-strategy.failure-rate.max-failures-per-interval
> > >>>> - yarn.maximum-failed-containers
> > >>>> - state.backend.rocksdb.compaction.level.max-size-level-base
> > >>>> - cluster.registration.max-timeout
> > >>>> - high-availability.zookeeper.client.max-retry-attempts
> > >>>> - rest.client.max-content-length
> > >>>> - rest.retry.max-attempts
> > >>>> - rest.server.max-content-length
> > >>>> - jobstore.max-capacity
> > >>>> - taskmanager.registration.max-backoff
> > >>>> - compiler.delimited-informat.max-line-samples
> > >>>> - compiler.delimited-informat.min-line-samples
> > >>>> - compiler.delimited-informat.max-sample-len
> > >>>> - taskmanager.runtime.max-fan
> > >>>> - pipeline.max-parallelism
> > >>>> - execution.checkpointing.max-concurrent-checkpoint
> > >>>> - execution.checkpointing.min-pause
> > >>>>
> > >>>> Config options follow the "xyz.max" pattern:
> > >>>> - taskmanager.memory.jvm-overhead.max
> > >>>> - taskmanager.memory.jvm-overhead.min
> > >>>> - taskmanager.memory.network.max
> > >>>> - taskmanager.memory.network.min
> > >>>> - taskmanager.network.request-backoff.max
> > >>>> - env.log.max
> > >>>>
> > >>>> Config options do not follow the above two patterns:
> > >>>> - akka.client-socket-worker-pool.pool-size-max
> > >>>> - akka.client-socket-worker-pool.pool-size-min
> > >>>> - akka.fork-join-executor.parallelism-max
> > >>>> - akka.fork-join-executor.parallelism-min
> > >>>> - akka.server-socket-worker-pool.pool-size-max
> > >>>> - akka.server-socket-worker-pool.pool-size-min
> > >>>> - containerized.heap-cutoff-min
> > >>>> - blob.offload.minsize
> > >>>>
> > >>>> It seems more config options follow the "max-xyz" pattern. From my
> > >>>> side, I do not have a strong preference. But it probably make sense
> to
> > >>>> follow one of them in Flink.
> > >>>> If we decide to make it a naming convention and align all config
> > >>>> options to it, I prefer to follow the "max-xyz" pattern to minimize
> > >>>> the infect to user.
> > >>>>
> > >>>> Looking forward to your feedback!
> > >>>>
> > >>>> [1] https://issues.apache.org/jira/browse/FLINK-16605
> > >>>> [2]
> https://github.com/apache/flink/pull/11615#discussion_r412316648
> > >>>> [3]
> > >>>>
> > >>
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
> > >>>> Best,
> > >>>> Yangze Guo
> > >>>>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Should max/min be part of the hierarchy of config option?

Yangze Guo
Thank you all for the feedback.

It seems we reach a consensus:
- The naming convention would better to be xyz.[min|max].
- Adding tests/tools checking the pattern of new configuration options
- Tickets for Flink 2.0 to migrate the "wrong" configuration options.

If there is no objection in the next 72 hours, I'd like to open JIRA
tickets addressing the above consensus.

Best,
Yangze Guo

On Mon, Apr 27, 2020 at 9:52 PM Jark Wu <[hidden email]> wrote:

>
> +1 for xyz.[min|max]
>
> This is already mentioned in the Code Style Guideline [1].
>
> Best,
> Jark
>
>
> [1]:
> https://flink.apache.org/contributing/code-style-and-quality-components.html#configuration-changes
>
> On Mon, 27 Apr 2020 at 21:33, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > +1 for Chesnay approach
> >
> > On Mon, Apr 27, 2020 at 2:31 PM Chesnay Schepler <[hidden email]>
> > wrote:
> >
> > > +1 for xyz.[min|max]; imo it becomes obvious if think of it like a yaml
> > > file:
> > >
> > > xyz:
> > >      min:
> > >      max:
> > >
> > > opposed to
> > >
> > > min-xyz:
> > > max-xyz:
> > >
> > > IIRC this would also be more in-line with the hierarchical scheme for
> > > config options we decided on months ago.
> > >
> > > On 27/04/2020 13:25, Xintong Song wrote:
> > > > +1 for Robert's idea about adding tests/tools checking the pattern of
> > new
> > > > configuration options, and migrate the old ones in release 2.0.
> > > >
> > > > Concerning the preferred pattern, I personally agree with Till's
> > > opinion. I
> > > > think 'xyz.[min|max]' somehow expresses that 'min' and 'max' are
> > > properties
> > > > of 'xyz', while 'xyz' may also have other properties. An example could
> > be
> > > > 'taskmanager.memory.network.[min|max|fraction]'.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Mon, Apr 27, 2020 at 6:00 PM Till Rohrmann <[hidden email]>
> > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> as Robert said I think the problem is that we don't have strict
> > > guidelines
> > > >> and every committer follows his/her personal taste. I'm actually not
> > > sure
> > > >> whether we can define bullet-proof guidelines but we can definitely
> > > >> make them more concrete.
> > > >>
> > > >> In this case here, I have to admit that I have an opposing
> > > opinion/taste. I
> > > >> think it would be better to use xyz.min and xyz.max. The reason is
> > that
> > > we
> > > >> configure a property of xyz which consists of the minimum and maximum
> > > >> value. Differently said, min and max belong semantically together and
> > > hence
> > > >> should be defined together. You can think of it as if the type of the
> > > xyz
> > > >> config option would be a tuple of two integers instead of two
> > individual
> > > >> integers.
> > > >>
> > > >> A comment concerning the existing styles of config options: I think
> > > many of
> > > >> the config options which follow the max-xzy pattern are actually older
> > > >> configuration options.
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Mon, Apr 27, 2020 at 10:34 AM Robert Metzger <[hidden email]>
> > > >> wrote:
> > > >>
> > > >>> Thanks for starting this discussion.
> > > >>> I believe the different options are a lot about personal taste, there
> > > are
> > > >>> no objective arguments why one option is better than the other.
> > > >>>
> > > >>> I agree with your proposal to simply go with the "max-xyz" pattern,
> > as
> > > >> this
> > > >>> is the style of the majority of the current configuration options in
> > > >> Flink
> > > >>> (maybe this also means it is the taste of the majority of Flink
> > > >>> developers?).
> > > >>>
> > > >>> I would propose to add a test or some tooling that checks that all
> > new
> > > >>> configuration parameters follow this pattern, as well as tickets for
> > > >> Flink
> > > >>> 2.0 to migrate the "wrong" configuration options.
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Wed, Apr 22, 2020 at 5:47 AM Yangze Guo <[hidden email]>
> > wrote:
> > > >>>
> > > >>>> Hi, everyone,
> > > >>>>
> > > >>>> I'm working on FLINK-16605 Add max limitation to the total number of
> > > >>>> slots[1]. In the PR, I, Gary and Xintong has a discussion[2] about
> > the
> > > >>>> config option of this limit.
> > > >>>> The central question is whether the "max" should be part of the
> > > >>>> hierarchy or part of the property itself.
> > > >>>>
> > > >>>> It means there could be two patterns:
> > > >>>> - max-xyz
> > > >>>> - xyz.max
> > > >>>>
> > > >>>> Currently, there is no clear consensus on which style is better and
> > we
> > > >>>> could find both patterns in the current Flink. Here, I'd like to
> > first
> > > >>>> sort out[3]:
> > > >>>>
> > > >>>> Config options follow the "max-xyz" pattern:
> > > >>>> - restart-strategy.failure-rate.max-failures-per-interval
> > > >>>> - yarn.maximum-failed-containers
> > > >>>> - state.backend.rocksdb.compaction.level.max-size-level-base
> > > >>>> - cluster.registration.max-timeout
> > > >>>> - high-availability.zookeeper.client.max-retry-attempts
> > > >>>> - rest.client.max-content-length
> > > >>>> - rest.retry.max-attempts
> > > >>>> - rest.server.max-content-length
> > > >>>> - jobstore.max-capacity
> > > >>>> - taskmanager.registration.max-backoff
> > > >>>> - compiler.delimited-informat.max-line-samples
> > > >>>> - compiler.delimited-informat.min-line-samples
> > > >>>> - compiler.delimited-informat.max-sample-len
> > > >>>> - taskmanager.runtime.max-fan
> > > >>>> - pipeline.max-parallelism
> > > >>>> - execution.checkpointing.max-concurrent-checkpoint
> > > >>>> - execution.checkpointing.min-pause
> > > >>>>
> > > >>>> Config options follow the "xyz.max" pattern:
> > > >>>> - taskmanager.memory.jvm-overhead.max
> > > >>>> - taskmanager.memory.jvm-overhead.min
> > > >>>> - taskmanager.memory.network.max
> > > >>>> - taskmanager.memory.network.min
> > > >>>> - taskmanager.network.request-backoff.max
> > > >>>> - env.log.max
> > > >>>>
> > > >>>> Config options do not follow the above two patterns:
> > > >>>> - akka.client-socket-worker-pool.pool-size-max
> > > >>>> - akka.client-socket-worker-pool.pool-size-min
> > > >>>> - akka.fork-join-executor.parallelism-max
> > > >>>> - akka.fork-join-executor.parallelism-min
> > > >>>> - akka.server-socket-worker-pool.pool-size-max
> > > >>>> - akka.server-socket-worker-pool.pool-size-min
> > > >>>> - containerized.heap-cutoff-min
> > > >>>> - blob.offload.minsize
> > > >>>>
> > > >>>> It seems more config options follow the "max-xyz" pattern. From my
> > > >>>> side, I do not have a strong preference. But it probably make sense
> > to
> > > >>>> follow one of them in Flink.
> > > >>>> If we decide to make it a naming convention and align all config
> > > >>>> options to it, I prefer to follow the "max-xyz" pattern to minimize
> > > >>>> the infect to user.
> > > >>>>
> > > >>>> Looking forward to your feedback!
> > > >>>>
> > > >>>> [1] https://issues.apache.org/jira/browse/FLINK-16605
> > > >>>> [2]
> > https://github.com/apache/flink/pull/11615#discussion_r412316648
> > > >>>> [3]
> > > >>>>
> > > >>
> > >
> > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html
> > > >>>> Best,
> > > >>>> Yangze Guo
> > > >>>>
> > >
> >