[DISCUSS] FLIP-59: Enable execution configuration from Configuration object

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-59: Enable execution configuration from Configuration object

dwysakowicz

Hi,

I wanted to propose a new, additional way of configuring execution parameters that can currently be set only on such objects like ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This poses problems such as:

  • no easy way to configure those from a file
  • there is no easy way to pass a configuration from layers built on top of StreamExecutionEnvironment. (e.g. when we want to configure those options from TableEnvironment)
  • they are not automatically documented

Note that there are a few concepts from FLIP-54[1] that this FLIP is based on.

Would be really grateful to know if you think this would be a valuable addition and any other feedback.

Best,

Dawid

Wiki page: https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object

Google doc: https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing


[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration



signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Gyula Fóra
Hi!

Huuuge +1 from me, this has been an operational pain for years.
This would also introduce a nice and simple way to extend it in the future
if we need.

Ship it!

Gyula

On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Hi,
>
> I wanted to propose a new, additional way of configuring execution
> parameters that can currently be set only on such objects like
> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
> poses problems such as:
>
>    - no easy way to configure those from a file
>    - there is no easy way to pass a configuration from layers built on
>    top of StreamExecutionEnvironment. (e.g. when we want to configure those
>    options from TableEnvironment)
>    - they are not automatically documented
>
> Note that there are a few concepts from FLIP-54[1] that this FLIP is based
> on.
>
> Would be really grateful to know if you think this would be a valuable
> addition and any other feedback.
>
> Best,
>
> Dawid
>
> Wiki page:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>
> Google doc:
> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Gyula Fóra
What we could also add to make this a bit more generic and extensible is to
create some interfaces for reconfiguring the StreamExecutionEnvironment,
ExecutionConfig etc and let users specify a class the implements the
reconfiguration logic based on the flink configuration.

This could be executed after the default behaviour that you outlined in the
FLIP.

What do you think?

Gyula

On Thu, Aug 29, 2019 at 7:21 PM Gyula Fóra <[hidden email]> wrote:

> Hi!
>
> Huuuge +1 from me, this has been an operational pain for years.
> This would also introduce a nice and simple way to extend it in the future
> if we need.
>
> Ship it!
>
> Gyula
>
> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]>
> wrote:
>
>> Hi,
>>
>> I wanted to propose a new, additional way of configuring execution
>> parameters that can currently be set only on such objects like
>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>> poses problems such as:
>>
>>    - no easy way to configure those from a file
>>    - there is no easy way to pass a configuration from layers built on
>>    top of StreamExecutionEnvironment. (e.g. when we want to configure those
>>    options from TableEnvironment)
>>    - they are not automatically documented
>>
>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>> based on.
>>
>> Would be really grateful to know if you think this would be a valuable
>> addition and any other feedback.
>>
>> Best,
>>
>> Dawid
>>
>> Wiki page:
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>
>> Google doc:
>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

dwysakowicz
In reply to this post by Gyula Fóra
Hi Gyula,

Thank you for the support on those changes.

I am not sure if I understood your idea for the "reconfiguration" logic.

The configure method on those objects would take ConfigurationReader. So
user can provide a thin wrapper around Configuration for e.g. filtering
certain logic, changing values based on other parameters etc. Is that
what you had in mind?

Best,

Dawid

On 29/08/2019 19:21, Gyula Fóra wrote:

> Hi!
>
> Huuuge +1 from me, this has been an operational pain for years.
> This would also introduce a nice and simple way to extend it in the future
> if we need.
>
> Ship it!
>
> Gyula
>
> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]>
> wrote:
>
>> Hi,
>>
>> I wanted to propose a new, additional way of configuring execution
>> parameters that can currently be set only on such objects like
>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>> poses problems such as:
>>
>>    - no easy way to configure those from a file
>>    - there is no easy way to pass a configuration from layers built on
>>    top of StreamExecutionEnvironment. (e.g. when we want to configure those
>>    options from TableEnvironment)
>>    - they are not automatically documented
>>
>> Note that there are a few concepts from FLIP-54[1] that this FLIP is based
>> on.
>>
>> Would be really grateful to know if you think this would be a valuable
>> addition and any other feedback.
>>
>> Best,
>>
>> Dawid
>>
>> Wiki page:
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>
>> Google doc:
>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>
>>
>>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Gyula Fóra
Hi Dawid,

Sorry I misread one of the interfaces a little (Configuration instead of
ConfigurationReader), you are right.
I was referring to:


   -

   void StreamExecutionEnvironment.configure(ConfigurationReader)


This might be slightly orthogonal to the changes that you made here but
what I meant is that instead of adding methods to the
StreamExecutionEnvironment we could make this an external interface:

EnvironmentConfigurer {
  void configure(StreamExecutionEnvironment, ConfigurationReader)
}

We could then have a default implementation of the EnvironmentConfigurer
that would understand built in options.  We could also allow users to pass
custom implementations of this, which could configure the
StreamExecutionEnvironment based on user defined config options. This is
just a rough idea for extensibility and probably out of scope at first.

Cheers,
Gyula

On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Hi Gyula,
>
> Thank you for the support on those changes.
>
> I am not sure if I understood your idea for the "reconfiguration" logic.
>
> The configure method on those objects would take ConfigurationReader. So
> user can provide a thin wrapper around Configuration for e.g. filtering
> certain logic, changing values based on other parameters etc. Is that
> what you had in mind?
>
> Best,
>
> Dawid
>
> On 29/08/2019 19:21, Gyula Fóra wrote:
> > Hi!
> >
> > Huuuge +1 from me, this has been an operational pain for years.
> > This would also introduce a nice and simple way to extend it in the
> future
> > if we need.
> >
> > Ship it!
> >
> > Gyula
> >
> > On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
> >
> > wrote:
> >
> >> Hi,
> >>
> >> I wanted to propose a new, additional way of configuring execution
> >> parameters that can currently be set only on such objects like
> >> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
> >> poses problems such as:
> >>
> >>    - no easy way to configure those from a file
> >>    - there is no easy way to pass a configuration from layers built on
> >>    top of StreamExecutionEnvironment. (e.g. when we want to configure
> those
> >>    options from TableEnvironment)
> >>    - they are not automatically documented
> >>
> >> Note that there are a few concepts from FLIP-54[1] that this FLIP is
> based
> >> on.
> >>
> >> Would be really grateful to know if you think this would be a valuable
> >> addition and any other feedback.
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> Wiki page:
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
> >>
> >> Google doc:
> >>
> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
> >>
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
> >>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

dwysakowicz
Hi Gyula,

Yes you are right, we were also considering the external configurer. The
reason we suggest the built in method is that it is more tightly coupled
with the place the options are actually set. Therefore our hope is that,
whenever somebody e.g. adds new fields to the ExecutionConfig he/she
updates also the configure method. I am not entirely against your
suggestion though, if this is the preferred way in the community.

Does anyone has any comments regarding the option keys?

Best,

Dawid

On 30/08/2019 14:57, Gyula Fóra wrote:

> Hi Dawid,
>
> Sorry I misread one of the interfaces a little (Configuration instead of
> ConfigurationReader), you are right.
> I was referring to:
>
>
>    -
>
>    void StreamExecutionEnvironment.configure(ConfigurationReader)
>
>
> This might be slightly orthogonal to the changes that you made here but
> what I meant is that instead of adding methods to the
> StreamExecutionEnvironment we could make this an external interface:
>
> EnvironmentConfigurer {
>   void configure(StreamExecutionEnvironment, ConfigurationReader)
> }
>
> We could then have a default implementation of the EnvironmentConfigurer
> that would understand built in options.  We could also allow users to pass
> custom implementations of this, which could configure the
> StreamExecutionEnvironment based on user defined config options. This is
> just a rough idea for extensibility and probably out of scope at first.
>
> Cheers,
> Gyula
>
> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
> wrote:
>
>> Hi Gyula,
>>
>> Thank you for the support on those changes.
>>
>> I am not sure if I understood your idea for the "reconfiguration" logic.
>>
>> The configure method on those objects would take ConfigurationReader. So
>> user can provide a thin wrapper around Configuration for e.g. filtering
>> certain logic, changing values based on other parameters etc. Is that
>> what you had in mind?
>>
>> Best,
>>
>> Dawid
>>
>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>> Hi!
>>>
>>> Huuuge +1 from me, this has been an operational pain for years.
>>> This would also introduce a nice and simple way to extend it in the
>> future
>>> if we need.
>>>
>>> Ship it!
>>>
>>> Gyula
>>>
>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
>>>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I wanted to propose a new, additional way of configuring execution
>>>> parameters that can currently be set only on such objects like
>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>>>> poses problems such as:
>>>>
>>>>    - no easy way to configure those from a file
>>>>    - there is no easy way to pass a configuration from layers built on
>>>>    top of StreamExecutionEnvironment. (e.g. when we want to configure
>> those
>>>>    options from TableEnvironment)
>>>>    - they are not automatically documented
>>>>
>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>> based
>>>> on.
>>>>
>>>> Would be really grateful to know if you think this would be a valuable
>>>> addition and any other feedback.
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> Wiki page:
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>> Google doc:
>>>>
>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>
>>>> [1]
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>>>
>>>>
>>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Kostas Kloudas-5
Hi all,

Thanks for opening the discussion!

I like the idea, so +1 from my side and actually this is aligned with
our intensions for the FLIP-73 effort.

For the naming convention of the parameters introduced in the FLIP, my
proposal would be have the full word "execution" instead of the
shorter "exec".
The reason for this, is that in the context of FLIP-73, we are also
planning to introduce some new configuration parameters and the
convention we
are currently using is the following:

pipeline.***: for job parameters that will not change between
executions of the same job, e.g. the jar location
executor.***: for parameters relevant to the instantiation of the
correct executor, e.g. YARN, detached, etc
execution.***: for parameters that are relevant to a specific
execution of a given pipeline, e.g. parallelism or savepoint settings

I understand that sometimes the boundaries may not be that clear for a
parameter but I hope this will not be relevant to most of the
parameters.

I will also open a FLIP with some addition parameters but until then,
this is the scheme that we are planning to follow.

Cheers,
Kostas



On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:

>
> Hi Gyula,
>
> Yes you are right, we were also considering the external configurer. The
> reason we suggest the built in method is that it is more tightly coupled
> with the place the options are actually set. Therefore our hope is that,
> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
> updates also the configure method. I am not entirely against your
> suggestion though, if this is the preferred way in the community.
>
> Does anyone has any comments regarding the option keys?
>
> Best,
>
> Dawid
>
> On 30/08/2019 14:57, Gyula Fóra wrote:
> > Hi Dawid,
> >
> > Sorry I misread one of the interfaces a little (Configuration instead of
> > ConfigurationReader), you are right.
> > I was referring to:
> >
> >
> >    -
> >
> >    void StreamExecutionEnvironment.configure(ConfigurationReader)
> >
> >
> > This might be slightly orthogonal to the changes that you made here but
> > what I meant is that instead of adding methods to the
> > StreamExecutionEnvironment we could make this an external interface:
> >
> > EnvironmentConfigurer {
> >   void configure(StreamExecutionEnvironment, ConfigurationReader)
> > }
> >
> > We could then have a default implementation of the EnvironmentConfigurer
> > that would understand built in options.  We could also allow users to pass
> > custom implementations of this, which could configure the
> > StreamExecutionEnvironment based on user defined config options. This is
> > just a rough idea for extensibility and probably out of scope at first.
> >
> > Cheers,
> > Gyula
> >
> > On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
> > wrote:
> >
> >> Hi Gyula,
> >>
> >> Thank you for the support on those changes.
> >>
> >> I am not sure if I understood your idea for the "reconfiguration" logic.
> >>
> >> The configure method on those objects would take ConfigurationReader. So
> >> user can provide a thin wrapper around Configuration for e.g. filtering
> >> certain logic, changing values based on other parameters etc. Is that
> >> what you had in mind?
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> On 29/08/2019 19:21, Gyula Fóra wrote:
> >>> Hi!
> >>>
> >>> Huuuge +1 from me, this has been an operational pain for years.
> >>> This would also introduce a nice and simple way to extend it in the
> >> future
> >>> if we need.
> >>>
> >>> Ship it!
> >>>
> >>> Gyula
> >>>
> >>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
> >>>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I wanted to propose a new, additional way of configuring execution
> >>>> parameters that can currently be set only on such objects like
> >>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
> >>>> poses problems such as:
> >>>>
> >>>>    - no easy way to configure those from a file
> >>>>    - there is no easy way to pass a configuration from layers built on
> >>>>    top of StreamExecutionEnvironment. (e.g. when we want to configure
> >> those
> >>>>    options from TableEnvironment)
> >>>>    - they are not automatically documented
> >>>>
> >>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
> >> based
> >>>> on.
> >>>>
> >>>> Would be really grateful to know if you think this would be a valuable
> >>>> addition and any other feedback.
> >>>>
> >>>> Best,
> >>>>
> >>>> Dawid
> >>>>
> >>>> Wiki page:
> >>>>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
> >>>> Google doc:
> >>>>
> >> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
> >>>>
> >>>> [1]
> >>>>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
> >>>>
> >>>>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Timo Walther-2
Hi Kostas,

can we still discuss the naming of the properties? For me, having
"execution" and "exector" as prefixes might be confusing in the future
and difficult to identify if you scan through a list of properties.

How about `deployment` and `execution`? Or `deployer` and `exec`?

Regards,
Timo

On 16.10.19 16:31, Kostas Kloudas wrote:

> Hi all,
>
> Thanks for opening the discussion!
>
> I like the idea, so +1 from my side and actually this is aligned with
> our intensions for the FLIP-73 effort.
>
> For the naming convention of the parameters introduced in the FLIP, my
> proposal would be have the full word "execution" instead of the
> shorter "exec".
> The reason for this, is that in the context of FLIP-73, we are also
> planning to introduce some new configuration parameters and the
> convention we
> are currently using is the following:
>
> pipeline.***: for job parameters that will not change between
> executions of the same job, e.g. the jar location
> executor.***: for parameters relevant to the instantiation of the
> correct executor, e.g. YARN, detached, etc
> execution.***: for parameters that are relevant to a specific
> execution of a given pipeline, e.g. parallelism or savepoint settings
>
> I understand that sometimes the boundaries may not be that clear for a
> parameter but I hope this will not be relevant to most of the
> parameters.
>
> I will also open a FLIP with some addition parameters but until then,
> this is the scheme that we are planning to follow.
>
> Cheers,
> Kostas
>
>
>
> On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:
>> Hi Gyula,
>>
>> Yes you are right, we were also considering the external configurer. The
>> reason we suggest the built in method is that it is more tightly coupled
>> with the place the options are actually set. Therefore our hope is that,
>> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
>> updates also the configure method. I am not entirely against your
>> suggestion though, if this is the preferred way in the community.
>>
>> Does anyone has any comments regarding the option keys?
>>
>> Best,
>>
>> Dawid
>>
>> On 30/08/2019 14:57, Gyula Fóra wrote:
>>> Hi Dawid,
>>>
>>> Sorry I misread one of the interfaces a little (Configuration instead of
>>> ConfigurationReader), you are right.
>>> I was referring to:
>>>
>>>
>>>     -
>>>
>>>     void StreamExecutionEnvironment.configure(ConfigurationReader)
>>>
>>>
>>> This might be slightly orthogonal to the changes that you made here but
>>> what I meant is that instead of adding methods to the
>>> StreamExecutionEnvironment we could make this an external interface:
>>>
>>> EnvironmentConfigurer {
>>>    void configure(StreamExecutionEnvironment, ConfigurationReader)
>>> }
>>>
>>> We could then have a default implementation of the EnvironmentConfigurer
>>> that would understand built in options.  We could also allow users to pass
>>> custom implementations of this, which could configure the
>>> StreamExecutionEnvironment based on user defined config options. This is
>>> just a rough idea for extensibility and probably out of scope at first.
>>>
>>> Cheers,
>>> Gyula
>>>
>>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
>>> wrote:
>>>
>>>> Hi Gyula,
>>>>
>>>> Thank you for the support on those changes.
>>>>
>>>> I am not sure if I understood your idea for the "reconfiguration" logic.
>>>>
>>>> The configure method on those objects would take ConfigurationReader. So
>>>> user can provide a thin wrapper around Configuration for e.g. filtering
>>>> certain logic, changing values based on other parameters etc. Is that
>>>> what you had in mind?
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>>>> Hi!
>>>>>
>>>>> Huuuge +1 from me, this has been an operational pain for years.
>>>>> This would also introduce a nice and simple way to extend it in the
>>>> future
>>>>> if we need.
>>>>>
>>>>> Ship it!
>>>>>
>>>>> Gyula
>>>>>
>>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
>>>>>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I wanted to propose a new, additional way of configuring execution
>>>>>> parameters that can currently be set only on such objects like
>>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>>>>>> poses problems such as:
>>>>>>
>>>>>>     - no easy way to configure those from a file
>>>>>>     - there is no easy way to pass a configuration from layers built on
>>>>>>     top of StreamExecutionEnvironment. (e.g. when we want to configure
>>>> those
>>>>>>     options from TableEnvironment)
>>>>>>     - they are not automatically documented
>>>>>>
>>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>>>> based
>>>>>> on.
>>>>>>
>>>>>> Would be really grateful to know if you think this would be a valuable
>>>>>> addition and any other feedback.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> Wiki page:
>>>>>>
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>>>> Google doc:
>>>>>>
>>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>>> [1]
>>>>>>
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Kostas Kloudas-4
Hi Timo,

I agree that distinguishing between "executor" and "execution" when
scanning through a configuration file can be difficult. These names
were mainly influenced by the fact that FLIP-73 introduced the
"Executor".
In addition, I agree that "deployment" or "deploy" sound good
alternatives. Between the two, I would go with "deployment" (although
I like more the "deploy" as it is more imperative) for the simple
reason that we do not use verbs anywhere else (I think) in config
options.

Now for the "exec" or "execution", personally I like the longer
version as it is clearer.

So, to summarise, I would vote for "deployment", "execution", and
"pipeline" for job invariants, like the jars.

What do you think?

Cheers,
Kostas

On Wed, Oct 16, 2019 at 5:28 PM Timo Walther <[hidden email]> wrote:

>
> Hi Kostas,
>
> can we still discuss the naming of the properties? For me, having
> "execution" and "exector" as prefixes might be confusing in the future
> and difficult to identify if you scan through a list of properties.
>
> How about `deployment` and `execution`? Or `deployer` and `exec`?
>
> Regards,
> Timo
>
> On 16.10.19 16:31, Kostas Kloudas wrote:
> > Hi all,
> >
> > Thanks for opening the discussion!
> >
> > I like the idea, so +1 from my side and actually this is aligned with
> > our intensions for the FLIP-73 effort.
> >
> > For the naming convention of the parameters introduced in the FLIP, my
> > proposal would be have the full word "execution" instead of the
> > shorter "exec".
> > The reason for this, is that in the context of FLIP-73, we are also
> > planning to introduce some new configuration parameters and the
> > convention we
> > are currently using is the following:
> >
> > pipeline.***: for job parameters that will not change between
> > executions of the same job, e.g. the jar location
> > executor.***: for parameters relevant to the instantiation of the
> > correct executor, e.g. YARN, detached, etc
> > execution.***: for parameters that are relevant to a specific
> > execution of a given pipeline, e.g. parallelism or savepoint settings
> >
> > I understand that sometimes the boundaries may not be that clear for a
> > parameter but I hope this will not be relevant to most of the
> > parameters.
> >
> > I will also open a FLIP with some addition parameters but until then,
> > this is the scheme that we are planning to follow.
> >
> > Cheers,
> > Kostas
> >
> >
> >
> > On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:
> >> Hi Gyula,
> >>
> >> Yes you are right, we were also considering the external configurer. The
> >> reason we suggest the built in method is that it is more tightly coupled
> >> with the place the options are actually set. Therefore our hope is that,
> >> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
> >> updates also the configure method. I am not entirely against your
> >> suggestion though, if this is the preferred way in the community.
> >>
> >> Does anyone has any comments regarding the option keys?
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> On 30/08/2019 14:57, Gyula Fóra wrote:
> >>> Hi Dawid,
> >>>
> >>> Sorry I misread one of the interfaces a little (Configuration instead of
> >>> ConfigurationReader), you are right.
> >>> I was referring to:
> >>>
> >>>
> >>>     -
> >>>
> >>>     void StreamExecutionEnvironment.configure(ConfigurationReader)
> >>>
> >>>
> >>> This might be slightly orthogonal to the changes that you made here but
> >>> what I meant is that instead of adding methods to the
> >>> StreamExecutionEnvironment we could make this an external interface:
> >>>
> >>> EnvironmentConfigurer {
> >>>    void configure(StreamExecutionEnvironment, ConfigurationReader)
> >>> }
> >>>
> >>> We could then have a default implementation of the EnvironmentConfigurer
> >>> that would understand built in options.  We could also allow users to pass
> >>> custom implementations of this, which could configure the
> >>> StreamExecutionEnvironment based on user defined config options. This is
> >>> just a rough idea for extensibility and probably out of scope at first.
> >>>
> >>> Cheers,
> >>> Gyula
> >>>
> >>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
> >>> wrote:
> >>>
> >>>> Hi Gyula,
> >>>>
> >>>> Thank you for the support on those changes.
> >>>>
> >>>> I am not sure if I understood your idea for the "reconfiguration" logic.
> >>>>
> >>>> The configure method on those objects would take ConfigurationReader. So
> >>>> user can provide a thin wrapper around Configuration for e.g. filtering
> >>>> certain logic, changing values based on other parameters etc. Is that
> >>>> what you had in mind?
> >>>>
> >>>> Best,
> >>>>
> >>>> Dawid
> >>>>
> >>>> On 29/08/2019 19:21, Gyula Fóra wrote:
> >>>>> Hi!
> >>>>>
> >>>>> Huuuge +1 from me, this has been an operational pain for years.
> >>>>> This would also introduce a nice and simple way to extend it in the
> >>>> future
> >>>>> if we need.
> >>>>>
> >>>>> Ship it!
> >>>>>
> >>>>> Gyula
> >>>>>
> >>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
> >>>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I wanted to propose a new, additional way of configuring execution
> >>>>>> parameters that can currently be set only on such objects like
> >>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
> >>>>>> poses problems such as:
> >>>>>>
> >>>>>>     - no easy way to configure those from a file
> >>>>>>     - there is no easy way to pass a configuration from layers built on
> >>>>>>     top of StreamExecutionEnvironment. (e.g. when we want to configure
> >>>> those
> >>>>>>     options from TableEnvironment)
> >>>>>>     - they are not automatically documented
> >>>>>>
> >>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
> >>>> based
> >>>>>> on.
> >>>>>>
> >>>>>> Would be really grateful to know if you think this would be a valuable
> >>>>>> addition and any other feedback.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Dawid
> >>>>>>
> >>>>>> Wiki page:
> >>>>>>
> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
> >>>>>> Google doc:
> >>>>>>
> >>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
> >>>>>> [1]
> >>>>>>
> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
> >>>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Timo Walther-2
Sounds good to me.

Thanks,

Timo


On 17.10.19 09:30, Kostas Kloudas wrote:

> Hi Timo,
>
> I agree that distinguishing between "executor" and "execution" when
> scanning through a configuration file can be difficult. These names
> were mainly influenced by the fact that FLIP-73 introduced the
> "Executor".
> In addition, I agree that "deployment" or "deploy" sound good
> alternatives. Between the two, I would go with "deployment" (although
> I like more the "deploy" as it is more imperative) for the simple
> reason that we do not use verbs anywhere else (I think) in config
> options.
>
> Now for the "exec" or "execution", personally I like the longer
> version as it is clearer.
>
> So, to summarise, I would vote for "deployment", "execution", and
> "pipeline" for job invariants, like the jars.
>
> What do you think?
>
> Cheers,
> Kostas
>
> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther <[hidden email]> wrote:
>> Hi Kostas,
>>
>> can we still discuss the naming of the properties? For me, having
>> "execution" and "exector" as prefixes might be confusing in the future
>> and difficult to identify if you scan through a list of properties.
>>
>> How about `deployment` and `execution`? Or `deployer` and `exec`?
>>
>> Regards,
>> Timo
>>
>> On 16.10.19 16:31, Kostas Kloudas wrote:
>>> Hi all,
>>>
>>> Thanks for opening the discussion!
>>>
>>> I like the idea, so +1 from my side and actually this is aligned with
>>> our intensions for the FLIP-73 effort.
>>>
>>> For the naming convention of the parameters introduced in the FLIP, my
>>> proposal would be have the full word "execution" instead of the
>>> shorter "exec".
>>> The reason for this, is that in the context of FLIP-73, we are also
>>> planning to introduce some new configuration parameters and the
>>> convention we
>>> are currently using is the following:
>>>
>>> pipeline.***: for job parameters that will not change between
>>> executions of the same job, e.g. the jar location
>>> executor.***: for parameters relevant to the instantiation of the
>>> correct executor, e.g. YARN, detached, etc
>>> execution.***: for parameters that are relevant to a specific
>>> execution of a given pipeline, e.g. parallelism or savepoint settings
>>>
>>> I understand that sometimes the boundaries may not be that clear for a
>>> parameter but I hope this will not be relevant to most of the
>>> parameters.
>>>
>>> I will also open a FLIP with some addition parameters but until then,
>>> this is the scheme that we are planning to follow.
>>>
>>> Cheers,
>>> Kostas
>>>
>>>
>>>
>>> On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:
>>>> Hi Gyula,
>>>>
>>>> Yes you are right, we were also considering the external configurer. The
>>>> reason we suggest the built in method is that it is more tightly coupled
>>>> with the place the options are actually set. Therefore our hope is that,
>>>> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
>>>> updates also the configure method. I am not entirely against your
>>>> suggestion though, if this is the preferred way in the community.
>>>>
>>>> Does anyone has any comments regarding the option keys?
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> On 30/08/2019 14:57, Gyula Fóra wrote:
>>>>> Hi Dawid,
>>>>>
>>>>> Sorry I misread one of the interfaces a little (Configuration instead of
>>>>> ConfigurationReader), you are right.
>>>>> I was referring to:
>>>>>
>>>>>
>>>>>      -
>>>>>
>>>>>      void StreamExecutionEnvironment.configure(ConfigurationReader)
>>>>>
>>>>>
>>>>> This might be slightly orthogonal to the changes that you made here but
>>>>> what I meant is that instead of adding methods to the
>>>>> StreamExecutionEnvironment we could make this an external interface:
>>>>>
>>>>> EnvironmentConfigurer {
>>>>>     void configure(StreamExecutionEnvironment, ConfigurationReader)
>>>>> }
>>>>>
>>>>> We could then have a default implementation of the EnvironmentConfigurer
>>>>> that would understand built in options.  We could also allow users to pass
>>>>> custom implementations of this, which could configure the
>>>>> StreamExecutionEnvironment based on user defined config options. This is
>>>>> just a rough idea for extensibility and probably out of scope at first.
>>>>>
>>>>> Cheers,
>>>>> Gyula
>>>>>
>>>>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Hi Gyula,
>>>>>>
>>>>>> Thank you for the support on those changes.
>>>>>>
>>>>>> I am not sure if I understood your idea for the "reconfiguration" logic.
>>>>>>
>>>>>> The configure method on those objects would take ConfigurationReader. So
>>>>>> user can provide a thin wrapper around Configuration for e.g. filtering
>>>>>> certain logic, changing values based on other parameters etc. Is that
>>>>>> what you had in mind?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> Huuuge +1 from me, this has been an operational pain for years.
>>>>>>> This would also introduce a nice and simple way to extend it in the
>>>>>> future
>>>>>>> if we need.
>>>>>>>
>>>>>>> Ship it!
>>>>>>>
>>>>>>> Gyula
>>>>>>>
>>>>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I wanted to propose a new, additional way of configuring execution
>>>>>>>> parameters that can currently be set only on such objects like
>>>>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>>>>>>>> poses problems such as:
>>>>>>>>
>>>>>>>>      - no easy way to configure those from a file
>>>>>>>>      - there is no easy way to pass a configuration from layers built on
>>>>>>>>      top of StreamExecutionEnvironment. (e.g. when we want to configure
>>>>>> those
>>>>>>>>      options from TableEnvironment)
>>>>>>>>      - they are not automatically documented
>>>>>>>>
>>>>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>>>>>> based
>>>>>>>> on.
>>>>>>>>
>>>>>>>> Would be really grateful to know if you think this would be a valuable
>>>>>>>> addition and any other feedback.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Dawid
>>>>>>>>
>>>>>>>> Wiki page:
>>>>>>>>
>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>>>>>> Google doc:
>>>>>>>>
>>>>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>>>>> [1]
>>>>>>>>
>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

Aljoscha Krettek-2
Hi,

In general, I’m also for “execution" compared to just “exec”. For some of these options, though, I’m wondering whether “pipeline.<option>” or “job.<option>” makes more sense. Over time, a lot of things have accumulated in ExecutionConfig but a lot of them are not execution related, I think. For example, auto-type-registration would make more sense as “pipeline.auto-type-registration”. For some other options, I think we should consider not exposing them via the configuration if we don’t think that we want to have them in the long term.

I’ll try to categorise what I think:

Don’t expose:
 - defaultInputDependencyConstraint (I think this is an internal flag for the Blink runner)
 - executionMode (I think this is also Blink internals)
 - printProgressDuringExecution (I don’t know if this flag still does anything)

Maybe don’t expose:
 - defaultKryoSerializerClasses
 - setGlobalJobParameters (if we expose it it should be “pipeline”)

pipeline/job:
 - autoTypeRegistration
 - autoWatermarkInterval
 - closureCleaner
 - disableGenericTypes
 - enableAutoGeneratedUIDs
 - forceAvro
 - forceKryo
 - setMaxParallelism
 - setParallelism
 - objectReuse (this one is hard, could be execution)
 - registeredKryoTypes
 - registeredPojoTypes
 - timeCharacteristic
 - isChainingEnabled
 - cachedFile

execution:
 - latencyTrackingInterval
 - setRestartStrategy
 - taskCancellationIntervalMillis
 - taskCancellationTimeoutMillis
 - bufferTimeout

checkpointing: (this might be “execution.checkpointing”)
 - useSnapshotCompression
 - <the other checkpointing settings in the doc>
 - defaultStateBackend

What do you think?

Best,
Aljoscha


> On 17. Oct 2019, at 09:32, Timo Walther <[hidden email]> wrote:
>
> Sounds good to me.
>
> Thanks,
>
> Timo
>
>
> On 17.10.19 09:30, Kostas Kloudas wrote:
>> Hi Timo,
>>
>> I agree that distinguishing between "executor" and "execution" when
>> scanning through a configuration file can be difficult. These names
>> were mainly influenced by the fact that FLIP-73 introduced the
>> "Executor".
>> In addition, I agree that "deployment" or "deploy" sound good
>> alternatives. Between the two, I would go with "deployment" (although
>> I like more the "deploy" as it is more imperative) for the simple
>> reason that we do not use verbs anywhere else (I think) in config
>> options.
>>
>> Now for the "exec" or "execution", personally I like the longer
>> version as it is clearer.
>>
>> So, to summarise, I would vote for "deployment", "execution", and
>> "pipeline" for job invariants, like the jars.
>>
>> What do you think?
>>
>> Cheers,
>> Kostas
>>
>> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther <[hidden email]> wrote:
>>> Hi Kostas,
>>>
>>> can we still discuss the naming of the properties? For me, having
>>> "execution" and "exector" as prefixes might be confusing in the future
>>> and difficult to identify if you scan through a list of properties.
>>>
>>> How about `deployment` and `execution`? Or `deployer` and `exec`?
>>>
>>> Regards,
>>> Timo
>>>
>>> On 16.10.19 16:31, Kostas Kloudas wrote:
>>>> Hi all,
>>>>
>>>> Thanks for opening the discussion!
>>>>
>>>> I like the idea, so +1 from my side and actually this is aligned with
>>>> our intensions for the FLIP-73 effort.
>>>>
>>>> For the naming convention of the parameters introduced in the FLIP, my
>>>> proposal would be have the full word "execution" instead of the
>>>> shorter "exec".
>>>> The reason for this, is that in the context of FLIP-73, we are also
>>>> planning to introduce some new configuration parameters and the
>>>> convention we
>>>> are currently using is the following:
>>>>
>>>> pipeline.***: for job parameters that will not change between
>>>> executions of the same job, e.g. the jar location
>>>> executor.***: for parameters relevant to the instantiation of the
>>>> correct executor, e.g. YARN, detached, etc
>>>> execution.***: for parameters that are relevant to a specific
>>>> execution of a given pipeline, e.g. parallelism or savepoint settings
>>>>
>>>> I understand that sometimes the boundaries may not be that clear for a
>>>> parameter but I hope this will not be relevant to most of the
>>>> parameters.
>>>>
>>>> I will also open a FLIP with some addition parameters but until then,
>>>> this is the scheme that we are planning to follow.
>>>>
>>>> Cheers,
>>>> Kostas
>>>>
>>>>
>>>>
>>>> On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:
>>>>> Hi Gyula,
>>>>>
>>>>> Yes you are right, we were also considering the external configurer. The
>>>>> reason we suggest the built in method is that it is more tightly coupled
>>>>> with the place the options are actually set. Therefore our hope is that,
>>>>> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
>>>>> updates also the configure method. I am not entirely against your
>>>>> suggestion though, if this is the preferred way in the community.
>>>>>
>>>>> Does anyone has any comments regarding the option keys?
>>>>>
>>>>> Best,
>>>>>
>>>>> Dawid
>>>>>
>>>>> On 30/08/2019 14:57, Gyula Fóra wrote:
>>>>>> Hi Dawid,
>>>>>>
>>>>>> Sorry I misread one of the interfaces a little (Configuration instead of
>>>>>> ConfigurationReader), you are right.
>>>>>> I was referring to:
>>>>>>
>>>>>>
>>>>>>     -
>>>>>>
>>>>>>     void StreamExecutionEnvironment.configure(ConfigurationReader)
>>>>>>
>>>>>>
>>>>>> This might be slightly orthogonal to the changes that you made here but
>>>>>> what I meant is that instead of adding methods to the
>>>>>> StreamExecutionEnvironment we could make this an external interface:
>>>>>>
>>>>>> EnvironmentConfigurer {
>>>>>>    void configure(StreamExecutionEnvironment, ConfigurationReader)
>>>>>> }
>>>>>>
>>>>>> We could then have a default implementation of the EnvironmentConfigurer
>>>>>> that would understand built in options.  We could also allow users to pass
>>>>>> custom implementations of this, which could configure the
>>>>>> StreamExecutionEnvironment based on user defined config options. This is
>>>>>> just a rough idea for extensibility and probably out of scope at first.
>>>>>>
>>>>>> Cheers,
>>>>>> Gyula
>>>>>>
>>>>>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Gyula,
>>>>>>>
>>>>>>> Thank you for the support on those changes.
>>>>>>>
>>>>>>> I am not sure if I understood your idea for the "reconfiguration" logic.
>>>>>>>
>>>>>>> The configure method on those objects would take ConfigurationReader. So
>>>>>>> user can provide a thin wrapper around Configuration for e.g. filtering
>>>>>>> certain logic, changing values based on other parameters etc. Is that
>>>>>>> what you had in mind?
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Dawid
>>>>>>>
>>>>>>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> Huuuge +1 from me, this has been an operational pain for years.
>>>>>>>> This would also introduce a nice and simple way to extend it in the
>>>>>>> future
>>>>>>>> if we need.
>>>>>>>>
>>>>>>>> Ship it!
>>>>>>>>
>>>>>>>> Gyula
>>>>>>>>
>>>>>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I wanted to propose a new, additional way of configuring execution
>>>>>>>>> parameters that can currently be set only on such objects like
>>>>>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>>>>>>>>> poses problems such as:
>>>>>>>>>
>>>>>>>>>     - no easy way to configure those from a file
>>>>>>>>>     - there is no easy way to pass a configuration from layers built on
>>>>>>>>>     top of StreamExecutionEnvironment. (e.g. when we want to configure
>>>>>>> those
>>>>>>>>>     options from TableEnvironment)
>>>>>>>>>     - they are not automatically documented
>>>>>>>>>
>>>>>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>>>>>>> based
>>>>>>>>> on.
>>>>>>>>>
>>>>>>>>> Would be really grateful to know if you think this would be a valuable
>>>>>>>>> addition and any other feedback.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Dawid
>>>>>>>>>
>>>>>>>>> Wiki page:
>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>>>>>>> Google doc:
>>>>>>>>>
>>>>>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>>>>>> [1]
>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

dwysakowicz
Hi,

Thank you for the comments Kostas, Timo, Aljoscha. I also like the
pipeline/execution naming. I tried to apply most of your suggestions
Aljoscha.

There are a few cases when I did not. You mentioned a few options that
are already present, and I planned to reuse the existing options
(latencyTrackingInterval, latencyTrackingInterval, setParallelism etc.)

I would still expose the two options from your "Maybe don’t expose"
section. They are currently exposed in the Table API module (the initial
motivation of this FLIP to enable passing the config from Table module).
Moreover I think it is important for users to have an option to
configure the kryo serializers in a way.

I updated the FLIP's wiki page and will start voting on it.

Best,

Dawid

On 18/10/2019 17:19, Aljoscha Krettek wrote:

> Hi,
>
> In general, I’m also for “execution" compared to just “exec”. For some of these options, though, I’m wondering whether “pipeline.<option>” or “job.<option>” makes more sense. Over time, a lot of things have accumulated in ExecutionConfig but a lot of them are not execution related, I think. For example, auto-type-registration would make more sense as “pipeline.auto-type-registration”. For some other options, I think we should consider not exposing them via the configuration if we don’t think that we want to have them in the long term.
>
> I’ll try to categorise what I think:
>
> Don’t expose:
>  - defaultInputDependencyConstraint (I think this is an internal flag for the Blink runner)
>  - executionMode (I think this is also Blink internals)
>  - printProgressDuringExecution (I don’t know if this flag still does anything)
>
> Maybe don’t expose:
>  - defaultKryoSerializerClasses
>  - setGlobalJobParameters (if we expose it it should be “pipeline”)
>
> pipeline/job:
>  - autoTypeRegistration
>  - autoWatermarkInterval
>  - closureCleaner
>  - disableGenericTypes
>  - enableAutoGeneratedUIDs
>  - forceAvro
>  - forceKryo
>  - setMaxParallelism
>  - setParallelism
>  - objectReuse (this one is hard, could be execution)
>  - registeredKryoTypes
>  - registeredPojoTypes
>  - timeCharacteristic
>  - isChainingEnabled
>  - cachedFile
>
> execution:
>  - latencyTrackingInterval
>  - setRestartStrategy
>  - taskCancellationIntervalMillis
>  - taskCancellationTimeoutMillis
>  - bufferTimeout
>
> checkpointing: (this might be “execution.checkpointing”)
>  - useSnapshotCompression
>  - <the other checkpointing settings in the doc>
>  - defaultStateBackend
>
> What do you think?
>
> Best,
> Aljoscha
>
>
>> On 17. Oct 2019, at 09:32, Timo Walther <[hidden email]> wrote:
>>
>> Sounds good to me.
>>
>> Thanks,
>>
>> Timo
>>
>>
>> On 17.10.19 09:30, Kostas Kloudas wrote:
>>> Hi Timo,
>>>
>>> I agree that distinguishing between "executor" and "execution" when
>>> scanning through a configuration file can be difficult. These names
>>> were mainly influenced by the fact that FLIP-73 introduced the
>>> "Executor".
>>> In addition, I agree that "deployment" or "deploy" sound good
>>> alternatives. Between the two, I would go with "deployment" (although
>>> I like more the "deploy" as it is more imperative) for the simple
>>> reason that we do not use verbs anywhere else (I think) in config
>>> options.
>>>
>>> Now for the "exec" or "execution", personally I like the longer
>>> version as it is clearer.
>>>
>>> So, to summarise, I would vote for "deployment", "execution", and
>>> "pipeline" for job invariants, like the jars.
>>>
>>> What do you think?
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther <[hidden email]> wrote:
>>>> Hi Kostas,
>>>>
>>>> can we still discuss the naming of the properties? For me, having
>>>> "execution" and "exector" as prefixes might be confusing in the future
>>>> and difficult to identify if you scan through a list of properties.
>>>>
>>>> How about `deployment` and `execution`? Or `deployer` and `exec`?
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>> On 16.10.19 16:31, Kostas Kloudas wrote:
>>>>> Hi all,
>>>>>
>>>>> Thanks for opening the discussion!
>>>>>
>>>>> I like the idea, so +1 from my side and actually this is aligned with
>>>>> our intensions for the FLIP-73 effort.
>>>>>
>>>>> For the naming convention of the parameters introduced in the FLIP, my
>>>>> proposal would be have the full word "execution" instead of the
>>>>> shorter "exec".
>>>>> The reason for this, is that in the context of FLIP-73, we are also
>>>>> planning to introduce some new configuration parameters and the
>>>>> convention we
>>>>> are currently using is the following:
>>>>>
>>>>> pipeline.***: for job parameters that will not change between
>>>>> executions of the same job, e.g. the jar location
>>>>> executor.***: for parameters relevant to the instantiation of the
>>>>> correct executor, e.g. YARN, detached, etc
>>>>> execution.***: for parameters that are relevant to a specific
>>>>> execution of a given pipeline, e.g. parallelism or savepoint settings
>>>>>
>>>>> I understand that sometimes the boundaries may not be that clear for a
>>>>> parameter but I hope this will not be relevant to most of the
>>>>> parameters.
>>>>>
>>>>> I will also open a FLIP with some addition parameters but until then,
>>>>> this is the scheme that we are planning to follow.
>>>>>
>>>>> Cheers,
>>>>> Kostas
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <[hidden email]> wrote:
>>>>>> Hi Gyula,
>>>>>>
>>>>>> Yes you are right, we were also considering the external configurer. The
>>>>>> reason we suggest the built in method is that it is more tightly coupled
>>>>>> with the place the options are actually set. Therefore our hope is that,
>>>>>> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
>>>>>> updates also the configure method. I am not entirely against your
>>>>>> suggestion though, if this is the preferred way in the community.
>>>>>>
>>>>>> Does anyone has any comments regarding the option keys?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> On 30/08/2019 14:57, Gyula Fóra wrote:
>>>>>>> Hi Dawid,
>>>>>>>
>>>>>>> Sorry I misread one of the interfaces a little (Configuration instead of
>>>>>>> ConfigurationReader), you are right.
>>>>>>> I was referring to:
>>>>>>>
>>>>>>>
>>>>>>>     -
>>>>>>>
>>>>>>>     void StreamExecutionEnvironment.configure(ConfigurationReader)
>>>>>>>
>>>>>>>
>>>>>>> This might be slightly orthogonal to the changes that you made here but
>>>>>>> what I meant is that instead of adding methods to the
>>>>>>> StreamExecutionEnvironment we could make this an external interface:
>>>>>>>
>>>>>>> EnvironmentConfigurer {
>>>>>>>    void configure(StreamExecutionEnvironment, ConfigurationReader)
>>>>>>> }
>>>>>>>
>>>>>>> We could then have a default implementation of the EnvironmentConfigurer
>>>>>>> that would understand built in options.  We could also allow users to pass
>>>>>>> custom implementations of this, which could configure the
>>>>>>> StreamExecutionEnvironment based on user defined config options. This is
>>>>>>> just a rough idea for extensibility and probably out of scope at first.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Gyula
>>>>>>>
>>>>>>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz <[hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Gyula,
>>>>>>>>
>>>>>>>> Thank you for the support on those changes.
>>>>>>>>
>>>>>>>> I am not sure if I understood your idea for the "reconfiguration" logic.
>>>>>>>>
>>>>>>>> The configure method on those objects would take ConfigurationReader. So
>>>>>>>> user can provide a thin wrapper around Configuration for e.g. filtering
>>>>>>>> certain logic, changing values based on other parameters etc. Is that
>>>>>>>> what you had in mind?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Dawid
>>>>>>>>
>>>>>>>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Huuuge +1 from me, this has been an operational pain for years.
>>>>>>>>> This would also introduce a nice and simple way to extend it in the
>>>>>>>> future
>>>>>>>>> if we need.
>>>>>>>>>
>>>>>>>>> Ship it!
>>>>>>>>>
>>>>>>>>> Gyula
>>>>>>>>>
>>>>>>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz <[hidden email]
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I wanted to propose a new, additional way of configuring execution
>>>>>>>>>> parameters that can currently be set only on such objects like
>>>>>>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. This
>>>>>>>>>> poses problems such as:
>>>>>>>>>>
>>>>>>>>>>     - no easy way to configure those from a file
>>>>>>>>>>     - there is no easy way to pass a configuration from layers built on
>>>>>>>>>>     top of StreamExecutionEnvironment. (e.g. when we want to configure
>>>>>>>> those
>>>>>>>>>>     options from TableEnvironment)
>>>>>>>>>>     - they are not automatically documented
>>>>>>>>>>
>>>>>>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>>>>>>>> based
>>>>>>>>>> on.
>>>>>>>>>>
>>>>>>>>>> Would be really grateful to know if you think this would be a valuable
>>>>>>>>>> addition and any other feedback.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Dawid
>>>>>>>>>>
>>>>>>>>>> Wiki page:
>>>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>>>>>>>> Google doc:
>>>>>>>>>>
>>>>>>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>>>>>>> [1]
>>>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>


signature.asc (849 bytes) Download Attachment