[DISCUSS] Flink configuration from environment variables

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Flink configuration from environment variables

Ingo Bürk
Hi everyone,

in Ververica Platform we offer a feature to use environment variables in
the Flink configuration¹, e.g.

```
s3.access-key: ${S3_ACCESS_KEY}
```

We've been discussing internally whether contributing such a feature to
Flink directly would make sense and wanted to start a discussion on this
topic.

An alternative way to do so from the above would be parsing those directly
based on their name, so instead of having it defined in the Flink
configuration as above, it would get automatically set if something like
$FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is somewhat
similar to what e.g. Spring does, and faces similar challenges (dealing
with "."s etc.)

Although I view both of these approaches as mostly orthogonal, supporting
both very likely wouldn't make sense, of course. So I was wondering what
your opinion is in terms of whether the project would benefit from
environment variable support for the Flink configuration, and whether there
are tendencies as to which approach to go with.

¹
https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables

Best regards
Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Khachatryan Roman
Hi Ingo,

Thanks a lot for this proposal!

We had a related discussion recently in the context of FLINK-19520
(randomizing tests configuration) [1].
I believe other scenarios will benefit as well.

For the end users, I think substitution in configuration files is
preferable over parsing env vars in Flink code.
And for cases without such a file, we could have a default one on the
classpath with all substitutions defined (and then merge everything from
the user-supplied file).

[1] https://issues.apache.org/jira/browse/FLINK-19520

Regards,
Roman


On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]> wrote:

> Hi everyone,
>
> in Ververica Platform we offer a feature to use environment variables in
> the Flink configuration¹, e.g.
>
> ```
> s3.access-key: ${S3_ACCESS_KEY}
> ```
>
> We've been discussing internally whether contributing such a feature to
> Flink directly would make sense and wanted to start a discussion on this
> topic.
>
> An alternative way to do so from the above would be parsing those directly
> based on their name, so instead of having it defined in the Flink
> configuration as above, it would get automatically set if something like
> $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is somewhat
> similar to what e.g. Spring does, and faces similar challenges (dealing
> with "."s etc.)
>
> Although I view both of these approaches as mostly orthogonal, supporting
> both very likely wouldn't make sense, of course. So I was wondering what
> your opinion is in terms of whether the project would benefit from
> environment variable support for the Flink configuration, and whether there
> are tendencies as to which approach to go with.
>
> ¹
>
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
>
> Best regards
> Ingo
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Yang Wang
Thanks for kicking off the discussion.

I think supporting environment variables rendering in the Flink
configuration yaml file is a good idea. Especially for
the Kubernetes environment since we are using the secret resource to store
the authentication information.

But I have some questions for how to do it?
1. The environments in Flink configuration yaml will be replaced in client,
JobManager, TaskManager or all of them?
2. If users do not want some config options to be replaced, how to
achieve that?

Best,
Yang

Khachatryan Roman <[hidden email]> 于2021年1月18日周一 下午8:55写道:

> Hi Ingo,
>
> Thanks a lot for this proposal!
>
> We had a related discussion recently in the context of FLINK-19520
> (randomizing tests configuration) [1].
> I believe other scenarios will benefit as well.
>
> For the end users, I think substitution in configuration files is
> preferable over parsing env vars in Flink code.
> And for cases without such a file, we could have a default one on the
> classpath with all substitutions defined (and then merge everything from
> the user-supplied file).
>
> [1] https://issues.apache.org/jira/browse/FLINK-19520
>
> Regards,
> Roman
>
>
> On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > in Ververica Platform we offer a feature to use environment variables in
> > the Flink configuration¹, e.g.
> >
> > ```
> > s3.access-key: ${S3_ACCESS_KEY}
> > ```
> >
> > We've been discussing internally whether contributing such a feature to
> > Flink directly would make sense and wanted to start a discussion on this
> > topic.
> >
> > An alternative way to do so from the above would be parsing those
> directly
> > based on their name, so instead of having it defined in the Flink
> > configuration as above, it would get automatically set if something like
> > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is somewhat
> > similar to what e.g. Spring does, and faces similar challenges (dealing
> > with "."s etc.)
> >
> > Although I view both of these approaches as mostly orthogonal, supporting
> > both very likely wouldn't make sense, of course. So I was wondering what
> > your opinion is in terms of whether the project would benefit from
> > environment variable support for the Flink configuration, and whether
> there
> > are tendencies as to which approach to go with.
> >
> > ¹
> >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> >
> > Best regards
> > Ingo
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ingo Bürk
Hi Yang,

thanks for your questions! I'm glad to see this feature is being received
positively.

ad 1) We don't distinguish JM/TM, and I can't think of a good reason why a
user would want to do so. I'm not very experienced with Flink, however, so
please excuse me if I'm overlooking some obvious reason here. :-)
ad 2) Admittedly I don't have a good overview on all the configuration
options that exist, but from those that I do know I can't imagine someone
wanting to pass a value like "${MY_VAR}" verbatim. In Ververica Platform as
of now we ignore this problem. If, however, this needs to be addressed, a
possible solution could be to allow escaping syntax such as "\${MY_VAR}".

Another point to consider here is when exactly the substitution takes
place: on the "raw" file, or on the parsed key / value separately, and if
so, should it support both key and value? My current thinking is that
substituting only the value of the parsed entry should be sufficient.


Regards
Ingo

On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]> wrote:

> Thanks for kicking off the discussion.
>
> I think supporting environment variables rendering in the Flink
> configuration yaml file is a good idea. Especially for
> the Kubernetes environment since we are using the secret resource to store
> the authentication information.
>
> But I have some questions for how to do it?
> 1. The environments in Flink configuration yaml will be replaced in client,
> JobManager, TaskManager or all of them?
> 2. If users do not want some config options to be replaced, how to
> achieve that?
>
> Best,
> Yang
>
> Khachatryan Roman <[hidden email]> 于2021年1月18日周一 下午8:55写道:
>
> > Hi Ingo,
> >
> > Thanks a lot for this proposal!
> >
> > We had a related discussion recently in the context of FLINK-19520
> > (randomizing tests configuration) [1].
> > I believe other scenarios will benefit as well.
> >
> > For the end users, I think substitution in configuration files is
> > preferable over parsing env vars in Flink code.
> > And for cases without such a file, we could have a default one on the
> > classpath with all substitutions defined (and then merge everything from
> > the user-supplied file).
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-19520
> >
> > Regards,
> > Roman
> >
> >
> > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]> wrote:
> >
> > > Hi everyone,
> > >
> > > in Ververica Platform we offer a feature to use environment variables
> in
> > > the Flink configuration¹, e.g.
> > >
> > > ```
> > > s3.access-key: ${S3_ACCESS_KEY}
> > > ```
> > >
> > > We've been discussing internally whether contributing such a feature to
> > > Flink directly would make sense and wanted to start a discussion on
> this
> > > topic.
> > >
> > > An alternative way to do so from the above would be parsing those
> > directly
> > > based on their name, so instead of having it defined in the Flink
> > > configuration as above, it would get automatically set if something
> like
> > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> somewhat
> > > similar to what e.g. Spring does, and faces similar challenges (dealing
> > > with "."s etc.)
> > >
> > > Although I view both of these approaches as mostly orthogonal,
> supporting
> > > both very likely wouldn't make sense, of course. So I was wondering
> what
> > > your opinion is in terms of whether the project would benefit from
> > > environment variable support for the Flink configuration, and whether
> > there
> > > are tendencies as to which approach to go with.
> > >
> > > ¹
> > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > >
> > > Best regards
> > > Ingo
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Steven Wu
Variable substitution (proposed here) is definitely useful.

For us, hierarchical override is more useful.  E.g., we may have the
default value of "state.checkpoints.dir=path1" defined in flink-conf.yaml.
But maybe we want to override it to "state.checkpoints.dir=path2" via
environment variable in some scenarios. Otherwise, we have to define a
corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
config, which is annoying.

As Ingo pointed, it is also annoying to handle Java property key naming
convention (dots separated), as dots aren't allowed in shell env var naming
(All caps, separated with underscore). Shell will complain. We have to
bundle all env var overrides (k-v pairs) in a single property value (JSON
and base64 encode) to avoid it.

On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:

> Hi Yang,
>
> thanks for your questions! I'm glad to see this feature is being received
> positively.
>
> ad 1) We don't distinguish JM/TM, and I can't think of a good reason why a
> user would want to do so. I'm not very experienced with Flink, however, so
> please excuse me if I'm overlooking some obvious reason here. :-)
> ad 2) Admittedly I don't have a good overview on all the configuration
> options that exist, but from those that I do know I can't imagine someone
> wanting to pass a value like "${MY_VAR}" verbatim. In Ververica Platform as
> of now we ignore this problem. If, however, this needs to be addressed, a
> possible solution could be to allow escaping syntax such as "\${MY_VAR}".
>
> Another point to consider here is when exactly the substitution takes
> place: on the "raw" file, or on the parsed key / value separately, and if
> so, should it support both key and value? My current thinking is that
> substituting only the value of the parsed entry should be sufficient.
>
>
> Regards
> Ingo
>
> On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]> wrote:
>
> > Thanks for kicking off the discussion.
> >
> > I think supporting environment variables rendering in the Flink
> > configuration yaml file is a good idea. Especially for
> > the Kubernetes environment since we are using the secret resource to
> store
> > the authentication information.
> >
> > But I have some questions for how to do it?
> > 1. The environments in Flink configuration yaml will be replaced in
> client,
> > JobManager, TaskManager or all of them?
> > 2. If users do not want some config options to be replaced, how to
> > achieve that?
> >
> > Best,
> > Yang
> >
> > Khachatryan Roman <[hidden email]> 于2021年1月18日周一 下午8:55写道:
> >
> > > Hi Ingo,
> > >
> > > Thanks a lot for this proposal!
> > >
> > > We had a related discussion recently in the context of FLINK-19520
> > > (randomizing tests configuration) [1].
> > > I believe other scenarios will benefit as well.
> > >
> > > For the end users, I think substitution in configuration files is
> > > preferable over parsing env vars in Flink code.
> > > And for cases without such a file, we could have a default one on the
> > > classpath with all substitutions defined (and then merge everything
> from
> > > the user-supplied file).
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > >
> > > Regards,
> > > Roman
> > >
> > >
> > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > in Ververica Platform we offer a feature to use environment variables
> > in
> > > > the Flink configuration¹, e.g.
> > > >
> > > > ```
> > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > ```
> > > >
> > > > We've been discussing internally whether contributing such a feature
> to
> > > > Flink directly would make sense and wanted to start a discussion on
> > this
> > > > topic.
> > > >
> > > > An alternative way to do so from the above would be parsing those
> > > directly
> > > > based on their name, so instead of having it defined in the Flink
> > > > configuration as above, it would get automatically set if something
> > like
> > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > somewhat
> > > > similar to what e.g. Spring does, and faces similar challenges
> (dealing
> > > > with "."s etc.)
> > > >
> > > > Although I view both of these approaches as mostly orthogonal,
> > supporting
> > > > both very likely wouldn't make sense, of course. So I was wondering
> > what
> > > > your opinion is in terms of whether the project would benefit from
> > > > environment variable support for the Flink configuration, and whether
> > > there
> > > > are tendencies as to which approach to go with.
> > > >
> > > > ¹
> > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > >
> > > > Best regards
> > > > Ingo
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Yang Wang
Hi Ingo,

Thanks for your response.

1. Not distinguishing JM/TM is reasonable, but what about the client side.
For Yarn/K8s deployment,
the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
about where should the environment
variables be replaced? IIUC, it is not an issue for Ververica Platform
since it is always done in the JM/TM side.

2. I believe we should support not do the substitution for specific key. A
typical use case is "env.java.opts". If the
value contains environment variables, they are expected to be replaced
exactly when the java command is executed,
not after the java process is started. Maybe escaping with single quote is
enough.

3. The substitution only takes effects on the value makes sense to me.


Best,
Yang

Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:

> Variable substitution (proposed here) is definitely useful.
>
> For us, hierarchical override is more useful.  E.g., we may have the
> default value of "state.checkpoints.dir=path1" defined in flink-conf.yaml.
> But maybe we want to override it to "state.checkpoints.dir=path2" via
> environment variable in some scenarios. Otherwise, we have to define a
> corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> config, which is annoying.
>
> As Ingo pointed, it is also annoying to handle Java property key naming
> convention (dots separated), as dots aren't allowed in shell env var naming
> (All caps, separated with underscore). Shell will complain. We have to
> bundle all env var overrides (k-v pairs) in a single property value (JSON
> and base64 encode) to avoid it.
>
> On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:
>
> > Hi Yang,
> >
> > thanks for your questions! I'm glad to see this feature is being received
> > positively.
> >
> > ad 1) We don't distinguish JM/TM, and I can't think of a good reason why
> a
> > user would want to do so. I'm not very experienced with Flink, however,
> so
> > please excuse me if I'm overlooking some obvious reason here. :-)
> > ad 2) Admittedly I don't have a good overview on all the configuration
> > options that exist, but from those that I do know I can't imagine someone
> > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica Platform
> as
> > of now we ignore this problem. If, however, this needs to be addressed, a
> > possible solution could be to allow escaping syntax such as "\${MY_VAR}".
> >
> > Another point to consider here is when exactly the substitution takes
> > place: on the "raw" file, or on the parsed key / value separately, and if
> > so, should it support both key and value? My current thinking is that
> > substituting only the value of the parsed entry should be sufficient.
> >
> >
> > Regards
> > Ingo
> >
> > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]> wrote:
> >
> > > Thanks for kicking off the discussion.
> > >
> > > I think supporting environment variables rendering in the Flink
> > > configuration yaml file is a good idea. Especially for
> > > the Kubernetes environment since we are using the secret resource to
> > store
> > > the authentication information.
> > >
> > > But I have some questions for how to do it?
> > > 1. The environments in Flink configuration yaml will be replaced in
> > client,
> > > JobManager, TaskManager or all of them?
> > > 2. If users do not want some config options to be replaced, how to
> > > achieve that?
> > >
> > > Best,
> > > Yang
> > >
> > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> 下午8:55写道:
> > >
> > > > Hi Ingo,
> > > >
> > > > Thanks a lot for this proposal!
> > > >
> > > > We had a related discussion recently in the context of FLINK-19520
> > > > (randomizing tests configuration) [1].
> > > > I believe other scenarios will benefit as well.
> > > >
> > > > For the end users, I think substitution in configuration files is
> > > > preferable over parsing env vars in Flink code.
> > > > And for cases without such a file, we could have a default one on the
> > > > classpath with all substitutions defined (and then merge everything
> > from
> > > > the user-supplied file).
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > >
> > > > Regards,
> > > > Roman
> > > >
> > > >
> > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]>
> wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > in Ververica Platform we offer a feature to use environment
> variables
> > > in
> > > > > the Flink configuration¹, e.g.
> > > > >
> > > > > ```
> > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > ```
> > > > >
> > > > > We've been discussing internally whether contributing such a
> feature
> > to
> > > > > Flink directly would make sense and wanted to start a discussion on
> > > this
> > > > > topic.
> > > > >
> > > > > An alternative way to do so from the above would be parsing those
> > > > directly
> > > > > based on their name, so instead of having it defined in the Flink
> > > > > configuration as above, it would get automatically set if something
> > > like
> > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > somewhat
> > > > > similar to what e.g. Spring does, and faces similar challenges
> > (dealing
> > > > > with "."s etc.)
> > > > >
> > > > > Although I view both of these approaches as mostly orthogonal,
> > > supporting
> > > > > both very likely wouldn't make sense, of course. So I was wondering
> > > what
> > > > > your opinion is in terms of whether the project would benefit from
> > > > > environment variable support for the Flink configuration, and
> whether
> > > > there
> > > > > are tendencies as to which approach to go with.
> > > > >
> > > > > ¹
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > >
> > > > > Best regards
> > > > > Ingo
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ingo Bürk
In reply to this post by Steven Wu
Hi Steven,

regarding the hierarchical override, we could even expand the substitution
solution to support shell syntax with default values like

state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}

such that if the environment variable doesn't exist, path1 will be used.


Regards
Ingo


On Mon, Jan 18, 2021 at 5:36 PM Steven Wu <[hidden email]> wrote:

> Variable substitution (proposed here) is definitely useful.
>
> For us, hierarchical override is more useful.  E.g., we may have the
> default value of "state.checkpoints.dir=path1" defined in flink-conf.yaml.
> But maybe we want to override it to "state.checkpoints.dir=path2" via
> environment variable in some scenarios. Otherwise, we have to define a
> corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> config, which is annoying.
>
> As Ingo pointed, it is also annoying to handle Java property key naming
> convention (dots separated), as dots aren't allowed in shell env var naming
> (All caps, separated with underscore). Shell will complain. We have to
> bundle all env var overrides (k-v pairs) in a single property value (JSON
> and base64 encode) to avoid it.
>
> On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:
>
> > Hi Yang,
> >
> > thanks for your questions! I'm glad to see this feature is being received
> > positively.
> >
> > ad 1) We don't distinguish JM/TM, and I can't think of a good reason why
> a
> > user would want to do so. I'm not very experienced with Flink, however,
> so
> > please excuse me if I'm overlooking some obvious reason here. :-)
> > ad 2) Admittedly I don't have a good overview on all the configuration
> > options that exist, but from those that I do know I can't imagine someone
> > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica Platform
> as
> > of now we ignore this problem. If, however, this needs to be addressed, a
> > possible solution could be to allow escaping syntax such as "\${MY_VAR}".
> >
> > Another point to consider here is when exactly the substitution takes
> > place: on the "raw" file, or on the parsed key / value separately, and if
> > so, should it support both key and value? My current thinking is that
> > substituting only the value of the parsed entry should be sufficient.
> >
> >
> > Regards
> > Ingo
> >
> > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]> wrote:
> >
> > > Thanks for kicking off the discussion.
> > >
> > > I think supporting environment variables rendering in the Flink
> > > configuration yaml file is a good idea. Especially for
> > > the Kubernetes environment since we are using the secret resource to
> > store
> > > the authentication information.
> > >
> > > But I have some questions for how to do it?
> > > 1. The environments in Flink configuration yaml will be replaced in
> > client,
> > > JobManager, TaskManager or all of them?
> > > 2. If users do not want some config options to be replaced, how to
> > > achieve that?
> > >
> > > Best,
> > > Yang
> > >
> > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> 下午8:55写道:
> > >
> > > > Hi Ingo,
> > > >
> > > > Thanks a lot for this proposal!
> > > >
> > > > We had a related discussion recently in the context of FLINK-19520
> > > > (randomizing tests configuration) [1].
> > > > I believe other scenarios will benefit as well.
> > > >
> > > > For the end users, I think substitution in configuration files is
> > > > preferable over parsing env vars in Flink code.
> > > > And for cases without such a file, we could have a default one on the
> > > > classpath with all substitutions defined (and then merge everything
> > from
> > > > the user-supplied file).
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > >
> > > > Regards,
> > > > Roman
> > > >
> > > >
> > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]>
> wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > in Ververica Platform we offer a feature to use environment
> variables
> > > in
> > > > > the Flink configuration¹, e.g.
> > > > >
> > > > > ```
> > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > ```
> > > > >
> > > > > We've been discussing internally whether contributing such a
> feature
> > to
> > > > > Flink directly would make sense and wanted to start a discussion on
> > > this
> > > > > topic.
> > > > >
> > > > > An alternative way to do so from the above would be parsing those
> > > > directly
> > > > > based on their name, so instead of having it defined in the Flink
> > > > > configuration as above, it would get automatically set if something
> > > like
> > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > somewhat
> > > > > similar to what e.g. Spring does, and faces similar challenges
> > (dealing
> > > > > with "."s etc.)
> > > > >
> > > > > Although I view both of these approaches as mostly orthogonal,
> > > supporting
> > > > > both very likely wouldn't make sense, of course. So I was wondering
> > > what
> > > > > your opinion is in terms of whether the project would benefit from
> > > > > environment variable support for the Flink configuration, and
> whether
> > > > there
> > > > > are tendencies as to which approach to go with.
> > > > >
> > > > > ¹
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > >
> > > > > Best regards
> > > > > Ingo
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ingo Bürk
In reply to this post by Yang Wang
Hi Yang,

1. As you said I think this doesn't affect Ververica Platform, really, so
I'm more than happy to hear and follow the thoughts of people more
experienced with Flink than me.
2. I wasn't aware of env.java.opts, but that's definitely a candidate where
a user may want to "escape" it so it doesn't get substituted immediately, I
agree.


Regards
Ingo

On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]> wrote:

> Hi Ingo,
>
> Thanks for your response.
>
> 1. Not distinguishing JM/TM is reasonable, but what about the client side.
> For Yarn/K8s deployment,
> the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> about where should the environment
> variables be replaced? IIUC, it is not an issue for Ververica Platform
> since it is always done in the JM/TM side.
>
> 2. I believe we should support not do the substitution for specific key. A
> typical use case is "env.java.opts". If the
> value contains environment variables, they are expected to be replaced
> exactly when the java command is executed,
> not after the java process is started. Maybe escaping with single quote is
> enough.
>
> 3. The substitution only takes effects on the value makes sense to me.
>
>
> Best,
> Yang
>
> Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
>
> > Variable substitution (proposed here) is definitely useful.
> >
> > For us, hierarchical override is more useful.  E.g., we may have the
> > default value of "state.checkpoints.dir=path1" defined in
> flink-conf.yaml.
> > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > environment variable in some scenarios. Otherwise, we have to define a
> > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > config, which is annoying.
> >
> > As Ingo pointed, it is also annoying to handle Java property key naming
> > convention (dots separated), as dots aren't allowed in shell env var
> naming
> > (All caps, separated with underscore). Shell will complain. We have to
> > bundle all env var overrides (k-v pairs) in a single property value (JSON
> > and base64 encode) to avoid it.
> >
> > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:
> >
> > > Hi Yang,
> > >
> > > thanks for your questions! I'm glad to see this feature is being
> received
> > > positively.
> > >
> > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> why
> > a
> > > user would want to do so. I'm not very experienced with Flink, however,
> > so
> > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > ad 2) Admittedly I don't have a good overview on all the configuration
> > > options that exist, but from those that I do know I can't imagine
> someone
> > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> Platform
> > as
> > > of now we ignore this problem. If, however, this needs to be
> addressed, a
> > > possible solution could be to allow escaping syntax such as
> "\${MY_VAR}".
> > >
> > > Another point to consider here is when exactly the substitution takes
> > > place: on the "raw" file, or on the parsed key / value separately, and
> if
> > > so, should it support both key and value? My current thinking is that
> > > substituting only the value of the parsed entry should be sufficient.
> > >
> > >
> > > Regards
> > > Ingo
> > >
> > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]>
> wrote:
> > >
> > > > Thanks for kicking off the discussion.
> > > >
> > > > I think supporting environment variables rendering in the Flink
> > > > configuration yaml file is a good idea. Especially for
> > > > the Kubernetes environment since we are using the secret resource to
> > > store
> > > > the authentication information.
> > > >
> > > > But I have some questions for how to do it?
> > > > 1. The environments in Flink configuration yaml will be replaced in
> > > client,
> > > > JobManager, TaskManager or all of them?
> > > > 2. If users do not want some config options to be replaced, how to
> > > > achieve that?
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> > 下午8:55写道:
> > > >
> > > > > Hi Ingo,
> > > > >
> > > > > Thanks a lot for this proposal!
> > > > >
> > > > > We had a related discussion recently in the context of FLINK-19520
> > > > > (randomizing tests configuration) [1].
> > > > > I believe other scenarios will benefit as well.
> > > > >
> > > > > For the end users, I think substitution in configuration files is
> > > > > preferable over parsing env vars in Flink code.
> > > > > And for cases without such a file, we could have a default one on
> the
> > > > > classpath with all substitutions defined (and then merge everything
> > > from
> > > > > the user-supplied file).
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > >
> > > > > Regards,
> > > > > Roman
> > > > >
> > > > >
> > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]>
> > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > in Ververica Platform we offer a feature to use environment
> > variables
> > > > in
> > > > > > the Flink configuration¹, e.g.
> > > > > >
> > > > > > ```
> > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > ```
> > > > > >
> > > > > > We've been discussing internally whether contributing such a
> > feature
> > > to
> > > > > > Flink directly would make sense and wanted to start a discussion
> on
> > > > this
> > > > > > topic.
> > > > > >
> > > > > > An alternative way to do so from the above would be parsing those
> > > > > directly
> > > > > > based on their name, so instead of having it defined in the Flink
> > > > > > configuration as above, it would get automatically set if
> something
> > > > like
> > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > > somewhat
> > > > > > similar to what e.g. Spring does, and faces similar challenges
> > > (dealing
> > > > > > with "."s etc.)
> > > > > >
> > > > > > Although I view both of these approaches as mostly orthogonal,
> > > > supporting
> > > > > > both very likely wouldn't make sense, of course. So I was
> wondering
> > > > what
> > > > > > your opinion is in terms of whether the project would benefit
> from
> > > > > > environment variable support for the Flink configuration, and
> > whether
> > > > > there
> > > > > > are tendencies as to which approach to go with.
> > > > > >
> > > > > > ¹
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > >
> > > > > > Best regards
> > > > > > Ingo
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Till Rohrmann
Hi everyone,

Thanks for starting this discussion Ingo. I think being able to use env
variables to change Flink's configuration will be a very useful feature.

Concerning the two approaches I would be in favour of the second approach
($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
prepare a special flink-conf.yaml where he inserts env variables for every
config value he wants to configure. Since this is not required with the
second approach, I think it is more general and easier to use. Also, the
user does not have to remember a second set of names (env names) which he
has to/can set.

For how to substitute the values, I think it should happen when we load the
Flink configuration. First we read the file and then overwrite values
specified via an env variable or dynamic properties in some defined order.
For env.java.opts and other options which are used for starting the JVM we
might need special handling in the bash scripts.

Cheers,
Till

On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]> wrote:

> Hi Yang,
>
> 1. As you said I think this doesn't affect Ververica Platform, really, so
> I'm more than happy to hear and follow the thoughts of people more
> experienced with Flink than me.
> 2. I wasn't aware of env.java.opts, but that's definitely a candidate where
> a user may want to "escape" it so it doesn't get substituted immediately, I
> agree.
>
>
> Regards
> Ingo
>
> On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]> wrote:
>
> > Hi Ingo,
> >
> > Thanks for your response.
> >
> > 1. Not distinguishing JM/TM is reasonable, but what about the client
> side.
> > For Yarn/K8s deployment,
> > the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> > about where should the environment
> > variables be replaced? IIUC, it is not an issue for Ververica Platform
> > since it is always done in the JM/TM side.
> >
> > 2. I believe we should support not do the substitution for specific key.
> A
> > typical use case is "env.java.opts". If the
> > value contains environment variables, they are expected to be replaced
> > exactly when the java command is executed,
> > not after the java process is started. Maybe escaping with single quote
> is
> > enough.
> >
> > 3. The substitution only takes effects on the value makes sense to me.
> >
> >
> > Best,
> > Yang
> >
> > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
> >
> > > Variable substitution (proposed here) is definitely useful.
> > >
> > > For us, hierarchical override is more useful.  E.g., we may have the
> > > default value of "state.checkpoints.dir=path1" defined in
> > flink-conf.yaml.
> > > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > > environment variable in some scenarios. Otherwise, we have to define a
> > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > > config, which is annoying.
> > >
> > > As Ingo pointed, it is also annoying to handle Java property key naming
> > > convention (dots separated), as dots aren't allowed in shell env var
> > naming
> > > (All caps, separated with underscore). Shell will complain. We have to
> > > bundle all env var overrides (k-v pairs) in a single property value
> (JSON
> > > and base64 encode) to avoid it.
> > >
> > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:
> > >
> > > > Hi Yang,
> > > >
> > > > thanks for your questions! I'm glad to see this feature is being
> > received
> > > > positively.
> > > >
> > > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> > why
> > > a
> > > > user would want to do so. I'm not very experienced with Flink,
> however,
> > > so
> > > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > > ad 2) Admittedly I don't have a good overview on all the
> configuration
> > > > options that exist, but from those that I do know I can't imagine
> > someone
> > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > Platform
> > > as
> > > > of now we ignore this problem. If, however, this needs to be
> > addressed, a
> > > > possible solution could be to allow escaping syntax such as
> > "\${MY_VAR}".
> > > >
> > > > Another point to consider here is when exactly the substitution takes
> > > > place: on the "raw" file, or on the parsed key / value separately,
> and
> > if
> > > > so, should it support both key and value? My current thinking is that
> > > > substituting only the value of the parsed entry should be sufficient.
> > > >
> > > >
> > > > Regards
> > > > Ingo
> > > >
> > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]>
> > wrote:
> > > >
> > > > > Thanks for kicking off the discussion.
> > > > >
> > > > > I think supporting environment variables rendering in the Flink
> > > > > configuration yaml file is a good idea. Especially for
> > > > > the Kubernetes environment since we are using the secret resource
> to
> > > > store
> > > > > the authentication information.
> > > > >
> > > > > But I have some questions for how to do it?
> > > > > 1. The environments in Flink configuration yaml will be replaced in
> > > > client,
> > > > > JobManager, TaskManager or all of them?
> > > > > 2. If users do not want some config options to be replaced, how to
> > > > > achieve that?
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> > > 下午8:55写道:
> > > > >
> > > > > > Hi Ingo,
> > > > > >
> > > > > > Thanks a lot for this proposal!
> > > > > >
> > > > > > We had a related discussion recently in the context of
> FLINK-19520
> > > > > > (randomizing tests configuration) [1].
> > > > > > I believe other scenarios will benefit as well.
> > > > > >
> > > > > > For the end users, I think substitution in configuration files is
> > > > > > preferable over parsing env vars in Flink code.
> > > > > > And for cases without such a file, we could have a default one on
> > the
> > > > > > classpath with all substitutions defined (and then merge
> everything
> > > > from
> > > > > > the user-supplied file).
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > >
> > > > > > Regards,
> > > > > > Roman
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > in Ververica Platform we offer a feature to use environment
> > > variables
> > > > > in
> > > > > > > the Flink configuration¹, e.g.
> > > > > > >
> > > > > > > ```
> > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > ```
> > > > > > >
> > > > > > > We've been discussing internally whether contributing such a
> > > feature
> > > > to
> > > > > > > Flink directly would make sense and wanted to start a
> discussion
> > on
> > > > > this
> > > > > > > topic.
> > > > > > >
> > > > > > > An alternative way to do so from the above would be parsing
> those
> > > > > > directly
> > > > > > > based on their name, so instead of having it defined in the
> Flink
> > > > > > > configuration as above, it would get automatically set if
> > something
> > > > > like
> > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > > > somewhat
> > > > > > > similar to what e.g. Spring does, and faces similar challenges
> > > > (dealing
> > > > > > > with "."s etc.)
> > > > > > >
> > > > > > > Although I view both of these approaches as mostly orthogonal,
> > > > > supporting
> > > > > > > both very likely wouldn't make sense, of course. So I was
> > wondering
> > > > > what
> > > > > > > your opinion is in terms of whether the project would benefit
> > from
> > > > > > > environment variable support for the Flink configuration, and
> > > whether
> > > > > > there
> > > > > > > are tendencies as to which approach to go with.
> > > > > > >
> > > > > > > ¹
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > >
> > > > > > > Best regards
> > > > > > > Ingo
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ufuk Celebi-2
Hey all,

I think that approach 2 is more idiomatic for container deployments where it can be cumbersome to manually map flink-conf.yaml contents to env vars [1]. The precedence order outlined by Till would also cover Steven's hierarchical overwrite requirement.

I'm really excited about this feature as it will make Flink deployments a lot more ergonomic. The implementation seems to be not too complicated (which makes we wonder why we didn't tackle this earlier or whether I'm missing something).

I'd also be happy to shepherd this contribution if there is consensus on the need for it and the approach. Does it make sense to formalize this decision a bit with a short FLIP?

– Ufuk

[1] In Ververica Platform, we support approach 1, because the Flink configuration is part of the specification for a single Deployment and it's minimally more convenient to have something like

flinkConfiguration:
  foo: ${BAR}

for us. I don't think this approach would feel natural when manually deploying Flink. There would be a clear migration path for our customers, so I'm not concerned about this too much.

On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:

> Hi everyone,
>
> Thanks for starting this discussion Ingo. I think being able to use env
> variables to change Flink's configuration will be a very useful feature.
>
> Concerning the two approaches I would be in favour of the second approach
> ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> prepare a special flink-conf.yaml where he inserts env variables for every
> config value he wants to configure. Since this is not required with the
> second approach, I think it is more general and easier to use. Also, the
> user does not have to remember a second set of names (env names) which he
> has to/can set.
>
> For how to substitute the values, I think it should happen when we load the
> Flink configuration. First we read the file and then overwrite values
> specified via an env variable or dynamic properties in some defined order.
> For env.java.opts and other options which are used for starting the JVM we
> might need special handling in the bash scripts.
>
> Cheers,
> Till
>
> On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]> wrote:
>
> > Hi Yang,
> >
> > 1. As you said I think this doesn't affect Ververica Platform, really, so
> > I'm more than happy to hear and follow the thoughts of people more
> > experienced with Flink than me.
> > 2. I wasn't aware of env.java.opts, but that's definitely a candidate where
> > a user may want to "escape" it so it doesn't get substituted immediately, I
> > agree.
> >
> >
> > Regards
> > Ingo
> >
> > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]> wrote:
> >
> > > Hi Ingo,
> > >
> > > Thanks for your response.
> > >
> > > 1. Not distinguishing JM/TM is reasonable, but what about the client
> > side.
> > > For Yarn/K8s deployment,
> > > the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> > > about where should the environment
> > > variables be replaced? IIUC, it is not an issue for Ververica Platform
> > > since it is always done in the JM/TM side.
> > >
> > > 2. I believe we should support not do the substitution for specific key.
> > A
> > > typical use case is "env.java.opts". If the
> > > value contains environment variables, they are expected to be replaced
> > > exactly when the java command is executed,
> > > not after the java process is started. Maybe escaping with single quote
> > is
> > > enough.
> > >
> > > 3. The substitution only takes effects on the value makes sense to me.
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
> > >
> > > > Variable substitution (proposed here) is definitely useful.
> > > >
> > > > For us, hierarchical override is more useful.  E.g., we may have the
> > > > default value of "state.checkpoints.dir=path1" defined in
> > > flink-conf.yaml.
> > > > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > > > environment variable in some scenarios. Otherwise, we have to define a
> > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > > > config, which is annoying.
> > > >
> > > > As Ingo pointed, it is also annoying to handle Java property key naming
> > > > convention (dots separated), as dots aren't allowed in shell env var
> > > naming
> > > > (All caps, separated with underscore). Shell will complain. We have to
> > > > bundle all env var overrides (k-v pairs) in a single property value
> > (JSON
> > > > and base64 encode) to avoid it.
> > > >
> > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]> wrote:
> > > >
> > > > > Hi Yang,
> > > > >
> > > > > thanks for your questions! I'm glad to see this feature is being
> > > received
> > > > > positively.
> > > > >
> > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> > > why
> > > > a
> > > > > user would want to do so. I'm not very experienced with Flink,
> > however,
> > > > so
> > > > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > > > ad 2) Admittedly I don't have a good overview on all the
> > configuration
> > > > > options that exist, but from those that I do know I can't imagine
> > > someone
> > > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > > Platform
> > > > as
> > > > > of now we ignore this problem. If, however, this needs to be
> > > addressed, a
> > > > > possible solution could be to allow escaping syntax such as
> > > "\${MY_VAR}".
> > > > >
> > > > > Another point to consider here is when exactly the substitution takes
> > > > > place: on the "raw" file, or on the parsed key / value separately,
> > and
> > > if
> > > > > so, should it support both key and value? My current thinking is that
> > > > > substituting only the value of the parsed entry should be sufficient.
> > > > >
> > > > >
> > > > > Regards
> > > > > Ingo
> > > > >
> > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Thanks for kicking off the discussion.
> > > > > >
> > > > > > I think supporting environment variables rendering in the Flink
> > > > > > configuration yaml file is a good idea. Especially for
> > > > > > the Kubernetes environment since we are using the secret resource
> > to
> > > > > store
> > > > > > the authentication information.
> > > > > >
> > > > > > But I have some questions for how to do it?
> > > > > > 1. The environments in Flink configuration yaml will be replaced in
> > > > > client,
> > > > > > JobManager, TaskManager or all of them?
> > > > > > 2. If users do not want some config options to be replaced, how to
> > > > > > achieve that?
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> > > > 下午8:55写道:
> > > > > >
> > > > > > > Hi Ingo,
> > > > > > >
> > > > > > > Thanks a lot for this proposal!
> > > > > > >
> > > > > > > We had a related discussion recently in the context of
> > FLINK-19520
> > > > > > > (randomizing tests configuration) [1].
> > > > > > > I believe other scenarios will benefit as well.
> > > > > > >
> > > > > > > For the end users, I think substitution in configuration files is
> > > > > > > preferable over parsing env vars in Flink code.
> > > > > > > And for cases without such a file, we could have a default one on
> > > the
> > > > > > > classpath with all substitutions defined (and then merge
> > everything
> > > > > from
> > > > > > > the user-supplied file).
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > > >
> > > > > > > Regards,
> > > > > > > Roman
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <[hidden email]>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > in Ververica Platform we offer a feature to use environment
> > > > variables
> > > > > > in
> > > > > > > > the Flink configuration¹, e.g.
> > > > > > > >
> > > > > > > > ```
> > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > > ```
> > > > > > > >
> > > > > > > > We've been discussing internally whether contributing such a
> > > > feature
> > > > > to
> > > > > > > > Flink directly would make sense and wanted to start a
> > discussion
> > > on
> > > > > > this
> > > > > > > > topic.
> > > > > > > >
> > > > > > > > An alternative way to do so from the above would be parsing
> > those
> > > > > > > directly
> > > > > > > > based on their name, so instead of having it defined in the
> > Flink
> > > > > > > > configuration as above, it would get automatically set if
> > > something
> > > > > > like
> > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > > > > somewhat
> > > > > > > > similar to what e.g. Spring does, and faces similar challenges
> > > > > (dealing
> > > > > > > > with "."s etc.)
> > > > > > > >
> > > > > > > > Although I view both of these approaches as mostly orthogonal,
> > > > > > supporting
> > > > > > > > both very likely wouldn't make sense, of course. So I was
> > > wondering
> > > > > > what
> > > > > > > > your opinion is in terms of whether the project would benefit
> > > from
> > > > > > > > environment variable support for the Flink configuration, and
> > > > whether
> > > > > > > there
> > > > > > > > are tendencies as to which approach to go with.
> > > > > > > >
> > > > > > > > ¹
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > > >
> > > > > > > > Best regards
> > > > > > > > Ingo
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Till Rohrmann
I think a short FLIP would be awesome.

I guess this feature hasn't been implemented yet because it has not been
implemented yet ;-) I agree that this feature will improve configuration
ergonomics big time :-)

Cheers,
Till

On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <[hidden email]> wrote:

> Hey all,
>
> I think that approach 2 is more idiomatic for container deployments where
> it can be cumbersome to manually map flink-conf.yaml contents to env vars
> [1]. The precedence order outlined by Till would also cover Steven's
> hierarchical overwrite requirement.
>
> I'm really excited about this feature as it will make Flink deployments a
> lot more ergonomic. The implementation seems to be not too complicated
> (which makes we wonder why we didn't tackle this earlier or whether I'm
> missing something).
>
> I'd also be happy to shepherd this contribution if there is consensus on
> the need for it and the approach. Does it make sense to formalize this
> decision a bit with a short FLIP?
>
> – Ufuk
>
> [1] In Ververica Platform, we support approach 1, because the Flink
> configuration is part of the specification for a single Deployment and it's
> minimally more convenient to have something like
>
> flinkConfiguration:
>   foo: ${BAR}
>
> for us. I don't think this approach would feel natural when manually
> deploying Flink. There would be a clear migration path for our customers,
> so I'm not concerned about this too much.
>
> On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> > Hi everyone,
> >
> > Thanks for starting this discussion Ingo. I think being able to use env
> > variables to change Flink's configuration will be a very useful feature.
> >
> > Concerning the two approaches I would be in favour of the second approach
> > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> > prepare a special flink-conf.yaml where he inserts env variables for
> every
> > config value he wants to configure. Since this is not required with the
> > second approach, I think it is more general and easier to use. Also, the
> > user does not have to remember a second set of names (env names) which he
> > has to/can set.
> >
> > For how to substitute the values, I think it should happen when we load
> the
> > Flink configuration. First we read the file and then overwrite values
> > specified via an env variable or dynamic properties in some defined
> order.
> > For env.java.opts and other options which are used for starting the JVM
> we
> > might need special handling in the bash scripts.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]> wrote:
> >
> > > Hi Yang,
> > >
> > > 1. As you said I think this doesn't affect Ververica Platform, really,
> so
> > > I'm more than happy to hear and follow the thoughts of people more
> > > experienced with Flink than me.
> > > 2. I wasn't aware of env.java.opts, but that's definitely a candidate
> where
> > > a user may want to "escape" it so it doesn't get substituted
> immediately, I
> > > agree.
> > >
> > >
> > > Regards
> > > Ingo
> > >
> > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]>
> wrote:
> > >
> > > > Hi Ingo,
> > > >
> > > > Thanks for your response.
> > > >
> > > > 1. Not distinguishing JM/TM is reasonable, but what about the client
> > > side.
> > > > For Yarn/K8s deployment,
> > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
> confused
> > > > about where should the environment
> > > > variables be replaced? IIUC, it is not an issue for Ververica
> Platform
> > > > since it is always done in the JM/TM side.
> > > >
> > > > 2. I believe we should support not do the substitution for specific
> key.
> > > A
> > > > typical use case is "env.java.opts". If the
> > > > value contains environment variables, they are expected to be
> replaced
> > > > exactly when the java command is executed,
> > > > not after the java process is started. Maybe escaping with single
> quote
> > > is
> > > > enough.
> > > >
> > > > 3. The substitution only takes effects on the value makes sense to
> me.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
> > > >
> > > > > Variable substitution (proposed here) is definitely useful.
> > > > >
> > > > > For us, hierarchical override is more useful.  E.g., we may have
> the
> > > > > default value of "state.checkpoints.dir=path1" defined in
> > > > flink-conf.yaml.
> > > > > But maybe we want to override it to "state.checkpoints.dir=path2"
> via
> > > > > environment variable in some scenarios. Otherwise, we have to
> define a
> > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the
> Flink
> > > > > config, which is annoying.
> > > > >
> > > > > As Ingo pointed, it is also annoying to handle Java property key
> naming
> > > > > convention (dots separated), as dots aren't allowed in shell env
> var
> > > > naming
> > > > > (All caps, separated with underscore). Shell will complain. We
> have to
> > > > > bundle all env var overrides (k-v pairs) in a single property value
> > > (JSON
> > > > > and base64 encode) to avoid it.
> > > > >
> > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]>
> wrote:
> > > > >
> > > > > > Hi Yang,
> > > > > >
> > > > > > thanks for your questions! I'm glad to see this feature is being
> > > > received
> > > > > > positively.
> > > > > >
> > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good
> reason
> > > > why
> > > > > a
> > > > > > user would want to do so. I'm not very experienced with Flink,
> > > however,
> > > > > so
> > > > > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > > > > ad 2) Admittedly I don't have a good overview on all the
> > > configuration
> > > > > > options that exist, but from those that I do know I can't imagine
> > > > someone
> > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > > > Platform
> > > > > as
> > > > > > of now we ignore this problem. If, however, this needs to be
> > > > addressed, a
> > > > > > possible solution could be to allow escaping syntax such as
> > > > "\${MY_VAR}".
> > > > > >
> > > > > > Another point to consider here is when exactly the substitution
> takes
> > > > > > place: on the "raw" file, or on the parsed key / value
> separately,
> > > and
> > > > if
> > > > > > so, should it support both key and value? My current thinking is
> that
> > > > > > substituting only the value of the parsed entry should be
> sufficient.
> > > > > >
> > > > > >
> > > > > > Regards
> > > > > > Ingo
> > > > > >
> > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <[hidden email]
> >
> > > > wrote:
> > > > > >
> > > > > > > Thanks for kicking off the discussion.
> > > > > > >
> > > > > > > I think supporting environment variables rendering in the Flink
> > > > > > > configuration yaml file is a good idea. Especially for
> > > > > > > the Kubernetes environment since we are using the secret
> resource
> > > to
> > > > > > store
> > > > > > > the authentication information.
> > > > > > >
> > > > > > > But I have some questions for how to do it?
> > > > > > > 1. The environments in Flink configuration yaml will be
> replaced in
> > > > > > client,
> > > > > > > JobManager, TaskManager or all of them?
> > > > > > > 2. If users do not want some config options to be replaced,
> how to
> > > > > > > achieve that?
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > > Khachatryan Roman <[hidden email]> 于2021年1月18日周一
> > > > > 下午8:55写道:
> > > > > > >
> > > > > > > > Hi Ingo,
> > > > > > > >
> > > > > > > > Thanks a lot for this proposal!
> > > > > > > >
> > > > > > > > We had a related discussion recently in the context of
> > > FLINK-19520
> > > > > > > > (randomizing tests configuration) [1].
> > > > > > > > I believe other scenarios will benefit as well.
> > > > > > > >
> > > > > > > > For the end users, I think substitution in configuration
> files is
> > > > > > > > preferable over parsing env vars in Flink code.
> > > > > > > > And for cases without such a file, we could have a default
> one on
> > > > the
> > > > > > > > classpath with all substitutions defined (and then merge
> > > everything
> > > > > > from
> > > > > > > > the user-supplied file).
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Roman
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <
> [hidden email]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > in Ververica Platform we offer a feature to use environment
> > > > > variables
> > > > > > > in
> > > > > > > > > the Flink configuration¹, e.g.
> > > > > > > > >
> > > > > > > > > ```
> > > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > We've been discussing internally whether contributing such
> a
> > > > > feature
> > > > > > to
> > > > > > > > > Flink directly would make sense and wanted to start a
> > > discussion
> > > > on
> > > > > > > this
> > > > > > > > > topic.
> > > > > > > > >
> > > > > > > > > An alternative way to do so from the above would be parsing
> > > those
> > > > > > > > directly
> > > > > > > > > based on their name, so instead of having it defined in the
> > > Flink
> > > > > > > > > configuration as above, it would get automatically set if
> > > > something
> > > > > > > like
> > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment.
> This is
> > > > > > > somewhat
> > > > > > > > > similar to what e.g. Spring does, and faces similar
> challenges
> > > > > > (dealing
> > > > > > > > > with "."s etc.)
> > > > > > > > >
> > > > > > > > > Although I view both of these approaches as mostly
> orthogonal,
> > > > > > > supporting
> > > > > > > > > both very likely wouldn't make sense, of course. So I was
> > > > wondering
> > > > > > > what
> > > > > > > > > your opinion is in terms of whether the project would
> benefit
> > > > from
> > > > > > > > > environment variable support for the Flink configuration,
> and
> > > > > whether
> > > > > > > > there
> > > > > > > > > are tendencies as to which approach to go with.
> > > > > > > > >
> > > > > > > > > ¹
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > > > >
> > > > > > > > > Best regards
> > > > > > > > > Ingo
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Steven Wu
Ingo,

regarding "state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}", it definitely
can work. but now users need to know that we can use "CHECKPOINTS_DIR" env
var to override "state.checkpoints.dir". That is the inconvenience that I
am trying to avoid. "state.checkpoints.dir" is well documented in the Flink
website. Now, we need to document "CHECKPOINTS_DIR" separately.

Ideally, we want to just let the user define a env var like
"state.checkpoints.dir=some/overriden/path". But it suffers the problem
that dot chars are invalid characters for shell env var names. That is the
dilemma.

That is why we bundled all the non-conformning env var overrides (where
variable names containing non-conforming chars) into a single base64
encoded string (like "NON_CONFORMING_OVERRIDES_BASE64=<base64 encoded json
string>"  in our deployment infrastructure and unpack them in the container
startup. It is hacky. I am hoping that there is a more elegant solution.

If the configuration key is conforming to shell standard (like
S3_ACCESS_KEY), then we don't have a problem. but it will be deviating from
the Flink config naming convention (Java property style, dot separated).

Thanks,
Steven


On Tue, Jan 19, 2021 at 6:56 AM Till Rohrmann <[hidden email]> wrote:

> I think a short FLIP would be awesome.
>
> I guess this feature hasn't been implemented yet because it has not been
> implemented yet ;-) I agree that this feature will improve configuration
> ergonomics big time :-)
>
> Cheers,
> Till
>
> On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <[hidden email]> wrote:
>
> > Hey all,
> >
> > I think that approach 2 is more idiomatic for container deployments where
> > it can be cumbersome to manually map flink-conf.yaml contents to env vars
> > [1]. The precedence order outlined by Till would also cover Steven's
> > hierarchical overwrite requirement.
> >
> > I'm really excited about this feature as it will make Flink deployments a
> > lot more ergonomic. The implementation seems to be not too complicated
> > (which makes we wonder why we didn't tackle this earlier or whether I'm
> > missing something).
> >
> > I'd also be happy to shepherd this contribution if there is consensus on
> > the need for it and the approach. Does it make sense to formalize this
> > decision a bit with a short FLIP?
> >
> > – Ufuk
> >
> > [1] In Ververica Platform, we support approach 1, because the Flink
> > configuration is part of the specification for a single Deployment and
> it's
> > minimally more convenient to have something like
> >
> > flinkConfiguration:
> >   foo: ${BAR}
> >
> > for us. I don't think this approach would feel natural when manually
> > deploying Flink. There would be a clear migration path for our customers,
> > so I'm not concerned about this too much.
> >
> > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> > > Hi everyone,
> > >
> > > Thanks for starting this discussion Ingo. I think being able to use env
> > > variables to change Flink's configuration will be a very useful
> feature.
> > >
> > > Concerning the two approaches I would be in favour of the second
> approach
> > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> > > prepare a special flink-conf.yaml where he inserts env variables for
> > every
> > > config value he wants to configure. Since this is not required with the
> > > second approach, I think it is more general and easier to use. Also,
> the
> > > user does not have to remember a second set of names (env names) which
> he
> > > has to/can set.
> > >
> > > For how to substitute the values, I think it should happen when we load
> > the
> > > Flink configuration. First we read the file and then overwrite values
> > > specified via an env variable or dynamic properties in some defined
> > order.
> > > For env.java.opts and other options which are used for starting the JVM
> > we
> > > might need special handling in the bash scripts.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]> wrote:
> > >
> > > > Hi Yang,
> > > >
> > > > 1. As you said I think this doesn't affect Ververica Platform,
> really,
> > so
> > > > I'm more than happy to hear and follow the thoughts of people more
> > > > experienced with Flink than me.
> > > > 2. I wasn't aware of env.java.opts, but that's definitely a candidate
> > where
> > > > a user may want to "escape" it so it doesn't get substituted
> > immediately, I
> > > > agree.
> > > >
> > > >
> > > > Regards
> > > > Ingo
> > > >
> > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]>
> > wrote:
> > > >
> > > > > Hi Ingo,
> > > > >
> > > > > Thanks for your response.
> > > > >
> > > > > 1. Not distinguishing JM/TM is reasonable, but what about the
> client
> > > > side.
> > > > > For Yarn/K8s deployment,
> > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
> > confused
> > > > > about where should the environment
> > > > > variables be replaced? IIUC, it is not an issue for Ververica
> > Platform
> > > > > since it is always done in the JM/TM side.
> > > > >
> > > > > 2. I believe we should support not do the substitution for specific
> > key.
> > > > A
> > > > > typical use case is "env.java.opts". If the
> > > > > value contains environment variables, they are expected to be
> > replaced
> > > > > exactly when the java command is executed,
> > > > > not after the java process is started. Maybe escaping with single
> > quote
> > > > is
> > > > > enough.
> > > > >
> > > > > 3. The substitution only takes effects on the value makes sense to
> > me.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
> > > > >
> > > > > > Variable substitution (proposed here) is definitely useful.
> > > > > >
> > > > > > For us, hierarchical override is more useful.  E.g., we may have
> > the
> > > > > > default value of "state.checkpoints.dir=path1" defined in
> > > > > flink-conf.yaml.
> > > > > > But maybe we want to override it to "state.checkpoints.dir=path2"
> > via
> > > > > > environment variable in some scenarios. Otherwise, we have to
> > define a
> > > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the
> > Flink
> > > > > > config, which is annoying.
> > > > > >
> > > > > > As Ingo pointed, it is also annoying to handle Java property key
> > naming
> > > > > > convention (dots separated), as dots aren't allowed in shell env
> > var
> > > > > naming
> > > > > > (All caps, separated with underscore). Shell will complain. We
> > have to
> > > > > > bundle all env var overrides (k-v pairs) in a single property
> value
> > > > (JSON
> > > > > > and base64 encode) to avoid it.
> > > > > >
> > > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]>
> > wrote:
> > > > > >
> > > > > > > Hi Yang,
> > > > > > >
> > > > > > > thanks for your questions! I'm glad to see this feature is
> being
> > > > > received
> > > > > > > positively.
> > > > > > >
> > > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good
> > reason
> > > > > why
> > > > > > a
> > > > > > > user would want to do so. I'm not very experienced with Flink,
> > > > however,
> > > > > > so
> > > > > > > please excuse me if I'm overlooking some obvious reason here.
> :-)
> > > > > > > ad 2) Admittedly I don't have a good overview on all the
> > > > configuration
> > > > > > > options that exist, but from those that I do know I can't
> imagine
> > > > > someone
> > > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > > > > Platform
> > > > > > as
> > > > > > > of now we ignore this problem. If, however, this needs to be
> > > > > addressed, a
> > > > > > > possible solution could be to allow escaping syntax such as
> > > > > "\${MY_VAR}".
> > > > > > >
> > > > > > > Another point to consider here is when exactly the substitution
> > takes
> > > > > > > place: on the "raw" file, or on the parsed key / value
> > separately,
> > > > and
> > > > > if
> > > > > > > so, should it support both key and value? My current thinking
> is
> > that
> > > > > > > substituting only the value of the parsed entry should be
> > sufficient.
> > > > > > >
> > > > > > >
> > > > > > > Regards
> > > > > > > Ingo
> > > > > > >
> > > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <
> [hidden email]
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for kicking off the discussion.
> > > > > > > >
> > > > > > > > I think supporting environment variables rendering in the
> Flink
> > > > > > > > configuration yaml file is a good idea. Especially for
> > > > > > > > the Kubernetes environment since we are using the secret
> > resource
> > > > to
> > > > > > > store
> > > > > > > > the authentication information.
> > > > > > > >
> > > > > > > > But I have some questions for how to do it?
> > > > > > > > 1. The environments in Flink configuration yaml will be
> > replaced in
> > > > > > > client,
> > > > > > > > JobManager, TaskManager or all of them?
> > > > > > > > 2. If users do not want some config options to be replaced,
> > how to
> > > > > > > > achieve that?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yang
> > > > > > > >
> > > > > > > > Khachatryan Roman <[hidden email]>
> 于2021年1月18日周一
> > > > > > 下午8:55写道:
> > > > > > > >
> > > > > > > > > Hi Ingo,
> > > > > > > > >
> > > > > > > > > Thanks a lot for this proposal!
> > > > > > > > >
> > > > > > > > > We had a related discussion recently in the context of
> > > > FLINK-19520
> > > > > > > > > (randomizing tests configuration) [1].
> > > > > > > > > I believe other scenarios will benefit as well.
> > > > > > > > >
> > > > > > > > > For the end users, I think substitution in configuration
> > files is
> > > > > > > > > preferable over parsing env vars in Flink code.
> > > > > > > > > And for cases without such a file, we could have a default
> > one on
> > > > > the
> > > > > > > > > classpath with all substitutions defined (and then merge
> > > > everything
> > > > > > > from
> > > > > > > > > the user-supplied file).
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Roman
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <
> > [hidden email]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > in Ververica Platform we offer a feature to use
> environment
> > > > > > variables
> > > > > > > > in
> > > > > > > > > > the Flink configuration¹, e.g.
> > > > > > > > > >
> > > > > > > > > > ```
> > > > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > > > > ```
> > > > > > > > > >
> > > > > > > > > > We've been discussing internally whether contributing
> such
> > a
> > > > > > feature
> > > > > > > to
> > > > > > > > > > Flink directly would make sense and wanted to start a
> > > > discussion
> > > > > on
> > > > > > > > this
> > > > > > > > > > topic.
> > > > > > > > > >
> > > > > > > > > > An alternative way to do so from the above would be
> parsing
> > > > those
> > > > > > > > > directly
> > > > > > > > > > based on their name, so instead of having it defined in
> the
> > > > Flink
> > > > > > > > > > configuration as above, it would get automatically set if
> > > > > something
> > > > > > > > like
> > > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment.
> > This is
> > > > > > > > somewhat
> > > > > > > > > > similar to what e.g. Spring does, and faces similar
> > challenges
> > > > > > > (dealing
> > > > > > > > > > with "."s etc.)
> > > > > > > > > >
> > > > > > > > > > Although I view both of these approaches as mostly
> > orthogonal,
> > > > > > > > supporting
> > > > > > > > > > both very likely wouldn't make sense, of course. So I was
> > > > > wondering
> > > > > > > > what
> > > > > > > > > > your opinion is in terms of whether the project would
> > benefit
> > > > > from
> > > > > > > > > > environment variable support for the Flink configuration,
> > and
> > > > > > whether
> > > > > > > > > there
> > > > > > > > > > are tendencies as to which approach to go with.
> > > > > > > > > >
> > > > > > > > > > ¹
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > > > > >
> > > > > > > > > > Best regards
> > > > > > > > > > Ingo
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ingo Bürk
Hi Steven,

understood, thanks for expanding on your point. I will prepare a FLIP now
going for the solution of S3_ACCESS_KEY style environment variables since
overall this fits Flink use-cases better. I think your point that users
have to know how to convert s3.access-key to S3_ACCESS_KEY is valid, but it
would follow a logical pattern also used by a big project like Spring.

I'll update this thread with the FLIP number once I have written the first
draft. Thanks everyone!


Regards
Ingo

On Tue, Jan 19, 2021 at 6:28 PM Steven Wu <[hidden email]> wrote:

> Ingo,
>
> regarding "state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}", it definitely
> can work. but now users need to know that we can use "CHECKPOINTS_DIR" env
> var to override "state.checkpoints.dir". That is the inconvenience that I
> am trying to avoid. "state.checkpoints.dir" is well documented in the Flink
> website. Now, we need to document "CHECKPOINTS_DIR" separately.
>
> Ideally, we want to just let the user define a env var like
> "state.checkpoints.dir=some/overriden/path". But it suffers the problem
> that dot chars are invalid characters for shell env var names. That is the
> dilemma.
>
> That is why we bundled all the non-conformning env var overrides (where
> variable names containing non-conforming chars) into a single base64
> encoded string (like "NON_CONFORMING_OVERRIDES_BASE64=<base64 encoded json
> string>"  in our deployment infrastructure and unpack them in the container
> startup. It is hacky. I am hoping that there is a more elegant solution.
>
> If the configuration key is conforming to shell standard (like
> S3_ACCESS_KEY), then we don't have a problem. but it will be deviating from
> the Flink config naming convention (Java property style, dot separated).
>
> Thanks,
> Steven
>
>
> On Tue, Jan 19, 2021 at 6:56 AM Till Rohrmann <[hidden email]>
> wrote:
>
> > I think a short FLIP would be awesome.
> >
> > I guess this feature hasn't been implemented yet because it has not been
> > implemented yet ;-) I agree that this feature will improve configuration
> > ergonomics big time :-)
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <[hidden email]> wrote:
> >
> > > Hey all,
> > >
> > > I think that approach 2 is more idiomatic for container deployments
> where
> > > it can be cumbersome to manually map flink-conf.yaml contents to env
> vars
> > > [1]. The precedence order outlined by Till would also cover Steven's
> > > hierarchical overwrite requirement.
> > >
> > > I'm really excited about this feature as it will make Flink
> deployments a
> > > lot more ergonomic. The implementation seems to be not too complicated
> > > (which makes we wonder why we didn't tackle this earlier or whether I'm
> > > missing something).
> > >
> > > I'd also be happy to shepherd this contribution if there is consensus
> on
> > > the need for it and the approach. Does it make sense to formalize this
> > > decision a bit with a short FLIP?
> > >
> > > – Ufuk
> > >
> > > [1] In Ververica Platform, we support approach 1, because the Flink
> > > configuration is part of the specification for a single Deployment and
> > it's
> > > minimally more convenient to have something like
> > >
> > > flinkConfiguration:
> > >   foo: ${BAR}
> > >
> > > for us. I don't think this approach would feel natural when manually
> > > deploying Flink. There would be a clear migration path for our
> customers,
> > > so I'm not concerned about this too much.
> > >
> > > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
> > > > Hi everyone,
> > > >
> > > > Thanks for starting this discussion Ingo. I think being able to use
> env
> > > > variables to change Flink's configuration will be a very useful
> > feature.
> > > >
> > > > Concerning the two approaches I would be in favour of the second
> > approach
> > > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
> > > > prepare a special flink-conf.yaml where he inserts env variables for
> > > every
> > > > config value he wants to configure. Since this is not required with
> the
> > > > second approach, I think it is more general and easier to use. Also,
> > the
> > > > user does not have to remember a second set of names (env names)
> which
> > he
> > > > has to/can set.
> > > >
> > > > For how to substitute the values, I think it should happen when we
> load
> > > the
> > > > Flink configuration. First we read the file and then overwrite values
> > > > specified via an env variable or dynamic properties in some defined
> > > order.
> > > > For env.java.opts and other options which are used for starting the
> JVM
> > > we
> > > > might need special handling in the bash scripts.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]>
> wrote:
> > > >
> > > > > Hi Yang,
> > > > >
> > > > > 1. As you said I think this doesn't affect Ververica Platform,
> > really,
> > > so
> > > > > I'm more than happy to hear and follow the thoughts of people more
> > > > > experienced with Flink than me.
> > > > > 2. I wasn't aware of env.java.opts, but that's definitely a
> candidate
> > > where
> > > > > a user may want to "escape" it so it doesn't get substituted
> > > immediately, I
> > > > > agree.
> > > > >
> > > > >
> > > > > Regards
> > > > > Ingo
> > > > >
> > > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Hi Ingo,
> > > > > >
> > > > > > Thanks for your response.
> > > > > >
> > > > > > 1. Not distinguishing JM/TM is reasonable, but what about the
> > client
> > > > > side.
> > > > > > For Yarn/K8s deployment,
> > > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
> > > confused
> > > > > > about where should the environment
> > > > > > variables be replaced? IIUC, it is not an issue for Ververica
> > > Platform
> > > > > > since it is always done in the JM/TM side.
> > > > > >
> > > > > > 2. I believe we should support not do the substitution for
> specific
> > > key.
> > > > > A
> > > > > > typical use case is "env.java.opts". If the
> > > > > > value contains environment variables, they are expected to be
> > > replaced
> > > > > > exactly when the java command is executed,
> > > > > > not after the java process is started. Maybe escaping with single
> > > quote
> > > > > is
> > > > > > enough.
> > > > > >
> > > > > > 3. The substitution only takes effects on the value makes sense
> to
> > > me.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
> > > > > >
> > > > > > > Variable substitution (proposed here) is definitely useful.
> > > > > > >
> > > > > > > For us, hierarchical override is more useful.  E.g., we may
> have
> > > the
> > > > > > > default value of "state.checkpoints.dir=path1" defined in
> > > > > > flink-conf.yaml.
> > > > > > > But maybe we want to override it to
> "state.checkpoints.dir=path2"
> > > via
> > > > > > > environment variable in some scenarios. Otherwise, we have to
> > > define a
> > > > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for
> the
> > > Flink
> > > > > > > config, which is annoying.
> > > > > > >
> > > > > > > As Ingo pointed, it is also annoying to handle Java property
> key
> > > naming
> > > > > > > convention (dots separated), as dots aren't allowed in shell
> env
> > > var
> > > > > > naming
> > > > > > > (All caps, separated with underscore). Shell will complain. We
> > > have to
> > > > > > > bundle all env var overrides (k-v pairs) in a single property
> > value
> > > > > (JSON
> > > > > > > and base64 encode) to avoid it.
> > > > > > >
> > > > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Yang,
> > > > > > > >
> > > > > > > > thanks for your questions! I'm glad to see this feature is
> > being
> > > > > > received
> > > > > > > > positively.
> > > > > > > >
> > > > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good
> > > reason
> > > > > > why
> > > > > > > a
> > > > > > > > user would want to do so. I'm not very experienced with
> Flink,
> > > > > however,
> > > > > > > so
> > > > > > > > please excuse me if I'm overlooking some obvious reason here.
> > :-)
> > > > > > > > ad 2) Admittedly I don't have a good overview on all the
> > > > > configuration
> > > > > > > > options that exist, but from those that I do know I can't
> > imagine
> > > > > > someone
> > > > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In
> Ververica
> > > > > > Platform
> > > > > > > as
> > > > > > > > of now we ignore this problem. If, however, this needs to be
> > > > > > addressed, a
> > > > > > > > possible solution could be to allow escaping syntax such as
> > > > > > "\${MY_VAR}".
> > > > > > > >
> > > > > > > > Another point to consider here is when exactly the
> substitution
> > > takes
> > > > > > > > place: on the "raw" file, or on the parsed key / value
> > > separately,
> > > > > and
> > > > > > if
> > > > > > > > so, should it support both key and value? My current thinking
> > is
> > > that
> > > > > > > > substituting only the value of the parsed entry should be
> > > sufficient.
> > > > > > > >
> > > > > > > >
> > > > > > > > Regards
> > > > > > > > Ingo
> > > > > > > >
> > > > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for kicking off the discussion.
> > > > > > > > >
> > > > > > > > > I think supporting environment variables rendering in the
> > Flink
> > > > > > > > > configuration yaml file is a good idea. Especially for
> > > > > > > > > the Kubernetes environment since we are using the secret
> > > resource
> > > > > to
> > > > > > > > store
> > > > > > > > > the authentication information.
> > > > > > > > >
> > > > > > > > > But I have some questions for how to do it?
> > > > > > > > > 1. The environments in Flink configuration yaml will be
> > > replaced in
> > > > > > > > client,
> > > > > > > > > JobManager, TaskManager or all of them?
> > > > > > > > > 2. If users do not want some config options to be replaced,
> > > how to
> > > > > > > > > achieve that?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yang
> > > > > > > > >
> > > > > > > > > Khachatryan Roman <[hidden email]>
> > 于2021年1月18日周一
> > > > > > > 下午8:55写道:
> > > > > > > > >
> > > > > > > > > > Hi Ingo,
> > > > > > > > > >
> > > > > > > > > > Thanks a lot for this proposal!
> > > > > > > > > >
> > > > > > > > > > We had a related discussion recently in the context of
> > > > > FLINK-19520
> > > > > > > > > > (randomizing tests configuration) [1].
> > > > > > > > > > I believe other scenarios will benefit as well.
> > > > > > > > > >
> > > > > > > > > > For the end users, I think substitution in configuration
> > > files is
> > > > > > > > > > preferable over parsing env vars in Flink code.
> > > > > > > > > > And for cases without such a file, we could have a
> default
> > > one on
> > > > > > the
> > > > > > > > > > classpath with all substitutions defined (and then merge
> > > > > everything
> > > > > > > > from
> > > > > > > > > > the user-supplied file).
> > > > > > > > > >
> > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Roman
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > in Ververica Platform we offer a feature to use
> > environment
> > > > > > > variables
> > > > > > > > > in
> > > > > > > > > > > the Flink configuration¹, e.g.
> > > > > > > > > > >
> > > > > > > > > > > ```
> > > > > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > > > > > ```
> > > > > > > > > > >
> > > > > > > > > > > We've been discussing internally whether contributing
> > such
> > > a
> > > > > > > feature
> > > > > > > > to
> > > > > > > > > > > Flink directly would make sense and wanted to start a
> > > > > discussion
> > > > > > on
> > > > > > > > > this
> > > > > > > > > > > topic.
> > > > > > > > > > >
> > > > > > > > > > > An alternative way to do so from the above would be
> > parsing
> > > > > those
> > > > > > > > > > directly
> > > > > > > > > > > based on their name, so instead of having it defined in
> > the
> > > > > Flink
> > > > > > > > > > > configuration as above, it would get automatically set
> if
> > > > > > something
> > > > > > > > > like
> > > > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment.
> > > This is
> > > > > > > > > somewhat
> > > > > > > > > > > similar to what e.g. Spring does, and faces similar
> > > challenges
> > > > > > > > (dealing
> > > > > > > > > > > with "."s etc.)
> > > > > > > > > > >
> > > > > > > > > > > Although I view both of these approaches as mostly
> > > orthogonal,
> > > > > > > > > supporting
> > > > > > > > > > > both very likely wouldn't make sense, of course. So I
> was
> > > > > > wondering
> > > > > > > > > what
> > > > > > > > > > > your opinion is in terms of whether the project would
> > > benefit
> > > > > > from
> > > > > > > > > > > environment variable support for the Flink
> configuration,
> > > and
> > > > > > > whether
> > > > > > > > > > there
> > > > > > > > > > > are tendencies as to which approach to go with.
> > > > > > > > > > >
> > > > > > > > > > > ¹
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > > > > > >
> > > > > > > > > > > Best regards
> > > > > > > > > > > Ingo
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink configuration from environment variables

Ingo Bürk
Hi,

sorry, I almost forgot, so just to update this thread: I now started a new
thread for the actual FLIP:

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-161-Configuration-through-environment-variables-td48094.html


Ingo

On Wed, Jan 20, 2021 at 11:10 AM Ingo Bürk <[hidden email]> wrote:

> Hi Steven,
>
> understood, thanks for expanding on your point. I will prepare a FLIP now
> going for the solution of S3_ACCESS_KEY style environment variables since
> overall this fits Flink use-cases better. I think your point that users
> have to know how to convert s3.access-key to S3_ACCESS_KEY is valid, but it
> would follow a logical pattern also used by a big project like Spring.
>
> I'll update this thread with the FLIP number once I have written the first
> draft. Thanks everyone!
>
>
> Regards
> Ingo
>
> On Tue, Jan 19, 2021 at 6:28 PM Steven Wu <[hidden email]> wrote:
>
>> Ingo,
>>
>> regarding "state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}", it
>> definitely
>> can work. but now users need to know that we can use "CHECKPOINTS_DIR" env
>> var to override "state.checkpoints.dir". That is the inconvenience that I
>> am trying to avoid. "state.checkpoints.dir" is well documented in the
>> Flink
>> website. Now, we need to document "CHECKPOINTS_DIR" separately.
>>
>> Ideally, we want to just let the user define a env var like
>> "state.checkpoints.dir=some/overriden/path". But it suffers the problem
>> that dot chars are invalid characters for shell env var names. That is the
>> dilemma.
>>
>> That is why we bundled all the non-conformning env var overrides (where
>> variable names containing non-conforming chars) into a single base64
>> encoded string (like "NON_CONFORMING_OVERRIDES_BASE64=<base64 encoded json
>> string>"  in our deployment infrastructure and unpack them in the
>> container
>> startup. It is hacky. I am hoping that there is a more elegant solution.
>>
>> If the configuration key is conforming to shell standard (like
>> S3_ACCESS_KEY), then we don't have a problem. but it will be deviating
>> from
>> the Flink config naming convention (Java property style, dot separated).
>>
>> Thanks,
>> Steven
>>
>>
>> On Tue, Jan 19, 2021 at 6:56 AM Till Rohrmann <[hidden email]>
>> wrote:
>>
>> > I think a short FLIP would be awesome.
>> >
>> > I guess this feature hasn't been implemented yet because it has not been
>> > implemented yet ;-) I agree that this feature will improve configuration
>> > ergonomics big time :-)
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <[hidden email]> wrote:
>> >
>> > > Hey all,
>> > >
>> > > I think that approach 2 is more idiomatic for container deployments
>> where
>> > > it can be cumbersome to manually map flink-conf.yaml contents to env
>> vars
>> > > [1]. The precedence order outlined by Till would also cover Steven's
>> > > hierarchical overwrite requirement.
>> > >
>> > > I'm really excited about this feature as it will make Flink
>> deployments a
>> > > lot more ergonomic. The implementation seems to be not too complicated
>> > > (which makes we wonder why we didn't tackle this earlier or whether
>> I'm
>> > > missing something).
>> > >
>> > > I'd also be happy to shepherd this contribution if there is consensus
>> on
>> > > the need for it and the approach. Does it make sense to formalize this
>> > > decision a bit with a short FLIP?
>> > >
>> > > – Ufuk
>> > >
>> > > [1] In Ververica Platform, we support approach 1, because the Flink
>> > > configuration is part of the specification for a single Deployment and
>> > it's
>> > > minimally more convenient to have something like
>> > >
>> > > flinkConfiguration:
>> > >   foo: ${BAR}
>> > >
>> > > for us. I don't think this approach would feel natural when manually
>> > > deploying Flink. There would be a clear migration path for our
>> customers,
>> > > so I'm not concerned about this too much.
>> > >
>> > > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
>> > > > Hi everyone,
>> > > >
>> > > > Thanks for starting this discussion Ingo. I think being able to use
>> env
>> > > > variables to change Flink's configuration will be a very useful
>> > feature.
>> > > >
>> > > > Concerning the two approaches I would be in favour of the second
>> > approach
>> > > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user
>> to
>> > > > prepare a special flink-conf.yaml where he inserts env variables for
>> > > every
>> > > > config value he wants to configure. Since this is not required with
>> the
>> > > > second approach, I think it is more general and easier to use. Also,
>> > the
>> > > > user does not have to remember a second set of names (env names)
>> which
>> > he
>> > > > has to/can set.
>> > > >
>> > > > For how to substitute the values, I think it should happen when we
>> load
>> > > the
>> > > > Flink configuration. First we read the file and then overwrite
>> values
>> > > > specified via an env variable or dynamic properties in some defined
>> > > order.
>> > > > For env.java.opts and other options which are used for starting the
>> JVM
>> > > we
>> > > > might need special handling in the bash scripts.
>> > > >
>> > > > Cheers,
>> > > > Till
>> > > >
>> > > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <[hidden email]>
>> wrote:
>> > > >
>> > > > > Hi Yang,
>> > > > >
>> > > > > 1. As you said I think this doesn't affect Ververica Platform,
>> > really,
>> > > so
>> > > > > I'm more than happy to hear and follow the thoughts of people more
>> > > > > experienced with Flink than me.
>> > > > > 2. I wasn't aware of env.java.opts, but that's definitely a
>> candidate
>> > > where
>> > > > > a user may want to "escape" it so it doesn't get substituted
>> > > immediately, I
>> > > > > agree.
>> > > > >
>> > > > >
>> > > > > Regards
>> > > > > Ingo
>> > > > >
>> > > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <[hidden email]>
>> > > wrote:
>> > > > >
>> > > > > > Hi Ingo,
>> > > > > >
>> > > > > > Thanks for your response.
>> > > > > >
>> > > > > > 1. Not distinguishing JM/TM is reasonable, but what about the
>> > client
>> > > > > side.
>> > > > > > For Yarn/K8s deployment,
>> > > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
>> > > confused
>> > > > > > about where should the environment
>> > > > > > variables be replaced? IIUC, it is not an issue for Ververica
>> > > Platform
>> > > > > > since it is always done in the JM/TM side.
>> > > > > >
>> > > > > > 2. I believe we should support not do the substitution for
>> specific
>> > > key.
>> > > > > A
>> > > > > > typical use case is "env.java.opts". If the
>> > > > > > value contains environment variables, they are expected to be
>> > > replaced
>> > > > > > exactly when the java command is executed,
>> > > > > > not after the java process is started. Maybe escaping with
>> single
>> > > quote
>> > > > > is
>> > > > > > enough.
>> > > > > >
>> > > > > > 3. The substitution only takes effects on the value makes sense
>> to
>> > > me.
>> > > > > >
>> > > > > >
>> > > > > > Best,
>> > > > > > Yang
>> > > > > >
>> > > > > > Steven Wu <[hidden email]> 于2021年1月19日周二 上午12:36写道:
>> > > > > >
>> > > > > > > Variable substitution (proposed here) is definitely useful.
>> > > > > > >
>> > > > > > > For us, hierarchical override is more useful.  E.g., we may
>> have
>> > > the
>> > > > > > > default value of "state.checkpoints.dir=path1" defined in
>> > > > > > flink-conf.yaml.
>> > > > > > > But maybe we want to override it to
>> "state.checkpoints.dir=path2"
>> > > via
>> > > > > > > environment variable in some scenarios. Otherwise, we have to
>> > > define a
>> > > > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for
>> the
>> > > Flink
>> > > > > > > config, which is annoying.
>> > > > > > >
>> > > > > > > As Ingo pointed, it is also annoying to handle Java property
>> key
>> > > naming
>> > > > > > > convention (dots separated), as dots aren't allowed in shell
>> env
>> > > var
>> > > > > > naming
>> > > > > > > (All caps, separated with underscore). Shell will complain. We
>> > > have to
>> > > > > > > bundle all env var overrides (k-v pairs) in a single property
>> > value
>> > > > > (JSON
>> > > > > > > and base64 encode) to avoid it.
>> > > > > > >
>> > > > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <[hidden email]
>> >
>> > > wrote:
>> > > > > > >
>> > > > > > > > Hi Yang,
>> > > > > > > >
>> > > > > > > > thanks for your questions! I'm glad to see this feature is
>> > being
>> > > > > > received
>> > > > > > > > positively.
>> > > > > > > >
>> > > > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a
>> good
>> > > reason
>> > > > > > why
>> > > > > > > a
>> > > > > > > > user would want to do so. I'm not very experienced with
>> Flink,
>> > > > > however,
>> > > > > > > so
>> > > > > > > > please excuse me if I'm overlooking some obvious reason
>> here.
>> > :-)
>> > > > > > > > ad 2) Admittedly I don't have a good overview on all the
>> > > > > configuration
>> > > > > > > > options that exist, but from those that I do know I can't
>> > imagine
>> > > > > > someone
>> > > > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In
>> Ververica
>> > > > > > Platform
>> > > > > > > as
>> > > > > > > > of now we ignore this problem. If, however, this needs to be
>> > > > > > addressed, a
>> > > > > > > > possible solution could be to allow escaping syntax such as
>> > > > > > "\${MY_VAR}".
>> > > > > > > >
>> > > > > > > > Another point to consider here is when exactly the
>> substitution
>> > > takes
>> > > > > > > > place: on the "raw" file, or on the parsed key / value
>> > > separately,
>> > > > > and
>> > > > > > if
>> > > > > > > > so, should it support both key and value? My current
>> thinking
>> > is
>> > > that
>> > > > > > > > substituting only the value of the parsed entry should be
>> > > sufficient.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Regards
>> > > > > > > > Ingo
>> > > > > > > >
>> > > > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <
>> > [hidden email]
>> > > >
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Thanks for kicking off the discussion.
>> > > > > > > > >
>> > > > > > > > > I think supporting environment variables rendering in the
>> > Flink
>> > > > > > > > > configuration yaml file is a good idea. Especially for
>> > > > > > > > > the Kubernetes environment since we are using the secret
>> > > resource
>> > > > > to
>> > > > > > > > store
>> > > > > > > > > the authentication information.
>> > > > > > > > >
>> > > > > > > > > But I have some questions for how to do it?
>> > > > > > > > > 1. The environments in Flink configuration yaml will be
>> > > replaced in
>> > > > > > > > client,
>> > > > > > > > > JobManager, TaskManager or all of them?
>> > > > > > > > > 2. If users do not want some config options to be
>> replaced,
>> > > how to
>> > > > > > > > > achieve that?
>> > > > > > > > >
>> > > > > > > > > Best,
>> > > > > > > > > Yang
>> > > > > > > > >
>> > > > > > > > > Khachatryan Roman <[hidden email]>
>> > 于2021年1月18日周一
>> > > > > > > 下午8:55写道:
>> > > > > > > > >
>> > > > > > > > > > Hi Ingo,
>> > > > > > > > > >
>> > > > > > > > > > Thanks a lot for this proposal!
>> > > > > > > > > >
>> > > > > > > > > > We had a related discussion recently in the context of
>> > > > > FLINK-19520
>> > > > > > > > > > (randomizing tests configuration) [1].
>> > > > > > > > > > I believe other scenarios will benefit as well.
>> > > > > > > > > >
>> > > > > > > > > > For the end users, I think substitution in configuration
>> > > files is
>> > > > > > > > > > preferable over parsing env vars in Flink code.
>> > > > > > > > > > And for cases without such a file, we could have a
>> default
>> > > one on
>> > > > > > the
>> > > > > > > > > > classpath with all substitutions defined (and then merge
>> > > > > everything
>> > > > > > > > from
>> > > > > > > > > > the user-supplied file).
>> > > > > > > > > >
>> > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
>> > > > > > > > > >
>> > > > > > > > > > Regards,
>> > > > > > > > > > Roman
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <
>> > > [hidden email]>
>> > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi everyone,
>> > > > > > > > > > >
>> > > > > > > > > > > in Ververica Platform we offer a feature to use
>> > environment
>> > > > > > > variables
>> > > > > > > > > in
>> > > > > > > > > > > the Flink configuration¹, e.g.
>> > > > > > > > > > >
>> > > > > > > > > > > ```
>> > > > > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
>> > > > > > > > > > > ```
>> > > > > > > > > > >
>> > > > > > > > > > > We've been discussing internally whether contributing
>> > such
>> > > a
>> > > > > > > feature
>> > > > > > > > to
>> > > > > > > > > > > Flink directly would make sense and wanted to start a
>> > > > > discussion
>> > > > > > on
>> > > > > > > > > this
>> > > > > > > > > > > topic.
>> > > > > > > > > > >
>> > > > > > > > > > > An alternative way to do so from the above would be
>> > parsing
>> > > > > those
>> > > > > > > > > > directly
>> > > > > > > > > > > based on their name, so instead of having it defined
>> in
>> > the
>> > > > > Flink
>> > > > > > > > > > > configuration as above, it would get automatically
>> set if
>> > > > > > something
>> > > > > > > > > like
>> > > > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the
>> environment.
>> > > This is
>> > > > > > > > > somewhat
>> > > > > > > > > > > similar to what e.g. Spring does, and faces similar
>> > > challenges
>> > > > > > > > (dealing
>> > > > > > > > > > > with "."s etc.)
>> > > > > > > > > > >
>> > > > > > > > > > > Although I view both of these approaches as mostly
>> > > orthogonal,
>> > > > > > > > > supporting
>> > > > > > > > > > > both very likely wouldn't make sense, of course. So I
>> was
>> > > > > > wondering
>> > > > > > > > > what
>> > > > > > > > > > > your opinion is in terms of whether the project would
>> > > benefit
>> > > > > > from
>> > > > > > > > > > > environment variable support for the Flink
>> configuration,
>> > > and
>> > > > > > > whether
>> > > > > > > > > > there
>> > > > > > > > > > > are tendencies as to which approach to go with.
>> > > > > > > > > > >
>> > > > > > > > > > > ¹
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> >
>> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
>> > > > > > > > > > >
>> > > > > > > > > > > Best regards
>> > > > > > > > > > > Ingo
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>