Hi everyone,
I've now started a FLIP and am opening this discussion thread. Very much looking forward to your feedback! FLIP: https://cwiki.apache.org/confluence/x/ngtRCg The first big point I'd like to discuss is about the mechanism of "when" the EVs (environment variables) are looked up. I'll give three approaches here, the first of which is currently in the FLIP but very much open for change, and of course I'm happy to hear about different ideas entirely as well. 1) Lazy evaluation Only look up the EVs when an actual config key is requested from Configuration(#getRawValue), possibly with the addition of caching it once it has been looked up. The main benefit here is that no a-priori knowledge of available keys is required. The downside is that at no point in time we have complete knowledge of the configuration. This currently only really affects Configuration#keySet, but it does impose a limitation on future development worth considering. It also changes Configuration which is not limited to the Flink configuration, though this can easily be turned into an optional feature of Configuration. 2) Eager evaluation with full information If we centrally collect all possible Flink configuration keys in flink-core (quite a lot seem to be available already, but not all), we'd have complete information and could eagerly evaluate the environment, the precedence rules and populate the Configuration object accordingly. It also contains the implementation entirely to GlobalConfiguration only. The downside is, however, that this shifts the design a bit of having to know possible keys upfront. I'm also not sure how much effort it would be to collect all information in flink-core, or how "spread" this is across the codebase. 3) Eager evaluation through bijective mapping If we deviate from the Spring-style naming of the EVs we could potentially come up with a scheme that provides a bijection between EVs and config keys. If keys are further prefixed with something like FLINK_CONFIG_ (or anything to that extent), we could take all EVs with that prefix, map them to the corresponding config key name and eagerly populate Configuration. The main challenge is now defining this bijection, and we'd lose some "flexibility" in the naming of the EVs, so we'd end up with something like "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty. Happy to hear your thoughts on this! Regards Ingo |
Thanks for starting this discussion Ingo!
I like the flexibility in terms of supported EV formats the lazy evaluation gives us. However, I am a bit concerned by the fact that it is quite hard to compute the configurational ground truth w/o knowing all requested config keys. Moreover, it makes the configuration "more" mutable. One pattern which would no longer be possible with this approach is that Flink decides on some configuration values for the user and sets it in the configuration because there might always be an EV which overwrites this value later. I like to think of the configuration as something immutable (I know this not the case technically speaking) which we load/create at the beginning of a Flink process and then pass to the different components. That way, it is also quite easy to inform the user how the system is configured (e.g. displaying the effective configuration in the web UI). Moreover, there are fewer surprises if there is an exported env variable (e.g. from a previous run) which overwrites a configuration value but has been forgotten about. I don't have a very good idea for a bijective mapping but if we could come up with one, then I'd be in favour of this approach because it allows us to create an "immutable" configuration which is created at the beginning of the process. One idea for option 2) is to look up all ConfigOption instances on the classpath via reflection to create the full set of keys. Cheers, Till On Thu, Jan 21, 2021 at 8:58 AM Ingo Bürk <[hidden email]> wrote: > Hi everyone, > > I've now started a FLIP and am opening this discussion thread. Very much > looking forward to your feedback! > > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg > > The first big point I'd like to discuss is about the mechanism of "when" > the EVs (environment variables) are looked up. I'll give three approaches > here, the first of which is currently in the FLIP but very much open for > change, and of course I'm happy to hear about different ideas entirely as > well. > > 1) Lazy evaluation > > Only look up the EVs when an actual config key is requested from > Configuration(#getRawValue), possibly with the addition of caching it once > it has been looked up. > The main benefit here is that no a-priori knowledge of available keys is > required. The downside is that at no point in time we have complete > knowledge of the configuration. This currently only really affects > Configuration#keySet, but it does impose a limitation on future development > worth considering. It also changes Configuration which is not limited to > the Flink configuration, though this can easily be turned into an optional > feature of Configuration. > > 2) Eager evaluation with full information > > If we centrally collect all possible Flink configuration keys in flink-core > (quite a lot seem to be available already, but not all), we'd have complete > information and could eagerly evaluate the environment, the precedence > rules and populate the Configuration object accordingly. It also contains > the implementation entirely to GlobalConfiguration only. > The downside is, however, that this shifts the design a bit of having to > know possible keys upfront. I'm also not sure how much effort it would be > to collect all information in flink-core, or how "spread" this is across > the codebase. > > 3) Eager evaluation through bijective mapping > > If we deviate from the Spring-style naming of the EVs we could potentially > come up with a scheme that provides a bijection between EVs and config > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or > anything to that extent), we could take all EVs with that prefix, map them > to the corresponding config key name and eagerly populate Configuration. > The main challenge is now defining this bijection, and we'd lose some > "flexibility" in the naming of the EVs, so we'd end up with something like > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty. > > Happy to hear your thoughts on this! > > > Regards > Ingo > |
In reply to this post by Ingo Bürk
Thanks for starting the discussion, Ingo!
Regarding approach 1: I like the idea of having a mapping scheme from ConfigOption to env var(s), but I'm concerned about the implications of lazy eval. I think it would be preferable to keep the Configuration object as the source of truth, requiring us to do some form of eager evaluation. Regarding approach 2: I don't think we can assume that we know all config option keys. For instance, I might write a custom high availability service or a custom FileSystem plugin that has it's own config options. It would be a pity (but maybe tolerable) if env var config would only work with Flink's core options. Regarding approach 3: What do you think about a mapping like a) stripping the FLINK_CONFIG_ prefix, b) replacing every _ with a dot, c) replacing every __ with a hyphen, d) lowercasing* everything? Some examples for options that include both dots and hyphens: akka.client-socket-worker-pool.pool-size-factor => FLINK_CONFIG_AKKA_CLIENT__SOCKET__WORKER__POOL_POOL__SIZE__FACTOR high-availability.zookeeper.quorum => FLINK_CONFIG_HIGH__AVAILABILITY_ZOOKEEPER_QUORUM It's not ideal, but easy to understand assuming that dots and hyphens are the only special characters in config keys. Regarding the lower-casing step above: ConfigOption keys seem to be case sensitive internally, but I couldn't find any user-facing documentation for this. There should be no options that depends on this behaviour. So if I'm not overlooking anything, I think it should be fine to make it case insensitive internally when accessing the raw value of a ConfigOption. In addition, I think the FLIP should mention special cases such as env.java.opts that are evaluated in the bash scripts and not in the Java code. Cheers, Ufuk On Thu, Jan 21, 2021, at 8:57 AM, Ingo Bürk wrote: > Hi everyone, > > I've now started a FLIP and am opening this discussion thread. Very much > looking forward to your feedback! > > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg > > The first big point I'd like to discuss is about the mechanism of "when" > the EVs (environment variables) are looked up. I'll give three approaches > here, the first of which is currently in the FLIP but very much open for > change, and of course I'm happy to hear about different ideas entirely as > well. > > 1) Lazy evaluation > > Only look up the EVs when an actual config key is requested from > Configuration(#getRawValue), possibly with the addition of caching it once > it has been looked up. > The main benefit here is that no a-priori knowledge of available keys is > required. The downside is that at no point in time we have complete > knowledge of the configuration. This currently only really affects > Configuration#keySet, but it does impose a limitation on future development > worth considering. It also changes Configuration which is not limited to > the Flink configuration, though this can easily be turned into an optional > feature of Configuration. > > 2) Eager evaluation with full information > > If we centrally collect all possible Flink configuration keys in flink-core > (quite a lot seem to be available already, but not all), we'd have complete > information and could eagerly evaluate the environment, the precedence > rules and populate the Configuration object accordingly. It also contains > the implementation entirely to GlobalConfiguration only. > The downside is, however, that this shifts the design a bit of having to > know possible keys upfront. I'm also not sure how much effort it would be > to collect all information in flink-core, or how "spread" this is across > the codebase. > > 3) Eager evaluation through bijective mapping > > If we deviate from the Spring-style naming of the EVs we could potentially > come up with a scheme that provides a bijection between EVs and config > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or > anything to that extent), we could take all EVs with that prefix, map them > to the corresponding config key name and eagerly populate Configuration. > The main challenge is now defining this bijection, and we'd lose some > "flexibility" in the naming of the EVs, so we'd end up with something like > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty. > > Happy to hear your thoughts on this! > > > Regards > Ingo > |
Hi Ufuk, Till,
I definitely agree that having the Configuration be (or at least feel) immutable and complete seems like a better choice, and it is probably worth the trade-off in EV naming flexibility. Let me reshape the FLIP to propose something along the lines of solution (3). Regarding env.java.opts, what special handling is needed there? AFAICT only the rejected alternative of substituting values would've had an effect on this. Regards Ingo On Thu, Jan 21, 2021 at 11:13 AM Ufuk Celebi <[hidden email]> wrote: > Thanks for starting the discussion, Ingo! > > Regarding approach 1: > > I like the idea of having a mapping scheme from ConfigOption to env > var(s), but I'm concerned about the implications of lazy eval. I think it > would be preferable to keep the Configuration object as the source of > truth, requiring us to do some form of eager evaluation. > > Regarding approach 2: > > I don't think we can assume that we know all config option keys. For > instance, I might write a custom high availability service or a custom > FileSystem plugin that has it's own config options. It would be a pity (but > maybe tolerable) if env var config would only work with Flink's core > options. > > Regarding approach 3: > > What do you think about a mapping like > a) stripping the FLINK_CONFIG_ prefix, > b) replacing every _ with a dot, > c) replacing every __ with a hyphen, > d) lowercasing* everything? > > Some examples for options that include both dots and hyphens: > > akka.client-socket-worker-pool.pool-size-factor => > FLINK_CONFIG_AKKA_CLIENT__SOCKET__WORKER__POOL_POOL__SIZE__FACTOR > > high-availability.zookeeper.quorum => > FLINK_CONFIG_HIGH__AVAILABILITY_ZOOKEEPER_QUORUM > > It's not ideal, but easy to understand assuming that dots and hyphens are > the only special characters in config keys. > > Regarding the lower-casing step above: ConfigOption keys seem to be case > sensitive internally, but I couldn't find any user-facing documentation for > this. There should be no options that depends on this behaviour. So if I'm > not overlooking anything, I think it should be fine to make it case > insensitive internally when accessing the raw value of a ConfigOption. > > In addition, I think the FLIP should mention special cases such as > env.java.opts that are evaluated in the bash scripts and not in the Java > code. > > Cheers, > > Ufuk > > On Thu, Jan 21, 2021, at 8:57 AM, Ingo Bürk wrote: > > Hi everyone, > > > > I've now started a FLIP and am opening this discussion thread. Very much > > looking forward to your feedback! > > > > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg > > > > The first big point I'd like to discuss is about the mechanism of "when" > > the EVs (environment variables) are looked up. I'll give three approaches > > here, the first of which is currently in the FLIP but very much open for > > change, and of course I'm happy to hear about different ideas entirely as > > well. > > > > 1) Lazy evaluation > > > > Only look up the EVs when an actual config key is requested from > > Configuration(#getRawValue), possibly with the addition of caching it > once > > it has been looked up. > > The main benefit here is that no a-priori knowledge of available keys is > > required. The downside is that at no point in time we have complete > > knowledge of the configuration. This currently only really affects > > Configuration#keySet, but it does impose a limitation on future > development > > worth considering. It also changes Configuration which is not limited to > > the Flink configuration, though this can easily be turned into an > optional > > feature of Configuration. > > > > 2) Eager evaluation with full information > > > > If we centrally collect all possible Flink configuration keys in > flink-core > > (quite a lot seem to be available already, but not all), we'd have > complete > > information and could eagerly evaluate the environment, the precedence > > rules and populate the Configuration object accordingly. It also contains > > the implementation entirely to GlobalConfiguration only. > > The downside is, however, that this shifts the design a bit of having to > > know possible keys upfront. I'm also not sure how much effort it would be > > to collect all information in flink-core, or how "spread" this is across > > the codebase. > > > > 3) Eager evaluation through bijective mapping > > > > If we deviate from the Spring-style naming of the EVs we could > potentially > > come up with a scheme that provides a bijection between EVs and config > > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or > > anything to that extent), we could take all EVs with that prefix, map > them > > to the corresponding config key name and eagerly populate Configuration. > > The main challenge is now defining this bijection, and we'd lose some > > "flexibility" in the naming of the EVs, so we'd end up with something > like > > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty. > > > > Happy to hear your thoughts on this! > > > > > > Regards > > Ingo > > > |
Hi everyone,
I've updated the FLIP (https://cwiki.apache.org/confluence/x/ngtRCg) according to these discussions. Regards Ingo On Thu, Jan 21, 2021 at 11:37 AM Ingo Bürk <[hidden email]> wrote: > Hi Ufuk, Till, > > I definitely agree that having the Configuration be (or at least feel) > immutable and complete seems like a better choice, and it is probably worth > the trade-off in EV naming flexibility. Let me reshape the FLIP to propose > something along the lines of solution (3). > > Regarding env.java.opts, what special handling is needed there? AFAICT > only the rejected alternative of substituting values would've had an effect > on this. > > > Regards > Ingo > > On Thu, Jan 21, 2021 at 11:13 AM Ufuk Celebi <[hidden email]> wrote: > >> Thanks for starting the discussion, Ingo! >> >> Regarding approach 1: >> >> I like the idea of having a mapping scheme from ConfigOption to env >> var(s), but I'm concerned about the implications of lazy eval. I think it >> would be preferable to keep the Configuration object as the source of >> truth, requiring us to do some form of eager evaluation. >> >> Regarding approach 2: >> >> I don't think we can assume that we know all config option keys. For >> instance, I might write a custom high availability service or a custom >> FileSystem plugin that has it's own config options. It would be a pity (but >> maybe tolerable) if env var config would only work with Flink's core >> options. >> >> Regarding approach 3: >> >> What do you think about a mapping like >> a) stripping the FLINK_CONFIG_ prefix, >> b) replacing every _ with a dot, >> c) replacing every __ with a hyphen, >> d) lowercasing* everything? >> >> Some examples for options that include both dots and hyphens: >> >> akka.client-socket-worker-pool.pool-size-factor => >> FLINK_CONFIG_AKKA_CLIENT__SOCKET__WORKER__POOL_POOL__SIZE__FACTOR >> >> high-availability.zookeeper.quorum => >> FLINK_CONFIG_HIGH__AVAILABILITY_ZOOKEEPER_QUORUM >> >> It's not ideal, but easy to understand assuming that dots and hyphens are >> the only special characters in config keys. >> >> Regarding the lower-casing step above: ConfigOption keys seem to be case >> sensitive internally, but I couldn't find any user-facing documentation for >> this. There should be no options that depends on this behaviour. So if I'm >> not overlooking anything, I think it should be fine to make it case >> insensitive internally when accessing the raw value of a ConfigOption. >> >> In addition, I think the FLIP should mention special cases such as >> env.java.opts that are evaluated in the bash scripts and not in the Java >> code. >> >> Cheers, >> >> Ufuk >> >> On Thu, Jan 21, 2021, at 8:57 AM, Ingo Bürk wrote: >> > Hi everyone, >> > >> > I've now started a FLIP and am opening this discussion thread. Very much >> > looking forward to your feedback! >> > >> > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg >> > >> > The first big point I'd like to discuss is about the mechanism of "when" >> > the EVs (environment variables) are looked up. I'll give three >> approaches >> > here, the first of which is currently in the FLIP but very much open for >> > change, and of course I'm happy to hear about different ideas entirely >> as >> > well. >> > >> > 1) Lazy evaluation >> > >> > Only look up the EVs when an actual config key is requested from >> > Configuration(#getRawValue), possibly with the addition of caching it >> once >> > it has been looked up. >> > The main benefit here is that no a-priori knowledge of available keys is >> > required. The downside is that at no point in time we have complete >> > knowledge of the configuration. This currently only really affects >> > Configuration#keySet, but it does impose a limitation on future >> development >> > worth considering. It also changes Configuration which is not limited to >> > the Flink configuration, though this can easily be turned into an >> optional >> > feature of Configuration. >> > >> > 2) Eager evaluation with full information >> > >> > If we centrally collect all possible Flink configuration keys in >> flink-core >> > (quite a lot seem to be available already, but not all), we'd have >> complete >> > information and could eagerly evaluate the environment, the precedence >> > rules and populate the Configuration object accordingly. It also >> contains >> > the implementation entirely to GlobalConfiguration only. >> > The downside is, however, that this shifts the design a bit of having to >> > know possible keys upfront. I'm also not sure how much effort it would >> be >> > to collect all information in flink-core, or how "spread" this is across >> > the codebase. >> > >> > 3) Eager evaluation through bijective mapping >> > >> > If we deviate from the Spring-style naming of the EVs we could >> potentially >> > come up with a scheme that provides a bijection between EVs and config >> > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or >> > anything to that extent), we could take all EVs with that prefix, map >> them >> > to the corresponding config key name and eagerly populate Configuration. >> > The main challenge is now defining this bijection, and we'd lose some >> > "flexibility" in the naming of the EVs, so we'd end up with something >> like >> > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty. >> > >> > Happy to hear your thoughts on this! >> > >> > >> > Regards >> > Ingo >> > >> > |
In reply to this post by Ingo Bürk
LGTM. Let's see what the others think...
On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > Regarding env.java.opts, what special handling is needed there? AFAICT only > the rejected alternative of substituting values would've had an effect on > this. Makes sense 👍 From the FLIP: > This mapping is not strictly bijective, but cases with consecutive periods or dashes in the key name are not considered here and should not (reasonably) be allowed. We could actually enforce such restrictions in the implementation of ConfigOption, avoiding any surprises down the line. – Ufuk |
Thanks, Ufuk. I think that makes sense, so I moved it from a footnote to an
addition to prevent that in the future as well. Ingo On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: > LGTM. Let's see what the others think... > > On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > Regarding env.java.opts, what special handling is needed there? AFAICT > only > > the rejected alternative of substituting values would've had an effect on > > this. > > Makes sense 👍 > > From the FLIP: > > This mapping is not strictly bijective, but cases with consecutive > periods or dashes in the key name are not considered here and should not > (reasonably) be allowed. > > We could actually enforce such restrictions in the implementation of > ConfigOption, avoiding any surprises down the line. > > – Ufuk > |
The updated design LGTM as well. Nice work Ingo!
Cheers, Till On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> wrote: > Thanks, Ufuk. I think that makes sense, so I moved it from a footnote to an > addition to prevent that in the future as well. > > Ingo > > On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: > > > LGTM. Let's see what the others think... > > > > On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > > Regarding env.java.opts, what special handling is needed there? AFAICT > > only > > > the rejected alternative of substituting values would've had an effect > on > > > this. > > > > Makes sense 👍 > > > > From the FLIP: > > > This mapping is not strictly bijective, but cases with consecutive > > periods or dashes in the key name are not considered here and should not > > (reasonably) be allowed. > > > > We could actually enforce such restrictions in the implementation of > > ConfigOption, avoiding any surprises down the line. > > > > – Ufuk > > > |
The FLIP seems to disregard the existence of dynamic properties, and I'm
wondering why that is the case. Don't they fulfill similar roles, in that they allow config options to be set on the command-line? What use-case do they currently not support? I assume it's something along the lines of setting some environment variable for containers, but at least for those based on our docker images we already have a mechanism to support that. In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice that is automatically evaluated by all scripts? Why go through all the hassle with the environment variable names? On 1/22/2021 3:53 PM, Till Rohrmann wrote: > The updated design LGTM as well. Nice work Ingo! > > Cheers, > Till > > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> wrote: > >> Thanks, Ufuk. I think that makes sense, so I moved it from a footnote to an >> addition to prevent that in the future as well. >> >> Ingo >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: >> >>> LGTM. Let's see what the others think... >>> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: >>>> Regarding env.java.opts, what special handling is needed there? AFAICT >>> only >>>> the rejected alternative of substituting values would've had an effect >> on >>>> this. >>> Makes sense 👍 >>> >>> From the FLIP: >>>> This mapping is not strictly bijective, but cases with consecutive >>> periods or dashes in the key name are not considered here and should not >>> (reasonably) be allowed. >>> >>> We could actually enforce such restrictions in the implementation of >>> ConfigOption, avoiding any surprises down the line. >>> >>> – Ufuk >>> |
Hi Chesnay,
thanks for your input! After some thought I think your proposal makes a lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES environment variable, in a Kubernetes environment we could do something like ``` env: - name: S3_KEY valueFrom: # get from secret - name: FLINK_DYNAMIC_PROPERTIES value: -Ds3.access-key=$(S3_KEY) ``` This, much like the substitution approach we rejected previously, requires "intermediate" variables. However, they're all defined in the same scope here, and this solution certainly has the noteworthy benefit that no "magic syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which I believe makes for a pretty good user experience here. I think personally I prefer this approach over the approach we currently have in the FLIP, but I'm definitely happy to hear thoughts from the others on this approach! Especially maybe also regarding the test randomization point raised by Khachatryan earlier in the discussion. Regards Ingo On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> wrote: > The FLIP seems to disregard the existence of dynamic properties, and I'm > wondering why that is the case. > Don't they fulfill similar roles, in that they allow config options to > be set on the command-line? > > What use-case do they currently not support? > I assume it's something along the lines of setting some environment > variable for containers, but at least for those based on our docker > images we already have a mechanism to support that. > > In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice that > is automatically evaluated by all scripts? > Why go through all the hassle with the environment variable names? > > On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > The updated design LGTM as well. Nice work Ingo! > > > > Cheers, > > Till > > > > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> wrote: > > > >> Thanks, Ufuk. I think that makes sense, so I moved it from a footnote > to an > >> addition to prevent that in the future as well. > >> > >> Ingo > >> > >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: > >> > >>> LGTM. Let's see what the others think... > >>> > >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > >>>> Regarding env.java.opts, what special handling is needed there? AFAICT > >>> only > >>>> the rejected alternative of substituting values would've had an effect > >> on > >>>> this. > >>> Makes sense 👍 > >>> > >>> From the FLIP: > >>>> This mapping is not strictly bijective, but cases with consecutive > >>> periods or dashes in the key name are not considered here and should > not > >>> (reasonably) be allowed. > >>> > >>> We could actually enforce such restrictions in the implementation of > >>> ConfigOption, avoiding any surprises down the line. > >>> > >>> – Ufuk > >>> > > |
The problem I see with $FLINK_PROPERTIES is that this is only supported by
Flink's Docker entrypoint.sh. In fact this variable was introduced as a temporary bridge to allow Docker users to change Flink's configuration dynamically. If we had had a proper way of configuring Flink processes via env variables, then we wouldn't have introduced it. Cheers, Till On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > Hi Chesnay, > > thanks for your input! After some thought I think your proposal makes a > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > environment variable, in a Kubernetes environment we could do something like > > ``` > env: > - name: S3_KEY > valueFrom: # get from secret > - name: FLINK_DYNAMIC_PROPERTIES > value: -Ds3.access-key=$(S3_KEY) > ``` > > This, much like the substitution approach we rejected previously, requires > "intermediate" variables. However, they're all defined in the same scope > here, and this solution certainly has the noteworthy benefit that no "magic > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which I > believe makes for a pretty good user experience here. I think personally I > prefer this approach over the approach we currently have in the FLIP, but > I'm definitely happy to hear thoughts from the others on this approach! > Especially maybe also regarding the test randomization point raised > by Khachatryan earlier in the discussion. > > > Regards > Ingo > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> > wrote: > >> The FLIP seems to disregard the existence of dynamic properties, and I'm >> wondering why that is the case. >> Don't they fulfill similar roles, in that they allow config options to >> be set on the command-line? >> >> What use-case do they currently not support? >> I assume it's something along the lines of setting some environment >> variable for containers, but at least for those based on our docker >> images we already have a mechanism to support that. >> >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice that >> is automatically evaluated by all scripts? >> Why go through all the hassle with the environment variable names? >> >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: >> > The updated design LGTM as well. Nice work Ingo! >> > >> > Cheers, >> > Till >> > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> wrote: >> > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a footnote >> to an >> >> addition to prevent that in the future as well. >> >> >> >> Ingo >> >> >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: >> >> >> >>> LGTM. Let's see what the others think... >> >>> >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: >> >>>> Regarding env.java.opts, what special handling is needed there? >> AFAICT >> >>> only >> >>>> the rejected alternative of substituting values would've had an >> effect >> >> on >> >>>> this. >> >>> Makes sense 👍 >> >>> >> >>> From the FLIP: >> >>>> This mapping is not strictly bijective, but cases with consecutive >> >>> periods or dashes in the key name are not considered here and should >> not >> >>> (reasonably) be allowed. >> >>> >> >>> We could actually enforce such restrictions in the implementation of >> >>> ConfigOption, avoiding any surprises down the line. >> >>> >> >>> – Ufuk >> >>> >> >> |
I agree with Till about $FLINK_PROPERTIES being only supported by a Flink
buildfile. Besides that, 1. having different ways of configuring different applications doesn't seem convenient to me. For example, Kafka and ZK configured via usual properties and Flink via concatenated one. 2. It could also be problematic to re-use the configuration from non-java apps (PyFlink?). 3. And yes, this can also be used in tests. 4. Having this logic in scripts means we have to test scripts instead of java/python I probably missed something but what are your concerns regarding the "magic syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned earlier? Regards, Roman On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> wrote: > The problem I see with $FLINK_PROPERTIES is that this is only supported by > Flink's Docker entrypoint.sh. In fact this variable was introduced as a > temporary bridge to allow Docker users to change Flink's configuration > dynamically. If we had had a proper way of configuring Flink processes via > env variables, then we wouldn't have introduced it. > > Cheers, > Till > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > > > Hi Chesnay, > > > > thanks for your input! After some thought I think your proposal makes a > > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > > environment variable, in a Kubernetes environment we could do something > like > > > > ``` > > env: > > - name: S3_KEY > > valueFrom: # get from secret > > - name: FLINK_DYNAMIC_PROPERTIES > > value: -Ds3.access-key=$(S3_KEY) > > ``` > > > > This, much like the substitution approach we rejected previously, > requires > > "intermediate" variables. However, they're all defined in the same scope > > here, and this solution certainly has the noteworthy benefit that no > "magic > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which I > > believe makes for a pretty good user experience here. I think personally > I > > prefer this approach over the approach we currently have in the FLIP, but > > I'm definitely happy to hear thoughts from the others on this approach! > > Especially maybe also regarding the test randomization point raised > > by Khachatryan earlier in the discussion. > > > > > > Regards > > Ingo > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> > > wrote: > > > >> The FLIP seems to disregard the existence of dynamic properties, and I'm > >> wondering why that is the case. > >> Don't they fulfill similar roles, in that they allow config options to > >> be set on the command-line? > >> > >> What use-case do they currently not support? > >> I assume it's something along the lines of setting some environment > >> variable for containers, but at least for those based on our docker > >> images we already have a mechanism to support that. > >> > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice that > >> is automatically evaluated by all scripts? > >> Why go through all the hassle with the environment variable names? > >> > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > >> > The updated design LGTM as well. Nice work Ingo! > >> > > >> > Cheers, > >> > Till > >> > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> wrote: > >> > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a footnote > >> to an > >> >> addition to prevent that in the future as well. > >> >> > >> >> Ingo > >> >> > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> wrote: > >> >> > >> >>> LGTM. Let's see what the others think... > >> >>> > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > >> >>>> Regarding env.java.opts, what special handling is needed there? > >> AFAICT > >> >>> only > >> >>>> the rejected alternative of substituting values would've had an > >> effect > >> >> on > >> >>>> this. > >> >>> Makes sense 👍 > >> >>> > >> >>> From the FLIP: > >> >>>> This mapping is not strictly bijective, but cases with consecutive > >> >>> periods or dashes in the key name are not considered here and should > >> not > >> >>> (reasonably) be allowed. > >> >>> > >> >>> We could actually enforce such restrictions in the implementation of > >> >>> ConfigOption, avoiding any surprises down the line. > >> >>> > >> >>> – Ufuk > >> >>> > >> > >> > |
It appears that the discussion in FLIP-161 is primarily focussed on setting
individual configuration entries? On the other hand, $FLINK_PROPERTIES (and envsubst) were introduced to set a fragment or the complete flink-conf.yaml. It's used by at least 2 of the Flink k8s operators. In these cases the configuration is supplied by the user in the flink conf format. It would be cumbersome to translate that into a set of environment variables to then translate those back into flink-conf.yaml. I remember at the time we discussed that Flink itself could perform the transformation of the config file. That would eliminate the restriction of only being applicable to the container entry point script. Thomas On Mon, Jan 25, 2021 at 11:16 AM Khachatryan Roman < [hidden email]> wrote: > I agree with Till about $FLINK_PROPERTIES being only supported by a Flink > buildfile. > Besides that, > 1. having different ways of configuring different applications doesn't seem > convenient to me. For example, Kafka and ZK configured via usual properties > and Flink via concatenated one. > 2. It could also be problematic to re-use the configuration from non-java > apps (PyFlink?). > 3. And yes, this can also be used in tests. > 4. Having this logic in scripts means we have to test scripts instead of > java/python > > I probably missed something but what are your concerns regarding the "magic > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned > earlier? > > Regards, > Roman > > > On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> > wrote: > > > The problem I see with $FLINK_PROPERTIES is that this is only supported > by > > Flink's Docker entrypoint.sh. In fact this variable was introduced as a > > temporary bridge to allow Docker users to change Flink's configuration > > dynamically. If we had had a proper way of configuring Flink processes > via > > env variables, then we wouldn't have introduced it. > > > > Cheers, > > Till > > > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > > > > > Hi Chesnay, > > > > > > thanks for your input! After some thought I think your proposal makes a > > > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > > > environment variable, in a Kubernetes environment we could do something > > like > > > > > > ``` > > > env: > > > - name: S3_KEY > > > valueFrom: # get from secret > > > - name: FLINK_DYNAMIC_PROPERTIES > > > value: -Ds3.access-key=$(S3_KEY) > > > ``` > > > > > > This, much like the substitution approach we rejected previously, > > requires > > > "intermediate" variables. However, they're all defined in the same > scope > > > here, and this solution certainly has the noteworthy benefit that no > > "magic > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which > I > > > believe makes for a pretty good user experience here. I think > personally > > I > > > prefer this approach over the approach we currently have in the FLIP, > but > > > I'm definitely happy to hear thoughts from the others on this approach! > > > Especially maybe also regarding the test randomization point raised > > > by Khachatryan earlier in the discussion. > > > > > > > > > Regards > > > Ingo > > > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> > > > wrote: > > > > > >> The FLIP seems to disregard the existence of dynamic properties, and > I'm > > >> wondering why that is the case. > > >> Don't they fulfill similar roles, in that they allow config options to > > >> be set on the command-line? > > >> > > >> What use-case do they currently not support? > > >> I assume it's something along the lines of setting some environment > > >> variable for containers, but at least for those based on our docker > > >> images we already have a mechanism to support that. > > >> > > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice > that > > >> is automatically evaluated by all scripts? > > >> Why go through all the hassle with the environment variable names? > > >> > > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > >> > The updated design LGTM as well. Nice work Ingo! > > >> > > > >> > Cheers, > > >> > Till > > >> > > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> > wrote: > > >> > > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a > footnote > > >> to an > > >> >> addition to prevent that in the future as well. > > >> >> > > >> >> Ingo > > >> >> > > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> > wrote: > > >> >> > > >> >>> LGTM. Let's see what the others think... > > >> >>> > > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > >> >>>> Regarding env.java.opts, what special handling is needed there? > > >> AFAICT > > >> >>> only > > >> >>>> the rejected alternative of substituting values would've had an > > >> effect > > >> >> on > > >> >>>> this. > > >> >>> Makes sense 👍 > > >> >>> > > >> >>> From the FLIP: > > >> >>>> This mapping is not strictly bijective, but cases with > consecutive > > >> >>> periods or dashes in the key name are not considered here and > should > > >> not > > >> >>> (reasonably) be allowed. > > >> >>> > > >> >>> We could actually enforce such restrictions in the implementation > of > > >> >>> ConfigOption, avoiding any surprises down the line. > > >> >>> > > >> >>> – Ufuk > > >> >>> > > >> > > >> > > > |
It is true that we haven't properly discussed whether to support bulk
config changes via a single environment variable or not, Thomas. It is also true that one could move the logic to the Flink processes (more specifically to where the config is loaded). I am somehow assuming that it is easier for (human) users to specify a single env variable for a config option instead of defining a block of Flink configuration values. I wouldn't know how to properly separate the entries of the configuration values from the top of my head (do you use spaces, commas or newline characters?). But maybe this is a wrong assumption and more of a documentation problem. I can see that for some automated tooling (e.g. K8s operators) it might be easier to not go through the translation process and simply copy what is provided to the tools (I guess `cat flink-conf.yaml`). I am interested why you haven't opted for using ConfigMaps which are mounted as the flink-conf.yaml? Do they have some limitations which did not work in your case? Cheers, Till On Tue, Jan 26, 2021 at 6:04 AM Thomas Weise <[hidden email]> wrote: > It appears that the discussion in FLIP-161 is primarily focussed on setting > individual configuration entries? > > On the other hand, $FLINK_PROPERTIES (and envsubst) were introduced to set > a fragment or the complete flink-conf.yaml. It's used by at least 2 of the > Flink k8s operators. > > In these cases the configuration is supplied by the user in the flink conf > format. It would be cumbersome to translate that into a set of environment > variables to then translate those back into flink-conf.yaml. > > I remember at the time we discussed that Flink itself could perform the > transformation of the config file. That would eliminate the restriction of > only being applicable to the container entry point script. > > Thomas > > On Mon, Jan 25, 2021 at 11:16 AM Khachatryan Roman < > [hidden email]> wrote: > > > I agree with Till about $FLINK_PROPERTIES being only supported by a Flink > > buildfile. > > Besides that, > > 1. having different ways of configuring different applications doesn't > seem > > convenient to me. For example, Kafka and ZK configured via usual > properties > > and Flink via concatenated one. > > 2. It could also be problematic to re-use the configuration from non-java > > apps (PyFlink?). > > 3. And yes, this can also be used in tests. > > 4. Having this logic in scripts means we have to test scripts instead of > > java/python > > > > I probably missed something but what are your concerns regarding the > "magic > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned > > earlier? > > > > Regards, > > Roman > > > > > > On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > The problem I see with $FLINK_PROPERTIES is that this is only supported > > by > > > Flink's Docker entrypoint.sh. In fact this variable was introduced as a > > > temporary bridge to allow Docker users to change Flink's configuration > > > dynamically. If we had had a proper way of configuring Flink processes > > via > > > env variables, then we wouldn't have introduced it. > > > > > > Cheers, > > > Till > > > > > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > > > > > > > Hi Chesnay, > > > > > > > > thanks for your input! After some thought I think your proposal > makes a > > > > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > > > > environment variable, in a Kubernetes environment we could do > something > > > like > > > > > > > > ``` > > > > env: > > > > - name: S3_KEY > > > > valueFrom: # get from secret > > > > - name: FLINK_DYNAMIC_PROPERTIES > > > > value: -Ds3.access-key=$(S3_KEY) > > > > ``` > > > > > > > > This, much like the substitution approach we rejected previously, > > > requires > > > > "intermediate" variables. However, they're all defined in the same > > scope > > > > here, and this solution certainly has the noteworthy benefit that no > > > "magic > > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, > which > > I > > > > believe makes for a pretty good user experience here. I think > > personally > > > I > > > > prefer this approach over the approach we currently have in the FLIP, > > but > > > > I'm definitely happy to hear thoughts from the others on this > approach! > > > > Especially maybe also regarding the test randomization point raised > > > > by Khachatryan earlier in the discussion. > > > > > > > > > > > > Regards > > > > Ingo > > > > > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email] > > > > > > wrote: > > > > > > > >> The FLIP seems to disregard the existence of dynamic properties, and > > I'm > > > >> wondering why that is the case. > > > >> Don't they fulfill similar roles, in that they allow config options > to > > > >> be set on the command-line? > > > >> > > > >> What use-case do they currently not support? > > > >> I assume it's something along the lines of setting some environment > > > >> variable for containers, but at least for those based on our docker > > > >> images we already have a mechanism to support that. > > > >> > > > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice > > that > > > >> is automatically evaluated by all scripts? > > > >> Why go through all the hassle with the environment variable names? > > > >> > > > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > > >> > The updated design LGTM as well. Nice work Ingo! > > > >> > > > > >> > Cheers, > > > >> > Till > > > >> > > > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> > > wrote: > > > >> > > > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a > > footnote > > > >> to an > > > >> >> addition to prevent that in the future as well. > > > >> >> > > > >> >> Ingo > > > >> >> > > > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> > > wrote: > > > >> >> > > > >> >>> LGTM. Let's see what the others think... > > > >> >>> > > > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > > >> >>>> Regarding env.java.opts, what special handling is needed there? > > > >> AFAICT > > > >> >>> only > > > >> >>>> the rejected alternative of substituting values would've had an > > > >> effect > > > >> >> on > > > >> >>>> this. > > > >> >>> Makes sense 👍 > > > >> >>> > > > >> >>> From the FLIP: > > > >> >>>> This mapping is not strictly bijective, but cases with > > consecutive > > > >> >>> periods or dashes in the key name are not considered here and > > should > > > >> not > > > >> >>> (reasonably) be allowed. > > > >> >>> > > > >> >>> We could actually enforce such restrictions in the > implementation > > of > > > >> >>> ConfigOption, avoiding any surprises down the line. > > > >> >>> > > > >> >>> – Ufuk > > > >> >>> > > > >> > > > >> > > > > > > |
In reply to this post by Thomas Weise
Chesnay, thanks for bringing this up. I think it's an alternative that's worthwhile to discuss, although I do think that we should stick to the current proposal in the FLIP.
My main concerns about the FLINK_PROPERTIES approach boil down to the following: * A single env var per config option is a standard language-agnostic approach to configure systems. * Having a single env var that encodes multiple config options makes it cumbersome to change individual entries. * (minor) Explicitly mapping an env var to a config key is explicit (nice!), but also results in some repetition. I don't have much experience actually deploying Flink to production, so I would definitely give higher weight to feedback from people with more experience with production Flink deployments. Cheers, Ufuk On Tue, Jan 26, 2021, at 6:03 AM, Thomas Weise wrote: > It appears that the discussion in FLIP-161 is primarily focussed on setting > individual configuration entries? > > On the other hand, $FLINK_PROPERTIES (and envsubst) were introduced to set > a fragment or the complete flink-conf.yaml. It's used by at least 2 of the > Flink k8s operators. > > In these cases the configuration is supplied by the user in the flink conf > format. It would be cumbersome to translate that into a set of environment > variables to then translate those back into flink-conf.yaml. > > I remember at the time we discussed that Flink itself could perform the > transformation of the config file. That would eliminate the restriction of > only being applicable to the container entry point script. > > Thomas > > On Mon, Jan 25, 2021 at 11:16 AM Khachatryan Roman < > [hidden email]> wrote: > > > I agree with Till about $FLINK_PROPERTIES being only supported by a Flink > > buildfile. > > Besides that, > > 1. having different ways of configuring different applications doesn't seem > > convenient to me. For example, Kafka and ZK configured via usual properties > > and Flink via concatenated one. > > 2. It could also be problematic to re-use the configuration from non-java > > apps (PyFlink?). > > 3. And yes, this can also be used in tests. > > 4. Having this logic in scripts means we have to test scripts instead of > > java/python > > > > I probably missed something but what are your concerns regarding the "magic > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned > > earlier? > > > > Regards, > > Roman > > > > > > On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > The problem I see with $FLINK_PROPERTIES is that this is only supported > > by > > > Flink's Docker entrypoint.sh. In fact this variable was introduced as a > > > temporary bridge to allow Docker users to change Flink's configuration > > > dynamically. If we had had a proper way of configuring Flink processes > > via > > > env variables, then we wouldn't have introduced it. > > > > > > Cheers, > > > Till > > > > > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > > > > > > > Hi Chesnay, > > > > > > > > thanks for your input! After some thought I think your proposal makes a > > > > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > > > > environment variable, in a Kubernetes environment we could do something > > > like > > > > > > > > ``` > > > > env: > > > > - name: S3_KEY > > > > valueFrom: # get from secret > > > > - name: FLINK_DYNAMIC_PROPERTIES > > > > value: -Ds3.access-key=$(S3_KEY) > > > > ``` > > > > > > > > This, much like the substitution approach we rejected previously, > > > requires > > > > "intermediate" variables. However, they're all defined in the same > > scope > > > > here, and this solution certainly has the noteworthy benefit that no > > > "magic > > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which > > I > > > > believe makes for a pretty good user experience here. I think > > personally > > > I > > > > prefer this approach over the approach we currently have in the FLIP, > > but > > > > I'm definitely happy to hear thoughts from the others on this approach! > > > > Especially maybe also regarding the test randomization point raised > > > > by Khachatryan earlier in the discussion. > > > > > > > > > > > > Regards > > > > Ingo > > > > > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> > > > > wrote: > > > > > > > >> The FLIP seems to disregard the existence of dynamic properties, and > > I'm > > > >> wondering why that is the case. > > > >> Don't they fulfill similar roles, in that they allow config options to > > > >> be set on the command-line? > > > >> > > > >> What use-case do they currently not support? > > > >> I assume it's something along the lines of setting some environment > > > >> variable for containers, but at least for those based on our docker > > > >> images we already have a mechanism to support that. > > > >> > > > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice > > that > > > >> is automatically evaluated by all scripts? > > > >> Why go through all the hassle with the environment variable names? > > > >> > > > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > > >> > The updated design LGTM as well. Nice work Ingo! > > > >> > > > > >> > Cheers, > > > >> > Till > > > >> > > > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> > > wrote: > > > >> > > > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a > > footnote > > > >> to an > > > >> >> addition to prevent that in the future as well. > > > >> >> > > > >> >> Ingo > > > >> >> > > > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> > > wrote: > > > >> >> > > > >> >>> LGTM. Let's see what the others think... > > > >> >>> > > > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > > >> >>>> Regarding env.java.opts, what special handling is needed there? > > > >> AFAICT > > > >> >>> only > > > >> >>>> the rejected alternative of substituting values would've had an > > > >> effect > > > >> >> on > > > >> >>>> this. > > > >> >>> Makes sense 👍 > > > >> >>> > > > >> >>> From the FLIP: > > > >> >>>> This mapping is not strictly bijective, but cases with > > consecutive > > > >> >>> periods or dashes in the key name are not considered here and > > should > > > >> not > > > >> >>> (reasonably) be allowed. > > > >> >>> > > > >> >>> We could actually enforce such restrictions in the implementation > > of > > > >> >>> ConfigOption, avoiding any surprises down the line. > > > >> >>> > > > >> >>> – Ufuk > > > >> >>> > > > >> > > > >> > > > > > > |
I'm including Steven and Yang into this discussion (cc'd) who raised some good points in the initial DISCUSS thread. Do you have any opinion on the current discussion here?
On Tue, Jan 26, 2021, at 10:59 AM, Ufuk Celebi wrote: > Chesnay, thanks for bringing this up. I think it's an alternative that's worthwhile to discuss, although I do think that we should stick to the current proposal in the FLIP. > > My main concerns about the FLINK_PROPERTIES approach boil down to the following: > * A single env var per config option is a standard language-agnostic approach to configure systems. > * Having a single env var that encodes multiple config options makes it cumbersome to change individual entries. > * (minor) Explicitly mapping an env var to a config key is explicit (nice!), but also results in some repetition. > > I don't have much experience actually deploying Flink to production, so I would definitely give higher weight to feedback from people with more experience with production Flink deployments. > > Cheers, > > Ufuk > > On Tue, Jan 26, 2021, at 6:03 AM, Thomas Weise wrote: > > It appears that the discussion in FLIP-161 is primarily focussed on setting > > individual configuration entries? > > > > On the other hand, $FLINK_PROPERTIES (and envsubst) were introduced to set > > a fragment or the complete flink-conf.yaml. It's used by at least 2 of the > > Flink k8s operators. > > > > In these cases the configuration is supplied by the user in the flink conf > > format. It would be cumbersome to translate that into a set of environment > > variables to then translate those back into flink-conf.yaml. > > > > I remember at the time we discussed that Flink itself could perform the > > transformation of the config file. That would eliminate the restriction of > > only being applicable to the container entry point script. > > > > Thomas > > > > On Mon, Jan 25, 2021 at 11:16 AM Khachatryan Roman < > > [hidden email]> wrote: > > > > > I agree with Till about $FLINK_PROPERTIES being only supported by a Flink > > > buildfile. > > > Besides that, > > > 1. having different ways of configuring different applications doesn't seem > > > convenient to me. For example, Kafka and ZK configured via usual properties > > > and Flink via concatenated one. > > > 2. It could also be problematic to re-use the configuration from non-java > > > apps (PyFlink?). > > > 3. And yes, this can also be used in tests. > > > 4. Having this logic in scripts means we have to test scripts instead of > > > java/python > > > > > > I probably missed something but what are your concerns regarding the "magic > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned > > > earlier? > > > > > > Regards, > > > Roman > > > > > > > > > On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> > > > wrote: > > > > > > > The problem I see with $FLINK_PROPERTIES is that this is only supported > > > by > > > > Flink's Docker entrypoint.sh. In fact this variable was introduced as a > > > > temporary bridge to allow Docker users to change Flink's configuration > > > > dynamically. If we had had a proper way of configuring Flink processes > > > via > > > > env variables, then we wouldn't have introduced it. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> wrote: > > > > > > > > > Hi Chesnay, > > > > > > > > > > thanks for your input! After some thought I think your proposal makes a > > > > > lot of sense, i.e. if we have one single $FLINK_DYNAMIC_PROPERTIES > > > > > environment variable, in a Kubernetes environment we could do something > > > > like > > > > > > > > > > ``` > > > > > env: > > > > > - name: S3_KEY > > > > > valueFrom: # get from secret > > > > > - name: FLINK_DYNAMIC_PROPERTIES > > > > > value: -Ds3.access-key=$(S3_KEY) > > > > > ``` > > > > > > > > > > This, much like the substitution approach we rejected previously, > > > > requires > > > > > "intermediate" variables. However, they're all defined in the same > > > scope > > > > > here, and this solution certainly has the noteworthy benefit that no > > > > "magic > > > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, which > > > I > > > > > believe makes for a pretty good user experience here. I think > > > personally > > > > I > > > > > prefer this approach over the approach we currently have in the FLIP, > > > but > > > > > I'm definitely happy to hear thoughts from the others on this approach! > > > > > Especially maybe also regarding the test randomization point raised > > > > > by Khachatryan earlier in the discussion. > > > > > > > > > > > > > > > Regards > > > > > Ingo > > > > > > > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler <[hidden email]> > > > > > wrote: > > > > > > > > > >> The FLIP seems to disregard the existence of dynamic properties, and > > > I'm > > > > >> wondering why that is the case. > > > > >> Don't they fulfill similar roles, in that they allow config options to > > > > >> be set on the command-line? > > > > >> > > > > >> What use-case do they currently not support? > > > > >> I assume it's something along the lines of setting some environment > > > > >> variable for containers, but at least for those based on our docker > > > > >> images we already have a mechanism to support that. > > > > >> > > > > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable suffice > > > that > > > > >> is automatically evaluated by all scripts? > > > > >> Why go through all the hassle with the environment variable names? > > > > >> > > > > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > > > >> > The updated design LGTM as well. Nice work Ingo! > > > > >> > > > > > >> > Cheers, > > > > >> > Till > > > > >> > > > > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email]> > > > wrote: > > > > >> > > > > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a > > > footnote > > > > >> to an > > > > >> >> addition to prevent that in the future as well. > > > > >> >> > > > > >> >> Ingo > > > > >> >> > > > > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> > > > wrote: > > > > >> >> > > > > >> >>> LGTM. Let's see what the others think... > > > > >> >>> > > > > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > > > >> >>>> Regarding env.java.opts, what special handling is needed there? > > > > >> AFAICT > > > > >> >>> only > > > > >> >>>> the rejected alternative of substituting values would've had an > > > > >> effect > > > > >> >> on > > > > >> >>>> this. > > > > >> >>> Makes sense 👍 > > > > >> >>> > > > > >> >>> From the FLIP: > > > > >> >>>> This mapping is not strictly bijective, but cases with > > > consecutive > > > > >> >>> periods or dashes in the key name are not considered here and > > > should > > > > >> not > > > > >> >>> (reasonably) be allowed. > > > > >> >>> > > > > >> >>> We could actually enforce such restrictions in the implementation > > > of > > > > >> >>> ConfigOption, avoiding any surprises down the line. > > > > >> >>> > > > > >> >>> – Ufuk > > > > >> >>> > > > > >> > > > > >> > > > > > > > > > > |
I might have joined the discussion a bit late.
My two cents: - I'm good with EVs overwriting the flink-conf.yaml. However, I'm a bit unsure about EVs overwriting the dynamic properties. - Currently, dynamic properties are expected to be the final effective configuration in many places. E.g., the active resource manager derives task managers' memory sizes from user configuration (e.g. a min-max range), and writes the derived actual sizes through dynamic properties (e.g., set min & max to the same value). Making EVs the highest priority would require changes from using dynamic properties to using EVs for all such cases. - I think dynamic properties might be easier to use than EVs, in terms of overwriting the configurations. Using dynamic properties, you can simply add `-Dkey=value` when executing `flink run`. I'm not entirely sure if there's a way to set the EVs in a equally simple way, without needing to change the image. - The naming conversions proposed in the FLIP seems not bijective to me. There could be problems if the configuration key contains underscores. - a_b -> FLINK_CONFIG_a_b - FLINK_CONFIG_a_b -> a.b Thank you~ Xintong Song On Tue, Jan 26, 2021 at 6:03 PM Ufuk Celebi <[hidden email]> wrote: > I'm including Steven and Yang into this discussion (cc'd) who raised some > good points in the initial DISCUSS thread. Do you have any opinion on the > current discussion here? > > On Tue, Jan 26, 2021, at 10:59 AM, Ufuk Celebi wrote: > > Chesnay, thanks for bringing this up. I think it's an alternative that's > worthwhile to discuss, although I do think that we should stick to the > current proposal in the FLIP. > > > > My main concerns about the FLINK_PROPERTIES approach boil down to the > following: > > * A single env var per config option is a standard language-agnostic > approach to configure systems. > > * Having a single env var that encodes multiple config options makes it > cumbersome to change individual entries. > > * (minor) Explicitly mapping an env var to a config key is explicit > (nice!), but also results in some repetition. > > > > I don't have much experience actually deploying Flink to production, so > I would definitely give higher weight to feedback from people with more > experience with production Flink deployments. > > > > Cheers, > > > > Ufuk > > > > On Tue, Jan 26, 2021, at 6:03 AM, Thomas Weise wrote: > > > It appears that the discussion in FLIP-161 is primarily focussed on > setting > > > individual configuration entries? > > > > > > On the other hand, $FLINK_PROPERTIES (and envsubst) were introduced to > set > > > a fragment or the complete flink-conf.yaml. It's used by at least 2 of > the > > > Flink k8s operators. > > > > > > In these cases the configuration is supplied by the user in the flink > conf > > > format. It would be cumbersome to translate that into a set of > environment > > > variables to then translate those back into flink-conf.yaml. > > > > > > I remember at the time we discussed that Flink itself could perform the > > > transformation of the config file. That would eliminate the > restriction of > > > only being applicable to the container entry point script. > > > > > > Thomas > > > > > > On Mon, Jan 25, 2021 at 11:16 AM Khachatryan Roman < > > > [hidden email]> wrote: > > > > > > > I agree with Till about $FLINK_PROPERTIES being only supported by a > Flink > > > > buildfile. > > > > Besides that, > > > > 1. having different ways of configuring different applications > doesn't seem > > > > convenient to me. For example, Kafka and ZK configured via usual > properties > > > > and Flink via concatenated one. > > > > 2. It could also be problematic to re-use the configuration from > non-java > > > > apps (PyFlink?). > > > > 3. And yes, this can also be used in tests. > > > > 4. Having this logic in scripts means we have to test scripts > instead of > > > > java/python > > > > > > > > I probably missed something but what are your concerns regarding the > "magic > > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) you mentioned > > > > earlier? > > > > > > > > Regards, > > > > Roman > > > > > > > > > > > > On Mon, Jan 25, 2021 at 6:48 PM Till Rohrmann <[hidden email]> > > > > wrote: > > > > > > > > > The problem I see with $FLINK_PROPERTIES is that this is only > supported > > > > by > > > > > Flink's Docker entrypoint.sh. In fact this variable was introduced > as a > > > > > temporary bridge to allow Docker users to change Flink's > configuration > > > > > dynamically. If we had had a proper way of configuring Flink > processes > > > > via > > > > > env variables, then we wouldn't have introduced it. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > On Mon, Jan 25, 2021 at 5:01 PM Ingo Bürk <[hidden email]> > wrote: > > > > > > > > > > > Hi Chesnay, > > > > > > > > > > > > thanks for your input! After some thought I think your proposal > makes a > > > > > > lot of sense, i.e. if we have one single > $FLINK_DYNAMIC_PROPERTIES > > > > > > environment variable, in a Kubernetes environment we could do > something > > > > > like > > > > > > > > > > > > ``` > > > > > > env: > > > > > > - name: S3_KEY > > > > > > valueFrom: # get from secret > > > > > > - name: FLINK_DYNAMIC_PROPERTIES > > > > > > value: -Ds3.access-key=$(S3_KEY) > > > > > > ``` > > > > > > > > > > > > This, much like the substitution approach we rejected previously, > > > > > requires > > > > > > "intermediate" variables. However, they're all defined in the > same > > > > scope > > > > > > here, and this solution certainly has the noteworthy benefit > that no > > > > > "magic > > > > > > syntax" (s3.access-key → FLINK_CONFIG_S3_ACCESS__KEY) is needed, > which > > > > I > > > > > > believe makes for a pretty good user experience here. I think > > > > personally > > > > > I > > > > > > prefer this approach over the approach we currently have in the > FLIP, > > > > but > > > > > > I'm definitely happy to hear thoughts from the others on this > approach! > > > > > > Especially maybe also regarding the test randomization point > raised > > > > > > by Khachatryan earlier in the discussion. > > > > > > > > > > > > > > > > > > Regards > > > > > > Ingo > > > > > > > > > > > > On Fri, Jan 22, 2021 at 5:18 PM Chesnay Schepler < > [hidden email]> > > > > > > wrote: > > > > > > > > > > > >> The FLIP seems to disregard the existence of dynamic > properties, and > > > > I'm > > > > > >> wondering why that is the case. > > > > > >> Don't they fulfill similar roles, in that they allow config > options to > > > > > >> be set on the command-line? > > > > > >> > > > > > >> What use-case do they currently not support? > > > > > >> I assume it's something along the lines of setting some > environment > > > > > >> variable for containers, but at least for those based on our > docker > > > > > >> images we already have a mechanism to support that. > > > > > >> > > > > > >> In any case, wouldn't a single DYNAMIC_PROPERTIES variable > suffice > > > > that > > > > > >> is automatically evaluated by all scripts? > > > > > >> Why go through all the hassle with the environment variable > names? > > > > > >> > > > > > >> On 1/22/2021 3:53 PM, Till Rohrmann wrote: > > > > > >> > The updated design LGTM as well. Nice work Ingo! > > > > > >> > > > > > > >> > Cheers, > > > > > >> > Till > > > > > >> > > > > > > >> > On Fri, Jan 22, 2021 at 3:33 PM Ingo Bürk <[hidden email] > > > > > > wrote: > > > > > >> > > > > > > >> >> Thanks, Ufuk. I think that makes sense, so I moved it from a > > > > footnote > > > > > >> to an > > > > > >> >> addition to prevent that in the future as well. > > > > > >> >> > > > > > >> >> Ingo > > > > > >> >> > > > > > >> >> On Fri, Jan 22, 2021 at 3:10 PM Ufuk Celebi <[hidden email]> > > > > wrote: > > > > > >> >> > > > > > >> >>> LGTM. Let's see what the others think... > > > > > >> >>> > > > > > >> >>> On Thu, Jan 21, 2021, at 11:37 AM, Ingo Bürk wrote: > > > > > >> >>>> Regarding env.java.opts, what special handling is needed > there? > > > > > >> AFAICT > > > > > >> >>> only > > > > > >> >>>> the rejected alternative of substituting values would've > had an > > > > > >> effect > > > > > >> >> on > > > > > >> >>>> this. > > > > > >> >>> Makes sense 👍 > > > > > >> >>> > > > > > >> >>> From the FLIP: > > > > > >> >>>> This mapping is not strictly bijective, but cases with > > > > consecutive > > > > > >> >>> periods or dashes in the key name are not considered here > and > > > > should > > > > > >> not > > > > > >> >>> (reasonably) be allowed. > > > > > >> >>> > > > > > >> >>> We could actually enforce such restrictions in the > implementation > > > > of > > > > > >> >>> ConfigOption, avoiding any surprises down the line. > > > > > >> >>> > > > > > >> >>> – Ufuk > > > > > >> >>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > |
@Xingtong: The assumption for the mapping was that we only have dots and hyphens in the keys. Do you have an example for a key which include underscores? If underscores are common for keys (I couldn't find any existing options that use it), it would certainly be a blocker for the discussed approach.
On Tue, Jan 26, 2021, at 11:46 AM, Xintong Song wrote: > - The naming conversions proposed in the FLIP seems not bijective to me. > There could be problems if the configuration key contains underscores. > - a_b -> FLINK_CONFIG_a_b > - FLINK_CONFIG_a_b -> a.b |
@Ufuk,
I also don't find any existing options with underscores in their keys. However, I do not find any explicit rules forbid using them either. I'm not saying this should block the FLIP. Just it would be nice to beware of this issue, and maybe ensure the assumption with test cases if we finally decide to go with these mapping rules. Thank you~ Xintong Song On Tue, Jan 26, 2021 at 7:27 PM Ufuk Celebi <[hidden email]> wrote: > @Xingtong: The assumption for the mapping was that we only have dots and > hyphens in the keys. Do you have an example for a key which include > underscores? If underscores are common for keys (I couldn't find any > existing options that use it), it would certainly be a blocker for the > discussed approach. > > On Tue, Jan 26, 2021, at 11:46 AM, Xintong Song wrote: > > - The naming conversions proposed in the FLIP seems not bijective to > me. > > There could be problems if the configuration key contains underscores. > > - a_b -> FLINK_CONFIG_a_b > > - FLINK_CONFIG_a_b -> a.b > |
I think we have to assume that some user has a custom config option that
uses underscores. That said, we can probably assume that no one uses other special characters like question marks and such, which are indeed allowed by the YAML spec. These kind of questions are precisely why I prefer the DYNAMIC_VARIABLES approach; you don't even have to worry about this stuff. The only question we'd have to answer is whether manually defined properties should take precedent or not. @Uce I can see how it could be cumbersome to modify, but at the same time you can implement whatever other approach you want on top of it: // this is a /conceptual /example for an optional setting DYNAMIC_VARIABLES="${REST_PORT_SETTING}" if _someCondition_: export REST_PORT_SETTING="-Drest.port=1234" // this is a /conceptual /example for a configurable setting DYNAMIC_VARIABLES="-Drest.port=${MY_FANCY_VARIABLE:-8200}" if _someCondition_: export MY_FANCY_VARIABLE="1234" Additionally, this makes it quite easy to audit stuff, since we can just eagerly log what DYNAMIC_VARIABLES is set to. On 1/26/2021 12:48 PM, Xintong Song wrote: > @Ufuk, > I also don't find any existing options with underscores in their keys. > However, I do not find any explicit rules forbid using them either. I'm not > saying this should block the FLIP. Just it would be nice to beware of this > issue, and maybe ensure the assumption with test cases if we finally decide > to go with these mapping rules. > > Thank you~ > > Xintong Song > > > > On Tue, Jan 26, 2021 at 7:27 PM Ufuk Celebi <[hidden email]> wrote: > >> @Xingtong: The assumption for the mapping was that we only have dots and >> hyphens in the keys. Do you have an example for a key which include >> underscores? If underscores are common for keys (I couldn't find any >> existing options that use it), it would certainly be a blocker for the >> discussed approach. >> >> On Tue, Jan 26, 2021, at 11:46 AM, Xintong Song wrote: >>> - The naming conversions proposed in the FLIP seems not bijective to >> me. >>> There could be problems if the configuration key contains underscores. >>> - a_b -> FLINK_CONFIG_a_b >>> - FLINK_CONFIG_a_b -> a.b |
Free forum by Nabble | Edit this page |