Hi all,
Now that the 1.11 release is out, it is time to plan for the next major Flink release. Some items: 1. Dian Fu and me volunteer to be the release managers for Flink 1.12. 1. Timeline: We propose to stick to our approximate 4 month release cycle, thus the release should be done by late October. Given that there’s a holiday week in China at the beginning of October, I propose to do the feature freeze on master by late September. 2. Collecting features: It would be good to have a rough overview of the features that will likely be ready to be merged by late September, and that we want in the release. Based on the discussion, we will update the Roadmap on the Flink website again! 1. Test instabilities and blockers: I would like to avoid a situation where we have many blocking issues or build instabilities at the time of the feature freeze. To achieve that, we will try to check every build instability within a week, to decide if it is a blocker (make sure to use the “test-stability” label for those tickets!) Blocker issues will need to have somebody assigned (responsible) within a week, and we want to see progress on all blocker issues (downgrade, resolution, a good plan how to proceed if it is more complicated) 2. Quality and stability of new features: In order to have a short feature freeze phase, we encourage developers to only merge well-tested and documented features. In our experience, the feature freeze works best if new features are complete, and the community can focus fully on addressing newly found bugs and voting the release. By having a smooth release process, the next merge-window for the next release will come sooner. Let me know what you think about our items, and share which features you want in Flink 1.12. Best, Robert & Dian |
Hi Flink Dev Team,
Dynamic AutoScaling Based on the incoming data load would be a great feature. We should be able have some rule say If the load increased by 20% , add extra resource should be added. Or time based say during these peak hours the pipeline should scale automatically by 50%. This will help a lot in cost reduction. EMR cluster provides a similar feature for SPARK based application. Thanks, Prasanna. On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email]> wrote: > Hi all, > > Now that the 1.11 release is out, it is time to plan for the next major > Flink release. > > Some items: > > 1. > > Dian Fu and me volunteer to be the release managers for Flink 1.12. > > > > 1. > > Timeline: We propose to stick to our approximate 4 month release cycle, > thus the release should be done by late October. Given that there’s a > holiday week in China at the beginning of October, I propose to do the > feature freeze on master by late September. > > 2. > > Collecting features: It would be good to have a rough overview of the > features that will likely be ready to be merged by late September, and > that > we want in the release. > Based on the discussion, we will update the Roadmap on the Flink website > again! > > > > 1. > > Test instabilities and blockers: I would like to avoid a situation where > we have many blocking issues or build instabilities at the time of the > feature freeze. To achieve that, we will try to check every build > instability within a week, to decide if it is a blocker (make sure to > use > the “test-stability” label for those tickets!) > Blocker issues will need to have somebody assigned (responsible) within > a week, and we want to see progress on all blocker issues (downgrade, > resolution, a good plan how to proceed if it is more complicated) > > 2. > > Quality and stability of new features: In order to have a short feature > freeze phase, we encourage developers to only merge well-tested and > documented features. In our experience, the feature freeze works best if > new features are complete, and the community can focus fully on > addressing > newly found bugs and voting the release. > By having a smooth release process, the next merge-window for the next > release will come sooner. > > > Let me know what you think about our items, and share which features you > want in Flink 1.12. > > Best, > > Robert & Dian > |
Thanks Robert for bringing up this discussion. This is very important to ensure that we have a smooth release process as there are only two months left before feature freeze.
It would be good to have a list of the features for 1.12 as soon as possible. Welcome any one to post the feature list which you think important and want in 1.12. Regards, Dian > 在 2020年7月23日,上午12:10,Prasanna kumar <[hidden email]> 写道: > > Hi Flink Dev Team, > > Dynamic AutoScaling Based on the incoming data load would be a great feature. > > We should be able have some rule say If the load increased by 20% , add extra resource should be added. > Or time based say during these peak hours the pipeline should scale automatically by 50%. > > This will help a lot in cost reduction. > > EMR cluster provides a similar feature for SPARK based application. > > Thanks, > Prasanna. > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email] <mailto:[hidden email]>> wrote: > Hi all, > > Now that the 1.11 release is out, it is time to plan for the next major > Flink release. > > Some items: > > 1. > > Dian Fu and me volunteer to be the release managers for Flink 1.12. > > > > 1. > > Timeline: We propose to stick to our approximate 4 month release cycle, > thus the release should be done by late October. Given that there’s a > holiday week in China at the beginning of October, I propose to do the > feature freeze on master by late September. > > 2. > > Collecting features: It would be good to have a rough overview of the > features that will likely be ready to be merged by late September, and that > we want in the release. > Based on the discussion, we will update the Roadmap on the Flink website > again! > > > > 1. > > Test instabilities and blockers: I would like to avoid a situation where > we have many blocking issues or build instabilities at the time of the > feature freeze. To achieve that, we will try to check every build > instability within a week, to decide if it is a blocker (make sure to use > the “test-stability” label for those tickets!) > Blocker issues will need to have somebody assigned (responsible) within > a week, and we want to see progress on all blocker issues (downgrade, > resolution, a good plan how to proceed if it is more complicated) > > 2. > > Quality and stability of new features: In order to have a short feature > freeze phase, we encourage developers to only merge well-tested and > documented features. In our experience, the feature freeze works best if > new features are complete, and the community can focus fully on addressing > newly found bugs and voting the release. > By having a smooth release process, the next merge-window for the next > release will come sooner. > > > Let me know what you think about our items, and share which features you > want in Flink 1.12. > > Best, > > Robert & Dian |
Hi All,
Thanks for bring-up this discussion, Robert! Congratulations on becoming the release manager of 1.12, Dian and Robert ! ---------- Here is my thoughts of the features for PyFlink in Flink 1.12: 1. Improve the usabilities for PyFlink Description: Improve the usabilities for PyFlink such as helping users check the client and cluster environment, optimize error messages, improve the current API type hint, etc. Benefits: Improve user experience. 2. PyFlink Table API DSL Description: Support Python Table API Expression DSL. Expression DSL has been supported on the Java side(FLIP-55). This task tries to align Python Table API with Java Table API. Benefits: Expression DSL is more user friendly than String expressions that users can rely on IDE smart prompts to write expressions which can facilitate users and increase development efficiency. 3. Python DataStream API Description: Support DataStream applications written in Python, including stateless operations(keyBy, connect, union, map, flatMap, filter, etc) and stateful operations(RichFunctions, ProcessFunctions, window, join). Benefits: 1) By adding DataStream API in pyflink, it would provide users more fine-grained configuration setting API for tasks(such as parallelism and resource spec) and more complex data processing operation, which are users strong demand while SQL and Table API are not supported at the moment. 2) For areas which have low relies on relation operations, such AI, transformations like map, flatmap, are more prefered by users than Table API. 4. Support Pandas UDAF in batch GroupBy aggregation Description: Support Pandas UDAF in batch GroupBy aggregation of Python Table API & SQL. Both the input and output of the UDF is pandas.DataFrame. Benefits: 1) Pandas UDAF performs better than row-at-a-time UDAF more than 10x in certain scenarios 2) Users could use Pandas/Numpy API in the Python UDAF implementation if the input/output data type is pandas.DataFrame 5.PyFlink Table API UDAF Description: Support UDAF for Python Table API. Benefits: Aggregations(stateful operations) can also be supported in PyFlink. 6. Support running pyflink jobs on kubernetes Description: Support running pyflink job on kubernetes, including dependency management and so on just like on yarn and standalone cluster. Benefits: Kubernetes is a widely used container orchestration framework which has more flexibility in application developement and deployment. Welcome any comments and suggestions! Best, Jincheng Dian Fu <[hidden email]> 于2020年7月23日周四 上午11:10写道: > Thanks Robert for bringing up this discussion. This is very important to > ensure that we have a smooth release process as there are only two months > left before feature freeze. > > It would be good to have a list of the features for 1.12 as soon as > possible. Welcome any one to post the feature list which you think > important and want in 1.12. > > Regards, > Dian > > > 在 2020年7月23日,上午12:10,Prasanna kumar <[hidden email]> 写道: > > > > Hi Flink Dev Team, > > > > Dynamic AutoScaling Based on the incoming data load would be a great > feature. > > > > We should be able have some rule say If the load increased by 20% , add > extra resource should be added. > > Or time based say during these peak hours the pipeline should scale > automatically by 50%. > > > > This will help a lot in cost reduction. > > > > EMR cluster provides a similar feature for SPARK based application. > > > > Thanks, > > Prasanna. > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi all, > > > > Now that the 1.11 release is out, it is time to plan for the next major > > Flink release. > > > > Some items: > > > > 1. > > > > Dian Fu and me volunteer to be the release managers for Flink 1.12. > > > > > > > > 1. > > > > Timeline: We propose to stick to our approximate 4 month release > cycle, > > thus the release should be done by late October. Given that there’s a > > holiday week in China at the beginning of October, I propose to do the > > feature freeze on master by late September. > > > > 2. > > > > Collecting features: It would be good to have a rough overview of the > > features that will likely be ready to be merged by late September, > and that > > we want in the release. > > Based on the discussion, we will update the Roadmap on the Flink > website > > again! > > > > > > > > 1. > > > > Test instabilities and blockers: I would like to avoid a situation > where > > we have many blocking issues or build instabilities at the time of the > > feature freeze. To achieve that, we will try to check every build > > instability within a week, to decide if it is a blocker (make sure to > use > > the “test-stability” label for those tickets!) > > Blocker issues will need to have somebody assigned (responsible) > within > > a week, and we want to see progress on all blocker issues (downgrade, > > resolution, a good plan how to proceed if it is more complicated) > > > > 2. > > > > Quality and stability of new features: In order to have a short > feature > > freeze phase, we encourage developers to only merge well-tested and > > documented features. In our experience, the feature freeze works best > if > > new features are complete, and the community can focus fully on > addressing > > newly found bugs and voting the release. > > By having a smooth release process, the next merge-window for the next > > release will come sooner. > > > > > > Let me know what you think about our items, and share which features you > > want in Flink 1.12. > > > > Best, > > > > Robert & Dian > > |
In reply to this post by Prasanna kumar
I'm excited to hear about this feature, very, very, very highly encouraged
Prasanna kumar <[hidden email]> 于2020年7月23日周四 上午12:10写道: > Hi Flink Dev Team, > > Dynamic AutoScaling Based on the incoming data load would be a great > feature. > > We should be able have some rule say If the load increased by 20% , add > extra resource should be added. > Or time based say during these peak hours the pipeline should scale > automatically by 50%. > > This will help a lot in cost reduction. > > EMR cluster provides a similar feature for SPARK based application. > > Thanks, > Prasanna. > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email]> > wrote: > > > Hi all, > > > > Now that the 1.11 release is out, it is time to plan for the next major > > Flink release. > > > > Some items: > > > > 1. > > > > Dian Fu and me volunteer to be the release managers for Flink 1.12. > > > > > > > > 1. > > > > Timeline: We propose to stick to our approximate 4 month release > cycle, > > thus the release should be done by late October. Given that there’s a > > holiday week in China at the beginning of October, I propose to do the > > feature freeze on master by late September. > > > > 2. > > > > Collecting features: It would be good to have a rough overview of the > > features that will likely be ready to be merged by late September, and > > that > > we want in the release. > > Based on the discussion, we will update the Roadmap on the Flink > website > > again! > > > > > > > > 1. > > > > Test instabilities and blockers: I would like to avoid a situation > where > > we have many blocking issues or build instabilities at the time of the > > feature freeze. To achieve that, we will try to check every build > > instability within a week, to decide if it is a blocker (make sure to > > use > > the “test-stability” label for those tickets!) > > Blocker issues will need to have somebody assigned (responsible) > within > > a week, and we want to see progress on all blocker issues (downgrade, > > resolution, a good plan how to proceed if it is more complicated) > > > > 2. > > > > Quality and stability of new features: In order to have a short > feature > > freeze phase, we encourage developers to only merge well-tested and > > documented features. In our experience, the feature freeze works best > if > > new features are complete, and the community can focus fully on > > addressing > > newly found bugs and voting the release. > > By having a smooth release process, the next merge-window for the next > > release will come sooner. > > > > > > Let me know what you think about our items, and share which features you > > want in Flink 1.12. > > > > Best, > > > > Robert & Dian > > > -- Best Regards, Harold Miao |
Hi All,
Thanks for bring-up this discussion, Robert! Congratulations on becoming the release manager of 1.12, Dian and Robert ! ---------- Here are some of my thoughts of the features for native integration with Kubernetes in Flink 1.12: 1. Support user-specified pod templates Description: The current approach of introducing new configuration options for each aspect of pod specification a user might wish is becoming unwieldy, we have to maintain more and more Flink side Kubernetes configuration options and users have to learn the gap between the declarative model used by Kubernetes and the configuration model used by Flink. It's a great improvement to allow users to specify pod templates as central places for all customization needs for the jobmanager and taskmanager pods. Benefits: Users can leverage many of the advanced K8s features that the Flink community does not support explicitly, such as volume mounting, DNS configuration, pod affinity/anti-affinity setting, etc. 2. Support running PyFlink on Kubernetes Description: Support running PyFlink on Kubernetes, including session cluster and application cluster. Benefits: Running python application in a containerized environment. 3. Support built-in init-Container Description: We need a built-in init-Container to help solve dependency management in a containerized environment, especially in the application mode. Benefits: Separate the base Flink image from dynamic dependencies. 4. Support accessing secured services via K8s secrets Description: Kubernetes Secrets <https://kubernetes.io/docs/concepts/configuration/secret/> can be used to provide credentials for a Flink application to access secured services. It helps people who want to use a user-specified K8s Secret through an environment variable. Benefits: Improve user experience. 5. Support configuring replica of JobManager Deployment in ZooKeeper HA setups Description: Make the *replica* of Deployment configurable in the ZooKeeper HA setups. Benefits: Achieve faster failover. 6. Support to configure limit for CPU requirement Description: To leverage the Kubernetes feature of container request/limit CPU. Benefits: Reduce cost. Regards, Canbin Zheng Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > I'm excited to hear about this feature, very, very, very highly encouraged > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 上午12:10写道: > > > Hi Flink Dev Team, > > > > Dynamic AutoScaling Based on the incoming data load would be a great > > feature. > > > > We should be able have some rule say If the load increased by 20% , add > > extra resource should be added. > > Or time based say during these peak hours the pipeline should scale > > automatically by 50%. > > > > This will help a lot in cost reduction. > > > > EMR cluster provides a similar feature for SPARK based application. > > > > Thanks, > > Prasanna. > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email]> > > wrote: > > > > > Hi all, > > > > > > Now that the 1.11 release is out, it is time to plan for the next major > > > Flink release. > > > > > > Some items: > > > > > > 1. > > > > > > Dian Fu and me volunteer to be the release managers for Flink 1.12. > > > > > > > > > > > > 1. > > > > > > Timeline: We propose to stick to our approximate 4 month release > > cycle, > > > thus the release should be done by late October. Given that there’s > a > > > holiday week in China at the beginning of October, I propose to do > the > > > feature freeze on master by late September. > > > > > > 2. > > > > > > Collecting features: It would be good to have a rough overview of > the > > > features that will likely be ready to be merged by late September, > and > > > that > > > we want in the release. > > > Based on the discussion, we will update the Roadmap on the Flink > > website > > > again! > > > > > > > > > > > > 1. > > > > > > Test instabilities and blockers: I would like to avoid a situation > > where > > > we have many blocking issues or build instabilities at the time of > the > > > feature freeze. To achieve that, we will try to check every build > > > instability within a week, to decide if it is a blocker (make sure > to > > > use > > > the “test-stability” label for those tickets!) > > > Blocker issues will need to have somebody assigned (responsible) > > within > > > a week, and we want to see progress on all blocker issues > (downgrade, > > > resolution, a good plan how to proceed if it is more complicated) > > > > > > 2. > > > > > > Quality and stability of new features: In order to have a short > > feature > > > freeze phase, we encourage developers to only merge well-tested and > > > documented features. In our experience, the feature freeze works > best > > if > > > new features are complete, and the community can focus fully on > > > addressing > > > newly found bugs and voting the release. > > > By having a smooth release process, the next merge-window for the > next > > > release will come sooner. > > > > > > > > > Let me know what you think about our items, and share which features > you > > > want in Flink 1.12. > > > > > > Best, > > > > > > Robert & Dian > > > > > > > > -- > > Best Regards, > Harold Miao > |
Thanks for being our release managers for the 1.12 release Dian & Robert!
Here are some features I would like to work on for this release: # Features ## Finishing pipelined region scheduling ( https://issues.apache.org/jira/browse/FLINK-16430) With the pipelined region scheduler we want to implement a scheduler which can serve streaming as well as batch workloads alike while being able to run jobs under constrained resources. The latter is particularly important for bounded streaming jobs which, currently, are not well supported. ## Reactive-scaling mode Being able to react to newly available resources and rescaling a running job accordingly will make Flink's operation much easier because resources can then be controlled by an external tool (e.g. GCP autoscaling, K8s horizontal pod scaler, etc.). In this release we want to make a big step towards this direction. As a first step we want to support the execution of jobs with a parallelism which is lower than the specified parallelism in case that Flink lost a TaskManager or could not acquire enough resources. # Maintenance/Stability ## JM / TM finished task reconciliation ( https://issues.apache.org/jira/browse/FLINK-17075) This prevents the system from going out of sync if a task state change from the TM to the JM is lost. ## Make metrics services work with Kubernetes deployments ( https://issues.apache.org/jira/browse/FLINK-11127) Invert the direction in which the MetricFetcher connects to the MetricQueryFetchers. That way it will no longer be necessary to expose on K8s for every TaskManager a port on which the MetricQueryFetcher runs. This will then make the deployment of Flink clusters on K8s easier. ## Handle long-blocking operations during job submission (savepoint restore) (https://issues.apache.org/jira/browse/FLINK-16866) Submitting a Flink job can involve the interaction with external systems (blocking operations). Depending on the job the interactions can take so long that it exceeds the submission timeout which reports a failure on the client side even though the actual submission succeeded. By decoupling the creation of the ExecutionGraph from the job submission, we can make the job submission non-blocking which will solve this problem. ## Make IDs more intuitive to ease debugging (FLIP-118) ( https://issues.apache.org/jira/browse/FLINK-15679) By making the internal Flink IDs compositional or logging how they belong together, we can make the debugging of Flink's operations much easier. Cheers, Till On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <[hidden email]> wrote: > Hi All, > > Thanks for bring-up this discussion, Robert! > Congratulations on becoming the release manager of 1.12, Dian and Robert ! > > ---------- > Here are some of my thoughts of the features for native integration with > Kubernetes in Flink 1.12: > > 1. Support user-specified pod templates > Description: > The current approach of introducing new configuration options for each > aspect of pod specification a user might wish is becoming unwieldy, we have > to maintain more and more Flink side Kubernetes configuration options and > users have to learn the gap between the declarative model used by > Kubernetes and the configuration model used by Flink. It's a great > improvement to allow users to specify pod templates as central places for > all customization needs for the jobmanager and taskmanager pods. > Benefits: > Users can leverage many of the advanced K8s features that the Flink > community does not support explicitly, such as volume mounting, DNS > configuration, pod affinity/anti-affinity setting, etc. > > 2. Support running PyFlink on Kubernetes > Description: > Support running PyFlink on Kubernetes, including session cluster and > application cluster. > Benefits: > Running python application in a containerized environment. > > 3. Support built-in init-Container > Description: > We need a built-in init-Container to help solve dependency management > in a containerized environment, especially in the application mode. > Benefits: > Separate the base Flink image from dynamic dependencies. > > 4. Support accessing secured services via K8s secrets > Description: > Kubernetes Secrets > <https://kubernetes.io/docs/concepts/configuration/secret/> can be used to > provide credentials for a Flink application to access secured services. It > helps people who want to use a user-specified K8s Secret through an > environment variable. > Benefits: > Improve user experience. > > 5. Support configuring replica of JobManager Deployment in ZooKeeper HA > setups > Description: > Make the *replica* of Deployment configurable in the ZooKeeper HA > setups. > Benefits: > Achieve faster failover. > > 6. Support to configure limit for CPU requirement > Description: > To leverage the Kubernetes feature of container request/limit CPU. > Benefits: > Reduce cost. > > Regards, > Canbin Zheng > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > I'm excited to hear about this feature, very, very, very highly > encouraged > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 上午12:10写道: > > > > > Hi Flink Dev Team, > > > > > > Dynamic AutoScaling Based on the incoming data load would be a great > > > feature. > > > > > > We should be able have some rule say If the load increased by 20% , add > > > extra resource should be added. > > > Or time based say during these peak hours the pipeline should scale > > > automatically by 50%. > > > > > > This will help a lot in cost reduction. > > > > > > EMR cluster provides a similar feature for SPARK based application. > > > > > > Thanks, > > > Prasanna. > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > Now that the 1.11 release is out, it is time to plan for the next > major > > > > Flink release. > > > > > > > > Some items: > > > > > > > > 1. > > > > > > > > Dian Fu and me volunteer to be the release managers for Flink > 1.12. > > > > > > > > > > > > > > > > 1. > > > > > > > > Timeline: We propose to stick to our approximate 4 month release > > > cycle, > > > > thus the release should be done by late October. Given that > there’s > > a > > > > holiday week in China at the beginning of October, I propose to do > > the > > > > feature freeze on master by late September. > > > > > > > > 2. > > > > > > > > Collecting features: It would be good to have a rough overview of > > the > > > > features that will likely be ready to be merged by late September, > > and > > > > that > > > > we want in the release. > > > > Based on the discussion, we will update the Roadmap on the Flink > > > website > > > > again! > > > > > > > > > > > > > > > > 1. > > > > > > > > Test instabilities and blockers: I would like to avoid a situation > > > where > > > > we have many blocking issues or build instabilities at the time of > > the > > > > feature freeze. To achieve that, we will try to check every build > > > > instability within a week, to decide if it is a blocker (make sure > > to > > > > use > > > > the “test-stability” label for those tickets!) > > > > Blocker issues will need to have somebody assigned (responsible) > > > within > > > > a week, and we want to see progress on all blocker issues > > (downgrade, > > > > resolution, a good plan how to proceed if it is more complicated) > > > > > > > > 2. > > > > > > > > Quality and stability of new features: In order to have a short > > > feature > > > > freeze phase, we encourage developers to only merge well-tested > and > > > > documented features. In our experience, the feature freeze works > > best > > > if > > > > new features are complete, and the community can focus fully on > > > > addressing > > > > newly found bugs and voting the release. > > > > By having a smooth release process, the next merge-window for the > > next > > > > release will come sooner. > > > > > > > > > > > > Let me know what you think about our items, and share which features > > you > > > > want in Flink 1.12. > > > > > > > > Best, > > > > > > > > Robert & Dian > > > > > > > > > > > > > -- > > > > Best Regards, > > Harold Miao > > > |
Hi all,
Thanks a lot for the responses so far. I've put them into this Wiki page: https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep track of them. Ideally, post JIRA tickets for your feature, then the status will update automatically in the wiki :) Please keep posting features here, or add them to the Wiki yourself 🙏 @Prasanna kumar <[hidden email]>: Dynamic Auto Scaling is a feature request the community is well-aware of. Till has posted "Reactive-scaling mode" as a feature he's working on for the 1.12 release. This work will introduce the basic building blocks and partial support for the feature you are requesting. Proper support for dynamic scaling, while maintaining Flink's high performance (throughout, low latency) and correctness is a difficult task that needs a lot of work. It will probably take a little bit of time till this is fully available. Cheers, Robert On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <[hidden email]> wrote: > Thanks for being our release managers for the 1.12 release Dian & Robert! > > Here are some features I would like to work on for this release: > > # Features > > ## Finishing pipelined region scheduling ( > https://issues.apache.org/jira/browse/FLINK-16430) > With the pipelined region scheduler we want to implement a scheduler which > can serve streaming as well as batch workloads alike while being able to > run jobs under constrained resources. The latter is particularly important > for bounded streaming jobs which, currently, are not well supported. > > ## Reactive-scaling mode > Being able to react to newly available resources and rescaling a running > job accordingly will make Flink's operation much easier because resources > can then be controlled by an external tool (e.g. GCP autoscaling, K8s > horizontal pod scaler, etc.). In this release we want to make a big step > towards this direction. As a first step we want to support the execution of > jobs with a parallelism which is lower than the specified parallelism in > case that Flink lost a TaskManager or could not acquire enough resources. > > # Maintenance/Stability > > ## JM / TM finished task reconciliation ( > https://issues.apache.org/jira/browse/FLINK-17075) > This prevents the system from going out of sync if a task state change from > the TM to the JM is lost. > > ## Make metrics services work with Kubernetes deployments ( > https://issues.apache.org/jira/browse/FLINK-11127) > Invert the direction in which the MetricFetcher connects to the > MetricQueryFetchers. That way it will no longer be necessary to expose on > K8s for every TaskManager a port on which the MetricQueryFetcher runs. This > will then make the deployment of Flink clusters on K8s easier. > > ## Handle long-blocking operations during job submission (savepoint > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > Submitting a Flink job can involve the interaction with external systems > (blocking operations). Depending on the job the interactions can take so > long that it exceeds the submission timeout which reports a failure on the > client side even though the actual submission succeeded. By decoupling the > creation of the ExecutionGraph from the job submission, we can make the job > submission non-blocking which will solve this problem. > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > https://issues.apache.org/jira/browse/FLINK-15679) > By making the internal Flink IDs compositional or logging how they belong > together, we can make the debugging of Flink's operations much easier. > > Cheers, > Till > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <[hidden email]> > wrote: > > > Hi All, > > > > Thanks for bring-up this discussion, Robert! > > Congratulations on becoming the release manager of 1.12, Dian and Robert > ! > > > > ---------- > > Here are some of my thoughts of the features for native integration with > > Kubernetes in Flink 1.12: > > > > 1. Support user-specified pod templates > > Description: > > The current approach of introducing new configuration options for > each > > aspect of pod specification a user might wish is becoming unwieldy, we > have > > to maintain more and more Flink side Kubernetes configuration options and > > users have to learn the gap between the declarative model used by > > Kubernetes and the configuration model used by Flink. It's a great > > improvement to allow users to specify pod templates as central places for > > all customization needs for the jobmanager and taskmanager pods. > > Benefits: > > Users can leverage many of the advanced K8s features that the Flink > > community does not support explicitly, such as volume mounting, DNS > > configuration, pod affinity/anti-affinity setting, etc. > > > > 2. Support running PyFlink on Kubernetes > > Description: > > Support running PyFlink on Kubernetes, including session cluster and > > application cluster. > > Benefits: > > Running python application in a containerized environment. > > > > 3. Support built-in init-Container > > Description: > > We need a built-in init-Container to help solve dependency management > > in a containerized environment, especially in the application mode. > > Benefits: > > Separate the base Flink image from dynamic dependencies. > > > > 4. Support accessing secured services via K8s secrets > > Description: > > Kubernetes Secrets > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be used > to > > provide credentials for a Flink application to access secured services. > It > > helps people who want to use a user-specified K8s Secret through an > > environment variable. > > Benefits: > > Improve user experience. > > > > 5. Support configuring replica of JobManager Deployment in ZooKeeper HA > > setups > > Description: > > Make the *replica* of Deployment configurable in the ZooKeeper HA > > setups. > > Benefits: > > Achieve faster failover. > > > > 6. Support to configure limit for CPU requirement > > Description: > > To leverage the Kubernetes feature of container request/limit CPU. > > Benefits: > > Reduce cost. > > > > Regards, > > Canbin Zheng > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > I'm excited to hear about this feature, very, very, very highly > > encouraged > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > 上午12:10写道: > > > > > > > Hi Flink Dev Team, > > > > > > > > Dynamic AutoScaling Based on the incoming data load would be a great > > > > feature. > > > > > > > > We should be able have some rule say If the load increased by 20% , > add > > > > extra resource should be added. > > > > Or time based say during these peak hours the pipeline should scale > > > > automatically by 50%. > > > > > > > > This will help a lot in cost reduction. > > > > > > > > EMR cluster provides a similar feature for SPARK based application. > > > > > > > > Thanks, > > > > Prasanna. > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <[hidden email]> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for the next > > major > > > > > Flink release. > > > > > > > > > > Some items: > > > > > > > > > > 1. > > > > > > > > > > Dian Fu and me volunteer to be the release managers for Flink > > 1.12. > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > Timeline: We propose to stick to our approximate 4 month release > > > > cycle, > > > > > thus the release should be done by late October. Given that > > there’s > > > a > > > > > holiday week in China at the beginning of October, I propose to > do > > > the > > > > > feature freeze on master by late September. > > > > > > > > > > 2. > > > > > > > > > > Collecting features: It would be good to have a rough overview > of > > > the > > > > > features that will likely be ready to be merged by late > September, > > > and > > > > > that > > > > > we want in the release. > > > > > Based on the discussion, we will update the Roadmap on the Flink > > > > website > > > > > again! > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > Test instabilities and blockers: I would like to avoid a > situation > > > > where > > > > > we have many blocking issues or build instabilities at the time > of > > > the > > > > > feature freeze. To achieve that, we will try to check every > build > > > > > instability within a week, to decide if it is a blocker (make > sure > > > to > > > > > use > > > > > the “test-stability” label for those tickets!) > > > > > Blocker issues will need to have somebody assigned (responsible) > > > > within > > > > > a week, and we want to see progress on all blocker issues > > > (downgrade, > > > > > resolution, a good plan how to proceed if it is more > complicated) > > > > > > > > > > 2. > > > > > > > > > > Quality and stability of new features: In order to have a short > > > > feature > > > > > freeze phase, we encourage developers to only merge well-tested > > and > > > > > documented features. In our experience, the feature freeze works > > > best > > > > if > > > > > new features are complete, and the community can focus fully on > > > > > addressing > > > > > newly found bugs and voting the release. > > > > > By having a smooth release process, the next merge-window for > the > > > next > > > > > release will come sooner. > > > > > > > > > > > > > > > Let me know what you think about our items, and share which > features > > > you > > > > > want in Flink 1.12. > > > > > > > > > > Best, > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > -- > > > > > > Best Regards, > > > Harold Miao > > > > > > |
Regarding setting the feature freeze date to late September, I have some
concern that it might make the development time of 1.12 too short. One reason for this is we took too much time (about 1.5 month, from mid of May to beginning of July) for testing 1.11. It's not ideal but further squeeze the development time of 1.12 won't make this better. Besides, AFAIK July & August is also a popular vacation season for European. Given the fact most committers of Flink come from Europe, I think we should also take this into consideration. It's also true that the first week of October is the national holiday of China, so I'm wondering whether the end of October could be a candidate feature freeze date. Best, Kurt On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <[hidden email]> wrote: > Hi all, > > Thanks a lot for the responses so far. I've put them into this Wiki page: > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep > track of them. Ideally, post JIRA tickets for your feature, then the status > will update automatically in the wiki :) > > Please keep posting features here, or add them to the Wiki yourself 🙏 > > @Prasanna kumar <[hidden email]>: Dynamic Auto Scaling is a > feature request the community is well-aware of. Till has posted > "Reactive-scaling mode" as a feature he's working on for the 1.12 release. > This work will introduce the basic building blocks and partial support for > the feature you are requesting. > Proper support for dynamic scaling, while maintaining Flink's high > performance (throughout, low latency) and correctness is a difficult task > that needs a lot of work. It will probably take a little bit of time till > this is fully available. > > Cheers, > Robert > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <[hidden email]> > wrote: > > > Thanks for being our release managers for the 1.12 release Dian & Robert! > > > > Here are some features I would like to work on for this release: > > > > # Features > > > > ## Finishing pipelined region scheduling ( > > https://issues.apache.org/jira/browse/FLINK-16430) > > With the pipelined region scheduler we want to implement a scheduler > which > > can serve streaming as well as batch workloads alike while being able to > > run jobs under constrained resources. The latter is particularly > important > > for bounded streaming jobs which, currently, are not well supported. > > > > ## Reactive-scaling mode > > Being able to react to newly available resources and rescaling a running > > job accordingly will make Flink's operation much easier because resources > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s > > horizontal pod scaler, etc.). In this release we want to make a big step > > towards this direction. As a first step we want to support the execution > of > > jobs with a parallelism which is lower than the specified parallelism in > > case that Flink lost a TaskManager or could not acquire enough resources. > > > > # Maintenance/Stability > > > > ## JM / TM finished task reconciliation ( > > https://issues.apache.org/jira/browse/FLINK-17075) > > This prevents the system from going out of sync if a task state change > from > > the TM to the JM is lost. > > > > ## Make metrics services work with Kubernetes deployments ( > > https://issues.apache.org/jira/browse/FLINK-11127) > > Invert the direction in which the MetricFetcher connects to the > > MetricQueryFetchers. That way it will no longer be necessary to expose on > > K8s for every TaskManager a port on which the MetricQueryFetcher runs. > This > > will then make the deployment of Flink clusters on K8s easier. > > > > ## Handle long-blocking operations during job submission (savepoint > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > Submitting a Flink job can involve the interaction with external systems > > (blocking operations). Depending on the job the interactions can take so > > long that it exceeds the submission timeout which reports a failure on > the > > client side even though the actual submission succeeded. By decoupling > the > > creation of the ExecutionGraph from the job submission, we can make the > job > > submission non-blocking which will solve this problem. > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > https://issues.apache.org/jira/browse/FLINK-15679) > > By making the internal Flink IDs compositional or logging how they belong > > together, we can make the debugging of Flink's operations much easier. > > > > Cheers, > > Till > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <[hidden email]> > > wrote: > > > > > Hi All, > > > > > > Thanks for bring-up this discussion, Robert! > > > Congratulations on becoming the release manager of 1.12, Dian and > Robert > > ! > > > > > > ---------- > > > Here are some of my thoughts of the features for native integration > with > > > Kubernetes in Flink 1.12: > > > > > > 1. Support user-specified pod templates > > > Description: > > > The current approach of introducing new configuration options for > > each > > > aspect of pod specification a user might wish is becoming unwieldy, we > > have > > > to maintain more and more Flink side Kubernetes configuration options > and > > > users have to learn the gap between the declarative model used by > > > Kubernetes and the configuration model used by Flink. It's a great > > > improvement to allow users to specify pod templates as central places > for > > > all customization needs for the jobmanager and taskmanager pods. > > > Benefits: > > > Users can leverage many of the advanced K8s features that the Flink > > > community does not support explicitly, such as volume mounting, DNS > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > 2. Support running PyFlink on Kubernetes > > > Description: > > > Support running PyFlink on Kubernetes, including session cluster > and > > > application cluster. > > > Benefits: > > > Running python application in a containerized environment. > > > > > > 3. Support built-in init-Container > > > Description: > > > We need a built-in init-Container to help solve dependency > management > > > in a containerized environment, especially in the application mode. > > > Benefits: > > > Separate the base Flink image from dynamic dependencies. > > > > > > 4. Support accessing secured services via K8s secrets > > > Description: > > > Kubernetes Secrets > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be > used > > to > > > provide credentials for a Flink application to access secured services. > > It > > > helps people who want to use a user-specified K8s Secret through an > > > environment variable. > > > Benefits: > > > Improve user experience. > > > > > > 5. Support configuring replica of JobManager Deployment in ZooKeeper HA > > > setups > > > Description: > > > Make the *replica* of Deployment configurable in the ZooKeeper HA > > > setups. > > > Benefits: > > > Achieve faster failover. > > > > > > 6. Support to configure limit for CPU requirement > > > Description: > > > To leverage the Kubernetes feature of container request/limit CPU. > > > Benefits: > > > Reduce cost. > > > > > > Regards, > > > Canbin Zheng > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > I'm excited to hear about this feature, very, very, very highly > > > encouraged > > > > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > > 上午12:10写道: > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would be a > great > > > > > feature. > > > > > > > > > > We should be able have some rule say If the load increased by 20% , > > add > > > > > extra resource should be added. > > > > > Or time based say during these peak hours the pipeline should scale > > > > > automatically by 50%. > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > EMR cluster provides a similar feature for SPARK based application. > > > > > > > > > > Thanks, > > > > > Prasanna. > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for the next > > > major > > > > > > Flink release. > > > > > > > > > > > > Some items: > > > > > > > > > > > > 1. > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers for Flink > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 month > release > > > > > cycle, > > > > > > thus the release should be done by late October. Given that > > > there’s > > > > a > > > > > > holiday week in China at the beginning of October, I propose > to > > do > > > > the > > > > > > feature freeze on master by late September. > > > > > > > > > > > > 2. > > > > > > > > > > > > Collecting features: It would be good to have a rough overview > > of > > > > the > > > > > > features that will likely be ready to be merged by late > > September, > > > > and > > > > > > that > > > > > > we want in the release. > > > > > > Based on the discussion, we will update the Roadmap on the > Flink > > > > > website > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > Test instabilities and blockers: I would like to avoid a > > situation > > > > > where > > > > > > we have many blocking issues or build instabilities at the > time > > of > > > > the > > > > > > feature freeze. To achieve that, we will try to check every > > build > > > > > > instability within a week, to decide if it is a blocker (make > > sure > > > > to > > > > > > use > > > > > > the “test-stability” label for those tickets!) > > > > > > Blocker issues will need to have somebody assigned > (responsible) > > > > > within > > > > > > a week, and we want to see progress on all blocker issues > > > > (downgrade, > > > > > > resolution, a good plan how to proceed if it is more > > complicated) > > > > > > > > > > > > 2. > > > > > > > > > > > > Quality and stability of new features: In order to have a > short > > > > > feature > > > > > > freeze phase, we encourage developers to only merge > well-tested > > > and > > > > > > documented features. In our experience, the feature freeze > works > > > > best > > > > > if > > > > > > new features are complete, and the community can focus fully > on > > > > > > addressing > > > > > > newly found bugs and voting the release. > > > > > > By having a smooth release process, the next merge-window for > > the > > > > next > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share which > > features > > > > you > > > > > > want in Flink 1.12. > > > > > > > > > > > > Best, > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best Regards, > > > > Harold Miao > > > > > > > > > > |
Thanks a lot for commenting on the feature freeze date.
You are raising a few good points on the timing. If we have already (2 months before) concerns regarding the deadline, then I agree that we should move it till the end of October. We then just need to be careful not to run into the Christmas season at the end of December. If nobody objects within a few days, I'll update the feature freeze date in the Wiki. On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > Regarding setting the feature freeze date to late September, I have some > concern that it might make > the development time of 1.12 too short. > > One reason for this is we took too much time (about 1.5 month, from mid of > May to beginning of July) > for testing 1.11. It's not ideal but further squeeze the development time > of 1.12 won't make this better. > Besides, AFAIK July & August is also a popular vacation season for > European. Given the fact most > committers of Flink come from Europe, I think we should also take this > into consideration. > > It's also true that the first week of October is the national holiday of > China, so I'm wondering whether the > end of October could be a candidate feature freeze date. > > Best, > Kurt > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <[hidden email]> > wrote: > > > Hi all, > > > > Thanks a lot for the responses so far. I've put them into this Wiki page: > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep > > track of them. Ideally, post JIRA tickets for your feature, then the > status > > will update automatically in the wiki :) > > > > Please keep posting features here, or add them to the Wiki yourself 🙏 > > > > @Prasanna kumar <[hidden email]>: Dynamic Auto Scaling > is a > > feature request the community is well-aware of. Till has posted > > "Reactive-scaling mode" as a feature he's working on for the 1.12 > release. > > This work will introduce the basic building blocks and partial support > for > > the feature you are requesting. > > Proper support for dynamic scaling, while maintaining Flink's high > > performance (throughout, low latency) and correctness is a difficult task > > that needs a lot of work. It will probably take a little bit of time till > > this is fully available. > > > > Cheers, > > Robert > > > > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > Thanks for being our release managers for the 1.12 release Dian & > Robert! > > > > > > Here are some features I would like to work on for this release: > > > > > > # Features > > > > > > ## Finishing pipelined region scheduling ( > > > https://issues.apache.org/jira/browse/FLINK-16430) > > > With the pipelined region scheduler we want to implement a scheduler > > which > > > can serve streaming as well as batch workloads alike while being able > to > > > run jobs under constrained resources. The latter is particularly > > important > > > for bounded streaming jobs which, currently, are not well supported. > > > > > > ## Reactive-scaling mode > > > Being able to react to newly available resources and rescaling a > running > > > job accordingly will make Flink's operation much easier because > resources > > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s > > > horizontal pod scaler, etc.). In this release we want to make a big > step > > > towards this direction. As a first step we want to support the > execution > > of > > > jobs with a parallelism which is lower than the specified parallelism > in > > > case that Flink lost a TaskManager or could not acquire enough > resources. > > > > > > # Maintenance/Stability > > > > > > ## JM / TM finished task reconciliation ( > > > https://issues.apache.org/jira/browse/FLINK-17075) > > > This prevents the system from going out of sync if a task state change > > from > > > the TM to the JM is lost. > > > > > > ## Make metrics services work with Kubernetes deployments ( > > > https://issues.apache.org/jira/browse/FLINK-11127) > > > Invert the direction in which the MetricFetcher connects to the > > > MetricQueryFetchers. That way it will no longer be necessary to expose > on > > > K8s for every TaskManager a port on which the MetricQueryFetcher runs. > > This > > > will then make the deployment of Flink clusters on K8s easier. > > > > > > ## Handle long-blocking operations during job submission (savepoint > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > Submitting a Flink job can involve the interaction with external > systems > > > (blocking operations). Depending on the job the interactions can take > so > > > long that it exceeds the submission timeout which reports a failure on > > the > > > client side even though the actual submission succeeded. By decoupling > > the > > > creation of the ExecutionGraph from the job submission, we can make the > > job > > > submission non-blocking which will solve this problem. > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > https://issues.apache.org/jira/browse/FLINK-15679) > > > By making the internal Flink IDs compositional or logging how they > belong > > > together, we can make the debugging of Flink's operations much easier. > > > > > > Cheers, > > > Till > > > > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <[hidden email]> > > > wrote: > > > > > > > Hi All, > > > > > > > > Thanks for bring-up this discussion, Robert! > > > > Congratulations on becoming the release manager of 1.12, Dian and > > Robert > > > ! > > > > > > > > ---------- > > > > Here are some of my thoughts of the features for native integration > > with > > > > Kubernetes in Flink 1.12: > > > > > > > > 1. Support user-specified pod templates > > > > Description: > > > > The current approach of introducing new configuration options for > > > each > > > > aspect of pod specification a user might wish is becoming unwieldy, > we > > > have > > > > to maintain more and more Flink side Kubernetes configuration options > > and > > > > users have to learn the gap between the declarative model used by > > > > Kubernetes and the configuration model used by Flink. It's a great > > > > improvement to allow users to specify pod templates as central places > > for > > > > all customization needs for the jobmanager and taskmanager pods. > > > > Benefits: > > > > Users can leverage many of the advanced K8s features that the > Flink > > > > community does not support explicitly, such as volume mounting, DNS > > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > > > 2. Support running PyFlink on Kubernetes > > > > Description: > > > > Support running PyFlink on Kubernetes, including session cluster > > and > > > > application cluster. > > > > Benefits: > > > > Running python application in a containerized environment. > > > > > > > > 3. Support built-in init-Container > > > > Description: > > > > We need a built-in init-Container to help solve dependency > > management > > > > in a containerized environment, especially in the application mode. > > > > Benefits: > > > > Separate the base Flink image from dynamic dependencies. > > > > > > > > 4. Support accessing secured services via K8s secrets > > > > Description: > > > > Kubernetes Secrets > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be > > used > > > to > > > > provide credentials for a Flink application to access secured > services. > > > It > > > > helps people who want to use a user-specified K8s Secret through an > > > > environment variable. > > > > Benefits: > > > > Improve user experience. > > > > > > > > 5. Support configuring replica of JobManager Deployment in ZooKeeper > HA > > > > setups > > > > Description: > > > > Make the *replica* of Deployment configurable in the ZooKeeper HA > > > > setups. > > > > Benefits: > > > > Achieve faster failover. > > > > > > > > 6. Support to configure limit for CPU requirement > > > > Description: > > > > To leverage the Kubernetes feature of container request/limit > CPU. > > > > Benefits: > > > > Reduce cost. > > > > > > > > Regards, > > > > Canbin Zheng > > > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > > > I'm excited to hear about this feature, very, very, very highly > > > > encouraged > > > > > > > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > > > 上午12:10写道: > > > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would be a > > great > > > > > > feature. > > > > > > > > > > > > We should be able have some rule say If the load increased by > 20% , > > > add > > > > > > extra resource should be added. > > > > > > Or time based say during these peak hours the pipeline should > scale > > > > > > automatically by 50%. > > > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > > > EMR cluster provides a similar feature for SPARK based > application. > > > > > > > > > > > > Thanks, > > > > > > Prasanna. > > > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for the > next > > > > major > > > > > > > Flink release. > > > > > > > > > > > > > > Some items: > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers for > Flink > > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 month > > release > > > > > > cycle, > > > > > > > thus the release should be done by late October. Given that > > > > there’s > > > > > a > > > > > > > holiday week in China at the beginning of October, I propose > > to > > > do > > > > > the > > > > > > > feature freeze on master by late September. > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > Collecting features: It would be good to have a rough > overview > > > of > > > > > the > > > > > > > features that will likely be ready to be merged by late > > > September, > > > > > and > > > > > > > that > > > > > > > we want in the release. > > > > > > > Based on the discussion, we will update the Roadmap on the > > Flink > > > > > > website > > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > Test instabilities and blockers: I would like to avoid a > > > situation > > > > > > where > > > > > > > we have many blocking issues or build instabilities at the > > time > > > of > > > > > the > > > > > > > feature freeze. To achieve that, we will try to check every > > > build > > > > > > > instability within a week, to decide if it is a blocker > (make > > > sure > > > > > to > > > > > > > use > > > > > > > the “test-stability” label for those tickets!) > > > > > > > Blocker issues will need to have somebody assigned > > (responsible) > > > > > > within > > > > > > > a week, and we want to see progress on all blocker issues > > > > > (downgrade, > > > > > > > resolution, a good plan how to proceed if it is more > > > complicated) > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > Quality and stability of new features: In order to have a > > short > > > > > > feature > > > > > > > freeze phase, we encourage developers to only merge > > well-tested > > > > and > > > > > > > documented features. In our experience, the feature freeze > > works > > > > > best > > > > > > if > > > > > > > new features are complete, and the community can focus fully > > on > > > > > > > addressing > > > > > > > newly found bugs and voting the release. > > > > > > > By having a smooth release process, the next merge-window > for > > > the > > > > > next > > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share which > > > features > > > > > you > > > > > > > want in Flink 1.12. > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best Regards, > > > > > Harold Miao > > > > > > > > > > > > > > > |
The end of October sounds good from my side, unless it collides with some
holidays that affect many committers. Feature-wise, I believe we can definitely make good use of the time to wrap up some critical threads (like finishing the FLIP-27 source efforts). So +1 to the end of October from my side. Best, Stephan On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> wrote: > Thanks a lot for commenting on the feature freeze date. > > You are raising a few good points on the timing. > If we have already (2 months before) concerns regarding the deadline, then > I agree that we should move it till the end of October. > > We then just need to be careful not to run into the Christmas season at the > end of December. > > If nobody objects within a few days, I'll update the feature freeze date in > the Wiki. > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > > > Regarding setting the feature freeze date to late September, I have some > > concern that it might make > > the development time of 1.12 too short. > > > > One reason for this is we took too much time (about 1.5 month, from mid > of > > May to beginning of July) > > for testing 1.11. It's not ideal but further squeeze the development time > > of 1.12 won't make this better. > > Besides, AFAIK July & August is also a popular vacation season for > > European. Given the fact most > > committers of Flink come from Europe, I think we should also take this > > into consideration. > > > > It's also true that the first week of October is the national holiday of > > China, so I'm wondering whether the > > end of October could be a candidate feature freeze date. > > > > Best, > > Kurt > > > > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <[hidden email]> > > wrote: > > > > > Hi all, > > > > > > Thanks a lot for the responses so far. I've put them into this Wiki > page: > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep > > > track of them. Ideally, post JIRA tickets for your feature, then the > > status > > > will update automatically in the wiki :) > > > > > > Please keep posting features here, or add them to the Wiki yourself 🙏 > > > > > > @Prasanna kumar <[hidden email]>: Dynamic Auto Scaling > > is a > > > feature request the community is well-aware of. Till has posted > > > "Reactive-scaling mode" as a feature he's working on for the 1.12 > > release. > > > This work will introduce the basic building blocks and partial support > > for > > > the feature you are requesting. > > > Proper support for dynamic scaling, while maintaining Flink's high > > > performance (throughout, low latency) and correctness is a difficult > task > > > that needs a lot of work. It will probably take a little bit of time > till > > > this is fully available. > > > > > > Cheers, > > > Robert > > > > > > > > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <[hidden email]> > > > wrote: > > > > > > > Thanks for being our release managers for the 1.12 release Dian & > > Robert! > > > > > > > > Here are some features I would like to work on for this release: > > > > > > > > # Features > > > > > > > > ## Finishing pipelined region scheduling ( > > > > https://issues.apache.org/jira/browse/FLINK-16430) > > > > With the pipelined region scheduler we want to implement a scheduler > > > which > > > > can serve streaming as well as batch workloads alike while being able > > to > > > > run jobs under constrained resources. The latter is particularly > > > important > > > > for bounded streaming jobs which, currently, are not well supported. > > > > > > > > ## Reactive-scaling mode > > > > Being able to react to newly available resources and rescaling a > > running > > > > job accordingly will make Flink's operation much easier because > > resources > > > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s > > > > horizontal pod scaler, etc.). In this release we want to make a big > > step > > > > towards this direction. As a first step we want to support the > > execution > > > of > > > > jobs with a parallelism which is lower than the specified parallelism > > in > > > > case that Flink lost a TaskManager or could not acquire enough > > resources. > > > > > > > > # Maintenance/Stability > > > > > > > > ## JM / TM finished task reconciliation ( > > > > https://issues.apache.org/jira/browse/FLINK-17075) > > > > This prevents the system from going out of sync if a task state > change > > > from > > > > the TM to the JM is lost. > > > > > > > > ## Make metrics services work with Kubernetes deployments ( > > > > https://issues.apache.org/jira/browse/FLINK-11127) > > > > Invert the direction in which the MetricFetcher connects to the > > > > MetricQueryFetchers. That way it will no longer be necessary to > expose > > on > > > > K8s for every TaskManager a port on which the MetricQueryFetcher > runs. > > > This > > > > will then make the deployment of Flink clusters on K8s easier. > > > > > > > > ## Handle long-blocking operations during job submission (savepoint > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > > Submitting a Flink job can involve the interaction with external > > systems > > > > (blocking operations). Depending on the job the interactions can take > > so > > > > long that it exceeds the submission timeout which reports a failure > on > > > the > > > > client side even though the actual submission succeeded. By > decoupling > > > the > > > > creation of the ExecutionGraph from the job submission, we can make > the > > > job > > > > submission non-blocking which will solve this problem. > > > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > > https://issues.apache.org/jira/browse/FLINK-15679) > > > > By making the internal Flink IDs compositional or logging how they > > belong > > > > together, we can make the debugging of Flink's operations much > easier. > > > > > > > > Cheers, > > > > Till > > > > > > > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <[hidden email] > > > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > Thanks for bring-up this discussion, Robert! > > > > > Congratulations on becoming the release manager of 1.12, Dian and > > > Robert > > > > ! > > > > > > > > > > ---------- > > > > > Here are some of my thoughts of the features for native integration > > > with > > > > > Kubernetes in Flink 1.12: > > > > > > > > > > 1. Support user-specified pod templates > > > > > Description: > > > > > The current approach of introducing new configuration options > for > > > > each > > > > > aspect of pod specification a user might wish is becoming unwieldy, > > we > > > > have > > > > > to maintain more and more Flink side Kubernetes configuration > options > > > and > > > > > users have to learn the gap between the declarative model used by > > > > > Kubernetes and the configuration model used by Flink. It's a great > > > > > improvement to allow users to specify pod templates as central > places > > > for > > > > > all customization needs for the jobmanager and taskmanager pods. > > > > > Benefits: > > > > > Users can leverage many of the advanced K8s features that the > > Flink > > > > > community does not support explicitly, such as volume mounting, DNS > > > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > > > > > 2. Support running PyFlink on Kubernetes > > > > > Description: > > > > > Support running PyFlink on Kubernetes, including session > cluster > > > and > > > > > application cluster. > > > > > Benefits: > > > > > Running python application in a containerized environment. > > > > > > > > > > 3. Support built-in init-Container > > > > > Description: > > > > > We need a built-in init-Container to help solve dependency > > > management > > > > > in a containerized environment, especially in the application mode. > > > > > Benefits: > > > > > Separate the base Flink image from dynamic dependencies. > > > > > > > > > > 4. Support accessing secured services via K8s secrets > > > > > Description: > > > > > Kubernetes Secrets > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be > > > used > > > > to > > > > > provide credentials for a Flink application to access secured > > services. > > > > It > > > > > helps people who want to use a user-specified K8s Secret through an > > > > > environment variable. > > > > > Benefits: > > > > > Improve user experience. > > > > > > > > > > 5. Support configuring replica of JobManager Deployment in > ZooKeeper > > HA > > > > > setups > > > > > Description: > > > > > Make the *replica* of Deployment configurable in the ZooKeeper > HA > > > > > setups. > > > > > Benefits: > > > > > Achieve faster failover. > > > > > > > > > > 6. Support to configure limit for CPU requirement > > > > > Description: > > > > > To leverage the Kubernetes feature of container request/limit > > CPU. > > > > > Benefits: > > > > > Reduce cost. > > > > > > > > > > Regards, > > > > > Canbin Zheng > > > > > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > > > > > I'm excited to hear about this feature, very, very, very highly > > > > > encouraged > > > > > > > > > > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > > > > 上午12:10写道: > > > > > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would be a > > > great > > > > > > > feature. > > > > > > > > > > > > > > We should be able have some rule say If the load increased by > > 20% , > > > > add > > > > > > > extra resource should be added. > > > > > > > Or time based say during these peak hours the pipeline should > > scale > > > > > > > automatically by 50%. > > > > > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > > > > > EMR cluster provides a similar feature for SPARK based > > application. > > > > > > > > > > > > > > Thanks, > > > > > > > Prasanna. > > > > > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for the > > next > > > > > major > > > > > > > > Flink release. > > > > > > > > > > > > > > > > Some items: > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers for > > Flink > > > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 month > > > release > > > > > > > cycle, > > > > > > > > thus the release should be done by late October. Given > that > > > > > there’s > > > > > > a > > > > > > > > holiday week in China at the beginning of October, I > propose > > > to > > > > do > > > > > > the > > > > > > > > feature freeze on master by late September. > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > Collecting features: It would be good to have a rough > > overview > > > > of > > > > > > the > > > > > > > > features that will likely be ready to be merged by late > > > > September, > > > > > > and > > > > > > > > that > > > > > > > > we want in the release. > > > > > > > > Based on the discussion, we will update the Roadmap on the > > > Flink > > > > > > > website > > > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > Test instabilities and blockers: I would like to avoid a > > > > situation > > > > > > > where > > > > > > > > we have many blocking issues or build instabilities at the > > > time > > > > of > > > > > > the > > > > > > > > feature freeze. To achieve that, we will try to check > every > > > > build > > > > > > > > instability within a week, to decide if it is a blocker > > (make > > > > sure > > > > > > to > > > > > > > > use > > > > > > > > the “test-stability” label for those tickets!) > > > > > > > > Blocker issues will need to have somebody assigned > > > (responsible) > > > > > > > within > > > > > > > > a week, and we want to see progress on all blocker issues > > > > > > (downgrade, > > > > > > > > resolution, a good plan how to proceed if it is more > > > > complicated) > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > Quality and stability of new features: In order to have a > > > short > > > > > > > feature > > > > > > > > freeze phase, we encourage developers to only merge > > > well-tested > > > > > and > > > > > > > > documented features. In our experience, the feature freeze > > > works > > > > > > best > > > > > > > if > > > > > > > > new features are complete, and the community can focus > fully > > > on > > > > > > > > addressing > > > > > > > > newly found bugs and voting the release. > > > > > > > > By having a smooth release process, the next merge-window > > for > > > > the > > > > > > next > > > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share which > > > > features > > > > > > you > > > > > > > > want in Flink 1.12. > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Best Regards, > > > > > > Harold Miao > > > > > > > > > > > > > > > > > > > > > |
+1 for end of October from my side as well.
Cheers, Till On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> wrote: > The end of October sounds good from my side, unless it collides with some > holidays that affect many committers. > > Feature-wise, I believe we can definitely make good use of the time to wrap > up some critical threads (like finishing the FLIP-27 source efforts). > > So +1 to the end of October from my side. > > Best, > Stephan > > > On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> wrote: > > > Thanks a lot for commenting on the feature freeze date. > > > > You are raising a few good points on the timing. > > If we have already (2 months before) concerns regarding the deadline, > then > > I agree that we should move it till the end of October. > > > > We then just need to be careful not to run into the Christmas season at > the > > end of December. > > > > If nobody objects within a few days, I'll update the feature freeze date > in > > the Wiki. > > > > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > > > > > Regarding setting the feature freeze date to late September, I have > some > > > concern that it might make > > > the development time of 1.12 too short. > > > > > > One reason for this is we took too much time (about 1.5 month, from mid > > of > > > May to beginning of July) > > > for testing 1.11. It's not ideal but further squeeze the development > time > > > of 1.12 won't make this better. > > > Besides, AFAIK July & August is also a popular vacation season for > > > European. Given the fact most > > > committers of Flink come from Europe, I think we should also take this > > > into consideration. > > > > > > It's also true that the first week of October is the national holiday > of > > > China, so I'm wondering whether the > > > end of October could be a candidate feature freeze date. > > > > > > Best, > > > Kurt > > > > > > > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <[hidden email]> > > > wrote: > > > > > > > Hi all, > > > > > > > > Thanks a lot for the responses so far. I've put them into this Wiki > > page: > > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to > keep > > > > track of them. Ideally, post JIRA tickets for your feature, then the > > > status > > > > will update automatically in the wiki :) > > > > > > > > Please keep posting features here, or add them to the Wiki yourself > 🙏 > > > > > > > > @Prasanna kumar <[hidden email]>: Dynamic Auto > Scaling > > > is a > > > > feature request the community is well-aware of. Till has posted > > > > "Reactive-scaling mode" as a feature he's working on for the 1.12 > > > release. > > > > This work will introduce the basic building blocks and partial > support > > > for > > > > the feature you are requesting. > > > > Proper support for dynamic scaling, while maintaining Flink's high > > > > performance (throughout, low latency) and correctness is a difficult > > task > > > > that needs a lot of work. It will probably take a little bit of time > > till > > > > this is fully available. > > > > > > > > Cheers, > > > > Robert > > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <[hidden email]> > > > > wrote: > > > > > > > > > Thanks for being our release managers for the 1.12 release Dian & > > > Robert! > > > > > > > > > > Here are some features I would like to work on for this release: > > > > > > > > > > # Features > > > > > > > > > > ## Finishing pipelined region scheduling ( > > > > > https://issues.apache.org/jira/browse/FLINK-16430) > > > > > With the pipelined region scheduler we want to implement a > scheduler > > > > which > > > > > can serve streaming as well as batch workloads alike while being > able > > > to > > > > > run jobs under constrained resources. The latter is particularly > > > > important > > > > > for bounded streaming jobs which, currently, are not well > supported. > > > > > > > > > > ## Reactive-scaling mode > > > > > Being able to react to newly available resources and rescaling a > > > running > > > > > job accordingly will make Flink's operation much easier because > > > resources > > > > > can then be controlled by an external tool (e.g. GCP autoscaling, > K8s > > > > > horizontal pod scaler, etc.). In this release we want to make a big > > > step > > > > > towards this direction. As a first step we want to support the > > > execution > > > > of > > > > > jobs with a parallelism which is lower than the specified > parallelism > > > in > > > > > case that Flink lost a TaskManager or could not acquire enough > > > resources. > > > > > > > > > > # Maintenance/Stability > > > > > > > > > > ## JM / TM finished task reconciliation ( > > > > > https://issues.apache.org/jira/browse/FLINK-17075) > > > > > This prevents the system from going out of sync if a task state > > change > > > > from > > > > > the TM to the JM is lost. > > > > > > > > > > ## Make metrics services work with Kubernetes deployments ( > > > > > https://issues.apache.org/jira/browse/FLINK-11127) > > > > > Invert the direction in which the MetricFetcher connects to the > > > > > MetricQueryFetchers. That way it will no longer be necessary to > > expose > > > on > > > > > K8s for every TaskManager a port on which the MetricQueryFetcher > > runs. > > > > This > > > > > will then make the deployment of Flink clusters on K8s easier. > > > > > > > > > > ## Handle long-blocking operations during job submission (savepoint > > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > > > Submitting a Flink job can involve the interaction with external > > > systems > > > > > (blocking operations). Depending on the job the interactions can > take > > > so > > > > > long that it exceeds the submission timeout which reports a failure > > on > > > > the > > > > > client side even though the actual submission succeeded. By > > decoupling > > > > the > > > > > creation of the ExecutionGraph from the job submission, we can make > > the > > > > job > > > > > submission non-blocking which will solve this problem. > > > > > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > > > https://issues.apache.org/jira/browse/FLINK-15679) > > > > > By making the internal Flink IDs compositional or logging how they > > > belong > > > > > together, we can make the debugging of Flink's operations much > > easier. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > Hi All, > > > > > > > > > > > > Thanks for bring-up this discussion, Robert! > > > > > > Congratulations on becoming the release manager of 1.12, Dian and > > > > Robert > > > > > ! > > > > > > > > > > > > ---------- > > > > > > Here are some of my thoughts of the features for native > integration > > > > with > > > > > > Kubernetes in Flink 1.12: > > > > > > > > > > > > 1. Support user-specified pod templates > > > > > > Description: > > > > > > The current approach of introducing new configuration options > > for > > > > > each > > > > > > aspect of pod specification a user might wish is becoming > unwieldy, > > > we > > > > > have > > > > > > to maintain more and more Flink side Kubernetes configuration > > options > > > > and > > > > > > users have to learn the gap between the declarative model used by > > > > > > Kubernetes and the configuration model used by Flink. It's a > great > > > > > > improvement to allow users to specify pod templates as central > > places > > > > for > > > > > > all customization needs for the jobmanager and taskmanager pods. > > > > > > Benefits: > > > > > > Users can leverage many of the advanced K8s features that the > > > Flink > > > > > > community does not support explicitly, such as volume mounting, > DNS > > > > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > > > > > > > 2. Support running PyFlink on Kubernetes > > > > > > Description: > > > > > > Support running PyFlink on Kubernetes, including session > > cluster > > > > and > > > > > > application cluster. > > > > > > Benefits: > > > > > > Running python application in a containerized environment. > > > > > > > > > > > > 3. Support built-in init-Container > > > > > > Description: > > > > > > We need a built-in init-Container to help solve dependency > > > > management > > > > > > in a containerized environment, especially in the application > mode. > > > > > > Benefits: > > > > > > Separate the base Flink image from dynamic dependencies. > > > > > > > > > > > > 4. Support accessing secured services via K8s secrets > > > > > > Description: > > > > > > Kubernetes Secrets > > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can > be > > > > used > > > > > to > > > > > > provide credentials for a Flink application to access secured > > > services. > > > > > It > > > > > > helps people who want to use a user-specified K8s Secret through > an > > > > > > environment variable. > > > > > > Benefits: > > > > > > Improve user experience. > > > > > > > > > > > > 5. Support configuring replica of JobManager Deployment in > > ZooKeeper > > > HA > > > > > > setups > > > > > > Description: > > > > > > Make the *replica* of Deployment configurable in the > ZooKeeper > > HA > > > > > > setups. > > > > > > Benefits: > > > > > > Achieve faster failover. > > > > > > > > > > > > 6. Support to configure limit for CPU requirement > > > > > > Description: > > > > > > To leverage the Kubernetes feature of container request/limit > > > CPU. > > > > > > Benefits: > > > > > > Reduce cost. > > > > > > > > > > > > Regards, > > > > > > Canbin Zheng > > > > > > > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > > > > > > > I'm excited to hear about this feature, very, very, very > highly > > > > > > encouraged > > > > > > > > > > > > > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > > > > > 上午12:10写道: > > > > > > > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would be > a > > > > great > > > > > > > > feature. > > > > > > > > > > > > > > > > We should be able have some rule say If the load increased by > > > 20% , > > > > > add > > > > > > > > extra resource should be added. > > > > > > > > Or time based say during these peak hours the pipeline should > > > scale > > > > > > > > automatically by 50%. > > > > > > > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > > > > > > > EMR cluster provides a similar feature for SPARK based > > > application. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Prasanna. > > > > > > > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for > the > > > next > > > > > > major > > > > > > > > > Flink release. > > > > > > > > > > > > > > > > > > Some items: > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers for > > > Flink > > > > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 month > > > > release > > > > > > > > cycle, > > > > > > > > > thus the release should be done by late October. Given > > that > > > > > > there’s > > > > > > > a > > > > > > > > > holiday week in China at the beginning of October, I > > propose > > > > to > > > > > do > > > > > > > the > > > > > > > > > feature freeze on master by late September. > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > Collecting features: It would be good to have a rough > > > overview > > > > > of > > > > > > > the > > > > > > > > > features that will likely be ready to be merged by late > > > > > September, > > > > > > > and > > > > > > > > > that > > > > > > > > > we want in the release. > > > > > > > > > Based on the discussion, we will update the Roadmap on > the > > > > Flink > > > > > > > > website > > > > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > Test instabilities and blockers: I would like to avoid a > > > > > situation > > > > > > > > where > > > > > > > > > we have many blocking issues or build instabilities at > the > > > > time > > > > > of > > > > > > > the > > > > > > > > > feature freeze. To achieve that, we will try to check > > every > > > > > build > > > > > > > > > instability within a week, to decide if it is a blocker > > > (make > > > > > sure > > > > > > > to > > > > > > > > > use > > > > > > > > > the “test-stability” label for those tickets!) > > > > > > > > > Blocker issues will need to have somebody assigned > > > > (responsible) > > > > > > > > within > > > > > > > > > a week, and we want to see progress on all blocker > issues > > > > > > > (downgrade, > > > > > > > > > resolution, a good plan how to proceed if it is more > > > > > complicated) > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > Quality and stability of new features: In order to have > a > > > > short > > > > > > > > feature > > > > > > > > > freeze phase, we encourage developers to only merge > > > > well-tested > > > > > > and > > > > > > > > > documented features. In our experience, the feature > freeze > > > > works > > > > > > > best > > > > > > > > if > > > > > > > > > new features are complete, and the community can focus > > fully > > > > on > > > > > > > > > addressing > > > > > > > > > newly found bugs and voting the release. > > > > > > > > > By having a smooth release process, the next > merge-window > > > for > > > > > the > > > > > > > next > > > > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share which > > > > > features > > > > > > > you > > > > > > > > > want in Flink 1.12. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Best Regards, > > > > > > > Harold Miao > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
+1 for end of October from me as well.
Cheers, Kostas On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> wrote: > +1 for end of October from my side as well. > > Cheers, > Till > > On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> wrote: > > > The end of October sounds good from my side, unless it collides with some > > holidays that affect many committers. > > > > Feature-wise, I believe we can definitely make good use of the time to > wrap > > up some critical threads (like finishing the FLIP-27 source efforts). > > > > So +1 to the end of October from my side. > > > > Best, > > Stephan > > > > > > On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> > wrote: > > > > > Thanks a lot for commenting on the feature freeze date. > > > > > > You are raising a few good points on the timing. > > > If we have already (2 months before) concerns regarding the deadline, > > then > > > I agree that we should move it till the end of October. > > > > > > We then just need to be careful not to run into the Christmas season at > > the > > > end of December. > > > > > > If nobody objects within a few days, I'll update the feature freeze > date > > in > > > the Wiki. > > > > > > > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > > > > > > > Regarding setting the feature freeze date to late September, I have > > some > > > > concern that it might make > > > > the development time of 1.12 too short. > > > > > > > > One reason for this is we took too much time (about 1.5 month, from > mid > > > of > > > > May to beginning of July) > > > > for testing 1.11. It's not ideal but further squeeze the development > > time > > > > of 1.12 won't make this better. > > > > Besides, AFAIK July & August is also a popular vacation season for > > > > European. Given the fact most > > > > committers of Flink come from Europe, I think we should also take > this > > > > into consideration. > > > > > > > > It's also true that the first week of October is the national holiday > > of > > > > China, so I'm wondering whether the > > > > end of October could be a candidate feature freeze date. > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <[hidden email]> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > Thanks a lot for the responses so far. I've put them into this Wiki > > > page: > > > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to > > keep > > > > > track of them. Ideally, post JIRA tickets for your feature, then > the > > > > status > > > > > will update automatically in the wiki :) > > > > > > > > > > Please keep posting features here, or add them to the Wiki yourself > > 🙏 > > > > > > > > > > @Prasanna kumar <[hidden email]>: Dynamic Auto > > Scaling > > > > is a > > > > > feature request the community is well-aware of. Till has posted > > > > > "Reactive-scaling mode" as a feature he's working on for the 1.12 > > > > release. > > > > > This work will introduce the basic building blocks and partial > > support > > > > for > > > > > the feature you are requesting. > > > > > Proper support for dynamic scaling, while maintaining Flink's high > > > > > performance (throughout, low latency) and correctness is a > difficult > > > task > > > > > that needs a lot of work. It will probably take a little bit of > time > > > till > > > > > this is fully available. > > > > > > > > > > Cheers, > > > > > Robert > > > > > > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Thanks for being our release managers for the 1.12 release Dian & > > > > Robert! > > > > > > > > > > > > Here are some features I would like to work on for this release: > > > > > > > > > > > > # Features > > > > > > > > > > > > ## Finishing pipelined region scheduling ( > > > > > > https://issues.apache.org/jira/browse/FLINK-16430) > > > > > > With the pipelined region scheduler we want to implement a > > scheduler > > > > > which > > > > > > can serve streaming as well as batch workloads alike while being > > able > > > > to > > > > > > run jobs under constrained resources. The latter is particularly > > > > > important > > > > > > for bounded streaming jobs which, currently, are not well > > supported. > > > > > > > > > > > > ## Reactive-scaling mode > > > > > > Being able to react to newly available resources and rescaling a > > > > running > > > > > > job accordingly will make Flink's operation much easier because > > > > resources > > > > > > can then be controlled by an external tool (e.g. GCP autoscaling, > > K8s > > > > > > horizontal pod scaler, etc.). In this release we want to make a > big > > > > step > > > > > > towards this direction. As a first step we want to support the > > > > execution > > > > > of > > > > > > jobs with a parallelism which is lower than the specified > > parallelism > > > > in > > > > > > case that Flink lost a TaskManager or could not acquire enough > > > > resources. > > > > > > > > > > > > # Maintenance/Stability > > > > > > > > > > > > ## JM / TM finished task reconciliation ( > > > > > > https://issues.apache.org/jira/browse/FLINK-17075) > > > > > > This prevents the system from going out of sync if a task state > > > change > > > > > from > > > > > > the TM to the JM is lost. > > > > > > > > > > > > ## Make metrics services work with Kubernetes deployments ( > > > > > > https://issues.apache.org/jira/browse/FLINK-11127) > > > > > > Invert the direction in which the MetricFetcher connects to the > > > > > > MetricQueryFetchers. That way it will no longer be necessary to > > > expose > > > > on > > > > > > K8s for every TaskManager a port on which the MetricQueryFetcher > > > runs. > > > > > This > > > > > > will then make the deployment of Flink clusters on K8s easier. > > > > > > > > > > > > ## Handle long-blocking operations during job submission > (savepoint > > > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > > > > Submitting a Flink job can involve the interaction with external > > > > systems > > > > > > (blocking operations). Depending on the job the interactions can > > take > > > > so > > > > > > long that it exceeds the submission timeout which reports a > failure > > > on > > > > > the > > > > > > client side even though the actual submission succeeded. By > > > decoupling > > > > > the > > > > > > creation of the ExecutionGraph from the job submission, we can > make > > > the > > > > > job > > > > > > submission non-blocking which will solve this problem. > > > > > > > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > > > > https://issues.apache.org/jira/browse/FLINK-15679) > > > > > > By making the internal Flink IDs compositional or logging how > they > > > > belong > > > > > > together, we can make the debugging of Flink's operations much > > > easier. > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > [hidden email] > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > Thanks for bring-up this discussion, Robert! > > > > > > > Congratulations on becoming the release manager of 1.12, Dian > and > > > > > Robert > > > > > > ! > > > > > > > > > > > > > > ---------- > > > > > > > Here are some of my thoughts of the features for native > > integration > > > > > with > > > > > > > Kubernetes in Flink 1.12: > > > > > > > > > > > > > > 1. Support user-specified pod templates > > > > > > > Description: > > > > > > > The current approach of introducing new configuration > options > > > for > > > > > > each > > > > > > > aspect of pod specification a user might wish is becoming > > unwieldy, > > > > we > > > > > > have > > > > > > > to maintain more and more Flink side Kubernetes configuration > > > options > > > > > and > > > > > > > users have to learn the gap between the declarative model used > by > > > > > > > Kubernetes and the configuration model used by Flink. It's a > > great > > > > > > > improvement to allow users to specify pod templates as central > > > places > > > > > for > > > > > > > all customization needs for the jobmanager and taskmanager > pods. > > > > > > > Benefits: > > > > > > > Users can leverage many of the advanced K8s features that > the > > > > Flink > > > > > > > community does not support explicitly, such as volume mounting, > > DNS > > > > > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > > > > > > > > > 2. Support running PyFlink on Kubernetes > > > > > > > Description: > > > > > > > Support running PyFlink on Kubernetes, including session > > > cluster > > > > > and > > > > > > > application cluster. > > > > > > > Benefits: > > > > > > > Running python application in a containerized environment. > > > > > > > > > > > > > > 3. Support built-in init-Container > > > > > > > Description: > > > > > > > We need a built-in init-Container to help solve dependency > > > > > management > > > > > > > in a containerized environment, especially in the application > > mode. > > > > > > > Benefits: > > > > > > > Separate the base Flink image from dynamic dependencies. > > > > > > > > > > > > > > 4. Support accessing secured services via K8s secrets > > > > > > > Description: > > > > > > > Kubernetes Secrets > > > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> > can > > be > > > > > used > > > > > > to > > > > > > > provide credentials for a Flink application to access secured > > > > services. > > > > > > It > > > > > > > helps people who want to use a user-specified K8s Secret > through > > an > > > > > > > environment variable. > > > > > > > Benefits: > > > > > > > Improve user experience. > > > > > > > > > > > > > > 5. Support configuring replica of JobManager Deployment in > > > ZooKeeper > > > > HA > > > > > > > setups > > > > > > > Description: > > > > > > > Make the *replica* of Deployment configurable in the > > ZooKeeper > > > HA > > > > > > > setups. > > > > > > > Benefits: > > > > > > > Achieve faster failover. > > > > > > > > > > > > > > 6. Support to configure limit for CPU requirement > > > > > > > Description: > > > > > > > To leverage the Kubernetes feature of container > request/limit > > > > CPU. > > > > > > > Benefits: > > > > > > > Reduce cost. > > > > > > > > > > > > > > Regards, > > > > > > > Canbin Zheng > > > > > > > > > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > > > > > > > > > I'm excited to hear about this feature, very, very, very > > highly > > > > > > > encouraged > > > > > > > > > > > > > > > > > > > > > > > > Prasanna kumar <[hidden email]> 于2020年7月23日周四 > > > > > > 上午12:10写道: > > > > > > > > > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would > be > > a > > > > > great > > > > > > > > > feature. > > > > > > > > > > > > > > > > > > We should be able have some rule say If the load increased > by > > > > 20% , > > > > > > add > > > > > > > > > extra resource should be added. > > > > > > > > > Or time based say during these peak hours the pipeline > should > > > > scale > > > > > > > > > automatically by 50%. > > > > > > > > > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > > > > > > > > > EMR cluster provides a similar feature for SPARK based > > > > application. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Prasanna. > > > > > > > > > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan for > > the > > > > next > > > > > > > major > > > > > > > > > > Flink release. > > > > > > > > > > > > > > > > > > > > Some items: > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers > for > > > > Flink > > > > > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 > month > > > > > release > > > > > > > > > cycle, > > > > > > > > > > thus the release should be done by late October. Given > > > that > > > > > > > there’s > > > > > > > > a > > > > > > > > > > holiday week in China at the beginning of October, I > > > propose > > > > > to > > > > > > do > > > > > > > > the > > > > > > > > > > feature freeze on master by late September. > > > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > > > Collecting features: It would be good to have a rough > > > > overview > > > > > > of > > > > > > > > the > > > > > > > > > > features that will likely be ready to be merged by > late > > > > > > September, > > > > > > > > and > > > > > > > > > > that > > > > > > > > > > we want in the release. > > > > > > > > > > Based on the discussion, we will update the Roadmap on > > the > > > > > Flink > > > > > > > > > website > > > > > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > Test instabilities and blockers: I would like to > avoid a > > > > > > situation > > > > > > > > > where > > > > > > > > > > we have many blocking issues or build instabilities at > > the > > > > > time > > > > > > of > > > > > > > > the > > > > > > > > > > feature freeze. To achieve that, we will try to check > > > every > > > > > > build > > > > > > > > > > instability within a week, to decide if it is a > blocker > > > > (make > > > > > > sure > > > > > > > > to > > > > > > > > > > use > > > > > > > > > > the “test-stability” label for those tickets!) > > > > > > > > > > Blocker issues will need to have somebody assigned > > > > > (responsible) > > > > > > > > > within > > > > > > > > > > a week, and we want to see progress on all blocker > > issues > > > > > > > > (downgrade, > > > > > > > > > > resolution, a good plan how to proceed if it is more > > > > > > complicated) > > > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > > > Quality and stability of new features: In order to > have > > a > > > > > short > > > > > > > > > feature > > > > > > > > > > freeze phase, we encourage developers to only merge > > > > > well-tested > > > > > > > and > > > > > > > > > > documented features. In our experience, the feature > > freeze > > > > > works > > > > > > > > best > > > > > > > > > if > > > > > > > > > > new features are complete, and the community can focus > > > fully > > > > > on > > > > > > > > > > addressing > > > > > > > > > > newly found bugs and voting the release. > > > > > > > > > > By having a smooth release process, the next > > merge-window > > > > for > > > > > > the > > > > > > > > next > > > > > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share > which > > > > > > features > > > > > > > > you > > > > > > > > > > want in Flink 1.12. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > Harold Miao > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
+1 for end of October from me as well.
Best, Jincheng Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > +1 for end of October from me as well. > > Cheers, > Kostas > > On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> wrote: > > > +1 for end of October from my side as well. > > > > Cheers, > > Till > > > > On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> wrote: > > > > > The end of October sounds good from my side, unless it collides with > some > > > holidays that affect many committers. > > > > > > Feature-wise, I believe we can definitely make good use of the time to > > wrap > > > up some critical threads (like finishing the FLIP-27 source efforts). > > > > > > So +1 to the end of October from my side. > > > > > > Best, > > > Stephan > > > > > > > > > On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> > > wrote: > > > > > > > Thanks a lot for commenting on the feature freeze date. > > > > > > > > You are raising a few good points on the timing. > > > > If we have already (2 months before) concerns regarding the deadline, > > > then > > > > I agree that we should move it till the end of October. > > > > > > > > We then just need to be careful not to run into the Christmas season > at > > > the > > > > end of December. > > > > > > > > If nobody objects within a few days, I'll update the feature freeze > > date > > > in > > > > the Wiki. > > > > > > > > > > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > > > > > > > > > Regarding setting the feature freeze date to late September, I have > > > some > > > > > concern that it might make > > > > > the development time of 1.12 too short. > > > > > > > > > > One reason for this is we took too much time (about 1.5 month, from > > mid > > > > of > > > > > May to beginning of July) > > > > > for testing 1.11. It's not ideal but further squeeze the > development > > > time > > > > > of 1.12 won't make this better. > > > > > Besides, AFAIK July & August is also a popular vacation season for > > > > > European. Given the fact most > > > > > committers of Flink come from Europe, I think we should also take > > this > > > > > into consideration. > > > > > > > > > > It's also true that the first week of October is the national > holiday > > > of > > > > > China, so I'm wondering whether the > > > > > end of October could be a candidate feature freeze date. > > > > > > > > > > Best, > > > > > Kurt > > > > > > > > > > > > > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Thanks a lot for the responses so far. I've put them into this > Wiki > > > > page: > > > > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > to > > > keep > > > > > > track of them. Ideally, post JIRA tickets for your feature, then > > the > > > > > status > > > > > > will update automatically in the wiki :) > > > > > > > > > > > > Please keep posting features here, or add them to the Wiki > yourself > > > 🙏 > > > > > > > > > > > > @Prasanna kumar <[hidden email]>: Dynamic Auto > > > Scaling > > > > > is a > > > > > > feature request the community is well-aware of. Till has posted > > > > > > "Reactive-scaling mode" as a feature he's working on for the 1.12 > > > > > release. > > > > > > This work will introduce the basic building blocks and partial > > > support > > > > > for > > > > > > the feature you are requesting. > > > > > > Proper support for dynamic scaling, while maintaining Flink's > high > > > > > > performance (throughout, low latency) and correctness is a > > difficult > > > > task > > > > > > that needs a lot of work. It will probably take a little bit of > > time > > > > till > > > > > > this is fully available. > > > > > > > > > > > > Cheers, > > > > > > Robert > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Thanks for being our release managers for the 1.12 release > Dian & > > > > > Robert! > > > > > > > > > > > > > > Here are some features I would like to work on for this > release: > > > > > > > > > > > > > > # Features > > > > > > > > > > > > > > ## Finishing pipelined region scheduling ( > > > > > > > https://issues.apache.org/jira/browse/FLINK-16430) > > > > > > > With the pipelined region scheduler we want to implement a > > > scheduler > > > > > > which > > > > > > > can serve streaming as well as batch workloads alike while > being > > > able > > > > > to > > > > > > > run jobs under constrained resources. The latter is > particularly > > > > > > important > > > > > > > for bounded streaming jobs which, currently, are not well > > > supported. > > > > > > > > > > > > > > ## Reactive-scaling mode > > > > > > > Being able to react to newly available resources and rescaling > a > > > > > running > > > > > > > job accordingly will make Flink's operation much easier because > > > > > resources > > > > > > > can then be controlled by an external tool (e.g. GCP > autoscaling, > > > K8s > > > > > > > horizontal pod scaler, etc.). In this release we want to make a > > big > > > > > step > > > > > > > towards this direction. As a first step we want to support the > > > > > execution > > > > > > of > > > > > > > jobs with a parallelism which is lower than the specified > > > parallelism > > > > > in > > > > > > > case that Flink lost a TaskManager or could not acquire enough > > > > > resources. > > > > > > > > > > > > > > # Maintenance/Stability > > > > > > > > > > > > > > ## JM / TM finished task reconciliation ( > > > > > > > https://issues.apache.org/jira/browse/FLINK-17075) > > > > > > > This prevents the system from going out of sync if a task state > > > > change > > > > > > from > > > > > > > the TM to the JM is lost. > > > > > > > > > > > > > > ## Make metrics services work with Kubernetes deployments ( > > > > > > > https://issues.apache.org/jira/browse/FLINK-11127) > > > > > > > Invert the direction in which the MetricFetcher connects to the > > > > > > > MetricQueryFetchers. That way it will no longer be necessary to > > > > expose > > > > > on > > > > > > > K8s for every TaskManager a port on which the > MetricQueryFetcher > > > > runs. > > > > > > This > > > > > > > will then make the deployment of Flink clusters on K8s easier. > > > > > > > > > > > > > > ## Handle long-blocking operations during job submission > > (savepoint > > > > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > > > > > Submitting a Flink job can involve the interaction with > external > > > > > systems > > > > > > > (blocking operations). Depending on the job the interactions > can > > > take > > > > > so > > > > > > > long that it exceeds the submission timeout which reports a > > failure > > > > on > > > > > > the > > > > > > > client side even though the actual submission succeeded. By > > > > decoupling > > > > > > the > > > > > > > creation of the ExecutionGraph from the job submission, we can > > make > > > > the > > > > > > job > > > > > > > submission non-blocking which will solve this problem. > > > > > > > > > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > > > > > https://issues.apache.org/jira/browse/FLINK-15679) > > > > > > > By making the internal Flink IDs compositional or logging how > > they > > > > > belong > > > > > > > together, we can make the debugging of Flink's operations much > > > > easier. > > > > > > > > > > > > > > Cheers, > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > > [hidden email] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > Thanks for bring-up this discussion, Robert! > > > > > > > > Congratulations on becoming the release manager of 1.12, Dian > > and > > > > > > Robert > > > > > > > ! > > > > > > > > > > > > > > > > ---------- > > > > > > > > Here are some of my thoughts of the features for native > > > integration > > > > > > with > > > > > > > > Kubernetes in Flink 1.12: > > > > > > > > > > > > > > > > 1. Support user-specified pod templates > > > > > > > > Description: > > > > > > > > The current approach of introducing new configuration > > options > > > > for > > > > > > > each > > > > > > > > aspect of pod specification a user might wish is becoming > > > unwieldy, > > > > > we > > > > > > > have > > > > > > > > to maintain more and more Flink side Kubernetes configuration > > > > options > > > > > > and > > > > > > > > users have to learn the gap between the declarative model > used > > by > > > > > > > > Kubernetes and the configuration model used by Flink. It's a > > > great > > > > > > > > improvement to allow users to specify pod templates as > central > > > > places > > > > > > for > > > > > > > > all customization needs for the jobmanager and taskmanager > > pods. > > > > > > > > Benefits: > > > > > > > > Users can leverage many of the advanced K8s features that > > the > > > > > Flink > > > > > > > > community does not support explicitly, such as volume > mounting, > > > DNS > > > > > > > > configuration, pod affinity/anti-affinity setting, etc. > > > > > > > > > > > > > > > > 2. Support running PyFlink on Kubernetes > > > > > > > > Description: > > > > > > > > Support running PyFlink on Kubernetes, including session > > > > cluster > > > > > > and > > > > > > > > application cluster. > > > > > > > > Benefits: > > > > > > > > Running python application in a containerized > environment. > > > > > > > > > > > > > > > > 3. Support built-in init-Container > > > > > > > > Description: > > > > > > > > We need a built-in init-Container to help solve > dependency > > > > > > management > > > > > > > > in a containerized environment, especially in the application > > > mode. > > > > > > > > Benefits: > > > > > > > > Separate the base Flink image from dynamic dependencies. > > > > > > > > > > > > > > > > 4. Support accessing secured services via K8s secrets > > > > > > > > Description: > > > > > > > > Kubernetes Secrets > > > > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> > > can > > > be > > > > > > used > > > > > > > to > > > > > > > > provide credentials for a Flink application to access secured > > > > > services. > > > > > > > It > > > > > > > > helps people who want to use a user-specified K8s Secret > > through > > > an > > > > > > > > environment variable. > > > > > > > > Benefits: > > > > > > > > Improve user experience. > > > > > > > > > > > > > > > > 5. Support configuring replica of JobManager Deployment in > > > > ZooKeeper > > > > > HA > > > > > > > > setups > > > > > > > > Description: > > > > > > > > Make the *replica* of Deployment configurable in the > > > ZooKeeper > > > > HA > > > > > > > > setups. > > > > > > > > Benefits: > > > > > > > > Achieve faster failover. > > > > > > > > > > > > > > > > 6. Support to configure limit for CPU requirement > > > > > > > > Description: > > > > > > > > To leverage the Kubernetes feature of container > > request/limit > > > > > CPU. > > > > > > > > Benefits: > > > > > > > > Reduce cost. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Canbin Zheng > > > > > > > > > > > > > > > > Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > > > > > > > > > > > > > > > I'm excited to hear about this feature, very, very, very > > > highly > > > > > > > > encouraged > > > > > > > > > > > > > > > > > > > > > > > > > > > Prasanna kumar <[hidden email]> > 于2020年7月23日周四 > > > > > > > 上午12:10写道: > > > > > > > > > > > > > > > > > > > Hi Flink Dev Team, > > > > > > > > > > > > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would > > be > > > a > > > > > > great > > > > > > > > > > feature. > > > > > > > > > > > > > > > > > > > > We should be able have some rule say If the load > increased > > by > > > > > 20% , > > > > > > > add > > > > > > > > > > extra resource should be added. > > > > > > > > > > Or time based say during these peak hours the pipeline > > should > > > > > scale > > > > > > > > > > automatically by 50%. > > > > > > > > > > > > > > > > > > > > This will help a lot in cost reduction. > > > > > > > > > > > > > > > > > > > > EMR cluster provides a similar feature for SPARK based > > > > > application. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Prasanna. > > > > > > > > > > > > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > Now that the 1.11 release is out, it is time to plan > for > > > the > > > > > next > > > > > > > > major > > > > > > > > > > > Flink release. > > > > > > > > > > > > > > > > > > > > > > Some items: > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > Dian Fu and me volunteer to be the release managers > > for > > > > > Flink > > > > > > > > 1.12. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > Timeline: We propose to stick to our approximate 4 > > month > > > > > > release > > > > > > > > > > cycle, > > > > > > > > > > > thus the release should be done by late October. > Given > > > > that > > > > > > > > there’s > > > > > > > > > a > > > > > > > > > > > holiday week in China at the beginning of October, I > > > > propose > > > > > > to > > > > > > > do > > > > > > > > > the > > > > > > > > > > > feature freeze on master by late September. > > > > > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > > > > > Collecting features: It would be good to have a > rough > > > > > overview > > > > > > > of > > > > > > > > > the > > > > > > > > > > > features that will likely be ready to be merged by > > late > > > > > > > September, > > > > > > > > > and > > > > > > > > > > > that > > > > > > > > > > > we want in the release. > > > > > > > > > > > Based on the discussion, we will update the Roadmap > on > > > the > > > > > > Flink > > > > > > > > > > website > > > > > > > > > > > again! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > Test instabilities and blockers: I would like to > > avoid a > > > > > > > situation > > > > > > > > > > where > > > > > > > > > > > we have many blocking issues or build instabilities > at > > > the > > > > > > time > > > > > > > of > > > > > > > > > the > > > > > > > > > > > feature freeze. To achieve that, we will try to > check > > > > every > > > > > > > build > > > > > > > > > > > instability within a week, to decide if it is a > > blocker > > > > > (make > > > > > > > sure > > > > > > > > > to > > > > > > > > > > > use > > > > > > > > > > > the “test-stability” label for those tickets!) > > > > > > > > > > > Blocker issues will need to have somebody assigned > > > > > > (responsible) > > > > > > > > > > within > > > > > > > > > > > a week, and we want to see progress on all blocker > > > issues > > > > > > > > > (downgrade, > > > > > > > > > > > resolution, a good plan how to proceed if it is more > > > > > > > complicated) > > > > > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > > > > > > Quality and stability of new features: In order to > > have > > > a > > > > > > short > > > > > > > > > > feature > > > > > > > > > > > freeze phase, we encourage developers to only merge > > > > > > well-tested > > > > > > > > and > > > > > > > > > > > documented features. In our experience, the feature > > > freeze > > > > > > works > > > > > > > > > best > > > > > > > > > > if > > > > > > > > > > > new features are complete, and the community can > focus > > > > fully > > > > > > on > > > > > > > > > > > addressing > > > > > > > > > > > newly found bugs and voting the release. > > > > > > > > > > > By having a smooth release process, the next > > > merge-window > > > > > for > > > > > > > the > > > > > > > > > next > > > > > > > > > > > release will come sooner. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me know what you think about our items, and share > > which > > > > > > > features > > > > > > > > > you > > > > > > > > > > > want in Flink 1.12. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > Robert & Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > Harold Miao > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
I'm a bit concerned about end of October, because it means we have Flink
forward, which usually means at least 1 week of little-to-no activity, and then 1 week until feature-freeze. On 05/08/2020 11:56, jincheng sun wrote: > +1 for end of October from me as well. > > Best, > Jincheng > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > >> +1 for end of October from me as well. >> >> Cheers, >> Kostas >> >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> wrote: >> >>> +1 for end of October from my side as well. >>> >>> Cheers, >>> Till >>> >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> wrote: >>> >>>> The end of October sounds good from my side, unless it collides with >> some >>>> holidays that affect many committers. >>>> >>>> Feature-wise, I believe we can definitely make good use of the time to >>> wrap >>>> up some critical threads (like finishing the FLIP-27 source efforts). >>>> >>>> So +1 to the end of October from my side. >>>> >>>> Best, >>>> Stephan >>>> >>>> >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> >>> wrote: >>>>> Thanks a lot for commenting on the feature freeze date. >>>>> >>>>> You are raising a few good points on the timing. >>>>> If we have already (2 months before) concerns regarding the deadline, >>>> then >>>>> I agree that we should move it till the end of October. >>>>> >>>>> We then just need to be careful not to run into the Christmas season >> at >>>> the >>>>> end of December. >>>>> >>>>> If nobody objects within a few days, I'll update the feature freeze >>> date >>>> in >>>>> the Wiki. >>>>> >>>>> >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: >>>>> >>>>>> Regarding setting the feature freeze date to late September, I have >>>> some >>>>>> concern that it might make >>>>>> the development time of 1.12 too short. >>>>>> >>>>>> One reason for this is we took too much time (about 1.5 month, from >>> mid >>>>> of >>>>>> May to beginning of July) >>>>>> for testing 1.11. It's not ideal but further squeeze the >> development >>>> time >>>>>> of 1.12 won't make this better. >>>>>> Besides, AFAIK July & August is also a popular vacation season for >>>>>> European. Given the fact most >>>>>> committers of Flink come from Europe, I think we should also take >>> this >>>>>> into consideration. >>>>>> >>>>>> It's also true that the first week of October is the national >> holiday >>>> of >>>>>> China, so I'm wondering whether the >>>>>> end of October could be a candidate feature freeze date. >>>>>> >>>>>> Best, >>>>>> Kurt >>>>>> >>>>>> >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < >> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Thanks a lot for the responses so far. I've put them into this >> Wiki >>>>> page: >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release >> to >>>> keep >>>>>>> track of them. Ideally, post JIRA tickets for your feature, then >>> the >>>>>> status >>>>>>> will update automatically in the wiki :) >>>>>>> >>>>>>> Please keep posting features here, or add them to the Wiki >> yourself >>>> 🙏 >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto >>>> Scaling >>>>>> is a >>>>>>> feature request the community is well-aware of. Till has posted >>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12 >>>>>> release. >>>>>>> This work will introduce the basic building blocks and partial >>>> support >>>>>> for >>>>>>> the feature you are requesting. >>>>>>> Proper support for dynamic scaling, while maintaining Flink's >> high >>>>>>> performance (throughout, low latency) and correctness is a >>> difficult >>>>> task >>>>>>> that needs a lot of work. It will probably take a little bit of >>> time >>>>> till >>>>>>> this is fully available. >>>>>>> >>>>>>> Cheers, >>>>>>> Robert >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < >>> [hidden email]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks for being our release managers for the 1.12 release >> Dian & >>>>>> Robert! >>>>>>>> Here are some features I would like to work on for this >> release: >>>>>>>> # Features >>>>>>>> >>>>>>>> ## Finishing pipelined region scheduling ( >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) >>>>>>>> With the pipelined region scheduler we want to implement a >>>> scheduler >>>>>>> which >>>>>>>> can serve streaming as well as batch workloads alike while >> being >>>> able >>>>>> to >>>>>>>> run jobs under constrained resources. The latter is >> particularly >>>>>>> important >>>>>>>> for bounded streaming jobs which, currently, are not well >>>> supported. >>>>>>>> ## Reactive-scaling mode >>>>>>>> Being able to react to newly available resources and rescaling >> a >>>>>> running >>>>>>>> job accordingly will make Flink's operation much easier because >>>>>> resources >>>>>>>> can then be controlled by an external tool (e.g. GCP >> autoscaling, >>>> K8s >>>>>>>> horizontal pod scaler, etc.). In this release we want to make a >>> big >>>>>> step >>>>>>>> towards this direction. As a first step we want to support the >>>>>> execution >>>>>>> of >>>>>>>> jobs with a parallelism which is lower than the specified >>>> parallelism >>>>>> in >>>>>>>> case that Flink lost a TaskManager or could not acquire enough >>>>>> resources. >>>>>>>> # Maintenance/Stability >>>>>>>> >>>>>>>> ## JM / TM finished task reconciliation ( >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) >>>>>>>> This prevents the system from going out of sync if a task state >>>>> change >>>>>>> from >>>>>>>> the TM to the JM is lost. >>>>>>>> >>>>>>>> ## Make metrics services work with Kubernetes deployments ( >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) >>>>>>>> Invert the direction in which the MetricFetcher connects to the >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to >>>>> expose >>>>>> on >>>>>>>> K8s for every TaskManager a port on which the >> MetricQueryFetcher >>>>> runs. >>>>>>> This >>>>>>>> will then make the deployment of Flink clusters on K8s easier. >>>>>>>> >>>>>>>> ## Handle long-blocking operations during job submission >>> (savepoint >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) >>>>>>>> Submitting a Flink job can involve the interaction with >> external >>>>>> systems >>>>>>>> (blocking operations). Depending on the job the interactions >> can >>>> take >>>>>> so >>>>>>>> long that it exceeds the submission timeout which reports a >>> failure >>>>> on >>>>>>> the >>>>>>>> client side even though the actual submission succeeded. By >>>>> decoupling >>>>>>> the >>>>>>>> creation of the ExecutionGraph from the job submission, we can >>> make >>>>> the >>>>>>> job >>>>>>>> submission non-blocking which will solve this problem. >>>>>>>> >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) >>>>>>>> By making the internal Flink IDs compositional or logging how >>> they >>>>>> belong >>>>>>>> together, we can make the debugging of Flink's operations much >>>>> easier. >>>>>>>> Cheers, >>>>>>>> Till >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < >>>> [hidden email] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Thanks for bring-up this discussion, Robert! >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian >>> and >>>>>>> Robert >>>>>>>> ! >>>>>>>>> ---------- >>>>>>>>> Here are some of my thoughts of the features for native >>>> integration >>>>>>> with >>>>>>>>> Kubernetes in Flink 1.12: >>>>>>>>> >>>>>>>>> 1. Support user-specified pod templates >>>>>>>>> Description: >>>>>>>>> The current approach of introducing new configuration >>> options >>>>> for >>>>>>>> each >>>>>>>>> aspect of pod specification a user might wish is becoming >>>> unwieldy, >>>>>> we >>>>>>>> have >>>>>>>>> to maintain more and more Flink side Kubernetes configuration >>>>> options >>>>>>> and >>>>>>>>> users have to learn the gap between the declarative model >> used >>> by >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a >>>> great >>>>>>>>> improvement to allow users to specify pod templates as >> central >>>>> places >>>>>>> for >>>>>>>>> all customization needs for the jobmanager and taskmanager >>> pods. >>>>>>>>> Benefits: >>>>>>>>> Users can leverage many of the advanced K8s features that >>> the >>>>>> Flink >>>>>>>>> community does not support explicitly, such as volume >> mounting, >>>> DNS >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. >>>>>>>>> >>>>>>>>> 2. Support running PyFlink on Kubernetes >>>>>>>>> Description: >>>>>>>>> Support running PyFlink on Kubernetes, including session >>>>> cluster >>>>>>> and >>>>>>>>> application cluster. >>>>>>>>> Benefits: >>>>>>>>> Running python application in a containerized >> environment. >>>>>>>>> 3. Support built-in init-Container >>>>>>>>> Description: >>>>>>>>> We need a built-in init-Container to help solve >> dependency >>>>>>> management >>>>>>>>> in a containerized environment, especially in the application >>>> mode. >>>>>>>>> Benefits: >>>>>>>>> Separate the base Flink image from dynamic dependencies. >>>>>>>>> >>>>>>>>> 4. Support accessing secured services via K8s secrets >>>>>>>>> Description: >>>>>>>>> Kubernetes Secrets >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> >>> can >>>> be >>>>>>> used >>>>>>>> to >>>>>>>>> provide credentials for a Flink application to access secured >>>>>> services. >>>>>>>> It >>>>>>>>> helps people who want to use a user-specified K8s Secret >>> through >>>> an >>>>>>>>> environment variable. >>>>>>>>> Benefits: >>>>>>>>> Improve user experience. >>>>>>>>> >>>>>>>>> 5. Support configuring replica of JobManager Deployment in >>>>> ZooKeeper >>>>>> HA >>>>>>>>> setups >>>>>>>>> Description: >>>>>>>>> Make the *replica* of Deployment configurable in the >>>> ZooKeeper >>>>> HA >>>>>>>>> setups. >>>>>>>>> Benefits: >>>>>>>>> Achieve faster failover. >>>>>>>>> >>>>>>>>> 6. Support to configure limit for CPU requirement >>>>>>>>> Description: >>>>>>>>> To leverage the Kubernetes feature of container >>> request/limit >>>>>> CPU. >>>>>>>>> Benefits: >>>>>>>>> Reduce cost. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Canbin Zheng >>>>>>>>> >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: >>>>>>>>> >>>>>>>>>> I'm excited to hear about this feature, very, very, very >>>> highly >>>>>>>>> encouraged >>>>>>>>>> >>>>>>>>>> Prasanna kumar <[hidden email]> >> 于2020年7月23日周四 >>>>>>>> 上午12:10写道: >>>>>>>>>>> Hi Flink Dev Team, >>>>>>>>>>> >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would >>> be >>>> a >>>>>>> great >>>>>>>>>>> feature. >>>>>>>>>>> >>>>>>>>>>> We should be able have some rule say If the load >> increased >>> by >>>>>> 20% , >>>>>>>> add >>>>>>>>>>> extra resource should be added. >>>>>>>>>>> Or time based say during these peak hours the pipeline >>> should >>>>>> scale >>>>>>>>>>> automatically by 50%. >>>>>>>>>>> >>>>>>>>>>> This will help a lot in cost reduction. >>>>>>>>>>> >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based >>>>>> application. >>>>>>>>>>> Thanks, >>>>>>>>>>> Prasanna. >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < >>>>>>> [hidden email]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan >> for >>>> the >>>>>> next >>>>>>>>> major >>>>>>>>>>>> Flink release. >>>>>>>>>>>> >>>>>>>>>>>> Some items: >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers >>> for >>>>>> Flink >>>>>>>>> 1.12. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 >>> month >>>>>>> release >>>>>>>>>>> cycle, >>>>>>>>>>>> thus the release should be done by late October. >> Given >>>>> that >>>>>>>>> there’s >>>>>>>>>> a >>>>>>>>>>>> holiday week in China at the beginning of October, I >>>>> propose >>>>>>> to >>>>>>>> do >>>>>>>>>> the >>>>>>>>>>>> feature freeze on master by late September. >>>>>>>>>>>> >>>>>>>>>>>> 2. >>>>>>>>>>>> >>>>>>>>>>>> Collecting features: It would be good to have a >> rough >>>>>> overview >>>>>>>> of >>>>>>>>>> the >>>>>>>>>>>> features that will likely be ready to be merged by >>> late >>>>>>>> September, >>>>>>>>>> and >>>>>>>>>>>> that >>>>>>>>>>>> we want in the release. >>>>>>>>>>>> Based on the discussion, we will update the Roadmap >> on >>>> the >>>>>>> Flink >>>>>>>>>>> website >>>>>>>>>>>> again! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Test instabilities and blockers: I would like to >>> avoid a >>>>>>>> situation >>>>>>>>>>> where >>>>>>>>>>>> we have many blocking issues or build instabilities >> at >>>> the >>>>>>> time >>>>>>>> of >>>>>>>>>> the >>>>>>>>>>>> feature freeze. To achieve that, we will try to >> check >>>>> every >>>>>>>> build >>>>>>>>>>>> instability within a week, to decide if it is a >>> blocker >>>>>> (make >>>>>>>> sure >>>>>>>>>> to >>>>>>>>>>>> use >>>>>>>>>>>> the “test-stability” label for those tickets!) >>>>>>>>>>>> Blocker issues will need to have somebody assigned >>>>>>> (responsible) >>>>>>>>>>> within >>>>>>>>>>>> a week, and we want to see progress on all blocker >>>> issues >>>>>>>>>> (downgrade, >>>>>>>>>>>> resolution, a good plan how to proceed if it is more >>>>>>>> complicated) >>>>>>>>>>>> 2. >>>>>>>>>>>> >>>>>>>>>>>> Quality and stability of new features: In order to >>> have >>>> a >>>>>>> short >>>>>>>>>>> feature >>>>>>>>>>>> freeze phase, we encourage developers to only merge >>>>>>> well-tested >>>>>>>>> and >>>>>>>>>>>> documented features. In our experience, the feature >>>> freeze >>>>>>> works >>>>>>>>>> best >>>>>>>>>>> if >>>>>>>>>>>> new features are complete, and the community can >> focus >>>>> fully >>>>>>> on >>>>>>>>>>>> addressing >>>>>>>>>>>> newly found bugs and voting the release. >>>>>>>>>>>> By having a smooth release process, the next >>>> merge-window >>>>>> for >>>>>>>> the >>>>>>>>>> next >>>>>>>>>>>> release will come sooner. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Let me know what you think about our items, and share >>> which >>>>>>>> features >>>>>>>>>> you >>>>>>>>>>>> want in Flink 1.12. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> >>>>>>>>>>>> Robert & Dian >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Best Regards, >>>>>>>>>> Harold Miao >>>>>>>>>> |
Thanks all for your opinion.
@Chesnay: That is a risk, but I hope the people responsible for individual FLIPs plan accordingly. Extending the time till the feature freeze should not mean that we are extending the scope of the release. Ideally, features are done before FF, and they use the time till the freeze for additional testing and documentation polishing. This FF will be virtual, there should be less disruption than a physical conference with all the travelling. Do you have a different proposal for the timing? I'm currently considering splitting the feature freeze and the release branch creation. Similar to the Linux kernel development, we could have a "merge window" and a stabilization phase. At the end of the stabilization phase, we cut the release branch and open the next merge window (I'll start a separate thread regarding this towards the end of this release cycle, if I still like the idea then) On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <[hidden email]> wrote: > I'm a bit concerned about end of October, because it means we have Flink > forward, which usually means at least 1 week of little-to-no activity, > and then 1 week until feature-freeze. > > On 05/08/2020 11:56, jincheng sun wrote: > > +1 for end of October from me as well. > > > > Best, > > Jincheng > > > > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > > > >> +1 for end of October from me as well. > >> > >> Cheers, > >> Kostas > >> > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> > wrote: > >> > >>> +1 for end of October from my side as well. > >>> > >>> Cheers, > >>> Till > >>> > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> wrote: > >>> > >>>> The end of October sounds good from my side, unless it collides with > >> some > >>>> holidays that affect many committers. > >>>> > >>>> Feature-wise, I believe we can definitely make good use of the time to > >>> wrap > >>>> up some critical threads (like finishing the FLIP-27 source efforts). > >>>> > >>>> So +1 to the end of October from my side. > >>>> > >>>> Best, > >>>> Stephan > >>>> > >>>> > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> > >>> wrote: > >>>>> Thanks a lot for commenting on the feature freeze date. > >>>>> > >>>>> You are raising a few good points on the timing. > >>>>> If we have already (2 months before) concerns regarding the deadline, > >>>> then > >>>>> I agree that we should move it till the end of October. > >>>>> > >>>>> We then just need to be careful not to run into the Christmas season > >> at > >>>> the > >>>>> end of December. > >>>>> > >>>>> If nobody objects within a few days, I'll update the feature freeze > >>> date > >>>> in > >>>>> the Wiki. > >>>>> > >>>>> > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> wrote: > >>>>> > >>>>>> Regarding setting the feature freeze date to late September, I have > >>>> some > >>>>>> concern that it might make > >>>>>> the development time of 1.12 too short. > >>>>>> > >>>>>> One reason for this is we took too much time (about 1.5 month, from > >>> mid > >>>>> of > >>>>>> May to beginning of July) > >>>>>> for testing 1.11. It's not ideal but further squeeze the > >> development > >>>> time > >>>>>> of 1.12 won't make this better. > >>>>>> Besides, AFAIK July & August is also a popular vacation season for > >>>>>> European. Given the fact most > >>>>>> committers of Flink come from Europe, I think we should also take > >>> this > >>>>>> into consideration. > >>>>>> > >>>>>> It's also true that the first week of October is the national > >> holiday > >>>> of > >>>>>> China, so I'm wondering whether the > >>>>>> end of October could be a candidate feature freeze date. > >>>>>> > >>>>>> Best, > >>>>>> Kurt > >>>>>> > >>>>>> > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > >> [hidden email]> > >>>>>> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Thanks a lot for the responses so far. I've put them into this > >> Wiki > >>>>> page: > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > >> to > >>>> keep > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, then > >>> the > >>>>>> status > >>>>>>> will update automatically in the wiki :) > >>>>>>> > >>>>>>> Please keep posting features here, or add them to the Wiki > >> yourself > >>>> 🙏 > >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto > >>>> Scaling > >>>>>> is a > >>>>>>> feature request the community is well-aware of. Till has posted > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12 > >>>>>> release. > >>>>>>> This work will introduce the basic building blocks and partial > >>>> support > >>>>>> for > >>>>>>> the feature you are requesting. > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's > >> high > >>>>>>> performance (throughout, low latency) and correctness is a > >>> difficult > >>>>> task > >>>>>>> that needs a lot of work. It will probably take a little bit of > >>> time > >>>>> till > >>>>>>> this is fully available. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Robert > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > >>> [hidden email]> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Thanks for being our release managers for the 1.12 release > >> Dian & > >>>>>> Robert! > >>>>>>>> Here are some features I would like to work on for this > >> release: > >>>>>>>> # Features > >>>>>>>> > >>>>>>>> ## Finishing pipelined region scheduling ( > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) > >>>>>>>> With the pipelined region scheduler we want to implement a > >>>> scheduler > >>>>>>> which > >>>>>>>> can serve streaming as well as batch workloads alike while > >> being > >>>> able > >>>>>> to > >>>>>>>> run jobs under constrained resources. The latter is > >> particularly > >>>>>>> important > >>>>>>>> for bounded streaming jobs which, currently, are not well > >>>> supported. > >>>>>>>> ## Reactive-scaling mode > >>>>>>>> Being able to react to newly available resources and rescaling > >> a > >>>>>> running > >>>>>>>> job accordingly will make Flink's operation much easier because > >>>>>> resources > >>>>>>>> can then be controlled by an external tool (e.g. GCP > >> autoscaling, > >>>> K8s > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make a > >>> big > >>>>>> step > >>>>>>>> towards this direction. As a first step we want to support the > >>>>>> execution > >>>>>>> of > >>>>>>>> jobs with a parallelism which is lower than the specified > >>>> parallelism > >>>>>> in > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough > >>>>>> resources. > >>>>>>>> # Maintenance/Stability > >>>>>>>> > >>>>>>>> ## JM / TM finished task reconciliation ( > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) > >>>>>>>> This prevents the system from going out of sync if a task state > >>>>> change > >>>>>>> from > >>>>>>>> the TM to the JM is lost. > >>>>>>>> > >>>>>>>> ## Make metrics services work with Kubernetes deployments ( > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) > >>>>>>>> Invert the direction in which the MetricFetcher connects to the > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to > >>>>> expose > >>>>>> on > >>>>>>>> K8s for every TaskManager a port on which the > >> MetricQueryFetcher > >>>>> runs. > >>>>>>> This > >>>>>>>> will then make the deployment of Flink clusters on K8s easier. > >>>>>>>> > >>>>>>>> ## Handle long-blocking operations during job submission > >>> (savepoint > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) > >>>>>>>> Submitting a Flink job can involve the interaction with > >> external > >>>>>> systems > >>>>>>>> (blocking operations). Depending on the job the interactions > >> can > >>>> take > >>>>>> so > >>>>>>>> long that it exceeds the submission timeout which reports a > >>> failure > >>>>> on > >>>>>>> the > >>>>>>>> client side even though the actual submission succeeded. By > >>>>> decoupling > >>>>>>> the > >>>>>>>> creation of the ExecutionGraph from the job submission, we can > >>> make > >>>>> the > >>>>>>> job > >>>>>>>> submission non-blocking which will solve this problem. > >>>>>>>> > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) > >>>>>>>> By making the internal Flink IDs compositional or logging how > >>> they > >>>>>> belong > >>>>>>>> together, we can make the debugging of Flink's operations much > >>>>> easier. > >>>>>>>> Cheers, > >>>>>>>> Till > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > >>>> [hidden email] > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi All, > >>>>>>>>> > >>>>>>>>> Thanks for bring-up this discussion, Robert! > >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian > >>> and > >>>>>>> Robert > >>>>>>>> ! > >>>>>>>>> ---------- > >>>>>>>>> Here are some of my thoughts of the features for native > >>>> integration > >>>>>>> with > >>>>>>>>> Kubernetes in Flink 1.12: > >>>>>>>>> > >>>>>>>>> 1. Support user-specified pod templates > >>>>>>>>> Description: > >>>>>>>>> The current approach of introducing new configuration > >>> options > >>>>> for > >>>>>>>> each > >>>>>>>>> aspect of pod specification a user might wish is becoming > >>>> unwieldy, > >>>>>> we > >>>>>>>> have > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration > >>>>> options > >>>>>>> and > >>>>>>>>> users have to learn the gap between the declarative model > >> used > >>> by > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a > >>>> great > >>>>>>>>> improvement to allow users to specify pod templates as > >> central > >>>>> places > >>>>>>> for > >>>>>>>>> all customization needs for the jobmanager and taskmanager > >>> pods. > >>>>>>>>> Benefits: > >>>>>>>>> Users can leverage many of the advanced K8s features that > >>> the > >>>>>> Flink > >>>>>>>>> community does not support explicitly, such as volume > >> mounting, > >>>> DNS > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. > >>>>>>>>> > >>>>>>>>> 2. Support running PyFlink on Kubernetes > >>>>>>>>> Description: > >>>>>>>>> Support running PyFlink on Kubernetes, including session > >>>>> cluster > >>>>>>> and > >>>>>>>>> application cluster. > >>>>>>>>> Benefits: > >>>>>>>>> Running python application in a containerized > >> environment. > >>>>>>>>> 3. Support built-in init-Container > >>>>>>>>> Description: > >>>>>>>>> We need a built-in init-Container to help solve > >> dependency > >>>>>>> management > >>>>>>>>> in a containerized environment, especially in the application > >>>> mode. > >>>>>>>>> Benefits: > >>>>>>>>> Separate the base Flink image from dynamic dependencies. > >>>>>>>>> > >>>>>>>>> 4. Support accessing secured services via K8s secrets > >>>>>>>>> Description: > >>>>>>>>> Kubernetes Secrets > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> > >>> can > >>>> be > >>>>>>> used > >>>>>>>> to > >>>>>>>>> provide credentials for a Flink application to access secured > >>>>>> services. > >>>>>>>> It > >>>>>>>>> helps people who want to use a user-specified K8s Secret > >>> through > >>>> an > >>>>>>>>> environment variable. > >>>>>>>>> Benefits: > >>>>>>>>> Improve user experience. > >>>>>>>>> > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in > >>>>> ZooKeeper > >>>>>> HA > >>>>>>>>> setups > >>>>>>>>> Description: > >>>>>>>>> Make the *replica* of Deployment configurable in the > >>>> ZooKeeper > >>>>> HA > >>>>>>>>> setups. > >>>>>>>>> Benefits: > >>>>>>>>> Achieve faster failover. > >>>>>>>>> > >>>>>>>>> 6. Support to configure limit for CPU requirement > >>>>>>>>> Description: > >>>>>>>>> To leverage the Kubernetes feature of container > >>> request/limit > >>>>>> CPU. > >>>>>>>>> Benefits: > >>>>>>>>> Reduce cost. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Canbin Zheng > >>>>>>>>> > >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > >>>>>>>>> > >>>>>>>>>> I'm excited to hear about this feature, very, very, very > >>>> highly > >>>>>>>>> encouraged > >>>>>>>>>> > >>>>>>>>>> Prasanna kumar <[hidden email]> > >> 于2020年7月23日周四 > >>>>>>>> 上午12:10写道: > >>>>>>>>>>> Hi Flink Dev Team, > >>>>>>>>>>> > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would > >>> be > >>>> a > >>>>>>> great > >>>>>>>>>>> feature. > >>>>>>>>>>> > >>>>>>>>>>> We should be able have some rule say If the load > >> increased > >>> by > >>>>>> 20% , > >>>>>>>> add > >>>>>>>>>>> extra resource should be added. > >>>>>>>>>>> Or time based say during these peak hours the pipeline > >>> should > >>>>>> scale > >>>>>>>>>>> automatically by 50%. > >>>>>>>>>>> > >>>>>>>>>>> This will help a lot in cost reduction. > >>>>>>>>>>> > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based > >>>>>> application. > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Prasanna. > >>>>>>>>>>> > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > >>>>>>> [hidden email]> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi all, > >>>>>>>>>>>> > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan > >> for > >>>> the > >>>>>> next > >>>>>>>>> major > >>>>>>>>>>>> Flink release. > >>>>>>>>>>>> > >>>>>>>>>>>> Some items: > >>>>>>>>>>>> > >>>>>>>>>>>> 1. > >>>>>>>>>>>> > >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers > >>> for > >>>>>> Flink > >>>>>>>>> 1.12. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> 1. > >>>>>>>>>>>> > >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 > >>> month > >>>>>>> release > >>>>>>>>>>> cycle, > >>>>>>>>>>>> thus the release should be done by late October. > >> Given > >>>>> that > >>>>>>>>> there’s > >>>>>>>>>> a > >>>>>>>>>>>> holiday week in China at the beginning of October, I > >>>>> propose > >>>>>>> to > >>>>>>>> do > >>>>>>>>>> the > >>>>>>>>>>>> feature freeze on master by late September. > >>>>>>>>>>>> > >>>>>>>>>>>> 2. > >>>>>>>>>>>> > >>>>>>>>>>>> Collecting features: It would be good to have a > >> rough > >>>>>> overview > >>>>>>>> of > >>>>>>>>>> the > >>>>>>>>>>>> features that will likely be ready to be merged by > >>> late > >>>>>>>> September, > >>>>>>>>>> and > >>>>>>>>>>>> that > >>>>>>>>>>>> we want in the release. > >>>>>>>>>>>> Based on the discussion, we will update the Roadmap > >> on > >>>> the > >>>>>>> Flink > >>>>>>>>>>> website > >>>>>>>>>>>> again! > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> 1. > >>>>>>>>>>>> > >>>>>>>>>>>> Test instabilities and blockers: I would like to > >>> avoid a > >>>>>>>> situation > >>>>>>>>>>> where > >>>>>>>>>>>> we have many blocking issues or build instabilities > >> at > >>>> the > >>>>>>> time > >>>>>>>> of > >>>>>>>>>> the > >>>>>>>>>>>> feature freeze. To achieve that, we will try to > >> check > >>>>> every > >>>>>>>> build > >>>>>>>>>>>> instability within a week, to decide if it is a > >>> blocker > >>>>>> (make > >>>>>>>> sure > >>>>>>>>>> to > >>>>>>>>>>>> use > >>>>>>>>>>>> the “test-stability” label for those tickets!) > >>>>>>>>>>>> Blocker issues will need to have somebody assigned > >>>>>>> (responsible) > >>>>>>>>>>> within > >>>>>>>>>>>> a week, and we want to see progress on all blocker > >>>> issues > >>>>>>>>>> (downgrade, > >>>>>>>>>>>> resolution, a good plan how to proceed if it is more > >>>>>>>> complicated) > >>>>>>>>>>>> 2. > >>>>>>>>>>>> > >>>>>>>>>>>> Quality and stability of new features: In order to > >>> have > >>>> a > >>>>>>> short > >>>>>>>>>>> feature > >>>>>>>>>>>> freeze phase, we encourage developers to only merge > >>>>>>> well-tested > >>>>>>>>> and > >>>>>>>>>>>> documented features. In our experience, the feature > >>>> freeze > >>>>>>> works > >>>>>>>>>> best > >>>>>>>>>>> if > >>>>>>>>>>>> new features are complete, and the community can > >> focus > >>>>> fully > >>>>>>> on > >>>>>>>>>>>> addressing > >>>>>>>>>>>> newly found bugs and voting the release. > >>>>>>>>>>>> By having a smooth release process, the next > >>>> merge-window > >>>>>> for > >>>>>>>> the > >>>>>>>>>> next > >>>>>>>>>>>> release will come sooner. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Let me know what you think about our items, and share > >>> which > >>>>>>>> features > >>>>>>>>>> you > >>>>>>>>>>>> want in Flink 1.12. > >>>>>>>>>>>> > >>>>>>>>>>>> Best, > >>>>>>>>>>>> > >>>>>>>>>>>> Robert & Dian > >>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> Best Regards, > >>>>>>>>>> Harold Miao > >>>>>>>>>> > > |
+1 for extending feature freeze date to end of October.
Feature development in the master branch could be unblocked through creating the release branch, but every coin has its two sides (smile) Best Regards, Yu On Wed, 5 Aug 2020 at 20:12, Robert Metzger <[hidden email]> wrote: > Thanks all for your opinion. > > @Chesnay: That is a risk, but I hope the people responsible for individual > FLIPs plan accordingly. Extending the time till the feature freeze should > not mean that we are extending the scope of the release. > Ideally, features are done before FF, and they use the time till the freeze > for additional testing and documentation polishing. > This FF will be virtual, there should be less disruption than a physical > conference with all the travelling. > Do you have a different proposal for the timing? > > > I'm currently considering splitting the feature freeze and the release > branch creation. Similar to the Linux kernel development, we could have a > "merge window" and a stabilization phase. At the end of the stabilization > phase, we cut the release branch and open the next merge window (I'll start > a separate thread regarding this towards the end of this release cycle, if > I still like the idea then) > > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <[hidden email]> > wrote: > > > I'm a bit concerned about end of October, because it means we have Flink > > forward, which usually means at least 1 week of little-to-no activity, > > and then 1 week until feature-freeze. > > > > On 05/08/2020 11:56, jincheng sun wrote: > > > +1 for end of October from me as well. > > > > > > Best, > > > Jincheng > > > > > > > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > > > > > >> +1 for end of October from me as well. > > >> > > >> Cheers, > > >> Kostas > > >> > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> > > wrote: > > >> > > >>> +1 for end of October from my side as well. > > >>> > > >>> Cheers, > > >>> Till > > >>> > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> > wrote: > > >>> > > >>>> The end of October sounds good from my side, unless it collides with > > >> some > > >>>> holidays that affect many committers. > > >>>> > > >>>> Feature-wise, I believe we can definitely make good use of the time > to > > >>> wrap > > >>>> up some critical threads (like finishing the FLIP-27 source > efforts). > > >>>> > > >>>> So +1 to the end of October from my side. > > >>>> > > >>>> Best, > > >>>> Stephan > > >>>> > > >>>> > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <[hidden email]> > > >>> wrote: > > >>>>> Thanks a lot for commenting on the feature freeze date. > > >>>>> > > >>>>> You are raising a few good points on the timing. > > >>>>> If we have already (2 months before) concerns regarding the > deadline, > > >>>> then > > >>>>> I agree that we should move it till the end of October. > > >>>>> > > >>>>> We then just need to be careful not to run into the Christmas > season > > >> at > > >>>> the > > >>>>> end of December. > > >>>>> > > >>>>> If nobody objects within a few days, I'll update the feature freeze > > >>> date > > >>>> in > > >>>>> the Wiki. > > >>>>> > > >>>>> > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> > wrote: > > >>>>> > > >>>>>> Regarding setting the feature freeze date to late September, I > have > > >>>> some > > >>>>>> concern that it might make > > >>>>>> the development time of 1.12 too short. > > >>>>>> > > >>>>>> One reason for this is we took too much time (about 1.5 month, > from > > >>> mid > > >>>>> of > > >>>>>> May to beginning of July) > > >>>>>> for testing 1.11. It's not ideal but further squeeze the > > >> development > > >>>> time > > >>>>>> of 1.12 won't make this better. > > >>>>>> Besides, AFAIK July & August is also a popular vacation season > for > > >>>>>> European. Given the fact most > > >>>>>> committers of Flink come from Europe, I think we should also > take > > >>> this > > >>>>>> into consideration. > > >>>>>> > > >>>>>> It's also true that the first week of October is the national > > >> holiday > > >>>> of > > >>>>>> China, so I'm wondering whether the > > >>>>>> end of October could be a candidate feature freeze date. > > >>>>>> > > >>>>>> Best, > > >>>>>> Kurt > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > > >> [hidden email]> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hi all, > > >>>>>>> > > >>>>>>> Thanks a lot for the responses so far. I've put them into this > > >> Wiki > > >>>>> page: > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > > >> to > > >>>> keep > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, then > > >>> the > > >>>>>> status > > >>>>>>> will update automatically in the wiki :) > > >>>>>>> > > >>>>>>> Please keep posting features here, or add them to the Wiki > > >> yourself > > >>>> 🙏 > > >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto > > >>>> Scaling > > >>>>>> is a > > >>>>>>> feature request the community is well-aware of. Till has posted > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12 > > >>>>>> release. > > >>>>>>> This work will introduce the basic building blocks and partial > > >>>> support > > >>>>>> for > > >>>>>>> the feature you are requesting. > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's > > >> high > > >>>>>>> performance (throughout, low latency) and correctness is a > > >>> difficult > > >>>>> task > > >>>>>>> that needs a lot of work. It will probably take a little bit of > > >>> time > > >>>>> till > > >>>>>>> this is fully available. > > >>>>>>> > > >>>>>>> Cheers, > > >>>>>>> Robert > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > > >>> [hidden email]> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Thanks for being our release managers for the 1.12 release > > >> Dian & > > >>>>>> Robert! > > >>>>>>>> Here are some features I would like to work on for this > > >> release: > > >>>>>>>> # Features > > >>>>>>>> > > >>>>>>>> ## Finishing pipelined region scheduling ( > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) > > >>>>>>>> With the pipelined region scheduler we want to implement a > > >>>> scheduler > > >>>>>>> which > > >>>>>>>> can serve streaming as well as batch workloads alike while > > >> being > > >>>> able > > >>>>>> to > > >>>>>>>> run jobs under constrained resources. The latter is > > >> particularly > > >>>>>>> important > > >>>>>>>> for bounded streaming jobs which, currently, are not well > > >>>> supported. > > >>>>>>>> ## Reactive-scaling mode > > >>>>>>>> Being able to react to newly available resources and rescaling > > >> a > > >>>>>> running > > >>>>>>>> job accordingly will make Flink's operation much easier because > > >>>>>> resources > > >>>>>>>> can then be controlled by an external tool (e.g. GCP > > >> autoscaling, > > >>>> K8s > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make a > > >>> big > > >>>>>> step > > >>>>>>>> towards this direction. As a first step we want to support the > > >>>>>> execution > > >>>>>>> of > > >>>>>>>> jobs with a parallelism which is lower than the specified > > >>>> parallelism > > >>>>>> in > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough > > >>>>>> resources. > > >>>>>>>> # Maintenance/Stability > > >>>>>>>> > > >>>>>>>> ## JM / TM finished task reconciliation ( > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) > > >>>>>>>> This prevents the system from going out of sync if a task state > > >>>>> change > > >>>>>>> from > > >>>>>>>> the TM to the JM is lost. > > >>>>>>>> > > >>>>>>>> ## Make metrics services work with Kubernetes deployments ( > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) > > >>>>>>>> Invert the direction in which the MetricFetcher connects to the > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to > > >>>>> expose > > >>>>>> on > > >>>>>>>> K8s for every TaskManager a port on which the > > >> MetricQueryFetcher > > >>>>> runs. > > >>>>>>> This > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier. > > >>>>>>>> > > >>>>>>>> ## Handle long-blocking operations during job submission > > >>> (savepoint > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > >>>>>>>> Submitting a Flink job can involve the interaction with > > >> external > > >>>>>> systems > > >>>>>>>> (blocking operations). Depending on the job the interactions > > >> can > > >>>> take > > >>>>>> so > > >>>>>>>> long that it exceeds the submission timeout which reports a > > >>> failure > > >>>>> on > > >>>>>>> the > > >>>>>>>> client side even though the actual submission succeeded. By > > >>>>> decoupling > > >>>>>>> the > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can > > >>> make > > >>>>> the > > >>>>>>> job > > >>>>>>>> submission non-blocking which will solve this problem. > > >>>>>>>> > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) > > >>>>>>>> By making the internal Flink IDs compositional or logging how > > >>> they > > >>>>>> belong > > >>>>>>>> together, we can make the debugging of Flink's operations much > > >>>>> easier. > > >>>>>>>> Cheers, > > >>>>>>>> Till > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > >>>> [hidden email] > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Hi All, > > >>>>>>>>> > > >>>>>>>>> Thanks for bring-up this discussion, Robert! > > >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian > > >>> and > > >>>>>>> Robert > > >>>>>>>> ! > > >>>>>>>>> ---------- > > >>>>>>>>> Here are some of my thoughts of the features for native > > >>>> integration > > >>>>>>> with > > >>>>>>>>> Kubernetes in Flink 1.12: > > >>>>>>>>> > > >>>>>>>>> 1. Support user-specified pod templates > > >>>>>>>>> Description: > > >>>>>>>>> The current approach of introducing new configuration > > >>> options > > >>>>> for > > >>>>>>>> each > > >>>>>>>>> aspect of pod specification a user might wish is becoming > > >>>> unwieldy, > > >>>>>> we > > >>>>>>>> have > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration > > >>>>> options > > >>>>>>> and > > >>>>>>>>> users have to learn the gap between the declarative model > > >> used > > >>> by > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a > > >>>> great > > >>>>>>>>> improvement to allow users to specify pod templates as > > >> central > > >>>>> places > > >>>>>>> for > > >>>>>>>>> all customization needs for the jobmanager and taskmanager > > >>> pods. > > >>>>>>>>> Benefits: > > >>>>>>>>> Users can leverage many of the advanced K8s features that > > >>> the > > >>>>>> Flink > > >>>>>>>>> community does not support explicitly, such as volume > > >> mounting, > > >>>> DNS > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. > > >>>>>>>>> > > >>>>>>>>> 2. Support running PyFlink on Kubernetes > > >>>>>>>>> Description: > > >>>>>>>>> Support running PyFlink on Kubernetes, including session > > >>>>> cluster > > >>>>>>> and > > >>>>>>>>> application cluster. > > >>>>>>>>> Benefits: > > >>>>>>>>> Running python application in a containerized > > >> environment. > > >>>>>>>>> 3. Support built-in init-Container > > >>>>>>>>> Description: > > >>>>>>>>> We need a built-in init-Container to help solve > > >> dependency > > >>>>>>> management > > >>>>>>>>> in a containerized environment, especially in the application > > >>>> mode. > > >>>>>>>>> Benefits: > > >>>>>>>>> Separate the base Flink image from dynamic dependencies. > > >>>>>>>>> > > >>>>>>>>> 4. Support accessing secured services via K8s secrets > > >>>>>>>>> Description: > > >>>>>>>>> Kubernetes Secrets > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> > > >>> can > > >>>> be > > >>>>>>> used > > >>>>>>>> to > > >>>>>>>>> provide credentials for a Flink application to access secured > > >>>>>> services. > > >>>>>>>> It > > >>>>>>>>> helps people who want to use a user-specified K8s Secret > > >>> through > > >>>> an > > >>>>>>>>> environment variable. > > >>>>>>>>> Benefits: > > >>>>>>>>> Improve user experience. > > >>>>>>>>> > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in > > >>>>> ZooKeeper > > >>>>>> HA > > >>>>>>>>> setups > > >>>>>>>>> Description: > > >>>>>>>>> Make the *replica* of Deployment configurable in the > > >>>> ZooKeeper > > >>>>> HA > > >>>>>>>>> setups. > > >>>>>>>>> Benefits: > > >>>>>>>>> Achieve faster failover. > > >>>>>>>>> > > >>>>>>>>> 6. Support to configure limit for CPU requirement > > >>>>>>>>> Description: > > >>>>>>>>> To leverage the Kubernetes feature of container > > >>> request/limit > > >>>>>> CPU. > > >>>>>>>>> Benefits: > > >>>>>>>>> Reduce cost. > > >>>>>>>>> > > >>>>>>>>> Regards, > > >>>>>>>>> Canbin Zheng > > >>>>>>>>> > > >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > >>>>>>>>> > > >>>>>>>>>> I'm excited to hear about this feature, very, very, very > > >>>> highly > > >>>>>>>>> encouraged > > >>>>>>>>>> > > >>>>>>>>>> Prasanna kumar <[hidden email]> > > >> 于2020年7月23日周四 > > >>>>>>>> 上午12:10写道: > > >>>>>>>>>>> Hi Flink Dev Team, > > >>>>>>>>>>> > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would > > >>> be > > >>>> a > > >>>>>>> great > > >>>>>>>>>>> feature. > > >>>>>>>>>>> > > >>>>>>>>>>> We should be able have some rule say If the load > > >> increased > > >>> by > > >>>>>> 20% , > > >>>>>>>> add > > >>>>>>>>>>> extra resource should be added. > > >>>>>>>>>>> Or time based say during these peak hours the pipeline > > >>> should > > >>>>>> scale > > >>>>>>>>>>> automatically by 50%. > > >>>>>>>>>>> > > >>>>>>>>>>> This will help a lot in cost reduction. > > >>>>>>>>>>> > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based > > >>>>>> application. > > >>>>>>>>>>> Thanks, > > >>>>>>>>>>> Prasanna. > > >>>>>>>>>>> > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > >>>>>>> [hidden email]> > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>>> Hi all, > > >>>>>>>>>>>> > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan > > >> for > > >>>> the > > >>>>>> next > > >>>>>>>>> major > > >>>>>>>>>>>> Flink release. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Some items: > > >>>>>>>>>>>> > > >>>>>>>>>>>> 1. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers > > >>> for > > >>>>>> Flink > > >>>>>>>>> 1.12. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> 1. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 > > >>> month > > >>>>>>> release > > >>>>>>>>>>> cycle, > > >>>>>>>>>>>> thus the release should be done by late October. > > >> Given > > >>>>> that > > >>>>>>>>> there’s > > >>>>>>>>>> a > > >>>>>>>>>>>> holiday week in China at the beginning of October, I > > >>>>> propose > > >>>>>>> to > > >>>>>>>> do > > >>>>>>>>>> the > > >>>>>>>>>>>> feature freeze on master by late September. > > >>>>>>>>>>>> > > >>>>>>>>>>>> 2. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Collecting features: It would be good to have a > > >> rough > > >>>>>> overview > > >>>>>>>> of > > >>>>>>>>>> the > > >>>>>>>>>>>> features that will likely be ready to be merged by > > >>> late > > >>>>>>>> September, > > >>>>>>>>>> and > > >>>>>>>>>>>> that > > >>>>>>>>>>>> we want in the release. > > >>>>>>>>>>>> Based on the discussion, we will update the Roadmap > > >> on > > >>>> the > > >>>>>>> Flink > > >>>>>>>>>>> website > > >>>>>>>>>>>> again! > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> 1. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Test instabilities and blockers: I would like to > > >>> avoid a > > >>>>>>>> situation > > >>>>>>>>>>> where > > >>>>>>>>>>>> we have many blocking issues or build instabilities > > >> at > > >>>> the > > >>>>>>> time > > >>>>>>>> of > > >>>>>>>>>> the > > >>>>>>>>>>>> feature freeze. To achieve that, we will try to > > >> check > > >>>>> every > > >>>>>>>> build > > >>>>>>>>>>>> instability within a week, to decide if it is a > > >>> blocker > > >>>>>> (make > > >>>>>>>> sure > > >>>>>>>>>> to > > >>>>>>>>>>>> use > > >>>>>>>>>>>> the “test-stability” label for those tickets!) > > >>>>>>>>>>>> Blocker issues will need to have somebody assigned > > >>>>>>> (responsible) > > >>>>>>>>>>> within > > >>>>>>>>>>>> a week, and we want to see progress on all blocker > > >>>> issues > > >>>>>>>>>> (downgrade, > > >>>>>>>>>>>> resolution, a good plan how to proceed if it is more > > >>>>>>>> complicated) > > >>>>>>>>>>>> 2. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Quality and stability of new features: In order to > > >>> have > > >>>> a > > >>>>>>> short > > >>>>>>>>>>> feature > > >>>>>>>>>>>> freeze phase, we encourage developers to only merge > > >>>>>>> well-tested > > >>>>>>>>> and > > >>>>>>>>>>>> documented features. In our experience, the feature > > >>>> freeze > > >>>>>>> works > > >>>>>>>>>> best > > >>>>>>>>>>> if > > >>>>>>>>>>>> new features are complete, and the community can > > >> focus > > >>>>> fully > > >>>>>>> on > > >>>>>>>>>>>> addressing > > >>>>>>>>>>>> newly found bugs and voting the release. > > >>>>>>>>>>>> By having a smooth release process, the next > > >>>> merge-window > > >>>>>> for > > >>>>>>>> the > > >>>>>>>>>> next > > >>>>>>>>>>>> release will come sooner. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> Let me know what you think about our items, and share > > >>> which > > >>>>>>>> features > > >>>>>>>>>> you > > >>>>>>>>>>>> want in Flink 1.12. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best, > > >>>>>>>>>>>> > > >>>>>>>>>>>> Robert & Dian > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> -- > > >>>>>>>>>> > > >>>>>>>>>> Best Regards, > > >>>>>>>>>> Harold Miao > > >>>>>>>>>> > > > > > |
+1
> +1 for extending the feature freeze date to the end of October. On Thu, Aug 6, 2020 at 12:08 PM Yu Li <[hidden email]> wrote: > +1 for extending feature freeze date to end of October. > > Feature development in the master branch could be unblocked through > creating the release branch, but every coin has its two sides (smile) > > Best Regards, > Yu > > > On Wed, 5 Aug 2020 at 20:12, Robert Metzger <[hidden email]> wrote: > > > Thanks all for your opinion. > > > > @Chesnay: That is a risk, but I hope the people responsible for > individual > > FLIPs plan accordingly. Extending the time till the feature freeze should > > not mean that we are extending the scope of the release. > > Ideally, features are done before FF, and they use the time till the > freeze > > for additional testing and documentation polishing. > > This FF will be virtual, there should be less disruption than a physical > > conference with all the travelling. > > Do you have a different proposal for the timing? > > > > > > I'm currently considering splitting the feature freeze and the release > > branch creation. Similar to the Linux kernel development, we could have a > > "merge window" and a stabilization phase. At the end of the stabilization > > phase, we cut the release branch and open the next merge window (I'll > start > > a separate thread regarding this towards the end of this release cycle, > if > > I still like the idea then) > > > > > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <[hidden email]> > > wrote: > > > > > I'm a bit concerned about end of October, because it means we have > Flink > > > forward, which usually means at least 1 week of little-to-no activity, > > > and then 1 week until feature-freeze. > > > > > > On 05/08/2020 11:56, jincheng sun wrote: > > > > +1 for end of October from me as well. > > > > > > > > Best, > > > > Jincheng > > > > > > > > > > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > > > > > > > >> +1 for end of October from me as well. > > > >> > > > >> Cheers, > > > >> Kostas > > > >> > > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> > > > wrote: > > > >> > > > >>> +1 for end of October from my side as well. > > > >>> > > > >>> Cheers, > > > >>> Till > > > >>> > > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> > > wrote: > > > >>> > > > >>>> The end of October sounds good from my side, unless it collides > with > > > >> some > > > >>>> holidays that affect many committers. > > > >>>> > > > >>>> Feature-wise, I believe we can definitely make good use of the > time > > to > > > >>> wrap > > > >>>> up some critical threads (like finishing the FLIP-27 source > > efforts). > > > >>>> > > > >>>> So +1 to the end of October from my side. > > > >>>> > > > >>>> Best, > > > >>>> Stephan > > > >>>> > > > >>>> > > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger < > [hidden email]> > > > >>> wrote: > > > >>>>> Thanks a lot for commenting on the feature freeze date. > > > >>>>> > > > >>>>> You are raising a few good points on the timing. > > > >>>>> If we have already (2 months before) concerns regarding the > > deadline, > > > >>>> then > > > >>>>> I agree that we should move it till the end of October. > > > >>>>> > > > >>>>> We then just need to be careful not to run into the Christmas > > season > > > >> at > > > >>>> the > > > >>>>> end of December. > > > >>>>> > > > >>>>> If nobody objects within a few days, I'll update the feature > freeze > > > >>> date > > > >>>> in > > > >>>>> the Wiki. > > > >>>>> > > > >>>>> > > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> > > wrote: > > > >>>>> > > > >>>>>> Regarding setting the feature freeze date to late September, I > > have > > > >>>> some > > > >>>>>> concern that it might make > > > >>>>>> the development time of 1.12 too short. > > > >>>>>> > > > >>>>>> One reason for this is we took too much time (about 1.5 month, > > from > > > >>> mid > > > >>>>> of > > > >>>>>> May to beginning of July) > > > >>>>>> for testing 1.11. It's not ideal but further squeeze the > > > >> development > > > >>>> time > > > >>>>>> of 1.12 won't make this better. > > > >>>>>> Besides, AFAIK July & August is also a popular vacation season > > for > > > >>>>>> European. Given the fact most > > > >>>>>> committers of Flink come from Europe, I think we should also > > take > > > >>> this > > > >>>>>> into consideration. > > > >>>>>> > > > >>>>>> It's also true that the first week of October is the national > > > >> holiday > > > >>>> of > > > >>>>>> China, so I'm wondering whether the > > > >>>>>> end of October could be a candidate feature freeze date. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Kurt > > > >>>>>> > > > >>>>>> > > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > > > >> [hidden email]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Hi all, > > > >>>>>>> > > > >>>>>>> Thanks a lot for the responses so far. I've put them into this > > > >> Wiki > > > >>>>> page: > > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > > > >> to > > > >>>> keep > > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, > then > > > >>> the > > > >>>>>> status > > > >>>>>>> will update automatically in the wiki :) > > > >>>>>>> > > > >>>>>>> Please keep posting features here, or add them to the Wiki > > > >> yourself > > > >>>> 🙏 > > > >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto > > > >>>> Scaling > > > >>>>>> is a > > > >>>>>>> feature request the community is well-aware of. Till has posted > > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the > 1.12 > > > >>>>>> release. > > > >>>>>>> This work will introduce the basic building blocks and partial > > > >>>> support > > > >>>>>> for > > > >>>>>>> the feature you are requesting. > > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's > > > >> high > > > >>>>>>> performance (throughout, low latency) and correctness is a > > > >>> difficult > > > >>>>> task > > > >>>>>>> that needs a lot of work. It will probably take a little bit of > > > >>> time > > > >>>>> till > > > >>>>>>> this is fully available. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Robert > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > > > >>> [hidden email]> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Thanks for being our release managers for the 1.12 release > > > >> Dian & > > > >>>>>> Robert! > > > >>>>>>>> Here are some features I would like to work on for this > > > >> release: > > > >>>>>>>> # Features > > > >>>>>>>> > > > >>>>>>>> ## Finishing pipelined region scheduling ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) > > > >>>>>>>> With the pipelined region scheduler we want to implement a > > > >>>> scheduler > > > >>>>>>> which > > > >>>>>>>> can serve streaming as well as batch workloads alike while > > > >> being > > > >>>> able > > > >>>>>> to > > > >>>>>>>> run jobs under constrained resources. The latter is > > > >> particularly > > > >>>>>>> important > > > >>>>>>>> for bounded streaming jobs which, currently, are not well > > > >>>> supported. > > > >>>>>>>> ## Reactive-scaling mode > > > >>>>>>>> Being able to react to newly available resources and rescaling > > > >> a > > > >>>>>> running > > > >>>>>>>> job accordingly will make Flink's operation much easier > because > > > >>>>>> resources > > > >>>>>>>> can then be controlled by an external tool (e.g. GCP > > > >> autoscaling, > > > >>>> K8s > > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make > a > > > >>> big > > > >>>>>> step > > > >>>>>>>> towards this direction. As a first step we want to support the > > > >>>>>> execution > > > >>>>>>> of > > > >>>>>>>> jobs with a parallelism which is lower than the specified > > > >>>> parallelism > > > >>>>>> in > > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough > > > >>>>>> resources. > > > >>>>>>>> # Maintenance/Stability > > > >>>>>>>> > > > >>>>>>>> ## JM / TM finished task reconciliation ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) > > > >>>>>>>> This prevents the system from going out of sync if a task > state > > > >>>>> change > > > >>>>>>> from > > > >>>>>>>> the TM to the JM is lost. > > > >>>>>>>> > > > >>>>>>>> ## Make metrics services work with Kubernetes deployments ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) > > > >>>>>>>> Invert the direction in which the MetricFetcher connects to > the > > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary > to > > > >>>>> expose > > > >>>>>> on > > > >>>>>>>> K8s for every TaskManager a port on which the > > > >> MetricQueryFetcher > > > >>>>> runs. > > > >>>>>>> This > > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier. > > > >>>>>>>> > > > >>>>>>>> ## Handle long-blocking operations during job submission > > > >>> (savepoint > > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > >>>>>>>> Submitting a Flink job can involve the interaction with > > > >> external > > > >>>>>> systems > > > >>>>>>>> (blocking operations). Depending on the job the interactions > > > >> can > > > >>>> take > > > >>>>>> so > > > >>>>>>>> long that it exceeds the submission timeout which reports a > > > >>> failure > > > >>>>> on > > > >>>>>>> the > > > >>>>>>>> client side even though the actual submission succeeded. By > > > >>>>> decoupling > > > >>>>>>> the > > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can > > > >>> make > > > >>>>> the > > > >>>>>>> job > > > >>>>>>>> submission non-blocking which will solve this problem. > > > >>>>>>>> > > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) > > > >>>>>>>> By making the internal Flink IDs compositional or logging how > > > >>> they > > > >>>>>> belong > > > >>>>>>>> together, we can make the debugging of Flink's operations much > > > >>>>> easier. > > > >>>>>>>> Cheers, > > > >>>>>>>> Till > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > > >>>> [hidden email] > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Hi All, > > > >>>>>>>>> > > > >>>>>>>>> Thanks for bring-up this discussion, Robert! > > > >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian > > > >>> and > > > >>>>>>> Robert > > > >>>>>>>> ! > > > >>>>>>>>> ---------- > > > >>>>>>>>> Here are some of my thoughts of the features for native > > > >>>> integration > > > >>>>>>> with > > > >>>>>>>>> Kubernetes in Flink 1.12: > > > >>>>>>>>> > > > >>>>>>>>> 1. Support user-specified pod templates > > > >>>>>>>>> Description: > > > >>>>>>>>> The current approach of introducing new configuration > > > >>> options > > > >>>>> for > > > >>>>>>>> each > > > >>>>>>>>> aspect of pod specification a user might wish is becoming > > > >>>> unwieldy, > > > >>>>>> we > > > >>>>>>>> have > > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration > > > >>>>> options > > > >>>>>>> and > > > >>>>>>>>> users have to learn the gap between the declarative model > > > >> used > > > >>> by > > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a > > > >>>> great > > > >>>>>>>>> improvement to allow users to specify pod templates as > > > >> central > > > >>>>> places > > > >>>>>>> for > > > >>>>>>>>> all customization needs for the jobmanager and taskmanager > > > >>> pods. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Users can leverage many of the advanced K8s features > that > > > >>> the > > > >>>>>> Flink > > > >>>>>>>>> community does not support explicitly, such as volume > > > >> mounting, > > > >>>> DNS > > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. > > > >>>>>>>>> > > > >>>>>>>>> 2. Support running PyFlink on Kubernetes > > > >>>>>>>>> Description: > > > >>>>>>>>> Support running PyFlink on Kubernetes, including session > > > >>>>> cluster > > > >>>>>>> and > > > >>>>>>>>> application cluster. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Running python application in a containerized > > > >> environment. > > > >>>>>>>>> 3. Support built-in init-Container > > > >>>>>>>>> Description: > > > >>>>>>>>> We need a built-in init-Container to help solve > > > >> dependency > > > >>>>>>> management > > > >>>>>>>>> in a containerized environment, especially in the application > > > >>>> mode. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Separate the base Flink image from dynamic dependencies. > > > >>>>>>>>> > > > >>>>>>>>> 4. Support accessing secured services via K8s secrets > > > >>>>>>>>> Description: > > > >>>>>>>>> Kubernetes Secrets > > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> > > > >>> can > > > >>>> be > > > >>>>>>> used > > > >>>>>>>> to > > > >>>>>>>>> provide credentials for a Flink application to access secured > > > >>>>>> services. > > > >>>>>>>> It > > > >>>>>>>>> helps people who want to use a user-specified K8s Secret > > > >>> through > > > >>>> an > > > >>>>>>>>> environment variable. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Improve user experience. > > > >>>>>>>>> > > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in > > > >>>>> ZooKeeper > > > >>>>>> HA > > > >>>>>>>>> setups > > > >>>>>>>>> Description: > > > >>>>>>>>> Make the *replica* of Deployment configurable in the > > > >>>> ZooKeeper > > > >>>>> HA > > > >>>>>>>>> setups. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Achieve faster failover. > > > >>>>>>>>> > > > >>>>>>>>> 6. Support to configure limit for CPU requirement > > > >>>>>>>>> Description: > > > >>>>>>>>> To leverage the Kubernetes feature of container > > > >>> request/limit > > > >>>>>> CPU. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Reduce cost. > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Canbin Zheng > > > >>>>>>>>> > > > >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > >>>>>>>>> > > > >>>>>>>>>> I'm excited to hear about this feature, very, very, very > > > >>>> highly > > > >>>>>>>>> encouraged > > > >>>>>>>>>> > > > >>>>>>>>>> Prasanna kumar <[hidden email]> > > > >> 于2020年7月23日周四 > > > >>>>>>>> 上午12:10写道: > > > >>>>>>>>>>> Hi Flink Dev Team, > > > >>>>>>>>>>> > > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would > > > >>> be > > > >>>> a > > > >>>>>>> great > > > >>>>>>>>>>> feature. > > > >>>>>>>>>>> > > > >>>>>>>>>>> We should be able have some rule say If the load > > > >> increased > > > >>> by > > > >>>>>> 20% , > > > >>>>>>>> add > > > >>>>>>>>>>> extra resource should be added. > > > >>>>>>>>>>> Or time based say during these peak hours the pipeline > > > >>> should > > > >>>>>> scale > > > >>>>>>>>>>> automatically by 50%. > > > >>>>>>>>>>> > > > >>>>>>>>>>> This will help a lot in cost reduction. > > > >>>>>>>>>>> > > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based > > > >>>>>> application. > > > >>>>>>>>>>> Thanks, > > > >>>>>>>>>>> Prasanna. > > > >>>>>>>>>>> > > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > >>>>>>> [hidden email]> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Hi all, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan > > > >> for > > > >>>> the > > > >>>>>> next > > > >>>>>>>>> major > > > >>>>>>>>>>>> Flink release. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Some items: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers > > > >>> for > > > >>>>>> Flink > > > >>>>>>>>> 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 > > > >>> month > > > >>>>>>> release > > > >>>>>>>>>>> cycle, > > > >>>>>>>>>>>> thus the release should be done by late October. > > > >> Given > > > >>>>> that > > > >>>>>>>>> there’s > > > >>>>>>>>>> a > > > >>>>>>>>>>>> holiday week in China at the beginning of October, I > > > >>>>> propose > > > >>>>>>> to > > > >>>>>>>> do > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze on master by late September. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Collecting features: It would be good to have a > > > >> rough > > > >>>>>> overview > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> features that will likely be ready to be merged by > > > >>> late > > > >>>>>>>> September, > > > >>>>>>>>>> and > > > >>>>>>>>>>>> that > > > >>>>>>>>>>>> we want in the release. > > > >>>>>>>>>>>> Based on the discussion, we will update the Roadmap > > > >> on > > > >>>> the > > > >>>>>>> Flink > > > >>>>>>>>>>> website > > > >>>>>>>>>>>> again! > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Test instabilities and blockers: I would like to > > > >>> avoid a > > > >>>>>>>> situation > > > >>>>>>>>>>> where > > > >>>>>>>>>>>> we have many blocking issues or build instabilities > > > >> at > > > >>>> the > > > >>>>>>> time > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze. To achieve that, we will try to > > > >> check > > > >>>>> every > > > >>>>>>>> build > > > >>>>>>>>>>>> instability within a week, to decide if it is a > > > >>> blocker > > > >>>>>> (make > > > >>>>>>>> sure > > > >>>>>>>>>> to > > > >>>>>>>>>>>> use > > > >>>>>>>>>>>> the “test-stability” label for those tickets!) > > > >>>>>>>>>>>> Blocker issues will need to have somebody assigned > > > >>>>>>> (responsible) > > > >>>>>>>>>>> within > > > >>>>>>>>>>>> a week, and we want to see progress on all blocker > > > >>>> issues > > > >>>>>>>>>> (downgrade, > > > >>>>>>>>>>>> resolution, a good plan how to proceed if it is more > > > >>>>>>>> complicated) > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Quality and stability of new features: In order to > > > >>> have > > > >>>> a > > > >>>>>>> short > > > >>>>>>>>>>> feature > > > >>>>>>>>>>>> freeze phase, we encourage developers to only merge > > > >>>>>>> well-tested > > > >>>>>>>>> and > > > >>>>>>>>>>>> documented features. In our experience, the feature > > > >>>> freeze > > > >>>>>>> works > > > >>>>>>>>>> best > > > >>>>>>>>>>> if > > > >>>>>>>>>>>> new features are complete, and the community can > > > >> focus > > > >>>>> fully > > > >>>>>>> on > > > >>>>>>>>>>>> addressing > > > >>>>>>>>>>>> newly found bugs and voting the release. > > > >>>>>>>>>>>> By having a smooth release process, the next > > > >>>> merge-window > > > >>>>>> for > > > >>>>>>>> the > > > >>>>>>>>>> next > > > >>>>>>>>>>>> release will come sooner. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Let me know what you think about our items, and share > > > >>> which > > > >>>>>>>> features > > > >>>>>>>>>> you > > > >>>>>>>>>>>> want in Flink 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Robert & Dian > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> -- > > > >>>>>>>>>> > > > >>>>>>>>>> Best Regards, > > > >>>>>>>>>> Harold Miao > > > >>>>>>>>>> > > > > > > > > > |
+1 on my side for feature freeze date by the end of Oct.
------------------------------------------------------------------ From:Yuan Mei <[hidden email]> Send Time:2020年8月6日(星期四) 14:54 To:dev <[hidden email]> Subject:Re: [DISCUSS] Planning Flink 1.12 +1 > +1 for extending the feature freeze date to the end of October. On Thu, Aug 6, 2020 at 12:08 PM Yu Li <[hidden email]> wrote: > +1 for extending feature freeze date to end of October. > > Feature development in the master branch could be unblocked through > creating the release branch, but every coin has its two sides (smile) > > Best Regards, > Yu > > > On Wed, 5 Aug 2020 at 20:12, Robert Metzger <[hidden email]> wrote: > > > Thanks all for your opinion. > > > > @Chesnay: That is a risk, but I hope the people responsible for > individual > > FLIPs plan accordingly. Extending the time till the feature freeze should > > not mean that we are extending the scope of the release. > > Ideally, features are done before FF, and they use the time till the > freeze > > for additional testing and documentation polishing. > > This FF will be virtual, there should be less disruption than a physical > > conference with all the travelling. > > Do you have a different proposal for the timing? > > > > > > I'm currently considering splitting the feature freeze and the release > > branch creation. Similar to the Linux kernel development, we could have a > > "merge window" and a stabilization phase. At the end of the stabilization > > phase, we cut the release branch and open the next merge window (I'll > start > > a separate thread regarding this towards the end of this release cycle, > if > > I still like the idea then) > > > > > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <[hidden email]> > > wrote: > > > > > I'm a bit concerned about end of October, because it means we have > Flink > > > forward, which usually means at least 1 week of little-to-no activity, > > > and then 1 week until feature-freeze. > > > > > > On 05/08/2020 11:56, jincheng sun wrote: > > > > +1 for end of October from me as well. > > > > > > > > Best, > > > > Jincheng > > > > > > > > > > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > > > > > > > >> +1 for end of October from me as well. > > > >> > > > >> Cheers, > > > >> Kostas > > > >> > > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> > > > wrote: > > > >> > > > >>> +1 for end of October from my side as well. > > > >>> > > > >>> Cheers, > > > >>> Till > > > >>> > > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> > > wrote: > > > >>> > > > >>>> The end of October sounds good from my side, unless it collides > with > > > >> some > > > >>>> holidays that affect many committers. > > > >>>> > > > >>>> Feature-wise, I believe we can definitely make good use of the > time > > to > > > >>> wrap > > > >>>> up some critical threads (like finishing the FLIP-27 source > > efforts). > > > >>>> > > > >>>> So +1 to the end of October from my side. > > > >>>> > > > >>>> Best, > > > >>>> Stephan > > > >>>> > > > >>>> > > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger < > [hidden email]> > > > >>> wrote: > > > >>>>> Thanks a lot for commenting on the feature freeze date. > > > >>>>> > > > >>>>> You are raising a few good points on the timing. > > > >>>>> If we have already (2 months before) concerns regarding the > > deadline, > > > >>>> then > > > >>>>> I agree that we should move it till the end of October. > > > >>>>> > > > >>>>> We then just need to be careful not to run into the Christmas > > season > > > >> at > > > >>>> the > > > >>>>> end of December. > > > >>>>> > > > >>>>> If nobody objects within a few days, I'll update the feature > freeze > > > >>> date > > > >>>> in > > > >>>>> the Wiki. > > > >>>>> > > > >>>>> > > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> > > wrote: > > > >>>>> > > > >>>>>> Regarding setting the feature freeze date to late September, I > > have > > > >>>> some > > > >>>>>> concern that it might make > > > >>>>>> the development time of 1.12 too short. > > > >>>>>> > > > >>>>>> One reason for this is we took too much time (about 1.5 month, > > from > > > >>> mid > > > >>>>> of > > > >>>>>> May to beginning of July) > > > >>>>>> for testing 1.11. It's not ideal but further squeeze the > > > >> development > > > >>>> time > > > >>>>>> of 1.12 won't make this better. > > > >>>>>> Besides, AFAIK July & August is also a popular vacation season > > for > > > >>>>>> European. Given the fact most > > > >>>>>> committers of Flink come from Europe, I think we should also > > take > > > >>> this > > > >>>>>> into consideration. > > > >>>>>> > > > >>>>>> It's also true that the first week of October is the national > > > >> holiday > > > >>>> of > > > >>>>>> China, so I'm wondering whether the > > > >>>>>> end of October could be a candidate feature freeze date. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Kurt > > > >>>>>> > > > >>>>>> > > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > > > >> [hidden email]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Hi all, > > > >>>>>>> > > > >>>>>>> Thanks a lot for the responses so far. I've put them into this > > > >> Wiki > > > >>>>> page: > > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > > > >> to > > > >>>> keep > > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, > then > > > >>> the > > > >>>>>> status > > > >>>>>>> will update automatically in the wiki :) > > > >>>>>>> > > > >>>>>>> Please keep posting features here, or add them to the Wiki > > > >> yourself > > > >>>> 🙏 > > > >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto > > > >>>> Scaling > > > >>>>>> is a > > > >>>>>>> feature request the community is well-aware of. Till has posted > > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the > 1.12 > > > >>>>>> release. > > > >>>>>>> This work will introduce the basic building blocks and partial > > > >>>> support > > > >>>>>> for > > > >>>>>>> the feature you are requesting. > > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's > > > >> high > > > >>>>>>> performance (throughout, low latency) and correctness is a > > > >>> difficult > > > >>>>> task > > > >>>>>>> that needs a lot of work. It will probably take a little bit of > > > >>> time > > > >>>>> till > > > >>>>>>> this is fully available. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Robert > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > > > >>> [hidden email]> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Thanks for being our release managers for the 1.12 release > > > >> Dian & > > > >>>>>> Robert! > > > >>>>>>>> Here are some features I would like to work on for this > > > >> release: > > > >>>>>>>> # Features > > > >>>>>>>> > > > >>>>>>>> ## Finishing pipelined region scheduling ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) > > > >>>>>>>> With the pipelined region scheduler we want to implement a > > > >>>> scheduler > > > >>>>>>> which > > > >>>>>>>> can serve streaming as well as batch workloads alike while > > > >> being > > > >>>> able > > > >>>>>> to > > > >>>>>>>> run jobs under constrained resources. The latter is > > > >> particularly > > > >>>>>>> important > > > >>>>>>>> for bounded streaming jobs which, currently, are not well > > > >>>> supported. > > > >>>>>>>> ## Reactive-scaling mode > > > >>>>>>>> Being able to react to newly available resources and rescaling > > > >> a > > > >>>>>> running > > > >>>>>>>> job accordingly will make Flink's operation much easier > because > > > >>>>>> resources > > > >>>>>>>> can then be controlled by an external tool (e.g. GCP > > > >> autoscaling, > > > >>>> K8s > > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make > a > > > >>> big > > > >>>>>> step > > > >>>>>>>> towards this direction. As a first step we want to support the > > > >>>>>> execution > > > >>>>>>> of > > > >>>>>>>> jobs with a parallelism which is lower than the specified > > > >>>> parallelism > > > >>>>>> in > > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough > > > >>>>>> resources. > > > >>>>>>>> # Maintenance/Stability > > > >>>>>>>> > > > >>>>>>>> ## JM / TM finished task reconciliation ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) > > > >>>>>>>> This prevents the system from going out of sync if a task > state > > > >>>>> change > > > >>>>>>> from > > > >>>>>>>> the TM to the JM is lost. > > > >>>>>>>> > > > >>>>>>>> ## Make metrics services work with Kubernetes deployments ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) > > > >>>>>>>> Invert the direction in which the MetricFetcher connects to > the > > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary > to > > > >>>>> expose > > > >>>>>> on > > > >>>>>>>> K8s for every TaskManager a port on which the > > > >> MetricQueryFetcher > > > >>>>> runs. > > > >>>>>>> This > > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier. > > > >>>>>>>> > > > >>>>>>>> ## Handle long-blocking operations during job submission > > > >>> (savepoint > > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > >>>>>>>> Submitting a Flink job can involve the interaction with > > > >> external > > > >>>>>> systems > > > >>>>>>>> (blocking operations). Depending on the job the interactions > > > >> can > > > >>>> take > > > >>>>>> so > > > >>>>>>>> long that it exceeds the submission timeout which reports a > > > >>> failure > > > >>>>> on > > > >>>>>>> the > > > >>>>>>>> client side even though the actual submission succeeded. By > > > >>>>> decoupling > > > >>>>>>> the > > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can > > > >>> make > > > >>>>> the > > > >>>>>>> job > > > >>>>>>>> submission non-blocking which will solve this problem. > > > >>>>>>>> > > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) > > > >>>>>>>> By making the internal Flink IDs compositional or logging how > > > >>> they > > > >>>>>> belong > > > >>>>>>>> together, we can make the debugging of Flink's operations much > > > >>>>> easier. > > > >>>>>>>> Cheers, > > > >>>>>>>> Till > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > > >>>> [hidden email] > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Hi All, > > > >>>>>>>>> > > > >>>>>>>>> Thanks for bring-up this discussion, Robert! > > > >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian > > > >>> and > > > >>>>>>> Robert > > > >>>>>>>> ! > > > >>>>>>>>> ---------- > > > >>>>>>>>> Here are some of my thoughts of the features for native > > > >>>> integration > > > >>>>>>> with > > > >>>>>>>>> Kubernetes in Flink 1.12: > > > >>>>>>>>> > > > >>>>>>>>> 1. Support user-specified pod templates > > > >>>>>>>>> Description: > > > >>>>>>>>> The current approach of introducing new configuration > > > >>> options > > > >>>>> for > > > >>>>>>>> each > > > >>>>>>>>> aspect of pod specification a user might wish is becoming > > > >>>> unwieldy, > > > >>>>>> we > > > >>>>>>>> have > > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration > > > >>>>> options > > > >>>>>>> and > > > >>>>>>>>> users have to learn the gap between the declarative model > > > >> used > > > >>> by > > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a > > > >>>> great > > > >>>>>>>>> improvement to allow users to specify pod templates as > > > >> central > > > >>>>> places > > > >>>>>>> for > > > >>>>>>>>> all customization needs for the jobmanager and taskmanager > > > >>> pods. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Users can leverage many of the advanced K8s features > that > > > >>> the > > > >>>>>> Flink > > > >>>>>>>>> community does not support explicitly, such as volume > > > >> mounting, > > > >>>> DNS > > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. > > > >>>>>>>>> > > > >>>>>>>>> 2. Support running PyFlink on Kubernetes > > > >>>>>>>>> Description: > > > >>>>>>>>> Support running PyFlink on Kubernetes, including session > > > >>>>> cluster > > > >>>>>>> and > > > >>>>>>>>> application cluster. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Running python application in a containerized > > > >> environment. > > > >>>>>>>>> 3. Support built-in init-Container > > > >>>>>>>>> Description: > > > >>>>>>>>> We need a built-in init-Container to help solve > > > >> dependency > > > >>>>>>> management > > > >>>>>>>>> in a containerized environment, especially in the application > > > >>>> mode. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Separate the base Flink image from dynamic dependencies. > > > >>>>>>>>> > > > >>>>>>>>> 4. Support accessing secured services via K8s secrets > > > >>>>>>>>> Description: > > > >>>>>>>>> Kubernetes Secrets > > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> > > > >>> can > > > >>>> be > > > >>>>>>> used > > > >>>>>>>> to > > > >>>>>>>>> provide credentials for a Flink application to access secured > > > >>>>>> services. > > > >>>>>>>> It > > > >>>>>>>>> helps people who want to use a user-specified K8s Secret > > > >>> through > > > >>>> an > > > >>>>>>>>> environment variable. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Improve user experience. > > > >>>>>>>>> > > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in > > > >>>>> ZooKeeper > > > >>>>>> HA > > > >>>>>>>>> setups > > > >>>>>>>>> Description: > > > >>>>>>>>> Make the *replica* of Deployment configurable in the > > > >>>> ZooKeeper > > > >>>>> HA > > > >>>>>>>>> setups. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Achieve faster failover. > > > >>>>>>>>> > > > >>>>>>>>> 6. Support to configure limit for CPU requirement > > > >>>>>>>>> Description: > > > >>>>>>>>> To leverage the Kubernetes feature of container > > > >>> request/limit > > > >>>>>> CPU. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Reduce cost. > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Canbin Zheng > > > >>>>>>>>> > > > >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > >>>>>>>>> > > > >>>>>>>>>> I'm excited to hear about this feature, very, very, very > > > >>>> highly > > > >>>>>>>>> encouraged > > > >>>>>>>>>> > > > >>>>>>>>>> Prasanna kumar <[hidden email]> > > > >> 于2020年7月23日周四 > > > >>>>>>>> 上午12:10写道: > > > >>>>>>>>>>> Hi Flink Dev Team, > > > >>>>>>>>>>> > > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would > > > >>> be > > > >>>> a > > > >>>>>>> great > > > >>>>>>>>>>> feature. > > > >>>>>>>>>>> > > > >>>>>>>>>>> We should be able have some rule say If the load > > > >> increased > > > >>> by > > > >>>>>> 20% , > > > >>>>>>>> add > > > >>>>>>>>>>> extra resource should be added. > > > >>>>>>>>>>> Or time based say during these peak hours the pipeline > > > >>> should > > > >>>>>> scale > > > >>>>>>>>>>> automatically by 50%. > > > >>>>>>>>>>> > > > >>>>>>>>>>> This will help a lot in cost reduction. > > > >>>>>>>>>>> > > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based > > > >>>>>> application. > > > >>>>>>>>>>> Thanks, > > > >>>>>>>>>>> Prasanna. > > > >>>>>>>>>>> > > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > >>>>>>> [hidden email]> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Hi all, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan > > > >> for > > > >>>> the > > > >>>>>> next > > > >>>>>>>>> major > > > >>>>>>>>>>>> Flink release. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Some items: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers > > > >>> for > > > >>>>>> Flink > > > >>>>>>>>> 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 > > > >>> month > > > >>>>>>> release > > > >>>>>>>>>>> cycle, > > > >>>>>>>>>>>> thus the release should be done by late October. > > > >> Given > > > >>>>> that > > > >>>>>>>>> there’s > > > >>>>>>>>>> a > > > >>>>>>>>>>>> holiday week in China at the beginning of October, I > > > >>>>> propose > > > >>>>>>> to > > > >>>>>>>> do > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze on master by late September. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Collecting features: It would be good to have a > > > >> rough > > > >>>>>> overview > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> features that will likely be ready to be merged by > > > >>> late > > > >>>>>>>> September, > > > >>>>>>>>>> and > > > >>>>>>>>>>>> that > > > >>>>>>>>>>>> we want in the release. > > > >>>>>>>>>>>> Based on the discussion, we will update the Roadmap > > > >> on > > > >>>> the > > > >>>>>>> Flink > > > >>>>>>>>>>> website > > > >>>>>>>>>>>> again! > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Test instabilities and blockers: I would like to > > > >>> avoid a > > > >>>>>>>> situation > > > >>>>>>>>>>> where > > > >>>>>>>>>>>> we have many blocking issues or build instabilities > > > >> at > > > >>>> the > > > >>>>>>> time > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze. To achieve that, we will try to > > > >> check > > > >>>>> every > > > >>>>>>>> build > > > >>>>>>>>>>>> instability within a week, to decide if it is a > > > >>> blocker > > > >>>>>> (make > > > >>>>>>>> sure > > > >>>>>>>>>> to > > > >>>>>>>>>>>> use > > > >>>>>>>>>>>> the “test-stability” label for those tickets!) > > > >>>>>>>>>>>> Blocker issues will need to have somebody assigned > > > >>>>>>> (responsible) > > > >>>>>>>>>>> within > > > >>>>>>>>>>>> a week, and we want to see progress on all blocker > > > >>>> issues > > > >>>>>>>>>> (downgrade, > > > >>>>>>>>>>>> resolution, a good plan how to proceed if it is more > > > >>>>>>>> complicated) > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Quality and stability of new features: In order to > > > >>> have > > > >>>> a > > > >>>>>>> short > > > >>>>>>>>>>> feature > > > >>>>>>>>>>>> freeze phase, we encourage developers to only merge > > > >>>>>>> well-tested > > > >>>>>>>>> and > > > >>>>>>>>>>>> documented features. In our experience, the feature > > > >>>> freeze > > > >>>>>>> works > > > >>>>>>>>>> best > > > >>>>>>>>>>> if > > > >>>>>>>>>>>> new features are complete, and the community can > > > >> focus > > > >>>>> fully > > > >>>>>>> on > > > >>>>>>>>>>>> addressing > > > >>>>>>>>>>>> newly found bugs and voting the release. > > > >>>>>>>>>>>> By having a smooth release process, the next > > > >>>> merge-window > > > >>>>>> for > > > >>>>>>>> the > > > >>>>>>>>>> next > > > >>>>>>>>>>>> release will come sooner. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Let me know what you think about our items, and share > > > >>> which > > > >>>>>>>> features > > > >>>>>>>>>> you > > > >>>>>>>>>>>> want in Flink 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Robert & Dian > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> -- > > > >>>>>>>>>> > > > >>>>>>>>>> Best Regards, > > > >>>>>>>>>> Harold Miao > > > >>>>>>>>>> > > > > > > > > > |
+1 for extending the feature freeze due date.
________________________________ From: Zhijiang <[hidden email]> Sent: Thursday, August 6, 2020 17:05 To: dev <[hidden email]> Subject: Re: [DISCUSS] Planning Flink 1.12 +1 on my side for feature freeze date by the end of Oct. ------------------------------------------------------------------ From:Yuan Mei <[hidden email]> Send Time:2020年8月6日(星期四) 14:54 To:dev <[hidden email]> Subject:Re: [DISCUSS] Planning Flink 1.12 +1 > +1 for extending the feature freeze date to the end of October. On Thu, Aug 6, 2020 at 12:08 PM Yu Li <[hidden email]> wrote: > +1 for extending feature freeze date to end of October. > > Feature development in the master branch could be unblocked through > creating the release branch, but every coin has its two sides (smile) > > Best Regards, > Yu > > > On Wed, 5 Aug 2020 at 20:12, Robert Metzger <[hidden email]> wrote: > > > Thanks all for your opinion. > > > > @Chesnay: That is a risk, but I hope the people responsible for > individual > > FLIPs plan accordingly. Extending the time till the feature freeze should > > not mean that we are extending the scope of the release. > > Ideally, features are done before FF, and they use the time till the > freeze > > for additional testing and documentation polishing. > > This FF will be virtual, there should be less disruption than a physical > > conference with all the travelling. > > Do you have a different proposal for the timing? > > > > > > I'm currently considering splitting the feature freeze and the release > > branch creation. Similar to the Linux kernel development, we could have a > > "merge window" and a stabilization phase. At the end of the stabilization > > phase, we cut the release branch and open the next merge window (I'll > start > > a separate thread regarding this towards the end of this release cycle, > if > > I still like the idea then) > > > > > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <[hidden email]> > > wrote: > > > > > I'm a bit concerned about end of October, because it means we have > Flink > > > forward, which usually means at least 1 week of little-to-no activity, > > > and then 1 week until feature-freeze. > > > > > > On 05/08/2020 11:56, jincheng sun wrote: > > > > +1 for end of October from me as well. > > > > > > > > Best, > > > > Jincheng > > > > > > > > > > > > Kostas Kloudas <[hidden email]> 于2020年8月5日周三 下午4:59写道: > > > > > > > >> +1 for end of October from me as well. > > > >> > > > >> Cheers, > > > >> Kostas > > > >> > > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <[hidden email]> > > > wrote: > > > >> > > > >>> +1 for end of October from my side as well. > > > >>> > > > >>> Cheers, > > > >>> Till > > > >>> > > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <[hidden email]> > > wrote: > > > >>> > > > >>>> The end of October sounds good from my side, unless it collides > with > > > >> some > > > >>>> holidays that affect many committers. > > > >>>> > > > >>>> Feature-wise, I believe we can definitely make good use of the > time > > to > > > >>> wrap > > > >>>> up some critical threads (like finishing the FLIP-27 source > > efforts). > > > >>>> > > > >>>> So +1 to the end of October from my side. > > > >>>> > > > >>>> Best, > > > >>>> Stephan > > > >>>> > > > >>>> > > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger < > [hidden email]> > > > >>> wrote: > > > >>>>> Thanks a lot for commenting on the feature freeze date. > > > >>>>> > > > >>>>> You are raising a few good points on the timing. > > > >>>>> If we have already (2 months before) concerns regarding the > > deadline, > > > >>>> then > > > >>>>> I agree that we should move it till the end of October. > > > >>>>> > > > >>>>> We then just need to be careful not to run into the Christmas > > season > > > >> at > > > >>>> the > > > >>>>> end of December. > > > >>>>> > > > >>>>> If nobody objects within a few days, I'll update the feature > freeze > > > >>> date > > > >>>> in > > > >>>>> the Wiki. > > > >>>>> > > > >>>>> > > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <[hidden email]> > > wrote: > > > >>>>> > > > >>>>>> Regarding setting the feature freeze date to late September, I > > have > > > >>>> some > > > >>>>>> concern that it might make > > > >>>>>> the development time of 1.12 too short. > > > >>>>>> > > > >>>>>> One reason for this is we took too much time (about 1.5 month, > > from > > > >>> mid > > > >>>>> of > > > >>>>>> May to beginning of July) > > > >>>>>> for testing 1.11. It's not ideal but further squeeze the > > > >> development > > > >>>> time > > > >>>>>> of 1.12 won't make this better. > > > >>>>>> Besides, AFAIK July & August is also a popular vacation season > > for > > > >>>>>> European. Given the fact most > > > >>>>>> committers of Flink come from Europe, I think we should also > > take > > > >>> this > > > >>>>>> into consideration. > > > >>>>>> > > > >>>>>> It's also true that the first week of October is the national > > > >> holiday > > > >>>> of > > > >>>>>> China, so I'm wondering whether the > > > >>>>>> end of October could be a candidate feature freeze date. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Kurt > > > >>>>>> > > > >>>>>> > > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger < > > > >> [hidden email]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Hi all, > > > >>>>>>> > > > >>>>>>> Thanks a lot for the responses so far. I've put them into this > > > >> Wiki > > > >>>>> page: > > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release > > > >> to > > > >>>> keep > > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, > then > > > >>> the > > > >>>>>> status > > > >>>>>>> will update automatically in the wiki :) > > > >>>>>>> > > > >>>>>>> Please keep posting features here, or add them to the Wiki > > > >> yourself > > > >>>> 🙏 > > > >>>>>>> @Prasanna kumar <[hidden email]>: Dynamic Auto > > > >>>> Scaling > > > >>>>>> is a > > > >>>>>>> feature request the community is well-aware of. Till has posted > > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the > 1.12 > > > >>>>>> release. > > > >>>>>>> This work will introduce the basic building blocks and partial > > > >>>> support > > > >>>>>> for > > > >>>>>>> the feature you are requesting. > > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's > > > >> high > > > >>>>>>> performance (throughout, low latency) and correctness is a > > > >>> difficult > > > >>>>> task > > > >>>>>>> that needs a lot of work. It will probably take a little bit of > > > >>> time > > > >>>>> till > > > >>>>>>> this is fully available. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Robert > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann < > > > >>> [hidden email]> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Thanks for being our release managers for the 1.12 release > > > >> Dian & > > > >>>>>> Robert! > > > >>>>>>>> Here are some features I would like to work on for this > > > >> release: > > > >>>>>>>> # Features > > > >>>>>>>> > > > >>>>>>>> ## Finishing pipelined region scheduling ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430) > > > >>>>>>>> With the pipelined region scheduler we want to implement a > > > >>>> scheduler > > > >>>>>>> which > > > >>>>>>>> can serve streaming as well as batch workloads alike while > > > >> being > > > >>>> able > > > >>>>>> to > > > >>>>>>>> run jobs under constrained resources. The latter is > > > >> particularly > > > >>>>>>> important > > > >>>>>>>> for bounded streaming jobs which, currently, are not well > > > >>>> supported. > > > >>>>>>>> ## Reactive-scaling mode > > > >>>>>>>> Being able to react to newly available resources and rescaling > > > >> a > > > >>>>>> running > > > >>>>>>>> job accordingly will make Flink's operation much easier > because > > > >>>>>> resources > > > >>>>>>>> can then be controlled by an external tool (e.g. GCP > > > >> autoscaling, > > > >>>> K8s > > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make > a > > > >>> big > > > >>>>>> step > > > >>>>>>>> towards this direction. As a first step we want to support the > > > >>>>>> execution > > > >>>>>>> of > > > >>>>>>>> jobs with a parallelism which is lower than the specified > > > >>>> parallelism > > > >>>>>> in > > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough > > > >>>>>> resources. > > > >>>>>>>> # Maintenance/Stability > > > >>>>>>>> > > > >>>>>>>> ## JM / TM finished task reconciliation ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075) > > > >>>>>>>> This prevents the system from going out of sync if a task > state > > > >>>>> change > > > >>>>>>> from > > > >>>>>>>> the TM to the JM is lost. > > > >>>>>>>> > > > >>>>>>>> ## Make metrics services work with Kubernetes deployments ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127) > > > >>>>>>>> Invert the direction in which the MetricFetcher connects to > the > > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary > to > > > >>>>> expose > > > >>>>>> on > > > >>>>>>>> K8s for every TaskManager a port on which the > > > >> MetricQueryFetcher > > > >>>>> runs. > > > >>>>>>> This > > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier. > > > >>>>>>>> > > > >>>>>>>> ## Handle long-blocking operations during job submission > > > >>> (savepoint > > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866) > > > >>>>>>>> Submitting a Flink job can involve the interaction with > > > >> external > > > >>>>>> systems > > > >>>>>>>> (blocking operations). Depending on the job the interactions > > > >> can > > > >>>> take > > > >>>>>> so > > > >>>>>>>> long that it exceeds the submission timeout which reports a > > > >>> failure > > > >>>>> on > > > >>>>>>> the > > > >>>>>>>> client side even though the actual submission succeeded. By > > > >>>>> decoupling > > > >>>>>>> the > > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can > > > >>> make > > > >>>>> the > > > >>>>>>> job > > > >>>>>>>> submission non-blocking which will solve this problem. > > > >>>>>>>> > > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) ( > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679) > > > >>>>>>>> By making the internal Flink IDs compositional or logging how > > > >>> they > > > >>>>>> belong > > > >>>>>>>> together, we can make the debugging of Flink's operations much > > > >>>>> easier. > > > >>>>>>>> Cheers, > > > >>>>>>>> Till > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng < > > > >>>> [hidden email] > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Hi All, > > > >>>>>>>>> > > > >>>>>>>>> Thanks for bring-up this discussion, Robert! > > > >>>>>>>>> Congratulations on becoming the release manager of 1.12, Dian > > > >>> and > > > >>>>>>> Robert > > > >>>>>>>> ! > > > >>>>>>>>> ---------- > > > >>>>>>>>> Here are some of my thoughts of the features for native > > > >>>> integration > > > >>>>>>> with > > > >>>>>>>>> Kubernetes in Flink 1.12: > > > >>>>>>>>> > > > >>>>>>>>> 1. Support user-specified pod templates > > > >>>>>>>>> Description: > > > >>>>>>>>> The current approach of introducing new configuration > > > >>> options > > > >>>>> for > > > >>>>>>>> each > > > >>>>>>>>> aspect of pod specification a user might wish is becoming > > > >>>> unwieldy, > > > >>>>>> we > > > >>>>>>>> have > > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration > > > >>>>> options > > > >>>>>>> and > > > >>>>>>>>> users have to learn the gap between the declarative model > > > >> used > > > >>> by > > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a > > > >>>> great > > > >>>>>>>>> improvement to allow users to specify pod templates as > > > >> central > > > >>>>> places > > > >>>>>>> for > > > >>>>>>>>> all customization needs for the jobmanager and taskmanager > > > >>> pods. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Users can leverage many of the advanced K8s features > that > > > >>> the > > > >>>>>> Flink > > > >>>>>>>>> community does not support explicitly, such as volume > > > >> mounting, > > > >>>> DNS > > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc. > > > >>>>>>>>> > > > >>>>>>>>> 2. Support running PyFlink on Kubernetes > > > >>>>>>>>> Description: > > > >>>>>>>>> Support running PyFlink on Kubernetes, including session > > > >>>>> cluster > > > >>>>>>> and > > > >>>>>>>>> application cluster. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Running python application in a containerized > > > >> environment. > > > >>>>>>>>> 3. Support built-in init-Container > > > >>>>>>>>> Description: > > > >>>>>>>>> We need a built-in init-Container to help solve > > > >> dependency > > > >>>>>>> management > > > >>>>>>>>> in a containerized environment, especially in the application > > > >>>> mode. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Separate the base Flink image from dynamic dependencies. > > > >>>>>>>>> > > > >>>>>>>>> 4. Support accessing secured services via K8s secrets > > > >>>>>>>>> Description: > > > >>>>>>>>> Kubernetes Secrets > > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/> > > > >>> can > > > >>>> be > > > >>>>>>> used > > > >>>>>>>> to > > > >>>>>>>>> provide credentials for a Flink application to access secured > > > >>>>>> services. > > > >>>>>>>> It > > > >>>>>>>>> helps people who want to use a user-specified K8s Secret > > > >>> through > > > >>>> an > > > >>>>>>>>> environment variable. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Improve user experience. > > > >>>>>>>>> > > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in > > > >>>>> ZooKeeper > > > >>>>>> HA > > > >>>>>>>>> setups > > > >>>>>>>>> Description: > > > >>>>>>>>> Make the *replica* of Deployment configurable in the > > > >>>> ZooKeeper > > > >>>>> HA > > > >>>>>>>>> setups. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Achieve faster failover. > > > >>>>>>>>> > > > >>>>>>>>> 6. Support to configure limit for CPU requirement > > > >>>>>>>>> Description: > > > >>>>>>>>> To leverage the Kubernetes feature of container > > > >>> request/limit > > > >>>>>> CPU. > > > >>>>>>>>> Benefits: > > > >>>>>>>>> Reduce cost. > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Canbin Zheng > > > >>>>>>>>> > > > >>>>>>>>> Harold.Miao <[hidden email]> 于2020年7月23日周四 下午12:44写道: > > > >>>>>>>>> > > > >>>>>>>>>> I'm excited to hear about this feature, very, very, very > > > >>>> highly > > > >>>>>>>>> encouraged > > > >>>>>>>>>> > > > >>>>>>>>>> Prasanna kumar <[hidden email]> > > > >> 于2020年7月23日周四 > > > >>>>>>>> 上午12:10写道: > > > >>>>>>>>>>> Hi Flink Dev Team, > > > >>>>>>>>>>> > > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would > > > >>> be > > > >>>> a > > > >>>>>>> great > > > >>>>>>>>>>> feature. > > > >>>>>>>>>>> > > > >>>>>>>>>>> We should be able have some rule say If the load > > > >> increased > > > >>> by > > > >>>>>> 20% , > > > >>>>>>>> add > > > >>>>>>>>>>> extra resource should be added. > > > >>>>>>>>>>> Or time based say during these peak hours the pipeline > > > >>> should > > > >>>>>> scale > > > >>>>>>>>>>> automatically by 50%. > > > >>>>>>>>>>> > > > >>>>>>>>>>> This will help a lot in cost reduction. > > > >>>>>>>>>>> > > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based > > > >>>>>> application. > > > >>>>>>>>>>> Thanks, > > > >>>>>>>>>>> Prasanna. > > > >>>>>>>>>>> > > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger < > > > >>>>>>> [hidden email]> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Hi all, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan > > > >> for > > > >>>> the > > > >>>>>> next > > > >>>>>>>>> major > > > >>>>>>>>>>>> Flink release. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Some items: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Dian Fu and me volunteer to be the release managers > > > >>> for > > > >>>>>> Flink > > > >>>>>>>>> 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Timeline: We propose to stick to our approximate 4 > > > >>> month > > > >>>>>>> release > > > >>>>>>>>>>> cycle, > > > >>>>>>>>>>>> thus the release should be done by late October. > > > >> Given > > > >>>>> that > > > >>>>>>>>> there’s > > > >>>>>>>>>> a > > > >>>>>>>>>>>> holiday week in China at the beginning of October, I > > > >>>>> propose > > > >>>>>>> to > > > >>>>>>>> do > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze on master by late September. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Collecting features: It would be good to have a > > > >> rough > > > >>>>>> overview > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> features that will likely be ready to be merged by > > > >>> late > > > >>>>>>>> September, > > > >>>>>>>>>> and > > > >>>>>>>>>>>> that > > > >>>>>>>>>>>> we want in the release. > > > >>>>>>>>>>>> Based on the discussion, we will update the Roadmap > > > >> on > > > >>>> the > > > >>>>>>> Flink > > > >>>>>>>>>>> website > > > >>>>>>>>>>>> again! > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Test instabilities and blockers: I would like to > > > >>> avoid a > > > >>>>>>>> situation > > > >>>>>>>>>>> where > > > >>>>>>>>>>>> we have many blocking issues or build instabilities > > > >> at > > > >>>> the > > > >>>>>>> time > > > >>>>>>>> of > > > >>>>>>>>>> the > > > >>>>>>>>>>>> feature freeze. To achieve that, we will try to > > > >> check > > > >>>>> every > > > >>>>>>>> build > > > >>>>>>>>>>>> instability within a week, to decide if it is a > > > >>> blocker > > > >>>>>> (make > > > >>>>>>>> sure > > > >>>>>>>>>> to > > > >>>>>>>>>>>> use > > > >>>>>>>>>>>> the “test-stability” label for those tickets!) > > > >>>>>>>>>>>> Blocker issues will need to have somebody assigned > > > >>>>>>> (responsible) > > > >>>>>>>>>>> within > > > >>>>>>>>>>>> a week, and we want to see progress on all blocker > > > >>>> issues > > > >>>>>>>>>> (downgrade, > > > >>>>>>>>>>>> resolution, a good plan how to proceed if it is more > > > >>>>>>>> complicated) > > > >>>>>>>>>>>> 2. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Quality and stability of new features: In order to > > > >>> have > > > >>>> a > > > >>>>>>> short > > > >>>>>>>>>>> feature > > > >>>>>>>>>>>> freeze phase, we encourage developers to only merge > > > >>>>>>> well-tested > > > >>>>>>>>> and > > > >>>>>>>>>>>> documented features. In our experience, the feature > > > >>>> freeze > > > >>>>>>> works > > > >>>>>>>>>> best > > > >>>>>>>>>>> if > > > >>>>>>>>>>>> new features are complete, and the community can > > > >> focus > > > >>>>> fully > > > >>>>>>> on > > > >>>>>>>>>>>> addressing > > > >>>>>>>>>>>> newly found bugs and voting the release. > > > >>>>>>>>>>>> By having a smooth release process, the next > > > >>>> merge-window > > > >>>>>> for > > > >>>>>>>> the > > > >>>>>>>>>> next > > > >>>>>>>>>>>> release will come sooner. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Let me know what you think about our items, and share > > > >>> which > > > >>>>>>>> features > > > >>>>>>>>>> you > > > >>>>>>>>>>>> want in Flink 1.12. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Robert & Dian > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> -- > > > >>>>>>>>>> > > > >>>>>>>>>> Best Regards, > > > >>>>>>>>>> Harold Miao > > > >>>>>>>>>> > > > > > > > > > |
Free forum by Nabble | Edit this page |