Dear Flink community!
As a follow-up to the thread announcing Alibaba's offer to contribute the Blink code [1] <https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E> , here are some thoughts on how this contribution could be merged. As described in the announcement thread, it is a big contribution, and we need to carefully plan how to handle the contribution. We would like to get the improvements to Flink, while making it as non-disruptive as possible for the community. I hope that this plan gives the community get a better understanding of what the proposed contribution would mean. Here is an initial rough proposal, with thoughts from Timo, Piotr, Dawid, Kurt, Shaoxuan, Jincheng, Jark, Aljoscha, Fabian, Xiaowei: - It is obviously very hard to merge all changes in a quick move, because we are talking about multiple 100k lines of code. - As much as possible, we want to maintain compatibility with the current Table API, so that this becomes a transparent change for most users. - The two areas with the most changes we identified were (1) The SQL/Table query processor (2) The batch scheduling/failover/shuffle - For the query processor part, this is what we found and propose: -> The Blink and Flink code have the same semantics (ANSI SQL) except for minor aspects (under discussion). Blink also covers more SQL operations. -> The Blink code is quite different from the current Flink SQL runtime. Merging as changes seems hardly feasible. From the current evaluation, the Blink query processor uses the more advanced architecture, so it would make sense to converge to that design. -> We propose to gradually build up the Blink-based query processor as a second query processor under the SQL/Table API. Think of it as two different runners for the Table API. As the new query processor becomes fully merged and stable, we can deprecate and eventually remove the existing query processor. That should give the least disruption to Flink users and allow for gradual merge/development. -> Some refactoring of the Table API is necessary to support the above strategy. Most of the prerequisite refactoring is around splitting the project into different modules, following a similar idea as FLIP-28 [2] <https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free> . -> A more detailed proposal is being worked on. -> Same as FLIP-28, this approach would probably need to suspend Table API contributions for a short while. We hope that this can be a very short period, to not impact the very active development in Flink on Table API/SQL too much. - For the batch scheduling and failover enhancements, we should be able to build on the currently ongoing refactoring of the scheduling logic [3] <https://issues.apache.org/jira/browse/FLINK-10429>. That should make it easy to plug in a new scheduler and failover logic. We can port the Blink enhancements as a new scheduler / failover handler. We can later make it the default for bounded stream programs once the merge is completed and it is tested. - For the catalog and source/sink design and interfaces, we would like to continue with the already started design discussion threads. Once these are converged, we might use some of the Blink code for the implementation, if it is close to the outcome of the design discussions. Best, Stephan [1] https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free [3] https://issues.apache.org/jira/browse/FLINK-10429 |
Hi Stephan,
Thanks for bringing up the discussions. I'm +1 on the merging plan. One question though: since the merge will not be completed for some time and there are might be uses trying blink branch, what's the plan for the development in the branch? Personally I think we may discourage big contributions to the branch, which would further complicate the merge, while we shouldn't stop critical fixes as well. What's your take on this? Thanks, Xuefu ------------------------------------------------------------------ From:Stephan Ewen <[hidden email]> Sent At:2019 Jan. 22 (Tue.) 06:16 To:dev <[hidden email]> Subject:[DISCUSS] A strategy for merging the Blink enhancements Dear Flink community! As a follow-up to the thread announcing Alibaba's offer to contribute the Blink code [1] <https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E> , here are some thoughts on how this contribution could be merged. As described in the announcement thread, it is a big contribution, and we need to carefully plan how to handle the contribution. We would like to get the improvements to Flink, while making it as non-disruptive as possible for the community. I hope that this plan gives the community get a better understanding of what the proposed contribution would mean. Here is an initial rough proposal, with thoughts from Timo, Piotr, Dawid, Kurt, Shaoxuan, Jincheng, Jark, Aljoscha, Fabian, Xiaowei: - It is obviously very hard to merge all changes in a quick move, because we are talking about multiple 100k lines of code. - As much as possible, we want to maintain compatibility with the current Table API, so that this becomes a transparent change for most users. - The two areas with the most changes we identified were (1) The SQL/Table query processor (2) The batch scheduling/failover/shuffle - For the query processor part, this is what we found and propose: -> The Blink and Flink code have the same semantics (ANSI SQL) except for minor aspects (under discussion). Blink also covers more SQL operations. -> The Blink code is quite different from the current Flink SQL runtime. Merging as changes seems hardly feasible. From the current evaluation, the Blink query processor uses the more advanced architecture, so it would make sense to converge to that design. -> We propose to gradually build up the Blink-based query processor as a second query processor under the SQL/Table API. Think of it as two different runners for the Table API. As the new query processor becomes fully merged and stable, we can deprecate and eventually remove the existing query processor. That should give the least disruption to Flink users and allow for gradual merge/development. -> Some refactoring of the Table API is necessary to support the above strategy. Most of the prerequisite refactoring is around splitting the project into different modules, following a similar idea as FLIP-28 [2] <https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free> . -> A more detailed proposal is being worked on. -> Same as FLIP-28, this approach would probably need to suspend Table API contributions for a short while. We hope that this can be a very short period, to not impact the very active development in Flink on Table API/SQL too much. - For the batch scheduling and failover enhancements, we should be able to build on the currently ongoing refactoring of the scheduling logic [3] <https://issues.apache.org/jira/browse/FLINK-10429>. That should make it easy to plug in a new scheduler and failover logic. We can port the Blink enhancements as a new scheduler / failover handler. We can later make it the default for bounded stream programs once the merge is completed and it is tested. - For the catalog and source/sink design and interfaces, we would like to continue with the already started design discussion threads. Once these are converged, we might use some of the Blink code for the implementation, if it is close to the outcome of the design discussions. Best, Stephan [1] https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free [3] https://issues.apache.org/jira/browse/FLINK-10429 |
I think that is a reasonable proposal. Bugs that are identified could be
fixed in the blink branch, so that we merge the working code. New feature contributions to that branch would complicate the merge. I would try and rather focus on merging and let new contributions go to the master branch. On Tue, Jan 22, 2019 at 11:12 PM Zhang, Xuefu <[hidden email]> wrote: > Hi Stephan, > > Thanks for bringing up the discussions. I'm +1 on the merging plan. One > question though: since the merge will not be completed for some time and > there are might be uses trying blink branch, what's the plan for the > development in the branch? Personally I think we may discourage big > contributions to the branch, which would further complicate the merge, > while we shouldn't stop critical fixes as well. > > What's your take on this? > > Thanks, > Xuefu > > > ------------------------------------------------------------------ > From:Stephan Ewen <[hidden email]> > Sent At:2019 Jan. 22 (Tue.) 06:16 > To:dev <[hidden email]> > Subject:[DISCUSS] A strategy for merging the Blink enhancements > > Dear Flink community! > > As a follow-up to the thread announcing Alibaba's offer to contribute the > Blink code [1] > < > https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E > > > , > here are some thoughts on how this contribution could be merged. > > As described in the announcement thread, it is a big contribution, and we > need to > carefully plan how to handle the contribution. We would like to get the > improvements to Flink, > while making it as non-disruptive as possible for the community. > I hope that this plan gives the community get a better understanding of > what the > proposed contribution would mean. > > Here is an initial rough proposal, with thoughts from > Timo, Piotr, Dawid, Kurt, Shaoxuan, Jincheng, Jark, Aljoscha, Fabian, > Xiaowei: > > - It is obviously very hard to merge all changes in a quick move, because > we > are talking about multiple 100k lines of code. > > - As much as possible, we want to maintain compatibility with the current > Table API, > so that this becomes a transparent change for most users. > > - The two areas with the most changes we identified were > (1) The SQL/Table query processor > (2) The batch scheduling/failover/shuffle > > - For the query processor part, this is what we found and propose: > > -> The Blink and Flink code have the same semantics (ANSI SQL) except > for minor > aspects (under discussion). Blink also covers more SQL operations. > > -> The Blink code is quite different from the current Flink SQL > runtime. > Merging as changes seems hardly feasible. From the current > evaluation, the > Blink query processor uses the more advanced architecture, so it > would make > sense to converge to that design. > > -> We propose to gradually build up the Blink-based query processor as > a second > query processor under the SQL/Table API. Think of it as two > different runners > for the Table API. > As the new query processor becomes fully merged and stable, we can > deprecate and > eventually remove the existing query processor. That should give the > least > disruption to Flink users and allow for gradual merge/development. > > -> Some refactoring of the Table API is necessary to support the above > strategy. > Most of the prerequisite refactoring is around splitting the project > into > different modules, following a similar idea as FLIP-28 [2] > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free > > > . > > -> A more detailed proposal is being worked on. > > -> Same as FLIP-28, this approach would probably need to suspend Table > API > contributions for a short while. We hope that this can be a very > short period, > to not impact the very active development in Flink on Table API/SQL > too much. > > - For the batch scheduling and failover enhancements, we should be able > to build > on the currently ongoing refactoring of the scheduling logic [3] > <https://issues.apache.org/jira/browse/FLINK-10429>. That should > make it easy to plug in a new scheduler and failover logic. We can port > the Blink > enhancements as a new scheduler / failover handler. We can later make > it the > default for bounded stream programs once the merge is completed and it > is tested. > > - For the catalog and source/sink design and interfaces, we would like to > continue with the already started design discussion threads. Once these > are > converged, we might use some of the Blink code for the implementation, > if it > is close to the outcome of the design discussions. > > Best, > Stephan > > [1] > > https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E > > [2] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free > > [3] https://issues.apache.org/jira/browse/FLINK-10429 > |
+1 for Stephan's merge proposal. I think it makes sense to pause the
development of the Table API for a short time in order to be able to quickly converge on a common API. From my experience with the Flip-6 refactoring it can be challenging to catch up with a branch which is actively developed. The biggest danger is to miss changes which are only ported to a single branch and to develop features which are not compatible with the other branch. Limiting the changes to critical fixes and paying attention to applying them also to the other branch should help with this problem. Cheers, Till On Wed, Jan 23, 2019 at 3:28 PM Stephan Ewen <[hidden email]> wrote: > I think that is a reasonable proposal. Bugs that are identified could be > fixed in the blink branch, so that we merge the working code. > > New feature contributions to that branch would complicate the merge. I > would try and rather focus on merging and let new contributions go to the > master branch. > > On Tue, Jan 22, 2019 at 11:12 PM Zhang, Xuefu <[hidden email]> > wrote: > > > Hi Stephan, > > > > Thanks for bringing up the discussions. I'm +1 on the merging plan. One > > question though: since the merge will not be completed for some time and > > there are might be uses trying blink branch, what's the plan for the > > development in the branch? Personally I think we may discourage big > > contributions to the branch, which would further complicate the merge, > > while we shouldn't stop critical fixes as well. > > > > What's your take on this? > > > > Thanks, > > Xuefu > > > > > > ------------------------------------------------------------------ > > From:Stephan Ewen <[hidden email]> > > Sent At:2019 Jan. 22 (Tue.) 06:16 > > To:dev <[hidden email]> > > Subject:[DISCUSS] A strategy for merging the Blink enhancements > > > > Dear Flink community! > > > > As a follow-up to the thread announcing Alibaba's offer to contribute the > > Blink code [1] > > < > > > https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E > > > > > , > > here are some thoughts on how this contribution could be merged. > > > > As described in the announcement thread, it is a big contribution, and we > > need to > > carefully plan how to handle the contribution. We would like to get the > > improvements to Flink, > > while making it as non-disruptive as possible for the community. > > I hope that this plan gives the community get a better understanding of > > what the > > proposed contribution would mean. > > > > Here is an initial rough proposal, with thoughts from > > Timo, Piotr, Dawid, Kurt, Shaoxuan, Jincheng, Jark, Aljoscha, Fabian, > > Xiaowei: > > > > - It is obviously very hard to merge all changes in a quick move, > because > > we > > are talking about multiple 100k lines of code. > > > > - As much as possible, we want to maintain compatibility with the > current > > Table API, > > so that this becomes a transparent change for most users. > > > > - The two areas with the most changes we identified were > > (1) The SQL/Table query processor > > (2) The batch scheduling/failover/shuffle > > > > - For the query processor part, this is what we found and propose: > > > > -> The Blink and Flink code have the same semantics (ANSI SQL) except > > for minor > > aspects (under discussion). Blink also covers more SQL operations. > > > > -> The Blink code is quite different from the current Flink SQL > > runtime. > > Merging as changes seems hardly feasible. From the current > > evaluation, the > > Blink query processor uses the more advanced architecture, so it > > would make > > sense to converge to that design. > > > > -> We propose to gradually build up the Blink-based query processor > as > > a second > > query processor under the SQL/Table API. Think of it as two > > different runners > > for the Table API. > > As the new query processor becomes fully merged and stable, we can > > deprecate and > > eventually remove the existing query processor. That should give > the > > least > > disruption to Flink users and allow for gradual merge/development. > > > > -> Some refactoring of the Table API is necessary to support the > above > > strategy. > > Most of the prerequisite refactoring is around splitting the > project > > into > > different modules, following a similar idea as FLIP-28 [2] > > < > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free > > > > > . > > > > -> A more detailed proposal is being worked on. > > > > -> Same as FLIP-28, this approach would probably need to suspend > Table > > API > > contributions for a short while. We hope that this can be a very > > short period, > > to not impact the very active development in Flink on Table > API/SQL > > too much. > > > > - For the batch scheduling and failover enhancements, we should be able > > to build > > on the currently ongoing refactoring of the scheduling logic [3] > > <https://issues.apache.org/jira/browse/FLINK-10429>. That should > > make it easy to plug in a new scheduler and failover logic. We can > port > > the Blink > > enhancements as a new scheduler / failover handler. We can later make > > it the > > default for bounded stream programs once the merge is completed and > it > > is tested. > > > > - For the catalog and source/sink design and interfaces, we would like > to > > continue with the already started design discussion threads. Once > these > > are > > converged, we might use some of the Blink code for the > implementation, > > if it > > is close to the outcome of the design discussions. > > > > Best, > > Stephan > > > > [1] > > > > > https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E > > > > [2] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free > > > > [3] https://issues.apache.org/jira/browse/FLINK-10429 > > > |
Free forum by Nabble | Edit this page |