Hi everyone,
We would like to propose FLIP-83 that adds an end-to-end performance testing framework for Flink. We discovered some potential problems through such an internal end-to-end performance testing framework before the release of 1.9.0 [1], so we'd like to contribute it to Flink community as a supplement to the existing daily run micro performance benchmark [2] and nightly run end-to-end stability test [3]. The FLIP document could be found here: https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework Please kindly review the FLIP document and let us know if you have any comments/suggestions, thanks! [1] https://s.apache.org/m8kcq [2] https://github.com/dataArtisans/flink-benchmarks [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests Best Regards, Yu |
Hi Yu,
Thanks for bringing this up. +1 for the idea and the proposal from my side. I think that the proposed Test Job List might be a bit redundant/excessive, but: - we can always adjust this later, once we have the infrastructure in place - as long as we have the computing resources and ability to quickly interpret the results/catch regressions, it doesn’t hurt to have more benchmarks/tests then strictly necessary. Which brings me to a question. How are you planning to execute the end-to-end benchmarks and integrate them with our build process? Another smaller question: > In this initial stage we will only monitor and display job throughput and latency. Are you planning to monitor the throughput and latency at the same time? It might be a bit problematic, as when measuring the throughput you want to saturate the system and hit some bottleneck, which will cause a back-pressure (measuring latency at the same time when system is back pressured doesn’t make much sense). Piotrek > On 30 Oct 2019, at 11:54, Yu Li <[hidden email]> wrote: > > Hi everyone, > > We would like to propose FLIP-83 that adds an end-to-end performance > testing framework for Flink. We discovered some potential problems through > such an internal end-to-end performance testing framework before the > release of 1.9.0 [1], so we'd like to contribute it to Flink community as a > supplement to the existing daily run micro performance benchmark [2] and > nightly run end-to-end stability test [3]. > > The FLIP document could be found here: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > Please kindly review the FLIP document and let us know if you have any > comments/suggestions, thanks! > > [1] https://s.apache.org/m8kcq > [2] https://github.com/dataArtisans/flink-benchmarks > [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests > > Best Regards, > Yu |
Hi Piotr,
Thanks for the comments! bq. How are you planning to execute the end-to-end benchmarks and integrate them with our build process? Great question! We plan to execute the end-to-end benchmark in a small cluster (like 3 vm nodes) to better reflect network cost, triggering it through our Jenkins service for micro benchmark and show the result on code-speed center. Will add these into FLIP document if no objections. bq. Are you planning to monitor the throughput and latency at the same time? Good question. And you're right, we will stress the cluster to back-pressure and watch the throughput, latency doesn't mean much in the first test suites. Let me refine the document. Thanks. Best Regards, Yu On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <[hidden email]> wrote: > Hi Yu, > > Thanks for bringing this up. > > +1 for the idea and the proposal from my side. > > I think that the proposed Test Job List might be a bit > redundant/excessive, but: > - we can always adjust this later, once we have the infrastructure in place > - as long as we have the computing resources and ability to quickly > interpret the results/catch regressions, it doesn’t hurt to have more > benchmarks/tests then strictly necessary. > > Which brings me to a question. How are you planning to execute the > end-to-end benchmarks and integrate them with our build process? > > Another smaller question: > > > In this initial stage we will only monitor and display job throughput > and latency. > > Are you planning to monitor the throughput and latency at the same time? > It might be a bit problematic, as when measuring the throughput you want to > saturate the system and hit some bottleneck, which will cause a > back-pressure (measuring latency at the same time when system is back > pressured doesn’t make much sense). > > Piotrek > > > On 30 Oct 2019, at 11:54, Yu Li <[hidden email]> wrote: > > > > Hi everyone, > > > > We would like to propose FLIP-83 that adds an end-to-end performance > > testing framework for Flink. We discovered some potential problems > through > > such an internal end-to-end performance testing framework before the > > release of 1.9.0 [1], so we'd like to contribute it to Flink community > as a > > supplement to the existing daily run micro performance benchmark [2] and > > nightly run end-to-end stability test [3]. > > > > The FLIP document could be found here: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > Please kindly review the FLIP document and let us know if you have any > > comments/suggestions, thanks! > > > > [1] https://s.apache.org/m8kcq > > [2] https://github.com/dataArtisans/flink-benchmarks > > [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests > > > > Best Regards, > > Yu > > |
Hi Yu,
Thanks for the answers, it makes sense to me :) Piotrek > On 31 Oct 2019, at 11:25, Yu Li <[hidden email]> wrote: > > Hi Piotr, > > Thanks for the comments! > > bq. How are you planning to execute the end-to-end benchmarks and integrate > them with our build process? > Great question! We plan to execute the end-to-end benchmark in a small > cluster (like 3 vm nodes) to better reflect network cost, triggering it > through our Jenkins service for micro benchmark and show the result on > code-speed center. Will add these into FLIP document if no objections. > > bq. Are you planning to monitor the throughput and latency at the same time? > Good question. And you're right, we will stress the cluster to > back-pressure and watch the throughput, latency doesn't mean much in the > first test suites. Let me refine the document. > > Thanks. > > Best Regards, > Yu > > > On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <[hidden email]> wrote: > >> Hi Yu, >> >> Thanks for bringing this up. >> >> +1 for the idea and the proposal from my side. >> >> I think that the proposed Test Job List might be a bit >> redundant/excessive, but: >> - we can always adjust this later, once we have the infrastructure in place >> - as long as we have the computing resources and ability to quickly >> interpret the results/catch regressions, it doesn’t hurt to have more >> benchmarks/tests then strictly necessary. >> >> Which brings me to a question. How are you planning to execute the >> end-to-end benchmarks and integrate them with our build process? >> >> Another smaller question: >> >>> In this initial stage we will only monitor and display job throughput >> and latency. >> >> Are you planning to monitor the throughput and latency at the same time? >> It might be a bit problematic, as when measuring the throughput you want to >> saturate the system and hit some bottleneck, which will cause a >> back-pressure (measuring latency at the same time when system is back >> pressured doesn’t make much sense). >> >> Piotrek >> >>> On 30 Oct 2019, at 11:54, Yu Li <[hidden email]> wrote: >>> >>> Hi everyone, >>> >>> We would like to propose FLIP-83 that adds an end-to-end performance >>> testing framework for Flink. We discovered some potential problems >> through >>> such an internal end-to-end performance testing framework before the >>> release of 1.9.0 [1], so we'd like to contribute it to Flink community >> as a >>> supplement to the existing daily run micro performance benchmark [2] and >>> nightly run end-to-end stability test [3]. >>> >>> The FLIP document could be found here: >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework >>> >>> Please kindly review the FLIP document and let us know if you have any >>> comments/suggestions, thanks! >>> >>> [1] https://s.apache.org/m8kcq >>> [2] https://github.com/dataArtisans/flink-benchmarks >>> [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests >>> >>> Best Regards, >>> Yu >> >> |
+1, I like the idea of this improvement which acts as a watchdog for developers' code change.
By the way, do you think it's worthy to add a checkpoint mode which just disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 be discussed in more details? Best Yun Tang On 11/1/19, 5:02 PM, "Piotr Nowojski" <[hidden email]> wrote: Hi Yu, Thanks for the answers, it makes sense to me :) Piotrek > On 31 Oct 2019, at 11:25, Yu Li <[hidden email]> wrote: > > Hi Piotr, > > Thanks for the comments! > > bq. How are you planning to execute the end-to-end benchmarks and integrate > them with our build process? > Great question! We plan to execute the end-to-end benchmark in a small > cluster (like 3 vm nodes) to better reflect network cost, triggering it > through our Jenkins service for micro benchmark and show the result on > code-speed center. Will add these into FLIP document if no objections. > > bq. Are you planning to monitor the throughput and latency at the same time? > Good question. And you're right, we will stress the cluster to > back-pressure and watch the throughput, latency doesn't mean much in the > first test suites. Let me refine the document. > > Thanks. > > Best Regards, > Yu > > > On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <[hidden email]> wrote: > >> Hi Yu, >> >> Thanks for bringing this up. >> >> +1 for the idea and the proposal from my side. >> >> I think that the proposed Test Job List might be a bit >> redundant/excessive, but: >> - we can always adjust this later, once we have the infrastructure in place >> - as long as we have the computing resources and ability to quickly >> interpret the results/catch regressions, it doesn’t hurt to have more >> benchmarks/tests then strictly necessary. >> >> Which brings me to a question. How are you planning to execute the >> end-to-end benchmarks and integrate them with our build process? >> >> Another smaller question: >> >>> In this initial stage we will only monitor and display job throughput >> and latency. >> >> Are you planning to monitor the throughput and latency at the same time? >> It might be a bit problematic, as when measuring the throughput you want to >> saturate the system and hit some bottleneck, which will cause a >> back-pressure (measuring latency at the same time when system is back >> pressured doesn’t make much sense). >> >> Piotrek >> >>> On 30 Oct 2019, at 11:54, Yu Li <[hidden email]> wrote: >>> >>> Hi everyone, >>> >>> We would like to propose FLIP-83 that adds an end-to-end performance >>> testing framework for Flink. We discovered some potential problems >> through >>> such an internal end-to-end performance testing framework before the >>> release of 1.9.0 [1], so we'd like to contribute it to Flink community >> as a >>> supplement to the existing daily run micro performance benchmark [2] and >>> nightly run end-to-end stability test [3]. >>> >>> The FLIP document could be found here: >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework >>> >>> Please kindly review the FLIP document and let us know if you have any >>> comments/suggestions, thanks! >>> >>> [1] https://s.apache.org/m8kcq >>> [2] https://github.com/dataArtisans/flink-benchmarks >>> [3] https://github.com/apache/flink/tree/master/flink-end-to-end-tests >>> >>> Best Regards, >>> Yu >> >> |
Thanks for starting this discussion. I agree that performance tests will
help us to prevent introducing regressions. +1 for this proposal. Cheers, Till On Fri, Nov 1, 2019 at 5:13 PM Yun Tang <[hidden email]> wrote: > +1, I like the idea of this improvement which acts as a watchdog for > developers' code change. > > By the way, do you think it's worthy to add a checkpoint mode which just > disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 > be discussed in more details? > > Best > Yun Tang > > On 11/1/19, 5:02 PM, "Piotr Nowojski" <[hidden email]> wrote: > > Hi Yu, > > Thanks for the answers, it makes sense to me :) > > Piotrek > > > On 31 Oct 2019, at 11:25, Yu Li <[hidden email]> wrote: > > > > Hi Piotr, > > > > Thanks for the comments! > > > > bq. How are you planning to execute the end-to-end benchmarks and > integrate > > them with our build process? > > Great question! We plan to execute the end-to-end benchmark in a > small > > cluster (like 3 vm nodes) to better reflect network cost, triggering > it > > through our Jenkins service for micro benchmark and show the result > on > > code-speed center. Will add these into FLIP document if no > objections. > > > > bq. Are you planning to monitor the throughput and latency at the > same time? > > Good question. And you're right, we will stress the cluster to > > back-pressure and watch the throughput, latency doesn't mean much in > the > > first test suites. Let me refine the document. > > > > Thanks. > > > > Best Regards, > > Yu > > > > > > On Wed, 30 Oct 2019 at 19:07, Piotr Nowojski <[hidden email]> > wrote: > > > >> Hi Yu, > >> > >> Thanks for bringing this up. > >> > >> +1 for the idea and the proposal from my side. > >> > >> I think that the proposed Test Job List might be a bit > >> redundant/excessive, but: > >> - we can always adjust this later, once we have the infrastructure > in place > >> - as long as we have the computing resources and ability to quickly > >> interpret the results/catch regressions, it doesn’t hurt to have > more > >> benchmarks/tests then strictly necessary. > >> > >> Which brings me to a question. How are you planning to execute the > >> end-to-end benchmarks and integrate them with our build process? > >> > >> Another smaller question: > >> > >>> In this initial stage we will only monitor and display job > throughput > >> and latency. > >> > >> Are you planning to monitor the throughput and latency at the same > time? > >> It might be a bit problematic, as when measuring the throughput you > want to > >> saturate the system and hit some bottleneck, which will cause a > >> back-pressure (measuring latency at the same time when system is > back > >> pressured doesn’t make much sense). > >> > >> Piotrek > >> > >>> On 30 Oct 2019, at 11:54, Yu Li <[hidden email]> wrote: > >>> > >>> Hi everyone, > >>> > >>> We would like to propose FLIP-83 that adds an end-to-end > performance > >>> testing framework for Flink. We discovered some potential problems > >> through > >>> such an internal end-to-end performance testing framework before > the > >>> release of 1.9.0 [1], so we'd like to contribute it to Flink > community > >> as a > >>> supplement to the existing daily run micro performance benchmark > [2] and > >>> nightly run end-to-end stability test [3]. > >>> > >>> The FLIP document could be found here: > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > >>> > >>> Please kindly review the FLIP document and let us know if you have > any > >>> comments/suggestions, thanks! > >>> > >>> [1] https://s.apache.org/m8kcq > >>> [2] https://github.com/dataArtisans/flink-benchmarks > >>> [3] > https://github.com/apache/flink/tree/master/flink-end-to-end-tests > >>> > >>> Best Regards, > >>> Yu > >> > >> > > > > |
In reply to this post by Yun Tang
In stage1, the checkpoint mode isn't disabled,and uses heap as the statebackend.
I think there should be some special scenarios to test checkpoint and statebackend, which will be discussed and added in the release-1.11 > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > By the way, do you think it's worthy to add a checkpoint mode which just disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 be discussed in more details? |
> The test cases are written in java and scripts in python. We propose a
separate directory/module in parallel with flink-end-to-end-tests, with the > name of flink-end-to-end-perf-tests. Glad to see that the newly introduced e2e test will be written in Java. because I'm re-working on the existed e2e tests suites from BASH scripts to Java test cases so that we can support more external system , such as running the testing job on yarn+flink, docker+flink, standalone+flink, distributed kafka cluster etc. BTW, I think the perf e2e test suites will also need to be designed as supporting running on both standalone env and distributed env. will be helpful for developing & evaluating the perf. Thanks. On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > In stage1, the checkpoint mode isn't disabled,and uses heap as the > statebackend. > I think there should be some special scenarios to test checkpoint and > statebackend, which will be discussed and added in the release-1.11 > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > By the way, do you think it's worthy to add a checkpoint mode which just > disable checkpoint to run end-to-end jobs? And when will stage2 and stage3 > be discussed in more details? > > |
+1 for the idea. Thanks Yu for driving this.
Just curious about that can we collect the metrics about Job scheduling and task launch. the speed of this part is also important. We can add tests for watch it too. Look forward to more batch test support. Best, Jingsong Lee On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > The test cases are written in java and scripts in python. We propose a > separate directory/module in parallel with flink-end-to-end-tests, with the > > name of flink-end-to-end-perf-tests. > > Glad to see that the newly introduced e2e test will be written in Java. > because I'm re-working on the existed e2e tests suites from BASH scripts > to Java test cases so that we can support more external system , such as > running the testing job on yarn+flink, docker+flink, standalone+flink, > distributed kafka cluster etc. > BTW, I think the perf e2e test suites will also need to be designed as > supporting running on both standalone env and distributed env. will be > helpful > for developing & evaluating the perf. > Thanks. > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > statebackend. > > I think there should be some special scenarios to test checkpoint and > > statebackend, which will be discussed and added in the release-1.11 > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > By the way, do you think it's worthy to add a checkpoint mode which > just > > disable checkpoint to run end-to-end jobs? And when will stage2 and > stage3 > > be discussed in more details? > > > > > -- Best, Jingsong Lee |
+1 for this idea.
Currently, we have the micro benchmark for flink, which can help us find the regressions. And I think the e2e jobs performance testing can also help us to cover more scenarios. Best, Congxian Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > +1 for the idea. Thanks Yu for driving this. > Just curious about that can we collect the metrics about Job scheduling and > task launch. the speed of this part is also important. > We can add tests for watch it too. > > Look forward to more batch test support. > > Best, > Jingsong Lee > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > > > The test cases are written in java and scripts in python. We propose a > > separate directory/module in parallel with flink-end-to-end-tests, with > the > > > name of flink-end-to-end-perf-tests. > > > > Glad to see that the newly introduced e2e test will be written in Java. > > because I'm re-working on the existed e2e tests suites from BASH scripts > > to Java test cases so that we can support more external system , such as > > running the testing job on yarn+flink, docker+flink, standalone+flink, > > distributed kafka cluster etc. > > BTW, I think the perf e2e test suites will also need to be designed as > > supporting running on both standalone env and distributed env. will be > > helpful > > for developing & evaluating the perf. > > Thanks. > > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > > statebackend. > > > I think there should be some special scenarios to test checkpoint and > > > statebackend, which will be discussed and added in the release-1.11 > > > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > > > By the way, do you think it's worthy to add a checkpoint mode which > > just > > > disable checkpoint to run end-to-end jobs? And when will stage2 and > > stage3 > > > be discussed in more details? > > > > > > > > > > > -- > Best, Jingsong Lee > |
Thanks Yu for bringing this topic.
+1 for this proposal. Glad to have an e2e performance testing. It seems this proposal is separated into several stages. Is there a more detailed plan? Thanks, Biao /'bɪ.aʊ/ On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> wrote: > +1 for this idea. > > Currently, we have the micro benchmark for flink, which can help us find > the regressions. And I think the e2e jobs performance testing can also help > us to cover more scenarios. > > Best, > Congxian > > > Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > > > +1 for the idea. Thanks Yu for driving this. > > Just curious about that can we collect the metrics about Job scheduling > and > > task launch. the speed of this part is also important. > > We can add tests for watch it too. > > > > Look forward to more batch test support. > > > > Best, > > Jingsong Lee > > > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > > > > > The test cases are written in java and scripts in python. We propose > a > > > separate directory/module in parallel with flink-end-to-end-tests, with > > the > > > > name of flink-end-to-end-perf-tests. > > > > > > Glad to see that the newly introduced e2e test will be written in Java. > > > because I'm re-working on the existed e2e tests suites from BASH > scripts > > > to Java test cases so that we can support more external system , such > as > > > running the testing job on yarn+flink, docker+flink, standalone+flink, > > > distributed kafka cluster etc. > > > BTW, I think the perf e2e test suites will also need to be designed as > > > supporting running on both standalone env and distributed env. will be > > > helpful > > > for developing & evaluating the perf. > > > Thanks. > > > > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > > > > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > > > statebackend. > > > > I think there should be some special scenarios to test checkpoint and > > > > statebackend, which will be discussed and added in the release-1.11 > > > > > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > > > > > By the way, do you think it's worthy to add a checkpoint mode which > > > just > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and > > > stage3 > > > > be discussed in more details? > > > > > > > > > > > > > > > > > -- > > Best, Jingsong Lee > > > |
Thanks Yu for starting this discussion.
I'm in favor of adding a e2e performance testing framework. Currently the e2e tests are mainly focused on functionality and written in shell. We need a better e2e framework for performance and functionality tests. Best, Yang Biao Liu <[hidden email]> 于2019年11月5日周二 上午10:16写道: > Thanks Yu for bringing this topic. > > +1 for this proposal. Glad to have an e2e performance testing. > > It seems this proposal is separated into several stages. Is there a more > detailed plan? > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> wrote: > > > +1 for this idea. > > > > Currently, we have the micro benchmark for flink, which can help us find > > the regressions. And I think the e2e jobs performance testing can also > help > > us to cover more scenarios. > > > > Best, > > Congxian > > > > > > Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > > > > > +1 for the idea. Thanks Yu for driving this. > > > Just curious about that can we collect the metrics about Job scheduling > > and > > > task launch. the speed of this part is also important. > > > We can add tests for watch it too. > > > > > > Look forward to more batch test support. > > > > > > Best, > > > Jingsong Lee > > > > > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > > > > > > > The test cases are written in java and scripts in python. We > propose > > a > > > > separate directory/module in parallel with flink-end-to-end-tests, > with > > > the > > > > > name of flink-end-to-end-perf-tests. > > > > > > > > Glad to see that the newly introduced e2e test will be written in > Java. > > > > because I'm re-working on the existed e2e tests suites from BASH > > scripts > > > > to Java test cases so that we can support more external system , such > > as > > > > running the testing job on yarn+flink, docker+flink, > standalone+flink, > > > > distributed kafka cluster etc. > > > > BTW, I think the perf e2e test suites will also need to be designed > as > > > > supporting running on both standalone env and distributed env. will > be > > > > helpful > > > > for developing & evaluating the perf. > > > > Thanks. > > > > > > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> > wrote: > > > > > > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > > > > statebackend. > > > > > I think there should be some special scenarios to test checkpoint > and > > > > > statebackend, which will be discussed and added in the release-1.11 > > > > > > > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > > > > > > > By the way, do you think it's worthy to add a checkpoint mode > which > > > > just > > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and > > > > stage3 > > > > > be discussed in more details? > > > > > > > > > > > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > |
In reply to this post by OpenInx
Thanks for the comments.
bq. I think the perf e2e test suites will also need to be designed as supporting running on both standalone env and distributed env. will be helpful for developing & evaluating the perf. Agreed and marked down, the benchmark will be able to be executed in standalone mode. On the other hand, we plan to check the result in distributed mode to better reflect network cost for the daily run. Best Regards, Yu On Mon, 4 Nov 2019 at 10:00, OpenInx <[hidden email]> wrote: > > The test cases are written in java and scripts in python. We propose a > separate directory/module in parallel with flink-end-to-end-tests, with the > > name of flink-end-to-end-perf-tests. > > Glad to see that the newly introduced e2e test will be written in Java. > because I'm re-working on the existed e2e tests suites from BASH scripts > to Java test cases so that we can support more external system , such as > running the testing job on yarn+flink, docker+flink, standalone+flink, > distributed kafka cluster etc. > BTW, I think the perf e2e test suites will also need to be designed as > supporting running on both standalone env and distributed env. will be > helpful > for developing & evaluating the perf. > Thanks. > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > statebackend. > > I think there should be some special scenarios to test checkpoint and > > statebackend, which will be discussed and added in the release-1.11 > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > By the way, do you think it's worthy to add a checkpoint mode which > just > > disable checkpoint to run end-to-end jobs? And when will stage2 and > stage3 > > be discussed in more details? > > > > > |
In reply to this post by Jingsong Li
Thanks for the suggestion Jingsong!
I've added a stage for adding more metrics in FLIP document, please check and let me know if any further concerns. Thanks. Best Regards, Yu On Mon, 4 Nov 2019 at 17:37, Jingsong Li <[hidden email]> wrote: > +1 for the idea. Thanks Yu for driving this. > Just curious about that can we collect the metrics about Job scheduling and > task launch. the speed of this part is also important. > We can add tests for watch it too. > > Look forward to more batch test support. > > Best, > Jingsong Lee > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > > > The test cases are written in java and scripts in python. We propose a > > separate directory/module in parallel with flink-end-to-end-tests, with > the > > > name of flink-end-to-end-perf-tests. > > > > Glad to see that the newly introduced e2e test will be written in Java. > > because I'm re-working on the existed e2e tests suites from BASH scripts > > to Java test cases so that we can support more external system , such as > > running the testing job on yarn+flink, docker+flink, standalone+flink, > > distributed kafka cluster etc. > > BTW, I think the perf e2e test suites will also need to be designed as > > supporting running on both standalone env and distributed env. will be > > helpful > > for developing & evaluating the perf. > > Thanks. > > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> wrote: > > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > > statebackend. > > > I think there should be some special scenarios to test checkpoint and > > > statebackend, which will be discussed and added in the release-1.11 > > > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > > > By the way, do you think it's worthy to add a checkpoint mode which > > just > > > disable checkpoint to run end-to-end jobs? And when will stage2 and > > stage3 > > > be discussed in more details? > > > > > > > > > > > -- > Best, Jingsong Lee > |
In reply to this post by Biao Liu
Thanks for the comments Biao!
bq. It seems this proposal is separated into several stages. Is there a more detailed plan? Good point! For stage one we'd like to try introducing the benchmark first, so we could guard the release (hopefully starting from 1.10). For other stages, we don't have detailed plan yet, but will add child FLIPs when moving on and open new discussion/voting separately. I have updated the FLIP document to better reflect this, please check it and let me know what you think. Thanks. Best Regards, Yu On Tue, 5 Nov 2019 at 10:16, Biao Liu <[hidden email]> wrote: > Thanks Yu for bringing this topic. > > +1 for this proposal. Glad to have an e2e performance testing. > > It seems this proposal is separated into several stages. Is there a more > detailed plan? > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> wrote: > > > +1 for this idea. > > > > Currently, we have the micro benchmark for flink, which can help us find > > the regressions. And I think the e2e jobs performance testing can also > help > > us to cover more scenarios. > > > > Best, > > Congxian > > > > > > Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > > > > > +1 for the idea. Thanks Yu for driving this. > > > Just curious about that can we collect the metrics about Job scheduling > > and > > > task launch. the speed of this part is also important. > > > We can add tests for watch it too. > > > > > > Look forward to more batch test support. > > > > > > Best, > > > Jingsong Lee > > > > > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > > > > > > > > The test cases are written in java and scripts in python. We > propose > > a > > > > separate directory/module in parallel with flink-end-to-end-tests, > with > > > the > > > > > name of flink-end-to-end-perf-tests. > > > > > > > > Glad to see that the newly introduced e2e test will be written in > Java. > > > > because I'm re-working on the existed e2e tests suites from BASH > > scripts > > > > to Java test cases so that we can support more external system , such > > as > > > > running the testing job on yarn+flink, docker+flink, > standalone+flink, > > > > distributed kafka cluster etc. > > > > BTW, I think the perf e2e test suites will also need to be designed > as > > > > supporting running on both standalone env and distributed env. will > be > > > > helpful > > > > for developing & evaluating the perf. > > > > Thanks. > > > > > > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> > wrote: > > > > > > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the > > > > > statebackend. > > > > > I think there should be some special scenarios to test checkpoint > and > > > > > statebackend, which will be discussed and added in the release-1.11 > > > > > > > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > > > > > > > > > > > > By the way, do you think it's worthy to add a checkpoint mode > which > > > > just > > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and > > > > stage3 > > > > > be discussed in more details? > > > > > > > > > > > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > |
Since one week passed and no more comments, I assume the latest FLIP doc
looks good to all and will open a VOTE thread soon for the FLIP. Thanks for all the comments and discussion! Best Regards, Yu On Thu, 7 Nov 2019 at 18:35, Yu Li <[hidden email]> wrote: > Thanks for the comments Biao! > > bq. It seems this proposal is separated into several stages. Is there a > more detailed plan? > Good point! For stage one we'd like to try introducing the benchmark > first, so we could guard the release (hopefully starting from 1.10). For > other stages, we don't have detailed plan yet, but will add child FLIPs > when moving on and open new discussion/voting separately. I have updated > the FLIP document to better reflect this, please check it and let me know > what you think. Thanks. > > Best Regards, > Yu > > > On Tue, 5 Nov 2019 at 10:16, Biao Liu <[hidden email]> wrote: > >> Thanks Yu for bringing this topic. >> >> +1 for this proposal. Glad to have an e2e performance testing. >> >> It seems this proposal is separated into several stages. Is there a more >> detailed plan? >> >> Thanks, >> Biao /'bɪ.aʊ/ >> >> >> >> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> wrote: >> >> > +1 for this idea. >> > >> > Currently, we have the micro benchmark for flink, which can help us find >> > the regressions. And I think the e2e jobs performance testing can also >> help >> > us to cover more scenarios. >> > >> > Best, >> > Congxian >> > >> > >> > Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: >> > >> > > +1 for the idea. Thanks Yu for driving this. >> > > Just curious about that can we collect the metrics about Job >> scheduling >> > and >> > > task launch. the speed of this part is also important. >> > > We can add tests for watch it too. >> > > >> > > Look forward to more batch test support. >> > > >> > > Best, >> > > Jingsong Lee >> > > >> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: >> > > >> > > > > The test cases are written in java and scripts in python. We >> propose >> > a >> > > > separate directory/module in parallel with flink-end-to-end-tests, >> with >> > > the >> > > > > name of flink-end-to-end-perf-tests. >> > > > >> > > > Glad to see that the newly introduced e2e test will be written in >> Java. >> > > > because I'm re-working on the existed e2e tests suites from BASH >> > scripts >> > > > to Java test cases so that we can support more external system , >> such >> > as >> > > > running the testing job on yarn+flink, docker+flink, >> standalone+flink, >> > > > distributed kafka cluster etc. >> > > > BTW, I think the perf e2e test suites will also need to be designed >> as >> > > > supporting running on both standalone env and distributed env. will >> be >> > > > helpful >> > > > for developing & evaluating the perf. >> > > > Thanks. >> > > > >> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> >> wrote: >> > > > >> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the >> > > > > statebackend. >> > > > > I think there should be some special scenarios to test checkpoint >> and >> > > > > statebackend, which will be discussed and added in the >> release-1.11 >> > > > > >> > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: >> > > > > > >> > > > > > By the way, do you think it's worthy to add a checkpoint mode >> which >> > > > just >> > > > > disable checkpoint to run end-to-end jobs? And when will stage2 >> and >> > > > stage3 >> > > > > be discussed in more details? >> > > > > >> > > > > >> > > > >> > > >> > > >> > > -- >> > > Best, Jingsong Lee >> > > >> > >> > |
Thanks Yu for bringing up this discussion.
The e2e perf tests can be really helpful and the overall design looks good to me. Sorry it's late but I have 2 questions about the result check. 1. How do we measure the job throughput? By measuring the job execution time on a finite input data set, or measuring the QPS when the job has reached a stable state? I ask this because that, with LazyFromSource schedule mode, tasks are launched gradually on processing progress. So if we are measuring the throughput in the latter way, the LazyFromSource scheduling would make no difference with Eager scheduling. So we can drop this dimension if taking this way. By measuring the total execution time, however, it can be kept since the scheduling effectiveness can make differences, especially in small input data set cases. 2. In our prior experiences, the performance result is usually not that stable, which may make the perf degradation harder to detect. Shall we define the rounds to run a job and how to aggregate the result, so that we can get a more reliable final performance result? Thanks, Zhu Zhu Yu Li <[hidden email]> 于2019年11月14日周四 上午10:52写道: > Since one week passed and no more comments, I assume the latest FLIP doc > looks good to all and will open a VOTE thread soon for the FLIP. Thanks for > all the comments and discussion! > > Best Regards, > Yu > > > On Thu, 7 Nov 2019 at 18:35, Yu Li <[hidden email]> wrote: > > > Thanks for the comments Biao! > > > > bq. It seems this proposal is separated into several stages. Is there a > > more detailed plan? > > Good point! For stage one we'd like to try introducing the benchmark > > first, so we could guard the release (hopefully starting from 1.10). For > > other stages, we don't have detailed plan yet, but will add child FLIPs > > when moving on and open new discussion/voting separately. I have updated > > the FLIP document to better reflect this, please check it and let me know > > what you think. Thanks. > > > > Best Regards, > > Yu > > > > > > On Tue, 5 Nov 2019 at 10:16, Biao Liu <[hidden email]> wrote: > > > >> Thanks Yu for bringing this topic. > >> > >> +1 for this proposal. Glad to have an e2e performance testing. > >> > >> It seems this proposal is separated into several stages. Is there a more > >> detailed plan? > >> > >> Thanks, > >> Biao /'bɪ.aʊ/ > >> > >> > >> > >> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> > wrote: > >> > >> > +1 for this idea. > >> > > >> > Currently, we have the micro benchmark for flink, which can help us > find > >> > the regressions. And I think the e2e jobs performance testing can also > >> help > >> > us to cover more scenarios. > >> > > >> > Best, > >> > Congxian > >> > > >> > > >> > Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > >> > > >> > > +1 for the idea. Thanks Yu for driving this. > >> > > Just curious about that can we collect the metrics about Job > >> scheduling > >> > and > >> > > task launch. the speed of this part is also important. > >> > > We can add tests for watch it too. > >> > > > >> > > Look forward to more batch test support. > >> > > > >> > > Best, > >> > > Jingsong Lee > >> > > > >> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > >> > > > >> > > > > The test cases are written in java and scripts in python. We > >> propose > >> > a > >> > > > separate directory/module in parallel with flink-end-to-end-tests, > >> with > >> > > the > >> > > > > name of flink-end-to-end-perf-tests. > >> > > > > >> > > > Glad to see that the newly introduced e2e test will be written in > >> Java. > >> > > > because I'm re-working on the existed e2e tests suites from BASH > >> > scripts > >> > > > to Java test cases so that we can support more external system , > >> such > >> > as > >> > > > running the testing job on yarn+flink, docker+flink, > >> standalone+flink, > >> > > > distributed kafka cluster etc. > >> > > > BTW, I think the perf e2e test suites will also need to be > designed > >> as > >> > > > supporting running on both standalone env and distributed env. > will > >> be > >> > > > helpful > >> > > > for developing & evaluating the perf. > >> > > > Thanks. > >> > > > > >> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> > >> wrote: > >> > > > > >> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as > the > >> > > > > statebackend. > >> > > > > I think there should be some special scenarios to test > checkpoint > >> and > >> > > > > statebackend, which will be discussed and added in the > >> release-1.11 > >> > > > > > >> > > > > > 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > >> > > > > > > >> > > > > > By the way, do you think it's worthy to add a checkpoint mode > >> which > >> > > > just > >> > > > > disable checkpoint to run end-to-end jobs? And when will stage2 > >> and > >> > > > stage3 > >> > > > > be discussed in more details? > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > >> > > -- > >> > > Best, Jingsong Lee > >> > > > >> > > >> > > > |
Thanks for the comments Zhu Zhu!
> 1. How do we measure the job throughput? By measuring the job execution > time on a finite input data set, or measuring the QPS when the job has > reached a stable state? > I ask this because that, with LazyFromSource schedule mode, tasks are > launched gradually on processing progress. > So if we are measuring the throughput in the latter way, > the LazyFromSource scheduling would make no difference with Eager > scheduling. So we can drop this dimension if taking this way. > By measuring the total execution time, however, it can be kept since the > scheduling effectiveness can make differences, especially in small input > data set cases. we plan to meaure the job throughout by meauring the qps when the job has reached a stable state. If as you said, there is no difference between lazyfromsource and eager in this measuring way, we can adjust the test scenario after running for a while, and remove the duplicate part. > 2. In our prior experiences, the performance result is usually not that > stable, which may make the perf degradation harder to detect. > Shall we define the rounds to run a job and how to aggregate the > result, so that we can get a more reliable final performance result? Good advice, we plan to run multi rounds(5 is the default value ) per scene ,then calculate the average value as the result. > 在 2019年11月21日,下午3:01,Zhu Zhu <[hidden email]> 写道: > > Thanks Yu for bringing up this discussion. > The e2e perf tests can be really helpful and the overall design looks good > to me. > > Sorry it's late but I have 2 questions about the result check. > 1. How do we measure the job throughput? By measuring the job execution > time on a finite input data set, or measuring the QPS when the job has > reached a stable state? > I ask this because that, with LazyFromSource schedule mode, tasks are > launched gradually on processing progress. > So if we are measuring the throughput in the latter way, > the LazyFromSource scheduling would make no difference with Eager > scheduling. So we can drop this dimension if taking this way. > By measuring the total execution time, however, it can be kept since the > scheduling effectiveness can make differences, especially in small input > data set cases. > 2. In our prior experiences, the performance result is usually not that > stable, which may make the perf degradation harder to detect. > Shall we define the rounds to run a job and how to aggregate the > result, so that we can get a more reliable final performance result? > > Thanks, > Zhu Zhu > > Yu Li <[hidden email]> 于2019年11月14日周四 上午10:52写道: > >> Since one week passed and no more comments, I assume the latest FLIP doc >> looks good to all and will open a VOTE thread soon for the FLIP. Thanks for >> all the comments and discussion! >> >> Best Regards, >> Yu >> >> >> On Thu, 7 Nov 2019 at 18:35, Yu Li <[hidden email]> wrote: >> >>> Thanks for the comments Biao! >>> >>> bq. It seems this proposal is separated into several stages. Is there a >>> more detailed plan? >>> Good point! For stage one we'd like to try introducing the benchmark >>> first, so we could guard the release (hopefully starting from 1.10). For >>> other stages, we don't have detailed plan yet, but will add child FLIPs >>> when moving on and open new discussion/voting separately. I have updated >>> the FLIP document to better reflect this, please check it and let me know >>> what you think. Thanks. >>> >>> Best Regards, >>> Yu >>> >>> >>> On Tue, 5 Nov 2019 at 10:16, Biao Liu <[hidden email]> wrote: >>> >>>> Thanks Yu for bringing this topic. >>>> >>>> +1 for this proposal. Glad to have an e2e performance testing. >>>> >>>> It seems this proposal is separated into several stages. Is there a more >>>> detailed plan? >>>> >>>> Thanks, >>>> Biao /'bɪ.aʊ/ >>>> >>>> >>>> >>>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> >> wrote: >>>> >>>>> +1 for this idea. >>>>> >>>>> Currently, we have the micro benchmark for flink, which can help us >> find >>>>> the regressions. And I think the e2e jobs performance testing can also >>>> help >>>>> us to cover more scenarios. >>>>> >>>>> Best, >>>>> Congxian >>>>> >>>>> >>>>> Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: >>>>> >>>>>> +1 for the idea. Thanks Yu for driving this. >>>>>> Just curious about that can we collect the metrics about Job >>>> scheduling >>>>> and >>>>>> task launch. the speed of this part is also important. >>>>>> We can add tests for watch it too. >>>>>> >>>>>> Look forward to more batch test support. >>>>>> >>>>>> Best, >>>>>> Jingsong Lee >>>>>> >>>>>> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: >>>>>> >>>>>>>> The test cases are written in java and scripts in python. We >>>> propose >>>>> a >>>>>>> separate directory/module in parallel with flink-end-to-end-tests, >>>> with >>>>>> the >>>>>>>> name of flink-end-to-end-perf-tests. >>>>>>> >>>>>>> Glad to see that the newly introduced e2e test will be written in >>>> Java. >>>>>>> because I'm re-working on the existed e2e tests suites from BASH >>>>> scripts >>>>>>> to Java test cases so that we can support more external system , >>>> such >>>>> as >>>>>>> running the testing job on yarn+flink, docker+flink, >>>> standalone+flink, >>>>>>> distributed kafka cluster etc. >>>>>>> BTW, I think the perf e2e test suites will also need to be >> designed >>>> as >>>>>>> supporting running on both standalone env and distributed env. >> will >>>> be >>>>>>> helpful >>>>>>> for developing & evaluating the perf. >>>>>>> Thanks. >>>>>>> >>>>>>> On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> >>>> wrote: >>>>>>> >>>>>>>> In stage1, the checkpoint mode isn't disabled,and uses heap as >> the >>>>>>>> statebackend. >>>>>>>> I think there should be some special scenarios to test >> checkpoint >>>> and >>>>>>>> statebackend, which will be discussed and added in the >>>> release-1.11 >>>>>>>> >>>>>>>>> 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: >>>>>>>>> >>>>>>>>> By the way, do you think it's worthy to add a checkpoint mode >>>> which >>>>>>> just >>>>>>>> disable checkpoint to run end-to-end jobs? And when will stage2 >>>> and >>>>>>> stage3 >>>>>>>> be discussed in more details? >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best, Jingsong Lee >>>>>> >>>>> >>>> >>> >> |
Thanks Aihua for the explanation.
The proposal looks good to me then. Thanks, Zhu Zhu aihua li <[hidden email]> 于2019年11月21日周四 下午3:59写道: > Thanks for the comments Zhu Zhu! > > > 1. How do we measure the job throughput? By measuring the job execution > > time on a finite input data set, or measuring the QPS when the job has > > reached a stable state? > > I ask this because that, with LazyFromSource schedule mode, tasks are > > launched gradually on processing progress. > > So if we are measuring the throughput in the latter way, > > the LazyFromSource scheduling would make no difference with Eager > > scheduling. So we can drop this dimension if taking this way. > > By measuring the total execution time, however, it can be kept since > the > > scheduling effectiveness can make differences, especially in small input > > data set cases. > > we plan to meaure the job throughout by meauring the qps when the job has > reached a stable state. > If as you said, there is no difference between lazyfromsource and eager in > this measuring way, we can adjust the test scenario after running for a > while, and remove the duplicate part. > > > 2. In our prior experiences, the performance result is usually not that > > stable, which may make the perf degradation harder to detect. > > Shall we define the rounds to run a job and how to aggregate the > > result, so that we can get a more reliable final performance result? > > Good advice, we plan to run multi rounds(5 is the default value ) per > scene ,then calculate the average value as the result. > > > > > > > 在 2019年11月21日,下午3:01,Zhu Zhu <[hidden email]> 写道: > > > > Thanks Yu for bringing up this discussion. > > The e2e perf tests can be really helpful and the overall design looks > good > > to me. > > > > Sorry it's late but I have 2 questions about the result check. > > 1. How do we measure the job throughput? By measuring the job execution > > time on a finite input data set, or measuring the QPS when the job has > > reached a stable state? > > I ask this because that, with LazyFromSource schedule mode, tasks are > > launched gradually on processing progress. > > So if we are measuring the throughput in the latter way, > > the LazyFromSource scheduling would make no difference with Eager > > scheduling. So we can drop this dimension if taking this way. > > By measuring the total execution time, however, it can be kept since > the > > scheduling effectiveness can make differences, especially in small input > > data set cases. > > 2. In our prior experiences, the performance result is usually not that > > stable, which may make the perf degradation harder to detect. > > Shall we define the rounds to run a job and how to aggregate the > > result, so that we can get a more reliable final performance result? > > > > Thanks, > > Zhu Zhu > > > > Yu Li <[hidden email]> 于2019年11月14日周四 上午10:52写道: > > > >> Since one week passed and no more comments, I assume the latest FLIP doc > >> looks good to all and will open a VOTE thread soon for the FLIP. Thanks > for > >> all the comments and discussion! > >> > >> Best Regards, > >> Yu > >> > >> > >> On Thu, 7 Nov 2019 at 18:35, Yu Li <[hidden email]> wrote: > >> > >>> Thanks for the comments Biao! > >>> > >>> bq. It seems this proposal is separated into several stages. Is there a > >>> more detailed plan? > >>> Good point! For stage one we'd like to try introducing the benchmark > >>> first, so we could guard the release (hopefully starting from 1.10). > For > >>> other stages, we don't have detailed plan yet, but will add child FLIPs > >>> when moving on and open new discussion/voting separately. I have > updated > >>> the FLIP document to better reflect this, please check it and let me > know > >>> what you think. Thanks. > >>> > >>> Best Regards, > >>> Yu > >>> > >>> > >>> On Tue, 5 Nov 2019 at 10:16, Biao Liu <[hidden email]> wrote: > >>> > >>>> Thanks Yu for bringing this topic. > >>>> > >>>> +1 for this proposal. Glad to have an e2e performance testing. > >>>> > >>>> It seems this proposal is separated into several stages. Is there a > more > >>>> detailed plan? > >>>> > >>>> Thanks, > >>>> Biao /'bɪ.aʊ/ > >>>> > >>>> > >>>> > >>>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <[hidden email]> > >> wrote: > >>>> > >>>>> +1 for this idea. > >>>>> > >>>>> Currently, we have the micro benchmark for flink, which can help us > >> find > >>>>> the regressions. And I think the e2e jobs performance testing can > also > >>>> help > >>>>> us to cover more scenarios. > >>>>> > >>>>> Best, > >>>>> Congxian > >>>>> > >>>>> > >>>>> Jingsong Li <[hidden email]> 于2019年11月4日周一 下午5:37写道: > >>>>> > >>>>>> +1 for the idea. Thanks Yu for driving this. > >>>>>> Just curious about that can we collect the metrics about Job > >>>> scheduling > >>>>> and > >>>>>> task launch. the speed of this part is also important. > >>>>>> We can add tests for watch it too. > >>>>>> > >>>>>> Look forward to more batch test support. > >>>>>> > >>>>>> Best, > >>>>>> Jingsong Lee > >>>>>> > >>>>>> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <[hidden email]> wrote: > >>>>>> > >>>>>>>> The test cases are written in java and scripts in python. We > >>>> propose > >>>>> a > >>>>>>> separate directory/module in parallel with flink-end-to-end-tests, > >>>> with > >>>>>> the > >>>>>>>> name of flink-end-to-end-perf-tests. > >>>>>>> > >>>>>>> Glad to see that the newly introduced e2e test will be written in > >>>> Java. > >>>>>>> because I'm re-working on the existed e2e tests suites from BASH > >>>>> scripts > >>>>>>> to Java test cases so that we can support more external system , > >>>> such > >>>>> as > >>>>>>> running the testing job on yarn+flink, docker+flink, > >>>> standalone+flink, > >>>>>>> distributed kafka cluster etc. > >>>>>>> BTW, I think the perf e2e test suites will also need to be > >> designed > >>>> as > >>>>>>> supporting running on both standalone env and distributed env. > >> will > >>>> be > >>>>>>> helpful > >>>>>>> for developing & evaluating the perf. > >>>>>>> Thanks. > >>>>>>> > >>>>>>> On Mon, Nov 4, 2019 at 9:31 AM aihua li <[hidden email]> > >>>> wrote: > >>>>>>> > >>>>>>>> In stage1, the checkpoint mode isn't disabled,and uses heap as > >> the > >>>>>>>> statebackend. > >>>>>>>> I think there should be some special scenarios to test > >> checkpoint > >>>> and > >>>>>>>> statebackend, which will be discussed and added in the > >>>> release-1.11 > >>>>>>>> > >>>>>>>>> 在 2019年11月2日,上午12:13,Yun Tang <[hidden email]> 写道: > >>>>>>>>> > >>>>>>>>> By the way, do you think it's worthy to add a checkpoint mode > >>>> which > >>>>>>> just > >>>>>>>> disable checkpoint to run end-to-end jobs? And when will stage2 > >>>> and > >>>>>>> stage3 > >>>>>>>> be discussed in more details? > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Best, Jingsong Lee > >>>>>> > >>>>> > >>>> > >>> > >> > > |
Free forum by Nabble | Edit this page |