Thanks for the reminder Patrick! According to the release process [1] we
will publish the Dockerfiles *after* the RC voting passed, to finalize the release. I have created FLINK-15978 [2] and prepared a PR [3] for it, will follow up after we conclude our RC vote. Thanks. Best Regards, Yu [1] https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release [2] https://issues.apache.org/jira/browse/FLINK-15978 [3] https://github.com/apache/flink-docker/pull/6 On Mon, 10 Feb 2020 at 20:57, Patrick Lucas <[hidden email]> wrote: > Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink > release process <https://issues.apache.org/jira/browse/FLINK-15828> is > complete, the Dockerfiles for 1.10.0 can be published as part of the > release process. > > @Gary/@Yu: please let me know if you have any questions regarding the > workflow or its documentation. > > -- > Patrick > > On Mon, Feb 10, 2020 at 1:29 PM Benchao Li <[hidden email]> wrote: > > > +1 (non-binding) > > > > - build from source > > - start standalone cluster, and run some examples > > - played with sql-client with some simple sql > > - run tests in IDE > > - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems > 1.10 > > behaves well. > > > > Xintong Song <[hidden email]> 于2020年2月10日周一 下午8:13写道: > > > > > +1 (non-binding) > > > > > > - build from source (with tests) > > > - run nightly e2e tests > > > - run example jobs in local/standalone/yarn setups > > > - play around with memory configurations on local/standalone/yarn > setups > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu <[hidden email]> wrote: > > > > > > > +1 (binding) > > > > > > > > - build the source release with Scala 2.12 and Scala 2.11 > successfully > > > > - checked/verified signatures and hashes > > > > - started cluster for both Scala 2.11 and 2.12, ran examples, > verified > > > web > > > > ui and log output, nothing unexpected > > > > - started cluster and run some e2e sql queries, all of them works > well > > > and > > > > the results are as expected: > > > > - read from kafka source, aggregate, write into mysql > > > > - read from kafka source with watermark defined in ddl, window > > > aggregate, > > > > write into mysql > > > > - read from kafka with computed column defined in ddl, temporal > join > > > with > > > > a mysql table, write into kafka > > > > > > > > Cheers, > > > > Jark > > > > > > > > > > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young <[hidden email]> wrote: > > > > > > > > > +1 (binding) > > > > > > > > > > - verified signatures and checksums > > > > > - start local cluster, run some examples, randomly play some sql > with > > > sql > > > > > client, no suspicious error/warn log found in log files > > > > > - repeat above operation with both scala 2.11 and 2.12 binary > > > > > > > > > > Best, > > > > > Kurt > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang <[hidden email]> > > > wrote: > > > > > > > > > > > +1 non-binding > > > > > > > > > > > > > > > > > > - Building from source with all tests skipped > > > > > > - Build a custom image with 1.10-rc3 > > > > > > - K8s tests > > > > > > * Deploy a standalone session cluster on K8s and submit > > multiple > > > > jobs > > > > > > * Deploy a standalone per-job cluster > > > > > > * Deploy a native session cluster on K8s with/without HA > > > > configured, > > > > > > kill TM and jobs could recover successfully > > > > > > > > > > > > > > > > > > Best, > > > > > > Yang > > > > > > > > > > > > Jingsong Li <[hidden email]> 于2020年2月10日周一 下午4:29写道: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu. > > > > > > > > > > > > > > > > > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct > > buffer > > > > > > memory" > > > > > > > in FileChannelBoundedData$FileBufferReader. > > > > > > > > > > > > > > It forces our batch users to configure > > > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And > > > users > > > > > are > > > > > > > hard to know how much memory they need configure. > > > > > > > > > > > > > > Even for us developers, it is hard to say how much memory, it > > > depends > > > > > on > > > > > > > tasks left over from the previous stage and the parallelism. > > > > > > > > > > > > > > > > > > > > > It is not a blocker, but hope to resolve it in 1.11. > > > > > > > > > > > > > > > > > > > > > - Verified signatures and checksums > > > > > > > > > > > > > > - Maven build from source skip tests > > > > > > > > > > > > > > - Verified pom files point to the 1.10.0 version > > > > > > > > > > > > > > - Test Hive integration and SQL client: work well > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu <[hidden email]> > > > wrote: > > > > > > > > > > > > > > > My bad. The missing commit info is caused by building from > the > > > src > > > > > code > > > > > > > zip > > > > > > > > which does not contain the git info. > > > > > > > > So this is not a problem. > > > > > > > > > > > > > > > > +1 (binding) for rc3 > > > > > > > > Here's what's were verified : > > > > > > > > * built successfully from the source code > > > > > > > > * run a sample streaming and a batch job with > parallelism=1000 > > > on > > > > > yarn > > > > > > > > cluster, with the new scheduler and legacy scheduler, the job > > > runs > > > > > well > > > > > > > > (tuned some resource configs to enable the jobs to work well) > > > > > > > > * killed TMs to trigger failures, the jobs can finally > recover > > > > from > > > > > > the > > > > > > > > failures > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > Zhu Zhu <[hidden email]> 于2020年2月10日周一 上午12:31写道: > > > > > > > > > > > > > > > > > The commit info is shown as <unknown> on the web UI and in > > > logs. > > > > > > > > > Not sure if it's a common issue or just happens to my build > > > only. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > > > aihua li <[hidden email]> 于2020年2月9日周日 下午7:42写道: > > > > > > > > > > > > > > > > > >> Yes, but the results you see in the Performance Code Speed > > > > Center > > > > > > [3] > > > > > > > > >> skip FLIP-49. > > > > > > > > >> The results of the default configurations are overwritten > > by > > > > the > > > > > > > latest > > > > > > > > >> results. > > > > > > > > >> > > > > > > > > >> > 2020年2月9日 下午5:29,Yu Li <[hidden email]> 写道: > > > > > > > > >> > > > > > > > > > >> > Thanks for the efforts Aihua! These could definitely > > improve > > > > our > > > > > > RC > > > > > > > > >> test coverage! > > > > > > > > >> > > > > > > > > > >> > Just to confirm, that the stability tests were executed > > with > > > > the > > > > > > > same > > > > > > > > >> test suite for Alibaba production usage, and the e2e > > > performance > > > > > one > > > > > > > was > > > > > > > > >> executed with the test suite proposed in FLIP-83 [1] and > > > > > FLINK-14917 > > > > > > > > [2], > > > > > > > > >> and the result could also be observed from our performance > > > > > > code-speed > > > > > > > > >> center [3], right? > > > > > > > > >> > > > > > > > > > >> > Thanks. > > > > > > > > >> > > > > > > > > > >> > Best Regards, > > > > > > > > >> > Yu > > > > > > > > >> > > > > > > > > > >> > [1] > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > >> < > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > >> > > > > > > > > > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 < > > > > > > > > >> https://issues.apache.org/jira/browse/FLINK-14917> > > > > > > > > >> > [3] https://s.apache.org/nglhm < > > https://s.apache.org/nglhm> > > > > > > > > >> > > > > > > > > > >> > On Sun, 9 Feb 2020 at 11:20, aihua li < > > > [hidden email] > > > > > > > <mailto: > > > > > > > > >> [hidden email]>> wrote: > > > > > > > > >> > +1 (non-binging) > > > > > > > > >> > > > > > > > > > >> > I ran stability tests and end-to-end performance tests > in > > > > branch > > > > > > > > >> release-1.10.0-rc3,both of them passed. > > > > > > > > >> > > > > > > > > > >> > Stability test: It mainly checks The flink job can > revover > > > > from > > > > > > > > >> various abnormal situations which concluding disk full, > > > > > > > > >> > network interruption, zk unable to connect, rpc message > > > > timeout, > > > > > > > etc. > > > > > > > > >> > If job can't be recoverd it means test failed. > > > > > > > > >> > The test passed after running 5 hours. > > > > > > > > >> > > > > > > > > > >> > End-to-end performance test: It containes 32 test > > scenarios > > > > > which > > > > > > > > >> designed in FLIP-83. > > > > > > > > >> > Test results: The performance regressions about 3% from > > > 1.9.1 > > > > if > > > > > > > uses > > > > > > > > >> default parameters; > > > > > > > > >> > The result: > > > > > > > > >> > > > > > > > > > >> > if skips FLIP-49 (add > > > > > > > parameters:taskmanager.memory.managed.fraction: > > > > > > > > >> 0,taskmanager.memory.flink.size: 1568m in > flink-conf.yaml), > > > > > > > > >> > the performance improves about 5% from 1.9.1. The > result: > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > I confirm it with @Xintong Song < > > > > > > > > >> https://cwiki.apache.org/confluence/display/~xintongsong> > > > that > > > > > the > > > > > > > > >> result makes sense. > > > > > > > > >> > > > > > > > > > >> >> 2020年2月8日 上午5:54,Gary Yao <[hidden email] <mailto: > > > > > > [hidden email] > > > > > > > >> > > > > > > > > >> 写道: > > > > > > > > >> >> > > > > > > > > >> >> Hi everyone, > > > > > > > > >> >> Please review and vote on the release candidate #3 for > > the > > > > > > version > > > > > > > > >> 1.10.0, > > > > > > > > >> >> as follows: > > > > > > > > >> >> [ ] +1, Approve the release > > > > > > > > >> >> [ ] -1, Do not approve the release (please provide > > specific > > > > > > > comments) > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> The complete staging area is available for your review, > > > which > > > > > > > > includes: > > > > > > > > >> >> * JIRA release notes [1], > > > > > > > > >> >> * the official Apache source release and binary > > convenience > > > > > > > releases > > > > > > > > >> to be > > > > > > > > >> >> deployed to dist.apache.org <http://dist.apache.org/> > > [2], > > > > > which > > > > > > > are > > > > > > > > >> signed with the key with > > > > > > > > >> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 > [3], > > > > > > > > >> >> * all artifacts to be deployed to the Maven Central > > > > Repository > > > > > > [4], > > > > > > > > >> >> * source code tag "release-1.10.0-rc3" [5], > > > > > > > > >> >> * website pull request listing the new release and > adding > > > > > > > > announcement > > > > > > > > >> blog > > > > > > > > >> >> post [6][7]. > > > > > > > > >> >> > > > > > > > > >> >> The vote will be open for at least 72 hours. It is > > adopted > > > by > > > > > > > > majority > > > > > > > > >> >> approval, with at least 3 PMC affirmative votes. > > > > > > > > >> >> > > > > > > > > >> >> Thanks, > > > > > > > > >> >> Yu & Gary > > > > > > > > >> >> > > > > > > > > >> >> [1] > > > > > > > > >> >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > >> < > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > >> > > > > > > > > > >> >> [2] > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ > > > > > > > < > > > > > > > > >> > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/> > > > > > > > > >> >> [3] > > https://dist.apache.org/repos/dist/release/flink/KEYS > > > < > > > > > > > > >> https://dist.apache.org/repos/dist/release/flink/KEYS> > > > > > > > > >> >> [4] > > > > > > > > >> > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > < > > > > > > > > >> > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > > > > > > > >> >> [5] > > > > > > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > < > > > > > > > > >> > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > > > > > > >> >> [6] https://github.com/apache/flink-web/pull/302 < > > > > > > > > >> https://github.com/apache/flink-web/pull/302> > > > > > > > > >> >> [7] https://github.com/apache/flink-web/pull/301 < > > > > > > > > >> https://github.com/apache/flink-web/pull/301> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Benchao Li > > School of Electronics Engineering and Computer Science, Peking University > > Tel:+86-15650713730 > > Email: [hidden email]; [hidden email] > > > |
+1 (non-binding)
- Build from source - Run mesos e2e tests(including unmerged heap state backend and rocks state backend case) Best, Yangze Guo On Tue, Feb 11, 2020 at 10:08 AM Yu Li <[hidden email]> wrote: > > Thanks for the reminder Patrick! According to the release process [1] we > will publish the Dockerfiles *after* the RC voting passed, to finalize the > release. > > I have created FLINK-15978 [2] and prepared a PR [3] for it, will follow up > after we conclude our RC vote. Thanks. > > Best Regards, > Yu > > [1] > https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release > [2] https://issues.apache.org/jira/browse/FLINK-15978 > [3] https://github.com/apache/flink-docker/pull/6 > > > On Mon, 10 Feb 2020 at 20:57, Patrick Lucas <[hidden email]> wrote: > > > Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink > > release process <https://issues.apache.org/jira/browse/FLINK-15828> is > > complete, the Dockerfiles for 1.10.0 can be published as part of the > > release process. > > > > @Gary/@Yu: please let me know if you have any questions regarding the > > workflow or its documentation. > > > > -- > > Patrick > > > > On Mon, Feb 10, 2020 at 1:29 PM Benchao Li <[hidden email]> wrote: > > > > > +1 (non-binding) > > > > > > - build from source > > > - start standalone cluster, and run some examples > > > - played with sql-client with some simple sql > > > - run tests in IDE > > > - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems > > 1.10 > > > behaves well. > > > > > > Xintong Song <[hidden email]> 于2020年2月10日周一 下午8:13写道: > > > > > > > +1 (non-binding) > > > > > > > > - build from source (with tests) > > > > - run nightly e2e tests > > > > - run example jobs in local/standalone/yarn setups > > > > - play around with memory configurations on local/standalone/yarn > > setups > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu <[hidden email]> wrote: > > > > > > > > > +1 (binding) > > > > > > > > > > - build the source release with Scala 2.12 and Scala 2.11 > > successfully > > > > > - checked/verified signatures and hashes > > > > > - started cluster for both Scala 2.11 and 2.12, ran examples, > > verified > > > > web > > > > > ui and log output, nothing unexpected > > > > > - started cluster and run some e2e sql queries, all of them works > > well > > > > and > > > > > the results are as expected: > > > > > - read from kafka source, aggregate, write into mysql > > > > > - read from kafka source with watermark defined in ddl, window > > > > aggregate, > > > > > write into mysql > > > > > - read from kafka with computed column defined in ddl, temporal > > join > > > > with > > > > > a mysql table, write into kafka > > > > > > > > > > Cheers, > > > > > Jark > > > > > > > > > > > > > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young <[hidden email]> wrote: > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > - verified signatures and checksums > > > > > > - start local cluster, run some examples, randomly play some sql > > with > > > > sql > > > > > > client, no suspicious error/warn log found in log files > > > > > > - repeat above operation with both scala 2.11 and 2.12 binary > > > > > > > > > > > > Best, > > > > > > Kurt > > > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang <[hidden email]> > > > > wrote: > > > > > > > > > > > > > +1 non-binding > > > > > > > > > > > > > > > > > > > > > - Building from source with all tests skipped > > > > > > > - Build a custom image with 1.10-rc3 > > > > > > > - K8s tests > > > > > > > * Deploy a standalone session cluster on K8s and submit > > > multiple > > > > > jobs > > > > > > > * Deploy a standalone per-job cluster > > > > > > > * Deploy a native session cluster on K8s with/without HA > > > > > configured, > > > > > > > kill TM and jobs could recover successfully > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > Yang > > > > > > > > > > > > > > Jingsong Li <[hidden email]> 于2020年2月10日周一 下午4:29写道: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu. > > > > > > > > > > > > > > > > > > > > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct > > > buffer > > > > > > > memory" > > > > > > > > in FileChannelBoundedData$FileBufferReader. > > > > > > > > > > > > > > > > It forces our batch users to configure > > > > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And > > > > users > > > > > > are > > > > > > > > hard to know how much memory they need configure. > > > > > > > > > > > > > > > > Even for us developers, it is hard to say how much memory, it > > > > depends > > > > > > on > > > > > > > > tasks left over from the previous stage and the parallelism. > > > > > > > > > > > > > > > > > > > > > > > > It is not a blocker, but hope to resolve it in 1.11. > > > > > > > > > > > > > > > > > > > > > > > > - Verified signatures and checksums > > > > > > > > > > > > > > > > - Maven build from source skip tests > > > > > > > > > > > > > > > > - Verified pom files point to the 1.10.0 version > > > > > > > > > > > > > > > > - Test Hive integration and SQL client: work well > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu <[hidden email]> > > > > wrote: > > > > > > > > > > > > > > > > > My bad. The missing commit info is caused by building from > > the > > > > src > > > > > > code > > > > > > > > zip > > > > > > > > > which does not contain the git info. > > > > > > > > > So this is not a problem. > > > > > > > > > > > > > > > > > > +1 (binding) for rc3 > > > > > > > > > Here's what's were verified : > > > > > > > > > * built successfully from the source code > > > > > > > > > * run a sample streaming and a batch job with > > parallelism=1000 > > > > on > > > > > > yarn > > > > > > > > > cluster, with the new scheduler and legacy scheduler, the job > > > > runs > > > > > > well > > > > > > > > > (tuned some resource configs to enable the jobs to work well) > > > > > > > > > * killed TMs to trigger failures, the jobs can finally > > recover > > > > > from > > > > > > > the > > > > > > > > > failures > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > > > Zhu Zhu <[hidden email]> 于2020年2月10日周一 上午12:31写道: > > > > > > > > > > > > > > > > > > > The commit info is shown as <unknown> on the web UI and in > > > > logs. > > > > > > > > > > Not sure if it's a common issue or just happens to my build > > > > only. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > > > > > aihua li <[hidden email]> 于2020年2月9日周日 下午7:42写道: > > > > > > > > > > > > > > > > > > > >> Yes, but the results you see in the Performance Code Speed > > > > > Center > > > > > > > [3] > > > > > > > > > >> skip FLIP-49. > > > > > > > > > >> The results of the default configurations are overwritten > > > by > > > > > the > > > > > > > > latest > > > > > > > > > >> results. > > > > > > > > > >> > > > > > > > > > >> > 2020年2月9日 下午5:29,Yu Li <[hidden email]> 写道: > > > > > > > > > >> > > > > > > > > > > >> > Thanks for the efforts Aihua! These could definitely > > > improve > > > > > our > > > > > > > RC > > > > > > > > > >> test coverage! > > > > > > > > > >> > > > > > > > > > > >> > Just to confirm, that the stability tests were executed > > > with > > > > > the > > > > > > > > same > > > > > > > > > >> test suite for Alibaba production usage, and the e2e > > > > performance > > > > > > one > > > > > > > > was > > > > > > > > > >> executed with the test suite proposed in FLIP-83 [1] and > > > > > > FLINK-14917 > > > > > > > > > [2], > > > > > > > > > >> and the result could also be observed from our performance > > > > > > > code-speed > > > > > > > > > >> center [3], right? > > > > > > > > > >> > > > > > > > > > > >> > Thanks. > > > > > > > > > >> > > > > > > > > > > >> > Best Regards, > > > > > > > > > >> > Yu > > > > > > > > > >> > > > > > > > > > > >> > [1] > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > > >> < > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > > >> > > > > > > > > > > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 < > > > > > > > > > >> https://issues.apache.org/jira/browse/FLINK-14917> > > > > > > > > > >> > [3] https://s.apache.org/nglhm < > > > https://s.apache.org/nglhm> > > > > > > > > > >> > > > > > > > > > > >> > On Sun, 9 Feb 2020 at 11:20, aihua li < > > > > [hidden email] > > > > > > > > <mailto: > > > > > > > > > >> [hidden email]>> wrote: > > > > > > > > > >> > +1 (non-binging) > > > > > > > > > >> > > > > > > > > > > >> > I ran stability tests and end-to-end performance tests > > in > > > > > branch > > > > > > > > > >> release-1.10.0-rc3,both of them passed. > > > > > > > > > >> > > > > > > > > > > >> > Stability test: It mainly checks The flink job can > > revover > > > > > from > > > > > > > > > >> various abnormal situations which concluding disk full, > > > > > > > > > >> > network interruption, zk unable to connect, rpc message > > > > > timeout, > > > > > > > > etc. > > > > > > > > > >> > If job can't be recoverd it means test failed. > > > > > > > > > >> > The test passed after running 5 hours. > > > > > > > > > >> > > > > > > > > > > >> > End-to-end performance test: It containes 32 test > > > scenarios > > > > > > which > > > > > > > > > >> designed in FLIP-83. > > > > > > > > > >> > Test results: The performance regressions about 3% from > > > > 1.9.1 > > > > > if > > > > > > > > uses > > > > > > > > > >> default parameters; > > > > > > > > > >> > The result: > > > > > > > > > >> > > > > > > > > > > >> > if skips FLIP-49 (add > > > > > > > > parameters:taskmanager.memory.managed.fraction: > > > > > > > > > >> 0,taskmanager.memory.flink.size: 1568m in > > flink-conf.yaml), > > > > > > > > > >> > the performance improves about 5% from 1.9.1. The > > result: > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > I confirm it with @Xintong Song < > > > > > > > > > >> https://cwiki.apache.org/confluence/display/~xintongsong> > > > > that > > > > > > the > > > > > > > > > >> result makes sense. > > > > > > > > > >> > > > > > > > > > > >> >> 2020年2月8日 上午5:54,Gary Yao <[hidden email] <mailto: > > > > > > > [hidden email] > > > > > > > > >> > > > > > > > > > >> 写道: > > > > > > > > > >> >> > > > > > > > > > >> >> Hi everyone, > > > > > > > > > >> >> Please review and vote on the release candidate #3 for > > > the > > > > > > > version > > > > > > > > > >> 1.10.0, > > > > > > > > > >> >> as follows: > > > > > > > > > >> >> [ ] +1, Approve the release > > > > > > > > > >> >> [ ] -1, Do not approve the release (please provide > > > specific > > > > > > > > comments) > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> The complete staging area is available for your review, > > > > which > > > > > > > > > includes: > > > > > > > > > >> >> * JIRA release notes [1], > > > > > > > > > >> >> * the official Apache source release and binary > > > convenience > > > > > > > > releases > > > > > > > > > >> to be > > > > > > > > > >> >> deployed to dist.apache.org <http://dist.apache.org/> > > > [2], > > > > > > which > > > > > > > > are > > > > > > > > > >> signed with the key with > > > > > > > > > >> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 > > [3], > > > > > > > > > >> >> * all artifacts to be deployed to the Maven Central > > > > > Repository > > > > > > > [4], > > > > > > > > > >> >> * source code tag "release-1.10.0-rc3" [5], > > > > > > > > > >> >> * website pull request listing the new release and > > adding > > > > > > > > > announcement > > > > > > > > > >> blog > > > > > > > > > >> >> post [6][7]. > > > > > > > > > >> >> > > > > > > > > > >> >> The vote will be open for at least 72 hours. It is > > > adopted > > > > by > > > > > > > > > majority > > > > > > > > > >> >> approval, with at least 3 PMC affirmative votes. > > > > > > > > > >> >> > > > > > > > > > >> >> Thanks, > > > > > > > > > >> >> Yu & Gary > > > > > > > > > >> >> > > > > > > > > > >> >> [1] > > > > > > > > > >> >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > > >> < > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > > >> > > > > > > > > > > >> >> [2] > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ > > > > > > > > < > > > > > > > > > >> > > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/> > > > > > > > > > >> >> [3] > > > https://dist.apache.org/repos/dist/release/flink/KEYS > > > > < > > > > > > > > > >> https://dist.apache.org/repos/dist/release/flink/KEYS> > > > > > > > > > >> >> [4] > > > > > > > > > >> > > > > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > > < > > > > > > > > > >> > > > > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > > > > > > > > > >> >> [5] > > > > > > > > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > > < > > > > > > > > > >> > > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > > > > > > > > >> >> [6] https://github.com/apache/flink-web/pull/302 < > > > > > > > > > >> https://github.com/apache/flink-web/pull/302> > > > > > > > > > >> >> [7] https://github.com/apache/flink-web/pull/301 < > > > > > > > > > >> https://github.com/apache/flink-web/pull/301> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Benchao Li > > > School of Electronics Engineering and Computer Science, Peking University > > > Tel:+86-15650713730 > > > Email: [hidden email]; [hidden email] > > > > > |
Hi,
@Jingsong Lee Regarding "OutOfMemoryError: Direct buffer memory" in FileChannelBoundedData$FileBufferReader I saw you created a specific issue issue: https://issues.apache.org/jira/browse/FLINK-15981 In general, I think we could rewrap this error in MemorySegmentFactory#allocateUnpooledOffHeapMemory, e.g. suggesting to increase off heap memory option: https://issues.apache.org/jira/browse/FLINK-15989 It can always happen independently from Flink if user code over-allocates the direct memory somewhere else. Thanks, Andrey On Tue, Feb 11, 2020 at 4:12 AM Yangze Guo <[hidden email]> wrote: > +1 (non-binding) > > - Build from source > - Run mesos e2e tests(including unmerged heap state backend and rocks > state backend case) > > > Best, > Yangze Guo > > On Tue, Feb 11, 2020 at 10:08 AM Yu Li <[hidden email]> wrote: > > > > Thanks for the reminder Patrick! According to the release process [1] we > > will publish the Dockerfiles *after* the RC voting passed, to finalize > the > > release. > > > > I have created FLINK-15978 [2] and prepared a PR [3] for it, will follow > up > > after we conclude our RC vote. Thanks. > > > > Best Regards, > > Yu > > > > [1] > > > https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release > > [2] https://issues.apache.org/jira/browse/FLINK-15978 > > [3] https://github.com/apache/flink-docker/pull/6 > > > > > > On Mon, 10 Feb 2020 at 20:57, Patrick Lucas <[hidden email]> > wrote: > > > > > Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink > > > release process <https://issues.apache.org/jira/browse/FLINK-15828> is > > > complete, the Dockerfiles for 1.10.0 can be published as part of the > > > release process. > > > > > > @Gary/@Yu: please let me know if you have any questions regarding the > > > workflow or its documentation. > > > > > > -- > > > Patrick > > > > > > On Mon, Feb 10, 2020 at 1:29 PM Benchao Li <[hidden email]> > wrote: > > > > > > > +1 (non-binding) > > > > > > > > - build from source > > > > - start standalone cluster, and run some examples > > > > - played with sql-client with some simple sql > > > > - run tests in IDE > > > > - run some sqls running in 1.9 internal version with 1.10.0-rc3, > seems > > > 1.10 > > > > behaves well. > > > > > > > > Xintong Song <[hidden email]> 于2020年2月10日周一 下午8:13写道: > > > > > > > > > +1 (non-binding) > > > > > > > > > > - build from source (with tests) > > > > > - run nightly e2e tests > > > > > - run example jobs in local/standalone/yarn setups > > > > > - play around with memory configurations on local/standalone/yarn > > > setups > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu <[hidden email]> wrote: > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > - build the source release with Scala 2.12 and Scala 2.11 > > > successfully > > > > > > - checked/verified signatures and hashes > > > > > > - started cluster for both Scala 2.11 and 2.12, ran examples, > > > verified > > > > > web > > > > > > ui and log output, nothing unexpected > > > > > > - started cluster and run some e2e sql queries, all of them works > > > well > > > > > and > > > > > > the results are as expected: > > > > > > - read from kafka source, aggregate, write into mysql > > > > > > - read from kafka source with watermark defined in ddl, window > > > > > aggregate, > > > > > > write into mysql > > > > > > - read from kafka with computed column defined in ddl, temporal > > > join > > > > > with > > > > > > a mysql table, write into kafka > > > > > > > > > > > > Cheers, > > > > > > Jark > > > > > > > > > > > > > > > > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young <[hidden email]> > wrote: > > > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > > > - verified signatures and checksums > > > > > > > - start local cluster, run some examples, randomly play some > sql > > > with > > > > > sql > > > > > > > client, no suspicious error/warn log found in log files > > > > > > > - repeat above operation with both scala 2.11 and 2.12 binary > > > > > > > > > > > > > > Best, > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang < > [hidden email]> > > > > > wrote: > > > > > > > > > > > > > > > +1 non-binding > > > > > > > > > > > > > > > > > > > > > > > > - Building from source with all tests skipped > > > > > > > > - Build a custom image with 1.10-rc3 > > > > > > > > - K8s tests > > > > > > > > * Deploy a standalone session cluster on K8s and submit > > > > multiple > > > > > > jobs > > > > > > > > * Deploy a standalone per-job cluster > > > > > > > > * Deploy a native session cluster on K8s with/without HA > > > > > > configured, > > > > > > > > kill TM and jobs could recover successfully > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > Yang > > > > > > > > > > > > > > > > Jingsong Li <[hidden email]> 于2020年2月10日周一 下午4:29写道: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu. > > > > > > > > > > > > > > > > > > > > > > > > > > > There is an unfriendly error here: "OutOfMemoryError: > Direct > > > > buffer > > > > > > > > memory" > > > > > > > > > in FileChannelBoundedData$FileBufferReader. > > > > > > > > > > > > > > > > > > It forces our batch users to configure > > > > > > > > > "taskmanager.memory.task.off-heap.size" in production > jobs. And > > > > > users > > > > > > > are > > > > > > > > > hard to know how much memory they need configure. > > > > > > > > > > > > > > > > > > Even for us developers, it is hard to say how much memory, > it > > > > > depends > > > > > > > on > > > > > > > > > tasks left over from the previous stage and the > parallelism. > > > > > > > > > > > > > > > > > > > > > > > > > > > It is not a blocker, but hope to resolve it in 1.11. > > > > > > > > > > > > > > > > > > > > > > > > > > > - Verified signatures and checksums > > > > > > > > > > > > > > > > > > - Maven build from source skip tests > > > > > > > > > > > > > > > > > > - Verified pom files point to the 1.10.0 version > > > > > > > > > > > > > > > > > > - Test Hive integration and SQL client: work well > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu < > [hidden email]> > > > > > wrote: > > > > > > > > > > > > > > > > > > > My bad. The missing commit info is caused by building > from > > > the > > > > > src > > > > > > > code > > > > > > > > > zip > > > > > > > > > > which does not contain the git info. > > > > > > > > > > So this is not a problem. > > > > > > > > > > > > > > > > > > > > +1 (binding) for rc3 > > > > > > > > > > Here's what's were verified : > > > > > > > > > > * built successfully from the source code > > > > > > > > > > * run a sample streaming and a batch job with > > > parallelism=1000 > > > > > on > > > > > > > yarn > > > > > > > > > > cluster, with the new scheduler and legacy scheduler, > the job > > > > > runs > > > > > > > well > > > > > > > > > > (tuned some resource configs to enable the jobs to work > well) > > > > > > > > > > * killed TMs to trigger failures, the jobs can finally > > > recover > > > > > > from > > > > > > > > the > > > > > > > > > > failures > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > > > > > Zhu Zhu <[hidden email]> 于2020年2月10日周一 上午12:31写道: > > > > > > > > > > > > > > > > > > > > > The commit info is shown as <unknown> on the web UI > and in > > > > > logs. > > > > > > > > > > > Not sure if it's a common issue or just happens to my > build > > > > > only. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Zhu Zhu > > > > > > > > > > > > > > > > > > > > > > aihua li <[hidden email]> 于2020年2月9日周日 > 下午7:42写道: > > > > > > > > > > > > > > > > > > > > > >> Yes, but the results you see in the Performance Code > Speed > > > > > > Center > > > > > > > > [3] > > > > > > > > > > >> skip FLIP-49. > > > > > > > > > > >> The results of the default configurations are > overwritten > > > > by > > > > > > the > > > > > > > > > latest > > > > > > > > > > >> results. > > > > > > > > > > >> > > > > > > > > > > >> > 2020年2月9日 下午5:29,Yu Li <[hidden email]> 写道: > > > > > > > > > > >> > > > > > > > > > > > >> > Thanks for the efforts Aihua! These could definitely > > > > improve > > > > > > our > > > > > > > > RC > > > > > > > > > > >> test coverage! > > > > > > > > > > >> > > > > > > > > > > > >> > Just to confirm, that the stability tests were > executed > > > > with > > > > > > the > > > > > > > > > same > > > > > > > > > > >> test suite for Alibaba production usage, and the e2e > > > > > performance > > > > > > > one > > > > > > > > > was > > > > > > > > > > >> executed with the test suite proposed in FLIP-83 [1] > and > > > > > > > FLINK-14917 > > > > > > > > > > [2], > > > > > > > > > > >> and the result could also be observed from our > performance > > > > > > > > code-speed > > > > > > > > > > >> center [3], right? > > > > > > > > > > >> > > > > > > > > > > > >> > Thanks. > > > > > > > > > > >> > > > > > > > > > > > >> > Best Regards, > > > > > > > > > > >> > Yu > > > > > > > > > > >> > > > > > > > > > > > >> > [1] > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > > > >> < > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework > > > > > > > > > > >> > > > > > > > > > > > >> > [2] > https://issues.apache.org/jira/browse/FLINK-14917 < > > > > > > > > > > >> https://issues.apache.org/jira/browse/FLINK-14917> > > > > > > > > > > >> > [3] https://s.apache.org/nglhm < > > > > https://s.apache.org/nglhm> > > > > > > > > > > >> > > > > > > > > > > > >> > On Sun, 9 Feb 2020 at 11:20, aihua li < > > > > > [hidden email] > > > > > > > > > <mailto: > > > > > > > > > > >> [hidden email]>> wrote: > > > > > > > > > > >> > +1 (non-binging) > > > > > > > > > > >> > > > > > > > > > > > >> > I ran stability tests and end-to-end performance > tests > > > in > > > > > > branch > > > > > > > > > > >> release-1.10.0-rc3,both of them passed. > > > > > > > > > > >> > > > > > > > > > > > >> > Stability test: It mainly checks The flink job can > > > revover > > > > > > from > > > > > > > > > > >> various abnormal situations which concluding disk > full, > > > > > > > > > > >> > network interruption, zk unable to connect, rpc > message > > > > > > timeout, > > > > > > > > > etc. > > > > > > > > > > >> > If job can't be recoverd it means test failed. > > > > > > > > > > >> > The test passed after running 5 hours. > > > > > > > > > > >> > > > > > > > > > > > >> > End-to-end performance test: It containes 32 test > > > > scenarios > > > > > > > which > > > > > > > > > > >> designed in FLIP-83. > > > > > > > > > > >> > Test results: The performance regressions about 3% > from > > > > > 1.9.1 > > > > > > if > > > > > > > > > uses > > > > > > > > > > >> default parameters; > > > > > > > > > > >> > The result: > > > > > > > > > > >> > > > > > > > > > > > >> > if skips FLIP-49 (add > > > > > > > > > parameters:taskmanager.memory.managed.fraction: > > > > > > > > > > >> 0,taskmanager.memory.flink.size: 1568m in > > > flink-conf.yaml), > > > > > > > > > > >> > the performance improves about 5% from 1.9.1. The > > > result: > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > I confirm it with @Xintong Song < > > > > > > > > > > >> > https://cwiki.apache.org/confluence/display/~xintongsong> > > > > > that > > > > > > > the > > > > > > > > > > >> result makes sense. > > > > > > > > > > >> > > > > > > > > > > > >> >> 2020年2月8日 上午5:54,Gary Yao <[hidden email] > <mailto: > > > > > > > > [hidden email] > > > > > > > > > >> > > > > > > > > > > >> 写道: > > > > > > > > > > >> >> > > > > > > > > > > >> >> Hi everyone, > > > > > > > > > > >> >> Please review and vote on the release candidate #3 > for > > > > the > > > > > > > > version > > > > > > > > > > >> 1.10.0, > > > > > > > > > > >> >> as follows: > > > > > > > > > > >> >> [ ] +1, Approve the release > > > > > > > > > > >> >> [ ] -1, Do not approve the release (please provide > > > > specific > > > > > > > > > comments) > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> The complete staging area is available for your > review, > > > > > which > > > > > > > > > > includes: > > > > > > > > > > >> >> * JIRA release notes [1], > > > > > > > > > > >> >> * the official Apache source release and binary > > > > convenience > > > > > > > > > releases > > > > > > > > > > >> to be > > > > > > > > > > >> >> deployed to dist.apache.org < > http://dist.apache.org/> > > > > [2], > > > > > > > which > > > > > > > > > are > > > > > > > > > > >> signed with the key with > > > > > > > > > > >> >> fingerprint > BB137807CEFBE7DD2616556710B12A1F89C115E8 > > > [3], > > > > > > > > > > >> >> * all artifacts to be deployed to the Maven Central > > > > > > Repository > > > > > > > > [4], > > > > > > > > > > >> >> * source code tag "release-1.10.0-rc3" [5], > > > > > > > > > > >> >> * website pull request listing the new release and > > > adding > > > > > > > > > > announcement > > > > > > > > > > >> blog > > > > > > > > > > >> >> post [6][7]. > > > > > > > > > > >> >> > > > > > > > > > > >> >> The vote will be open for at least 72 hours. It is > > > > adopted > > > > > by > > > > > > > > > > majority > > > > > > > > > > >> >> approval, with at least 3 PMC affirmative votes. > > > > > > > > > > >> >> > > > > > > > > > > >> >> Thanks, > > > > > > > > > > >> >> Yu & Gary > > > > > > > > > > >> >> > > > > > > > > > > >> >> [1] > > > > > > > > > > >> >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > > > >> < > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845 > > > > > > > > > > >> > > > > > > > > > > > >> >> [2] > > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ > > > > > > > > > < > > > > > > > > > > >> > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/> > > > > > > > > > > >> >> [3] > > > > https://dist.apache.org/repos/dist/release/flink/KEYS > > > > > < > > > > > > > > > > >> https://dist.apache.org/repos/dist/release/flink/KEYS > > > > > > > > > > > > >> >> [4] > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > > > < > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1333 > > > > > > > > > > > > > > > > > > >> >> [5] > > > > > > > > > > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > > > < > > > > > > > > > > >> > > > > > https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 > > > > > > > > > > > > > > > > > >> >> [6] https://github.com/apache/flink-web/pull/302 < > > > > > > > > > > >> https://github.com/apache/flink-web/pull/302> > > > > > > > > > > >> >> [7] https://github.com/apache/flink-web/pull/301 < > > > > > > > > > > >> https://github.com/apache/flink-web/pull/301> > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Benchao Li > > > > School of Electronics Engineering and Computer Science, Peking > University > > > > Tel:+86-15650713730 > > > > Email: [hidden email]; [hidden email] > > > > > > > > |
Free forum by Nabble | Edit this page |