Thanks for reporting this problem and opening a JIRA issue. I've created a
fix for the problem [1]. [1] https://github.com/apache/flink/pull/8096 Cheers, Till On Mon, Apr 1, 2019 at 12:30 AM Richard Deurwaarder <[hidden email]> wrote: > Hello @Aljoscha and @Rong, > > I've described the problem in the mailing list[1] and on stackoverflow[2] > before. But the gist is: If there's a firewall between the yarn cluster and > the machine submitting the job, we need to be able to set a fixed port (or > range of ports) for REST communication with the jobmanager. > > It is a regression in the sense that on 1.5 (and 1.6 I believe?) it was > possible to work around this by using the legacy mode (non flip-6), but on > 1.7 and now 1.8 this is not possible. > > I've created FLINK-12075 < > https://issues.apache.org/jira/browse/FLINK-12075> > for it, I have not made it blocking yet as it is not strictly a regression > with regards to 1.7. Perhaps you guys can better determine if you want this > added in RC5. > > Regards, > > Richard > > [1] > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Submitting-job-to-Flink-on-yarn-timesout-on-flip-6-1-5-x-td26199.html#a26383 > [2] https://stackoverflow.com/q/54771637/988324 > > On Sat, Mar 30, 2019 at 7:24 PM Rong Rong <[hidden email]> wrote: > > > Hi @Aljoscha, > > > > Based on the previous commit [1] that adds the random port selection > code, > > it seems like the important part is to unset whatever 'rest.port' setting > > previously done. I don't think the current way of setting the BIND_PORT > > actually overrides any existing PORT setting. However, I wasn't able to > > find any test that is related, maybe @Till can provide more insight here? > > > > Maybe @Richard can provide more detail on the YARN run command used to > > reproduce the problem? > > > > Thanks, > > Rong > > > > [1] > > > > > https://github.com/apache/flink/commit/dbe0e8286d76a5facdb49589b638b87dbde80178#diff-487838863ab693af7008f04cb3359be3R117 > > > > On Sat, Mar 30, 2019 at 5:51 AM Aljoscha Krettek <[hidden email]> > > wrote: > > > > > @Richard Did this work for you previously? From the change, it seems > that > > > the port was always set to 0 on YARN even before. > > > > > > > On 28. Mar 2019, at 16:13, Richard Deurwaarder <[hidden email]> > > wrote: > > > > > > > > -1 (non-binding) > > > > > > > > - Ran integration tests locally (1000+) of our flink job, all > > succeeded. > > > > - Attempted to run job on hadoop, failed. It failed because we have a > > > > firewall in place and we cannot set the rest port to a specific > > port/port > > > > range. > > > > Unless I am mistaken, it seems like FLINK-11081 broke the possibility > > of > > > > setting a REST port when running on yarn ( > > > > > > > > > > https://github.com/apache/flink/commit/730eed71ef3f718d61f85d5e94b1060844ca56db#diff-487838863ab693af7008f04cb3359be3R102 > > > > ) > > > > Code-wise it seems rather straightforward to fix but I am unsure > about > > > the > > > > reason why this is hard-coded to 0 and what the impact would be. > > > > > > > > It would benefit us greatly if a fix for this could make it to 1.8.0. > > > > > > > > Regards, > > > > > > > > Richard > > > > > > > > On Thu, Mar 28, 2019 at 9:54 AM Tzu-Li (Gordon) Tai < > > [hidden email] > > > > > > > > wrote: > > > > > > > >> +1 (binding) > > > >> > > > >> Functional checks: > > > >> > > > >> - Built Flink from source (`mvn clean verify`) locally, with success > > > >> - Ran end-to-end tests locally for 5 times in a loop, no attempts > > failed > > > >> (Hadoop 2.8.4, Scala 2.12) > > > >> - Manually tested state schema evolution for POJO. Besides the tests > > > that > > > >> @Congxian already did, additionally tested evolution cases with POJO > > > >> subclasses + non-registered POJOs. > > > >> - Manually tested migration of Scala stateful jobs that use case > > > classes / > > > >> Scala collections as state types, performing the migration across > > Scala > > > >> 2.11 to Scala 2.12. > > > >> - Reviewed release announcement PR > > > >> > > > >> Misc / legal checks: > > > >> > > > >> - checked checksums and signatures > > > >> - No binaries in source distribution > > > >> - Staging area does not seem to have any missing artifacts > > > >> > > > >> Cheers, > > > >> Gordon > > > >> > > > >> On Thu, Mar 28, 2019 at 4:52 PM Tzu-Li (Gordon) Tai < > > > [hidden email]> > > > >> wrote: > > > >> > > > >>> @Shaoxuan > > > >>> > > > >>> The drop in the serializerAvro benchmark, as explained earlier in > > > >> previous > > > >>> voting threads of earlier RCs, was due to a slower job > initialization > > > >> phase > > > >>> caused by slower deserialization of the AvroSerializer. > > > >>> Piotr also pointed out that after the number of records was > increased > > > in > > > >>> the serializer benchmarks, this drop was no longer observable > before > > / > > > >>> after the changes in mid February. > > > >>> IMO, this is not critical as it does not affect the per-record > > > >> performance > > > >>> / throughput, and therefore should not block this release. > > > >>> > > > >>> On Thu, Mar 28, 2019 at 1:08 AM Aljoscha Krettek < > > > [hidden email]> > > > >>> wrote: > > > >>> > > > >>>> By now, I'm reasonably sure that the test instabilities on the > > > >> end-to-end > > > >>>> test are only instabilities. I pushed changes to increase timeouts > > to > > > >> make > > > >>>> the tests more stable. As in any project, there will always be > bugs > > > but > > > >> I > > > >>>> think we could release this RC4 and be reasonably sure that it > works > > > >> well. > > > >>>> > > > >>>> Now, we only need to have the required number of PMC votes. > > > >>>> > > > >>>> On Wed, Mar 27, 2019, at 07:22, Congxian Qiu wrote: > > > >>>>> +1 (non-binding) > > > >>>>> > > > >>>>> • checked signature and checksum ok > > > >>>>> • mvn clean package -DskipTests ok > > > >>>>> • Run job on yarn ok > > > >>>>> • Test state migration with POJO type (both heap and rocksdb) ok > > > >>>>> • - 1.6 -> 1.8 > > > >>>>> • - 1.7 -> 1.8 > > > >>>>> • - 1.8 -> 1.8 > > > >>>>> > > > >>>>> > > > >>>>> Best, Congxian > > > >>>>> On Mar 27, 2019, 10:26 +0800, vino yang <[hidden email]>, > > > >> wrote: > > > >>>>>> +1 (non-binding) > > > >>>>>> > > > >>>>>> - checked JIRA release note > > > >>>>>> - ran "mvn package -DskipTests" > > > >>>>>> - checked signature and checksum > > > >>>>>> - started a cluster locally and ran some examples in binary > > > >>>>>> - checked web site announcement's PR > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Vino > > > >>>>>> > > > >>>>>> > > > >>>>>> Xiaowei Jiang <[hidden email]> 于2019年3月26日周二 下午8:20写道: > > > >>>>>> > > > >>>>>>> +1 (non-binding) > > > >>>>>>> > > > >>>>>>> - checked checksums and GPG files > > > >>>>>>> - build from source successfully- run end-to-end precommit > tests > > > >>>>>>> successfully- run end-to-end nightly tests successfully > > > >>>>>>> Xiaowei > > > >>>>>>> On Tuesday, March 26, 2019, 8:09:19 PM GMT+8, Yu Li < > > > >>>> [hidden email]> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>> +1 (non-binding) > > > >>>>>>> > > > >>>>>>> - Checked release notes: OK > > > >>>>>>> - Checked sums and signatures: OK > > > >>>>>>> - Source release > > > >>>>>>> - contains no binaries: OK > > > >>>>>>> - contains no 1.8-SNAPSHOT references: OK > > > >>>>>>> - build from source: OK (8u101) > > > >>>>>>> - mvn clean verify: OK (8u101) > > > >>>>>>> - Binary release > > > >>>>>>> - no examples appear to be missing > > > >>>>>>> - started a cluster; WebUI reachable, example ran successfully > > > >>>>>>> - end-to-end test (all but K8S and docker ones): OK (8u101) > > > >>>>>>> - Repository appears to contain all expected artifacts > > > >>>>>>> > > > >>>>>>> Best Regards, > > > >>>>>>> Yu > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Tue, 26 Mar 2019 at 14:28, Kurt Young <[hidden email]> > > > >> wrote: > > > >>>>>>> > > > >>>>>>>> +1 (non-binding) > > > >>>>>>>> > > > >>>>>>>> Checked items: > > > >>>>>>>> - checked checksums and GPG files > > > >>>>>>>> - verified that the source archives do not contains any > binaries > > > >>>>>>>> - checked that all POM files point to the same version > > > >>>>>>>> - build from source successfully > > > >>>>>>>> > > > >>>>>>>> Best, > > > >>>>>>>> Kurt > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Tue, Mar 26, 2019 at 10:57 AM Shaoxuan Wang < > > > >>>> [hidden email]> > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> +1 (non-binding) > > > >>>>>>>>> > > > >>>>>>>>> I tested RC4 with the following items: > > > >>>>>>>>> - Maven Central Repository contains all artifacts > > > >>>>>>>>> - Built the source with Maven (ensured all source files have > > > >>>> Apache > > > >>>>>>>>> headers), and executed built-in tests via "mvn clean verify" > > > >>>>>>>>> - Manually executed the tests in IntelliJ IDE > > > >>>>>>>>> - Verify that the quickstarts for Scala and Java are working > > > >>>> with the > > > >>>>>>>>> staging repository in IntelliJ > > > >>>>>>>>> - Checked the benchmark results. The perf regression of > > > >>>>>>>>> tuple-key-by/statebackend/tumblingWindow are gone, but the > > > >>>> regression > > > >>>>>>> on > > > >>>>>>>>> serializer still exists. > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Shaoxuan > > > >>>>>>>>> > > > >>>>>>>>> On Tue, Mar 26, 2019 at 8:06 AM jincheng sun < > > > >>>> [hidden email] > > > >>>>>>>> > > > >>>>>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>>> Hi Aljoscha, I think you are right, increase the timeout > > > >>>> config will > > > >>>>>>>> fix > > > >>>>>>>>>> this issue. this depends on the resource of Travis. I would > > > >>>> like > > > >>>>>>> share > > > >>>>>>>>>> some phenomenon during my test (not the flink problem) as > > > >>>> follows: > > > >>>>>>> :-) > > > >>>>>>>>>> > > > >>>>>>>>>> During my testing, `mvn clean verify` and `nightly > > > >> end-to-end > > > >>>> test ` > > > >>>>>>>> both > > > >>>>>>>>>> consume a lot of machine resources (especially > > > >>>> memory/network), and > > > >>>>>>> the > > > >>>>>>>>>> network bandwidth requirements of `nightly end-to-end test ` > > > >>>> are also > > > >>>>>>>>> very > > > >>>>>>>>>> high. In China, need to use VPN acceleration (100~200Kb > > > >> before > > > >>>>>>>>>> acceleration, 3~4Mb after acceleration), I have encountered: > > > >>>> [Avro > > > >>>>>>>>>> Confluent Schema Registry nightly end-to-end test' failed > > > >>>> after 18 > > > >>>>>>>>> minutes > > > >>>>>>>>>> and 15 seconds! Test exited with exit Code 1] takes more > > > >> than > > > >>>> 18 > > > >>>>>>>> minutes, > > > >>>>>>>>>> the download failed because the network bandwidth is not > > > >>>> enough. and > > > >>>>>>> it > > > >>>>>>>>>> runs smoothly when using VPN acceleration. The overall > > > >>>> end-to-end run > > > >>>>>>>> was > > > >>>>>>>>>> passed twice. The Docker resource configuration (CUPs 7, > > > >> Mem: > > > >>>> 28.7G, > > > >>>>>>>>> Swap: > > > >>>>>>>>>> 3.5G). See detail log here > > > >>>>>>>>>> < > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>> > > > >> > > > > > > https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing > > > >>>>>>>>>>> > > > >>>>>>>>>> . > > > >>>>>>>>>> > > > >>>>>>>>>> Just now, I had checked the Travis for your last commit > > > >>>> (Increase > > > >>>>>>>> startup > > > >>>>>>>>>> timeout in end-to-end tests), in addition to the Cleanup > > > >>>> phase, other > > > >>>>>>>>>> phases are successful. here > > > >>>>>>>>>> <https://travis-ci.org/apache/flink/builds/511071777> > > > >>>>>>>>>> > > > >>>>>>>>>> In order to verify that our speculation is accurate, I can > > > >>>> help with > > > >>>>>>> 10 > > > >>>>>>>>> and > > > >>>>>>>>>> 20 seconds timeout config on my repo verification to see if > > > >>>> 100% > > > >>>>>>>>> recurring > > > >>>>>>>>>> timeout problem. It is already running, we are waiting for > > > >> the > > > >>>>>>> result. > > > >>>>>>>>>> 10seconds < > > > >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235749 > > > >>>>>>>>> > > > >>>>>>>>>> 20seconds < > > > >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235598 > > > >>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> Best, > > > >>>>>>>>>> Jincheng > > > >>>>>>>>>> > > > >>>>>>>>>> Aljoscha Krettek <[hidden email]> 于2019年3月26日周二 > > > >>>> 上午1:04写道: > > > >>>>>>>>>> > > > >>>>>>>>>>> Thanks for the testing done so far! > > > >>>>>>>>>>> > > > >>>>>>>>>>> There has been quite some flakiness on Travis lately, see > > > >>>> here: > > > >>>>>>>>>>> https://travis-ci.org/apache/flink/branches < > > > >>>>>>>>>>> https://travis-ci.org/apache/flink/branches>. I’m a bit > > > >>>> hesitant > > > >>>>>>> to > > > >>>>>>>>>>> release in this state. Looking at the tests you can see > > > >>>> that all of > > > >>>>>>>> the > > > >>>>>>>>>>> end-to-end tests fail because waiting for the dispatcher > > > >> to > > > >>>> come up > > > >>>>>>>>> times > > > >>>>>>>>>>> out. I also noticed that this usually takes about 5-8 > > > >>>> seconds on > > > >>>>>>>>> Travis, > > > >>>>>>>>>> so > > > >>>>>>>>>>> a 10 second timeout might be a bit low. I pushed commits > > > >> to > > > >>>>>>> increase > > > >>>>>>>>> that > > > >>>>>>>>>>> to 20 secs. Let’s see what will happen. > > > >>>>>>>>>>> > > > >>>>>>>>>>> I’ll keep you posted! > > > >>>>>>>>>>> Aljoscha > > > >>>>>>>>>>> > > > >>>>>>>>>>>> On 25. Mar 2019, at 13:13, jincheng sun < > > > >>>>>>> [hidden email]> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Great thanks for preparing the RC4 of Flink 1.8.0, > > > >>>> Aljoscha! > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> +1 (non-binding) > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I checked the functional things as follows(Without > > > >>>> performance > > > >>>>>>>>>>>> verification): > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1. Checking Artifacts: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1). Download the release source code - SUCCESS > > > >>>>>>>>>>>> 2). Check Source release flink-1.8.0-src.tgz.sha512 - > > > >>>> SUCCESS > > > >>>>>>>>>>>> 3). Download the released JAR - SUCCESS > > > >>>>>>>>>>>> 4). Check if checksums and GPG files match the > > > >>>> corresponding > > > >>>>>>>>> release > > > >>>>>>>>>>>> files - SUCCESS. > > > >>>>>>>>>>>> 5). Verify that the source archives do not contain any > > > >>>>>>> binaries > > > >>>>>>>> - > > > >>>>>>>>>>>> SUCCESS. > > > >>>>>>>>>>>> 6). Build the source with `mvn clean verify -DskipTests` > > > >>>> to > > > >>>>>>>> ensure > > > >>>>>>>>>> all > > > >>>>>>>>>>>> source files have Apache headers - SUCCESS > > > >>>>>>>>>>>> 7). Check that all POM files point to the same version - > > > >>>>>>> SUCCESS > > > >>>>>>>>>>>> 8). Read the `README.md` file to ensure there is nothing > > > >>>>>>>>> unexpected > > > >>>>>>>>>> - > > > >>>>>>>>>>>> SUCCESS > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2. Testing Larger Setups > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Cluster Environment:7 nodes, jm 1024m, tm 4096m > > > >>>>>>>>>>>> Testing Jobs: WordCount(Batch&Streaming), > > > >>>>>>>>>> DataStreamAllroundTestProgram > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1). Use local&hdfs file systems for checkpoints - > > > >> SUCCESS > > > >>>>>>>>>>>> 2). Use hdfs file systems for input/output -SUCCESS > > > >>>>>>>>>>>> 3). Run examples on YARN(with or without session) - > > > >>>> SUCCESS > > > >>>>>>>>>>>> 4). Test failover and recovery. - SUCCESS > > > >>>>>>>>>>>> 5). Test incremental&non-incremental checkpoint - > > > >> SUCCESS > > > >>>>>>>>>>>> 6). Test connector - kafka -SUCCESS > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 3. Testing Functionality > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1). Built-in tests(linux&mac os) > > > >>>>>>>>>>>> - `mvn cealn verify` (some test timeout error and test > > > >>>> case > > > >>>>>>>> bug > > > >>>>>>>>>> see > > > >>>>>>>>>>>> FLINK-12001 < > > > >>>> https://issues.apache.org/jira/browse/FLINK-12001>, > > > >>>>>>>> all > > > >>>>>>>>>> of > > > >>>>>>>>>>>> them are not the blocker) > > > >>>>>>>>>>>> - build for scala 2.11(mvn clean install -P scala-2.11 > > > >>>>>>>>>> -DskipTests) > > > >>>>>>>>>>>> - SUCCESS > > > >>>>>>>>>>>> - Run the scripted nightly end-to-end test - SUCCESS > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2). Quickstarts > > > >>>>>>>>>>>> - Verify that the quickstarts for Scala with the staging > > > >>>>>>>>>> repository > > > >>>>>>>>>>>> in IntelliJ - SUCCESS > > > >>>>>>>>>>>> - Verify that the quickstarts for Java with the staging > > > >>>>>>>>> repository > > > >>>>>>>>>>> in > > > >>>>>>>>>>>> IntelliJ - SUCCESS > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 3). Simple Starter Experience and Use Cases > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> - run all examples from IntelliJ IDE - SUCCESS > > > >>>>>>>>>>>> - Start a local cluster and verify that the processes - > > > >>>>>>>> SUCCESS > > > >>>>>>>>>>>> a. Examine the *.out files (should be empty) and the log > > > >>>>>>>> files > > > >>>>>>>>>>>> (should contain no exceptions) > > > >>>>>>>>>>>> b. Test for Linux, MacOS > > > >>>>>>>>>>>> c. Shutdown and verify there are no exceptions in the > > > >> log > > > >>>>>>>>> output > > > >>>>>>>>>>>> (after shutdown) > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> - Verify that the examples are running from both > > > >>>> ./bin/flink > > > >>>>>>>> and > > > >>>>>>>>>>> from > > > >>>>>>>>>>>> the web-based job submission tool(following items) - > > > >>>> SUCCESS > > > >>>>>>>>>>>> a. Start multiple task managers in the local cluster > > > >>>>>>>>>>>> b. Change the flink-conf.yml to define more than one > > > >> task > > > >>>>>>>> slot > > > >>>>>>>>>> (2) > > > >>>>>>>>>>>> c. Run the examples with a parallelism > 1 > > > >>>>>>>>>>>> d. Examine the log output - no error messages should be > > > >>>>>>>>>>> encountered > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 4. Review the PR > > > >>>>>>>>>>>> - [Add 1.8 Release Blog Post] - Just a reminder, updated > > > >>>> the > > > >>>>>>>>>> release > > > >>>>>>>>>>>> date to correct date before merging. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Cheers, > > > >>>>>>>>>>>> Jincheng > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Piotr Nowojski <[hidden email]> 于2019年3月25日周一 > > > >>>> 下午4:11写道: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> +1 from my side. Previously spotted performance > > > >>>> regression seems > > > >>>>>>>> to > > > >>>>>>>>> be > > > >>>>>>>>>>>>> gone, or mostly gone. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Piotrek > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> On 21 Mar 2019, at 17:52, Aljoscha Krettek < > > > >>>>>>> [hidden email]> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Hi everyone, > > > >>>>>>>>>>>>>> Please review and vote on the release candidate 4 > > > >> for > > > >>>> Flink > > > >>>>>>>> 1.8.0, > > > >>>>>>>>> as > > > >>>>>>>>>>>>> follows: > > > >>>>>>>>>>>>>> [ ] +1, Approve the release > > > >>>>>>>>>>>>>> [ ] -1, Do not approve the release (please provide > > > >>>> specific > > > >>>>>>>>> comments) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> The complete staging area is available for your > > > >>>> review, which > > > >>>>>>>>>> includes: > > > >>>>>>>>>>>>>> * JIRA release notes [1], > > > >>>>>>>>>>>>>> * the official Apache source release and binary > > > >>>> convenience > > > >>>>>>>>> releases > > > >>>>>>>>>> to > > > >>>>>>>>>>>>> be deployed to dist.apache.org [2], which are signed > > > >>>> with the > > > >>>>>>> key > > > >>>>>>>>>> with > > > >>>>>>>>>>>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 > > > >>>> [3], > > > >>>>>>>>>>>>>> * all artifacts to be deployed to the Maven Central > > > >>>> Repository > > > >>>>>>>> [4], > > > >>>>>>>>>>>>>> * source code tag "release-1.8.0-rc4" [5], > > > >>>>>>>>>>>>>> * website pull request listing the new release [6] > > > >>>>>>>>>>>>>> * website pull request adding announcement blog post > > > >>>> [7]. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> The vote will be open for at least 72 hours. It is > > > >>>> adopted by > > > >>>>>>>>>> majority > > > >>>>>>>>>>>>> approval, with at least 3 PMC affirmative votes. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Thanks, > > > >>>>>>>>>>>>>> Aljoscha > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> [1] > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>> > > > >> > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 > > > >>>>>>>>>>>>>> [2] > > > >>>>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/ > > > >>>>>>>>>>>>>> [3] > > > >>>> https://dist.apache.org/repos/dist/release/flink/KEYS > > > >>>>>>>>>>>>>> [4] > > > >>>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>> > > > >>>> > > > https://repository.apache.org/content/repositories/orgapacheflink-1215 > > > >>>>>>>>>>>>>> [5] > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>> > > > >> > > > > > > https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145 > > > >>>>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 < > > > >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/180> > > > >>>>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 < > > > >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/179> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> P.S. The difference to the previous RCs is small, > > > >> you > > > >>>> can fetch > > > >>>>>>>> the > > > >>>>>>>>>>> tags > > > >>>>>>>>>>>>> and do a "git log > > > >> release-1.8.0-rc1..release-1.8.0-rc4” > > > >>>> to see > > > >>>>>>> the > > > >>>>>>>>>>>>> difference in commits. Its fixes for the issues that > > > >>>> led to the > > > >>>>>>>>>>>>> cancellation of the previous RCs plus smaller fixes. > > > >>>> Most > > > >>>>>>>>>>>>> verification/testing that was carried out should apply > > > >>>> as is to > > > >>>>>>>> this > > > >>>>>>>>>> RC. > > > >>>>>>>>>>>>> Any functional verification that you did on previous > > > >>>> RCs should > > > >>>>>>>>>>> therefore > > > >>>>>>>>>>>>> easily carry over to this one. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > |
I’m hereby canceling the vote for Flink 1.8.0 RC4 in favour of a new RC that I will create shortly now that the blockers are resolved.
> On 1. Apr 2019, at 13:21, Till Rohrmann <[hidden email]> wrote: > > Thanks for reporting this problem and opening a JIRA issue. I've created a > fix for the problem [1]. > > [1] https://github.com/apache/flink/pull/8096 > > Cheers, > Till > > On Mon, Apr 1, 2019 at 12:30 AM Richard Deurwaarder <[hidden email]> wrote: > >> Hello @Aljoscha and @Rong, >> >> I've described the problem in the mailing list[1] and on stackoverflow[2] >> before. But the gist is: If there's a firewall between the yarn cluster and >> the machine submitting the job, we need to be able to set a fixed port (or >> range of ports) for REST communication with the jobmanager. >> >> It is a regression in the sense that on 1.5 (and 1.6 I believe?) it was >> possible to work around this by using the legacy mode (non flip-6), but on >> 1.7 and now 1.8 this is not possible. >> >> I've created FLINK-12075 < >> https://issues.apache.org/jira/browse/FLINK-12075> >> for it, I have not made it blocking yet as it is not strictly a regression >> with regards to 1.7. Perhaps you guys can better determine if you want this >> added in RC5. >> >> Regards, >> >> Richard >> >> [1] >> >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Submitting-job-to-Flink-on-yarn-timesout-on-flip-6-1-5-x-td26199.html#a26383 >> [2] https://stackoverflow.com/q/54771637/988324 >> >> On Sat, Mar 30, 2019 at 7:24 PM Rong Rong <[hidden email]> wrote: >> >>> Hi @Aljoscha, >>> >>> Based on the previous commit [1] that adds the random port selection >> code, >>> it seems like the important part is to unset whatever 'rest.port' setting >>> previously done. I don't think the current way of setting the BIND_PORT >>> actually overrides any existing PORT setting. However, I wasn't able to >>> find any test that is related, maybe @Till can provide more insight here? >>> >>> Maybe @Richard can provide more detail on the YARN run command used to >>> reproduce the problem? >>> >>> Thanks, >>> Rong >>> >>> [1] >>> >>> >> https://github.com/apache/flink/commit/dbe0e8286d76a5facdb49589b638b87dbde80178#diff-487838863ab693af7008f04cb3359be3R117 >>> >>> On Sat, Mar 30, 2019 at 5:51 AM Aljoscha Krettek <[hidden email]> >>> wrote: >>> >>>> @Richard Did this work for you previously? From the change, it seems >> that >>>> the port was always set to 0 on YARN even before. >>>> >>>>> On 28. Mar 2019, at 16:13, Richard Deurwaarder <[hidden email]> >>> wrote: >>>>> >>>>> -1 (non-binding) >>>>> >>>>> - Ran integration tests locally (1000+) of our flink job, all >>> succeeded. >>>>> - Attempted to run job on hadoop, failed. It failed because we have a >>>>> firewall in place and we cannot set the rest port to a specific >>> port/port >>>>> range. >>>>> Unless I am mistaken, it seems like FLINK-11081 broke the possibility >>> of >>>>> setting a REST port when running on yarn ( >>>>> >>>> >>> >> https://github.com/apache/flink/commit/730eed71ef3f718d61f85d5e94b1060844ca56db#diff-487838863ab693af7008f04cb3359be3R102 >>>>> ) >>>>> Code-wise it seems rather straightforward to fix but I am unsure >> about >>>> the >>>>> reason why this is hard-coded to 0 and what the impact would be. >>>>> >>>>> It would benefit us greatly if a fix for this could make it to 1.8.0. >>>>> >>>>> Regards, >>>>> >>>>> Richard >>>>> >>>>> On Thu, Mar 28, 2019 at 9:54 AM Tzu-Li (Gordon) Tai < >>> [hidden email] >>>>> >>>>> wrote: >>>>> >>>>>> +1 (binding) >>>>>> >>>>>> Functional checks: >>>>>> >>>>>> - Built Flink from source (`mvn clean verify`) locally, with success >>>>>> - Ran end-to-end tests locally for 5 times in a loop, no attempts >>> failed >>>>>> (Hadoop 2.8.4, Scala 2.12) >>>>>> - Manually tested state schema evolution for POJO. Besides the tests >>>> that >>>>>> @Congxian already did, additionally tested evolution cases with POJO >>>>>> subclasses + non-registered POJOs. >>>>>> - Manually tested migration of Scala stateful jobs that use case >>>> classes / >>>>>> Scala collections as state types, performing the migration across >>> Scala >>>>>> 2.11 to Scala 2.12. >>>>>> - Reviewed release announcement PR >>>>>> >>>>>> Misc / legal checks: >>>>>> >>>>>> - checked checksums and signatures >>>>>> - No binaries in source distribution >>>>>> - Staging area does not seem to have any missing artifacts >>>>>> >>>>>> Cheers, >>>>>> Gordon >>>>>> >>>>>> On Thu, Mar 28, 2019 at 4:52 PM Tzu-Li (Gordon) Tai < >>>> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> @Shaoxuan >>>>>>> >>>>>>> The drop in the serializerAvro benchmark, as explained earlier in >>>>>> previous >>>>>>> voting threads of earlier RCs, was due to a slower job >> initialization >>>>>> phase >>>>>>> caused by slower deserialization of the AvroSerializer. >>>>>>> Piotr also pointed out that after the number of records was >> increased >>>> in >>>>>>> the serializer benchmarks, this drop was no longer observable >> before >>> / >>>>>>> after the changes in mid February. >>>>>>> IMO, this is not critical as it does not affect the per-record >>>>>> performance >>>>>>> / throughput, and therefore should not block this release. >>>>>>> >>>>>>> On Thu, Mar 28, 2019 at 1:08 AM Aljoscha Krettek < >>>> [hidden email]> >>>>>>> wrote: >>>>>>> >>>>>>>> By now, I'm reasonably sure that the test instabilities on the >>>>>> end-to-end >>>>>>>> test are only instabilities. I pushed changes to increase timeouts >>> to >>>>>> make >>>>>>>> the tests more stable. As in any project, there will always be >> bugs >>>> but >>>>>> I >>>>>>>> think we could release this RC4 and be reasonably sure that it >> works >>>>>> well. >>>>>>>> >>>>>>>> Now, we only need to have the required number of PMC votes. >>>>>>>> >>>>>>>> On Wed, Mar 27, 2019, at 07:22, Congxian Qiu wrote: >>>>>>>>> +1 (non-binding) >>>>>>>>> >>>>>>>>> • checked signature and checksum ok >>>>>>>>> • mvn clean package -DskipTests ok >>>>>>>>> • Run job on yarn ok >>>>>>>>> • Test state migration with POJO type (both heap and rocksdb) ok >>>>>>>>> • - 1.6 -> 1.8 >>>>>>>>> • - 1.7 -> 1.8 >>>>>>>>> • - 1.8 -> 1.8 >>>>>>>>> >>>>>>>>> >>>>>>>>> Best, Congxian >>>>>>>>> On Mar 27, 2019, 10:26 +0800, vino yang <[hidden email]>, >>>>>> wrote: >>>>>>>>>> +1 (non-binding) >>>>>>>>>> >>>>>>>>>> - checked JIRA release note >>>>>>>>>> - ran "mvn package -DskipTests" >>>>>>>>>> - checked signature and checksum >>>>>>>>>> - started a cluster locally and ran some examples in binary >>>>>>>>>> - checked web site announcement's PR >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Vino >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Xiaowei Jiang <[hidden email]> 于2019年3月26日周二 下午8:20写道: >>>>>>>>>> >>>>>>>>>>> +1 (non-binding) >>>>>>>>>>> >>>>>>>>>>> - checked checksums and GPG files >>>>>>>>>>> - build from source successfully- run end-to-end precommit >> tests >>>>>>>>>>> successfully- run end-to-end nightly tests successfully >>>>>>>>>>> Xiaowei >>>>>>>>>>> On Tuesday, March 26, 2019, 8:09:19 PM GMT+8, Yu Li < >>>>>>>> [hidden email]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> +1 (non-binding) >>>>>>>>>>> >>>>>>>>>>> - Checked release notes: OK >>>>>>>>>>> - Checked sums and signatures: OK >>>>>>>>>>> - Source release >>>>>>>>>>> - contains no binaries: OK >>>>>>>>>>> - contains no 1.8-SNAPSHOT references: OK >>>>>>>>>>> - build from source: OK (8u101) >>>>>>>>>>> - mvn clean verify: OK (8u101) >>>>>>>>>>> - Binary release >>>>>>>>>>> - no examples appear to be missing >>>>>>>>>>> - started a cluster; WebUI reachable, example ran successfully >>>>>>>>>>> - end-to-end test (all but K8S and docker ones): OK (8u101) >>>>>>>>>>> - Repository appears to contain all expected artifacts >>>>>>>>>>> >>>>>>>>>>> Best Regards, >>>>>>>>>>> Yu >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, 26 Mar 2019 at 14:28, Kurt Young <[hidden email]> >>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>> >>>>>>>>>>>> Checked items: >>>>>>>>>>>> - checked checksums and GPG files >>>>>>>>>>>> - verified that the source archives do not contains any >> binaries >>>>>>>>>>>> - checked that all POM files point to the same version >>>>>>>>>>>> - build from source successfully >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Kurt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 26, 2019 at 10:57 AM Shaoxuan Wang < >>>>>>>> [hidden email]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>>> >>>>>>>>>>>>> I tested RC4 with the following items: >>>>>>>>>>>>> - Maven Central Repository contains all artifacts >>>>>>>>>>>>> - Built the source with Maven (ensured all source files have >>>>>>>> Apache >>>>>>>>>>>>> headers), and executed built-in tests via "mvn clean verify" >>>>>>>>>>>>> - Manually executed the tests in IntelliJ IDE >>>>>>>>>>>>> - Verify that the quickstarts for Scala and Java are working >>>>>>>> with the >>>>>>>>>>>>> staging repository in IntelliJ >>>>>>>>>>>>> - Checked the benchmark results. The perf regression of >>>>>>>>>>>>> tuple-key-by/statebackend/tumblingWindow are gone, but the >>>>>>>> regression >>>>>>>>>>> on >>>>>>>>>>>>> serializer still exists. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Shaoxuan >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 26, 2019 at 8:06 AM jincheng sun < >>>>>>>> [hidden email] >>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Aljoscha, I think you are right, increase the timeout >>>>>>>> config will >>>>>>>>>>>> fix >>>>>>>>>>>>>> this issue. this depends on the resource of Travis. I would >>>>>>>> like >>>>>>>>>>> share >>>>>>>>>>>>>> some phenomenon during my test (not the flink problem) as >>>>>>>> follows: >>>>>>>>>>> :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> During my testing, `mvn clean verify` and `nightly >>>>>> end-to-end >>>>>>>> test ` >>>>>>>>>>>> both >>>>>>>>>>>>>> consume a lot of machine resources (especially >>>>>>>> memory/network), and >>>>>>>>>>> the >>>>>>>>>>>>>> network bandwidth requirements of `nightly end-to-end test ` >>>>>>>> are also >>>>>>>>>>>>> very >>>>>>>>>>>>>> high. In China, need to use VPN acceleration (100~200Kb >>>>>> before >>>>>>>>>>>>>> acceleration, 3~4Mb after acceleration), I have encountered: >>>>>>>> [Avro >>>>>>>>>>>>>> Confluent Schema Registry nightly end-to-end test' failed >>>>>>>> after 18 >>>>>>>>>>>>> minutes >>>>>>>>>>>>>> and 15 seconds! Test exited with exit Code 1] takes more >>>>>> than >>>>>>>> 18 >>>>>>>>>>>> minutes, >>>>>>>>>>>>>> the download failed because the network bandwidth is not >>>>>>>> enough. and >>>>>>>>>>> it >>>>>>>>>>>>>> runs smoothly when using VPN acceleration. The overall >>>>>>>> end-to-end run >>>>>>>>>>>> was >>>>>>>>>>>>>> passed twice. The Docker resource configuration (CUPs 7, >>>>>> Mem: >>>>>>>> 28.7G, >>>>>>>>>>>>> Swap: >>>>>>>>>>>>>> 3.5G). See detail log here >>>>>>>>>>>>>> < >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing >>>>>>>>>>>>>>> >>>>>>>>>>>>>> . >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just now, I had checked the Travis for your last commit >>>>>>>> (Increase >>>>>>>>>>>> startup >>>>>>>>>>>>>> timeout in end-to-end tests), in addition to the Cleanup >>>>>>>> phase, other >>>>>>>>>>>>>> phases are successful. here >>>>>>>>>>>>>> <https://travis-ci.org/apache/flink/builds/511071777> >>>>>>>>>>>>>> >>>>>>>>>>>>>> In order to verify that our speculation is accurate, I can >>>>>>>> help with >>>>>>>>>>> 10 >>>>>>>>>>>>> and >>>>>>>>>>>>>> 20 seconds timeout config on my repo verification to see if >>>>>>>> 100% >>>>>>>>>>>>> recurring >>>>>>>>>>>>>> timeout problem. It is already running, we are waiting for >>>>>> the >>>>>>>>>>> result. >>>>>>>>>>>>>> 10seconds < >>>>>>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235749 >>>>>>>>>>>>> >>>>>>>>>>>>>> 20seconds < >>>>>>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235598 >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Jincheng >>>>>>>>>>>>>> >>>>>>>>>>>>>> Aljoscha Krettek <[hidden email]> 于2019年3月26日周二 >>>>>>>> 上午1:04写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for the testing done so far! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There has been quite some flakiness on Travis lately, see >>>>>>>> here: >>>>>>>>>>>>>>> https://travis-ci.org/apache/flink/branches < >>>>>>>>>>>>>>> https://travis-ci.org/apache/flink/branches>. I’m a bit >>>>>>>> hesitant >>>>>>>>>>> to >>>>>>>>>>>>>>> release in this state. Looking at the tests you can see >>>>>>>> that all of >>>>>>>>>>>> the >>>>>>>>>>>>>>> end-to-end tests fail because waiting for the dispatcher >>>>>> to >>>>>>>> come up >>>>>>>>>>>>> times >>>>>>>>>>>>>>> out. I also noticed that this usually takes about 5-8 >>>>>>>> seconds on >>>>>>>>>>>>> Travis, >>>>>>>>>>>>>> so >>>>>>>>>>>>>>> a 10 second timeout might be a bit low. I pushed commits >>>>>> to >>>>>>>>>>> increase >>>>>>>>>>>>> that >>>>>>>>>>>>>>> to 20 secs. Let’s see what will happen. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I’ll keep you posted! >>>>>>>>>>>>>>> Aljoscha >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 25. Mar 2019, at 13:13, jincheng sun < >>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Great thanks for preparing the RC4 of Flink 1.8.0, >>>>>>>> Aljoscha! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I checked the functional things as follows(Without >>>>>>>> performance >>>>>>>>>>>>>>>> verification): >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Checking Artifacts: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1). Download the release source code - SUCCESS >>>>>>>>>>>>>>>> 2). Check Source release flink-1.8.0-src.tgz.sha512 - >>>>>>>> SUCCESS >>>>>>>>>>>>>>>> 3). Download the released JAR - SUCCESS >>>>>>>>>>>>>>>> 4). Check if checksums and GPG files match the >>>>>>>> corresponding >>>>>>>>>>>>> release >>>>>>>>>>>>>>>> files - SUCCESS. >>>>>>>>>>>>>>>> 5). Verify that the source archives do not contain any >>>>>>>>>>> binaries >>>>>>>>>>>> - >>>>>>>>>>>>>>>> SUCCESS. >>>>>>>>>>>>>>>> 6). Build the source with `mvn clean verify -DskipTests` >>>>>>>> to >>>>>>>>>>>> ensure >>>>>>>>>>>>>> all >>>>>>>>>>>>>>>> source files have Apache headers - SUCCESS >>>>>>>>>>>>>>>> 7). Check that all POM files point to the same version - >>>>>>>>>>> SUCCESS >>>>>>>>>>>>>>>> 8). Read the `README.md` file to ensure there is nothing >>>>>>>>>>>>> unexpected >>>>>>>>>>>>>> - >>>>>>>>>>>>>>>> SUCCESS >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. Testing Larger Setups >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cluster Environment:7 nodes, jm 1024m, tm 4096m >>>>>>>>>>>>>>>> Testing Jobs: WordCount(Batch&Streaming), >>>>>>>>>>>>>> DataStreamAllroundTestProgram >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1). Use local&hdfs file systems for checkpoints - >>>>>> SUCCESS >>>>>>>>>>>>>>>> 2). Use hdfs file systems for input/output -SUCCESS >>>>>>>>>>>>>>>> 3). Run examples on YARN(with or without session) - >>>>>>>> SUCCESS >>>>>>>>>>>>>>>> 4). Test failover and recovery. - SUCCESS >>>>>>>>>>>>>>>> 5). Test incremental&non-incremental checkpoint - >>>>>> SUCCESS >>>>>>>>>>>>>>>> 6). Test connector - kafka -SUCCESS >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 3. Testing Functionality >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1). Built-in tests(linux&mac os) >>>>>>>>>>>>>>>> - `mvn cealn verify` (some test timeout error and test >>>>>>>> case >>>>>>>>>>>> bug >>>>>>>>>>>>>> see >>>>>>>>>>>>>>>> FLINK-12001 < >>>>>>>> https://issues.apache.org/jira/browse/FLINK-12001>, >>>>>>>>>>>> all >>>>>>>>>>>>>> of >>>>>>>>>>>>>>>> them are not the blocker) >>>>>>>>>>>>>>>> - build for scala 2.11(mvn clean install -P scala-2.11 >>>>>>>>>>>>>> -DskipTests) >>>>>>>>>>>>>>>> - SUCCESS >>>>>>>>>>>>>>>> - Run the scripted nightly end-to-end test - SUCCESS >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2). Quickstarts >>>>>>>>>>>>>>>> - Verify that the quickstarts for Scala with the staging >>>>>>>>>>>>>> repository >>>>>>>>>>>>>>>> in IntelliJ - SUCCESS >>>>>>>>>>>>>>>> - Verify that the quickstarts for Java with the staging >>>>>>>>>>>>> repository >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>> IntelliJ - SUCCESS >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 3). Simple Starter Experience and Use Cases >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - run all examples from IntelliJ IDE - SUCCESS >>>>>>>>>>>>>>>> - Start a local cluster and verify that the processes - >>>>>>>>>>>> SUCCESS >>>>>>>>>>>>>>>> a. Examine the *.out files (should be empty) and the log >>>>>>>>>>>> files >>>>>>>>>>>>>>>> (should contain no exceptions) >>>>>>>>>>>>>>>> b. Test for Linux, MacOS >>>>>>>>>>>>>>>> c. Shutdown and verify there are no exceptions in the >>>>>> log >>>>>>>>>>>>> output >>>>>>>>>>>>>>>> (after shutdown) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Verify that the examples are running from both >>>>>>>> ./bin/flink >>>>>>>>>>>> and >>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>> the web-based job submission tool(following items) - >>>>>>>> SUCCESS >>>>>>>>>>>>>>>> a. Start multiple task managers in the local cluster >>>>>>>>>>>>>>>> b. Change the flink-conf.yml to define more than one >>>>>> task >>>>>>>>>>>> slot >>>>>>>>>>>>>> (2) >>>>>>>>>>>>>>>> c. Run the examples with a parallelism > 1 >>>>>>>>>>>>>>>> d. Examine the log output - no error messages should be >>>>>>>>>>>>>>> encountered >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 4. Review the PR >>>>>>>>>>>>>>>> - [Add 1.8 Release Blog Post] - Just a reminder, updated >>>>>>>> the >>>>>>>>>>>>>> release >>>>>>>>>>>>>>>> date to correct date before merging. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Jincheng >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Piotr Nowojski <[hidden email]> 于2019年3月25日周一 >>>>>>>> 下午4:11写道: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +1 from my side. Previously spotted performance >>>>>>>> regression seems >>>>>>>>>>>> to >>>>>>>>>>>>> be >>>>>>>>>>>>>>>>> gone, or mostly gone. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Piotrek >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 21 Mar 2019, at 17:52, Aljoscha Krettek < >>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>>>> Please review and vote on the release candidate 4 >>>>>> for >>>>>>>> Flink >>>>>>>>>>>> 1.8.0, >>>>>>>>>>>>> as >>>>>>>>>>>>>>>>> follows: >>>>>>>>>>>>>>>>>> [ ] +1, Approve the release >>>>>>>>>>>>>>>>>> [ ] -1, Do not approve the release (please provide >>>>>>>> specific >>>>>>>>>>>>> comments) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The complete staging area is available for your >>>>>>>> review, which >>>>>>>>>>>>>> includes: >>>>>>>>>>>>>>>>>> * JIRA release notes [1], >>>>>>>>>>>>>>>>>> * the official Apache source release and binary >>>>>>>> convenience >>>>>>>>>>>>> releases >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> be deployed to dist.apache.org [2], which are signed >>>>>>>> with the >>>>>>>>>>> key >>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 >>>>>>>> [3], >>>>>>>>>>>>>>>>>> * all artifacts to be deployed to the Maven Central >>>>>>>> Repository >>>>>>>>>>>> [4], >>>>>>>>>>>>>>>>>> * source code tag "release-1.8.0-rc4" [5], >>>>>>>>>>>>>>>>>> * website pull request listing the new release [6] >>>>>>>>>>>>>>>>>> * website pull request adding announcement blog post >>>>>>>> [7]. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. It is >>>>>>>> adopted by >>>>>>>>>>>>>> majority >>>>>>>>>>>>>>>>> approval, with at least 3 PMC affirmative votes. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Aljoscha >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 >>>>>>>>>>>>>>>>>> [2] >>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/ >>>>>>>>>>>>>>>>>> [3] >>>>>>>> https://dist.apache.org/repos/dist/release/flink/KEYS >>>>>>>>>>>>>>>>>> [4] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>> https://repository.apache.org/content/repositories/orgapacheflink-1215 >>>>>>>>>>>>>>>>>> [5] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145 >>>>>>>>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 < >>>>>>>>>>>>>>>>> https://github.com/apache/flink-web/pull/180> >>>>>>>>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 < >>>>>>>>>>>>>>>>> https://github.com/apache/flink-web/pull/179> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> P.S. The difference to the previous RCs is small, >>>>>> you >>>>>>>> can fetch >>>>>>>>>>>> the >>>>>>>>>>>>>>> tags >>>>>>>>>>>>>>>>> and do a "git log >>>>>> release-1.8.0-rc1..release-1.8.0-rc4” >>>>>>>> to see >>>>>>>>>>> the >>>>>>>>>>>>>>>>> difference in commits. Its fixes for the issues that >>>>>>>> led to the >>>>>>>>>>>>>>>>> cancellation of the previous RCs plus smaller fixes. >>>>>>>> Most >>>>>>>>>>>>>>>>> verification/testing that was carried out should apply >>>>>>>> as is to >>>>>>>>>>>> this >>>>>>>>>>>>>> RC. >>>>>>>>>>>>>>>>> Any functional verification that you did on previous >>>>>>>> RCs should >>>>>>>>>>>>>>> therefore >>>>>>>>>>>>>>>>> easily carry over to this one. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>>> >>> >> |
Free forum by Nabble | Edit this page |