[VOTE] Release 1.8.0, release candidate #4

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #4

till.rohrmann
Thanks for reporting this problem and opening a JIRA issue. I've created a
fix for the problem [1].

[1] https://github.com/apache/flink/pull/8096

Cheers,
Till

On Mon, Apr 1, 2019 at 12:30 AM Richard Deurwaarder <[hidden email]> wrote:

> Hello @Aljoscha and @Rong,
>
> I've described the problem in the mailing list[1] and on stackoverflow[2]
> before. But the gist is: If there's a firewall between the yarn cluster and
> the machine submitting the job, we need to be able to set a fixed port (or
> range of ports) for REST communication with the jobmanager.
>
> It is a regression in the sense that on 1.5 (and 1.6 I believe?) it was
> possible to work around this by using the legacy mode (non flip-6), but on
> 1.7 and now 1.8 this is not possible.
>
> I've created FLINK-12075 <
> https://issues.apache.org/jira/browse/FLINK-12075>
> for it, I have not made it blocking yet as it is not strictly a regression
> with regards to 1.7. Perhaps you guys can better determine if you want this
> added in RC5.
>
> Regards,
>
> Richard
>
> [1]
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Submitting-job-to-Flink-on-yarn-timesout-on-flip-6-1-5-x-td26199.html#a26383
> [2] https://stackoverflow.com/q/54771637/988324
>
> On Sat, Mar 30, 2019 at 7:24 PM Rong Rong <[hidden email]> wrote:
>
> > Hi @Aljoscha,
> >
> > Based on the previous commit [1] that adds the random port selection
> code,
> > it seems like the important part is to unset whatever 'rest.port' setting
> > previously done. I don't think the current way of setting the BIND_PORT
> > actually overrides any existing PORT setting. However, I wasn't able to
> > find any test that is related, maybe @Till can provide more insight here?
> >
> > Maybe @Richard can provide more detail on the YARN run command used to
> > reproduce the problem?
> >
> > Thanks,
> > Rong
> >
> > [1]
> >
> >
> https://github.com/apache/flink/commit/dbe0e8286d76a5facdb49589b638b87dbde80178#diff-487838863ab693af7008f04cb3359be3R117
> >
> > On Sat, Mar 30, 2019 at 5:51 AM Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > @Richard Did this work for you previously? From the change, it seems
> that
> > > the port was always set to 0 on YARN even before.
> > >
> > > > On 28. Mar 2019, at 16:13, Richard Deurwaarder <[hidden email]>
> > wrote:
> > > >
> > > > -1 (non-binding)
> > > >
> > > > - Ran integration tests locally (1000+) of our flink job, all
> > succeeded.
> > > > - Attempted to run job on hadoop, failed. It failed because we have a
> > > > firewall in place and we cannot set the rest port to a specific
> > port/port
> > > > range.
> > > > Unless I am mistaken, it seems like FLINK-11081 broke the possibility
> > of
> > > > setting a REST port when running on yarn (
> > > >
> > >
> >
> https://github.com/apache/flink/commit/730eed71ef3f718d61f85d5e94b1060844ca56db#diff-487838863ab693af7008f04cb3359be3R102
> > > > )
> > > > Code-wise it seems rather straightforward to fix but I am unsure
> about
> > > the
> > > > reason why this is hard-coded to 0 and what the impact would be.
> > > >
> > > > It would benefit us greatly if a fix for this could make it to 1.8.0.
> > > >
> > > > Regards,
> > > >
> > > > Richard
> > > >
> > > > On Thu, Mar 28, 2019 at 9:54 AM Tzu-Li (Gordon) Tai <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> Functional checks:
> > > >>
> > > >> - Built Flink from source (`mvn clean verify`) locally, with success
> > > >> - Ran end-to-end tests locally for 5 times in a loop, no attempts
> > failed
> > > >> (Hadoop 2.8.4, Scala 2.12)
> > > >> - Manually tested state schema evolution for POJO. Besides the tests
> > > that
> > > >> @Congxian already did, additionally tested evolution cases with POJO
> > > >> subclasses + non-registered POJOs.
> > > >> - Manually tested migration of Scala stateful jobs that use case
> > > classes /
> > > >> Scala collections as state types, performing the migration across
> > Scala
> > > >> 2.11 to Scala 2.12.
> > > >> - Reviewed release announcement PR
> > > >>
> > > >> Misc / legal checks:
> > > >>
> > > >> - checked checksums and signatures
> > > >> - No binaries in source distribution
> > > >> - Staging area does not seem to have any missing artifacts
> > > >>
> > > >> Cheers,
> > > >> Gordon
> > > >>
> > > >> On Thu, Mar 28, 2019 at 4:52 PM Tzu-Li (Gordon) Tai <
> > > [hidden email]>
> > > >> wrote:
> > > >>
> > > >>> @Shaoxuan
> > > >>>
> > > >>> The drop in the serializerAvro benchmark, as explained earlier in
> > > >> previous
> > > >>> voting threads of earlier RCs, was due to a slower job
> initialization
> > > >> phase
> > > >>> caused by slower deserialization of the AvroSerializer.
> > > >>> Piotr also pointed out that after the number of records was
> increased
> > > in
> > > >>> the serializer benchmarks, this drop was no longer observable
> before
> > /
> > > >>> after the changes in mid February.
> > > >>> IMO, this is not critical as it does not affect the per-record
> > > >> performance
> > > >>> / throughput, and therefore should not block this release.
> > > >>>
> > > >>> On Thu, Mar 28, 2019 at 1:08 AM Aljoscha Krettek <
> > > [hidden email]>
> > > >>> wrote:
> > > >>>
> > > >>>> By now, I'm reasonably sure that the test instabilities on the
> > > >> end-to-end
> > > >>>> test are only instabilities. I pushed changes to increase timeouts
> > to
> > > >> make
> > > >>>> the tests more stable. As in any project, there will always be
> bugs
> > > but
> > > >> I
> > > >>>> think we could release this RC4 and be reasonably sure that it
> works
> > > >> well.
> > > >>>>
> > > >>>> Now, we only need to have the required number of PMC votes.
> > > >>>>
> > > >>>> On Wed, Mar 27, 2019, at 07:22, Congxian Qiu wrote:
> > > >>>>> +1 (non-binding)
> > > >>>>>
> > > >>>>> • checked signature and checksum  ok
> > > >>>>> • mvn clean package -DskipTests ok
> > > >>>>> • Run job on yarn ok
> > > >>>>> • Test state migration with POJO type (both heap and rocksdb) ok
> > > >>>>> • - 1.6 -> 1.8
> > > >>>>> • - 1.7 -> 1.8
> > > >>>>> • - 1.8 -> 1.8
> > > >>>>>
> > > >>>>>
> > > >>>>> Best, Congxian
> > > >>>>> On Mar 27, 2019, 10:26 +0800, vino yang <[hidden email]>,
> > > >> wrote:
> > > >>>>>> +1 (non-binding)
> > > >>>>>>
> > > >>>>>> - checked JIRA release note
> > > >>>>>> - ran "mvn package -DskipTests"
> > > >>>>>> - checked signature and checksum
> > > >>>>>> - started a cluster locally and ran some examples in binary
> > > >>>>>> - checked web site announcement's PR
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Vino
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Xiaowei Jiang <[hidden email]> 于2019年3月26日周二 下午8:20写道:
> > > >>>>>>
> > > >>>>>>> +1 (non-binding)
> > > >>>>>>>
> > > >>>>>>> - checked checksums and GPG files
> > > >>>>>>> - build from source successfully- run end-to-end precommit
> tests
> > > >>>>>>> successfully- run end-to-end nightly tests successfully
> > > >>>>>>> Xiaowei
> > > >>>>>>> On Tuesday, March 26, 2019, 8:09:19 PM GMT+8, Yu Li <
> > > >>>> [hidden email]>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> +1 (non-binding)
> > > >>>>>>>
> > > >>>>>>> - Checked release notes: OK
> > > >>>>>>> - Checked sums and signatures: OK
> > > >>>>>>> - Source release
> > > >>>>>>> - contains no binaries: OK
> > > >>>>>>> - contains no 1.8-SNAPSHOT references: OK
> > > >>>>>>> - build from source: OK (8u101)
> > > >>>>>>> - mvn clean verify: OK (8u101)
> > > >>>>>>> - Binary release
> > > >>>>>>> - no examples appear to be missing
> > > >>>>>>> - started a cluster; WebUI reachable, example ran successfully
> > > >>>>>>> - end-to-end test (all but K8S and docker ones): OK (8u101)
> > > >>>>>>> - Repository appears to contain all expected artifacts
> > > >>>>>>>
> > > >>>>>>> Best Regards,
> > > >>>>>>> Yu
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Tue, 26 Mar 2019 at 14:28, Kurt Young <[hidden email]>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> +1 (non-binding)
> > > >>>>>>>>
> > > >>>>>>>> Checked items:
> > > >>>>>>>> - checked checksums and GPG files
> > > >>>>>>>> - verified that the source archives do not contains any
> binaries
> > > >>>>>>>> - checked that all POM files point to the same version
> > > >>>>>>>> - build from source successfully
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Kurt
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Mar 26, 2019 at 10:57 AM Shaoxuan Wang <
> > > >>>> [hidden email]>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> +1 (non-binding)
> > > >>>>>>>>>
> > > >>>>>>>>> I tested RC4 with the following items:
> > > >>>>>>>>> - Maven Central Repository contains all artifacts
> > > >>>>>>>>> - Built the source with Maven (ensured all source files have
> > > >>>> Apache
> > > >>>>>>>>> headers), and executed built-in tests via "mvn clean verify"
> > > >>>>>>>>> - Manually executed the tests in IntelliJ IDE
> > > >>>>>>>>> - Verify that the quickstarts for Scala and Java are working
> > > >>>> with the
> > > >>>>>>>>> staging repository in IntelliJ
> > > >>>>>>>>> - Checked the benchmark results. The perf regression of
> > > >>>>>>>>> tuple-key-by/statebackend/tumblingWindow are gone, but the
> > > >>>> regression
> > > >>>>>>> on
> > > >>>>>>>>> serializer still exists.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Shaoxuan
> > > >>>>>>>>>
> > > >>>>>>>>> On Tue, Mar 26, 2019 at 8:06 AM jincheng sun <
> > > >>>> [hidden email]
> > > >>>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi Aljoscha, I think you are right, increase the timeout
> > > >>>> config will
> > > >>>>>>>> fix
> > > >>>>>>>>>> this issue. this depends on the resource of Travis. I would
> > > >>>> like
> > > >>>>>>> share
> > > >>>>>>>>>> some phenomenon during my test (not the flink problem) as
> > > >>>> follows:
> > > >>>>>>> :-)
> > > >>>>>>>>>>
> > > >>>>>>>>>> During my testing, `mvn clean verify` and `nightly
> > > >> end-to-end
> > > >>>> test `
> > > >>>>>>>> both
> > > >>>>>>>>>> consume a lot of machine resources (especially
> > > >>>> memory/network), and
> > > >>>>>>> the
> > > >>>>>>>>>> network bandwidth requirements of `nightly end-to-end test `
> > > >>>> are also
> > > >>>>>>>>> very
> > > >>>>>>>>>> high. In China, need to use VPN acceleration (100~200Kb
> > > >> before
> > > >>>>>>>>>> acceleration, 3~4Mb after acceleration), I have encountered:
> > > >>>> [Avro
> > > >>>>>>>>>> Confluent Schema Registry nightly end-to-end test' failed
> > > >>>> after 18
> > > >>>>>>>>> minutes
> > > >>>>>>>>>> and 15 seconds! Test exited with exit Code 1] takes more
> > > >> than
> > > >>>> 18
> > > >>>>>>>> minutes,
> > > >>>>>>>>>> the download failed because the network bandwidth is not
> > > >>>> enough. and
> > > >>>>>>> it
> > > >>>>>>>>>> runs smoothly when using VPN acceleration. The overall
> > > >>>> end-to-end run
> > > >>>>>>>> was
> > > >>>>>>>>>> passed twice. The Docker resource configuration (CUPs 7,
> > > >> Mem:
> > > >>>> 28.7G,
> > > >>>>>>>>> Swap:
> > > >>>>>>>>>> 3.5G). See detail log here
> > > >>>>>>>>>> <
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing
> > > >>>>>>>>>>>
> > > >>>>>>>>>> .
> > > >>>>>>>>>>
> > > >>>>>>>>>> Just now, I had checked the Travis for your last commit
> > > >>>> (Increase
> > > >>>>>>>> startup
> > > >>>>>>>>>> timeout in end-to-end tests), in addition to the Cleanup
> > > >>>> phase, other
> > > >>>>>>>>>> phases are successful. here
> > > >>>>>>>>>> <https://travis-ci.org/apache/flink/builds/511071777>
> > > >>>>>>>>>>
> > > >>>>>>>>>> In order to verify that our speculation is accurate, I can
> > > >>>> help with
> > > >>>>>>> 10
> > > >>>>>>>>> and
> > > >>>>>>>>>> 20 seconds timeout config on my repo verification to see if
> > > >>>> 100%
> > > >>>>>>>>> recurring
> > > >>>>>>>>>> timeout problem. It is already running, we are waiting for
> > > >> the
> > > >>>>>>> result.
> > > >>>>>>>>>> 10seconds <
> > > >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235749
> > > >>>>>>>>>
> > > >>>>>>>>>> 20seconds <
> > > >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235598
> > > >>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best,
> > > >>>>>>>>>> Jincheng
> > > >>>>>>>>>>
> > > >>>>>>>>>> Aljoscha Krettek <[hidden email]> 于2019年3月26日周二
> > > >>>> 上午1:04写道:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Thanks for the testing done so far!
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> There has been quite some flakiness on Travis lately, see
> > > >>>> here:
> > > >>>>>>>>>>> https://travis-ci.org/apache/flink/branches <
> > > >>>>>>>>>>> https://travis-ci.org/apache/flink/branches>. I’m a bit
> > > >>>> hesitant
> > > >>>>>>> to
> > > >>>>>>>>>>> release in this state. Looking at the tests you can see
> > > >>>> that all of
> > > >>>>>>>> the
> > > >>>>>>>>>>> end-to-end tests fail because waiting for the dispatcher
> > > >> to
> > > >>>> come up
> > > >>>>>>>>> times
> > > >>>>>>>>>>> out. I also noticed that this usually takes about 5-8
> > > >>>> seconds on
> > > >>>>>>>>> Travis,
> > > >>>>>>>>>> so
> > > >>>>>>>>>>> a 10 second timeout might be a bit low. I pushed commits
> > > >> to
> > > >>>>>>> increase
> > > >>>>>>>>> that
> > > >>>>>>>>>>> to 20 secs. Let’s see what will happen.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I’ll keep you posted!
> > > >>>>>>>>>>> Aljoscha
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> On 25. Mar 2019, at 13:13, jincheng sun <
> > > >>>>>>> [hidden email]>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Great thanks for preparing the RC4 of Flink 1.8.0,
> > > >>>> Aljoscha!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> +1 (non-binding)
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I checked the functional things as follows(Without
> > > >>>> performance
> > > >>>>>>>>>>>> verification):
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 1. Checking Artifacts:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 1). Download the release source code - SUCCESS
> > > >>>>>>>>>>>> 2). Check Source release flink-1.8.0-src.tgz.sha512 -
> > > >>>> SUCCESS
> > > >>>>>>>>>>>> 3). Download the released JAR - SUCCESS
> > > >>>>>>>>>>>> 4). Check if checksums and GPG files match the
> > > >>>> corresponding
> > > >>>>>>>>> release
> > > >>>>>>>>>>>> files - SUCCESS.
> > > >>>>>>>>>>>> 5). Verify that the source archives do not contain any
> > > >>>>>>> binaries
> > > >>>>>>>> -
> > > >>>>>>>>>>>> SUCCESS.
> > > >>>>>>>>>>>> 6). Build the source with `mvn clean verify -DskipTests`
> > > >>>> to
> > > >>>>>>>> ensure
> > > >>>>>>>>>> all
> > > >>>>>>>>>>>> source files have Apache headers - SUCCESS
> > > >>>>>>>>>>>> 7). Check that all POM files point to the same version -
> > > >>>>>>> SUCCESS
> > > >>>>>>>>>>>> 8). Read the `README.md` file to ensure there is nothing
> > > >>>>>>>>> unexpected
> > > >>>>>>>>>> -
> > > >>>>>>>>>>>> SUCCESS
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 2. Testing Larger Setups
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Cluster Environment:7 nodes, jm 1024m, tm 4096m
> > > >>>>>>>>>>>> Testing Jobs: WordCount(Batch&Streaming),
> > > >>>>>>>>>> DataStreamAllroundTestProgram
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 1). Use local&hdfs file systems for checkpoints -
> > > >> SUCCESS
> > > >>>>>>>>>>>> 2). Use hdfs file systems for input/output -SUCCESS
> > > >>>>>>>>>>>> 3). Run examples on YARN(with or without session) -
> > > >>>> SUCCESS
> > > >>>>>>>>>>>> 4). Test failover and recovery. - SUCCESS
> > > >>>>>>>>>>>> 5). Test incremental&non-incremental checkpoint -
> > > >> SUCCESS
> > > >>>>>>>>>>>> 6). Test connector - kafka -SUCCESS
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 3. Testing Functionality
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 1). Built-in tests(linux&mac os)
> > > >>>>>>>>>>>> - `mvn cealn verify` (some test timeout error and test
> > > >>>> case
> > > >>>>>>>> bug
> > > >>>>>>>>>> see
> > > >>>>>>>>>>>> FLINK-12001 <
> > > >>>> https://issues.apache.org/jira/browse/FLINK-12001>,
> > > >>>>>>>> all
> > > >>>>>>>>>> of
> > > >>>>>>>>>>>> them are not the blocker)
> > > >>>>>>>>>>>> - build for scala 2.11(mvn clean install -P scala-2.11
> > > >>>>>>>>>> -DskipTests)
> > > >>>>>>>>>>>> - SUCCESS
> > > >>>>>>>>>>>> - Run the scripted nightly end-to-end test - SUCCESS
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 2). Quickstarts
> > > >>>>>>>>>>>> - Verify that the quickstarts for Scala with the staging
> > > >>>>>>>>>> repository
> > > >>>>>>>>>>>> in IntelliJ - SUCCESS
> > > >>>>>>>>>>>> - Verify that the quickstarts for Java with the staging
> > > >>>>>>>>> repository
> > > >>>>>>>>>>> in
> > > >>>>>>>>>>>> IntelliJ - SUCCESS
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 3). Simple Starter Experience and Use Cases
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> - run all examples from IntelliJ IDE - SUCCESS
> > > >>>>>>>>>>>> - Start a local cluster and verify that the processes -
> > > >>>>>>>> SUCCESS
> > > >>>>>>>>>>>> a. Examine the *.out files (should be empty) and the log
> > > >>>>>>>> files
> > > >>>>>>>>>>>> (should contain no exceptions)
> > > >>>>>>>>>>>> b. Test for Linux, MacOS
> > > >>>>>>>>>>>> c. Shutdown and verify there are no exceptions in the
> > > >> log
> > > >>>>>>>>> output
> > > >>>>>>>>>>>> (after shutdown)
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> - Verify that the examples are running from both
> > > >>>> ./bin/flink
> > > >>>>>>>> and
> > > >>>>>>>>>>> from
> > > >>>>>>>>>>>> the web-based job submission tool(following items) -
> > > >>>> SUCCESS
> > > >>>>>>>>>>>> a. Start multiple task managers in the local cluster
> > > >>>>>>>>>>>> b. Change the flink-conf.yml to define more than one
> > > >> task
> > > >>>>>>>> slot
> > > >>>>>>>>>> (2)
> > > >>>>>>>>>>>> c. Run the examples with a parallelism > 1
> > > >>>>>>>>>>>> d. Examine the log output - no error messages should be
> > > >>>>>>>>>>> encountered
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 4. Review the PR
> > > >>>>>>>>>>>> - [Add 1.8 Release Blog Post] - Just a reminder, updated
> > > >>>> the
> > > >>>>>>>>>> release
> > > >>>>>>>>>>>> date to correct date before merging.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Cheers,
> > > >>>>>>>>>>>> Jincheng
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Piotr Nowojski <[hidden email]> 于2019年3月25日周一
> > > >>>> 下午4:11写道:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> +1 from my side. Previously spotted performance
> > > >>>> regression seems
> > > >>>>>>>> to
> > > >>>>>>>>> be
> > > >>>>>>>>>>>>> gone, or mostly gone.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Piotrek
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On 21 Mar 2019, at 17:52, Aljoscha Krettek <
> > > >>>>>>> [hidden email]>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>>>>> Please review and vote on the release candidate 4
> > > >> for
> > > >>>> Flink
> > > >>>>>>>> 1.8.0,
> > > >>>>>>>>> as
> > > >>>>>>>>>>>>> follows:
> > > >>>>>>>>>>>>>> [ ] +1, Approve the release
> > > >>>>>>>>>>>>>> [ ] -1, Do not approve the release (please provide
> > > >>>> specific
> > > >>>>>>>>> comments)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> The complete staging area is available for your
> > > >>>> review, which
> > > >>>>>>>>>> includes:
> > > >>>>>>>>>>>>>> * JIRA release notes [1],
> > > >>>>>>>>>>>>>> * the official Apache source release and binary
> > > >>>> convenience
> > > >>>>>>>>> releases
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>>> be deployed to dist.apache.org [2], which are signed
> > > >>>> with the
> > > >>>>>>> key
> > > >>>>>>>>>> with
> > > >>>>>>>>>>>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293
> > > >>>> [3],
> > > >>>>>>>>>>>>>> * all artifacts to be deployed to the Maven Central
> > > >>>> Repository
> > > >>>>>>>> [4],
> > > >>>>>>>>>>>>>> * source code tag "release-1.8.0-rc4" [5],
> > > >>>>>>>>>>>>>> * website pull request listing the new release [6]
> > > >>>>>>>>>>>>>> * website pull request adding announcement blog post
> > > >>>> [7].
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> The vote will be open for at least 72 hours. It is
> > > >>>> adopted by
> > > >>>>>>>>>> majority
> > > >>>>>>>>>>>>> approval, with at least 3 PMC affirmative votes.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>> Aljoscha
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> [1]
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > > >>>>>>>>>>>>>> [2]
> > > >>>>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/
> > > >>>>>>>>>>>>>> [3]
> > > >>>> https://dist.apache.org/repos/dist/release/flink/KEYS
> > > >>>>>>>>>>>>>> [4]
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > https://repository.apache.org/content/repositories/orgapacheflink-1215
> > > >>>>>>>>>>>>>> [5]
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>
> > >
> >
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145
> > > >>>>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 <
> > > >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/180>
> > > >>>>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 <
> > > >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/179>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> P.S. The difference to the previous RCs is small,
> > > >> you
> > > >>>> can fetch
> > > >>>>>>>> the
> > > >>>>>>>>>>> tags
> > > >>>>>>>>>>>>> and do a "git log
> > > >> release-1.8.0-rc1..release-1.8.0-rc4”
> > > >>>> to see
> > > >>>>>>> the
> > > >>>>>>>>>>>>> difference in commits. Its fixes for the issues that
> > > >>>> led to the
> > > >>>>>>>>>>>>> cancellation of the previous RCs plus smaller fixes.
> > > >>>> Most
> > > >>>>>>>>>>>>> verification/testing that was carried out should apply
> > > >>>> as is to
> > > >>>>>>>> this
> > > >>>>>>>>>> RC.
> > > >>>>>>>>>>>>> Any functional verification that you did on previous
> > > >>>> RCs should
> > > >>>>>>>>>>> therefore
> > > >>>>>>>>>>>>> easily carry over to this one.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

[CANCEL][VOTE] Release 1.8.0, release candidate #4

Aljoscha Krettek-2
I’m hereby canceling the vote for Flink 1.8.0 RC4 in favour of a new RC that I will create shortly now that the blockers are resolved.

> On 1. Apr 2019, at 13:21, Till Rohrmann <[hidden email]> wrote:
>
> Thanks for reporting this problem and opening a JIRA issue. I've created a
> fix for the problem [1].
>
> [1] https://github.com/apache/flink/pull/8096
>
> Cheers,
> Till
>
> On Mon, Apr 1, 2019 at 12:30 AM Richard Deurwaarder <[hidden email]> wrote:
>
>> Hello @Aljoscha and @Rong,
>>
>> I've described the problem in the mailing list[1] and on stackoverflow[2]
>> before. But the gist is: If there's a firewall between the yarn cluster and
>> the machine submitting the job, we need to be able to set a fixed port (or
>> range of ports) for REST communication with the jobmanager.
>>
>> It is a regression in the sense that on 1.5 (and 1.6 I believe?) it was
>> possible to work around this by using the legacy mode (non flip-6), but on
>> 1.7 and now 1.8 this is not possible.
>>
>> I've created FLINK-12075 <
>> https://issues.apache.org/jira/browse/FLINK-12075>
>> for it, I have not made it blocking yet as it is not strictly a regression
>> with regards to 1.7. Perhaps you guys can better determine if you want this
>> added in RC5.
>>
>> Regards,
>>
>> Richard
>>
>> [1]
>>
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Submitting-job-to-Flink-on-yarn-timesout-on-flip-6-1-5-x-td26199.html#a26383
>> [2] https://stackoverflow.com/q/54771637/988324
>>
>> On Sat, Mar 30, 2019 at 7:24 PM Rong Rong <[hidden email]> wrote:
>>
>>> Hi @Aljoscha,
>>>
>>> Based on the previous commit [1] that adds the random port selection
>> code,
>>> it seems like the important part is to unset whatever 'rest.port' setting
>>> previously done. I don't think the current way of setting the BIND_PORT
>>> actually overrides any existing PORT setting. However, I wasn't able to
>>> find any test that is related, maybe @Till can provide more insight here?
>>>
>>> Maybe @Richard can provide more detail on the YARN run command used to
>>> reproduce the problem?
>>>
>>> Thanks,
>>> Rong
>>>
>>> [1]
>>>
>>>
>> https://github.com/apache/flink/commit/dbe0e8286d76a5facdb49589b638b87dbde80178#diff-487838863ab693af7008f04cb3359be3R117
>>>
>>> On Sat, Mar 30, 2019 at 5:51 AM Aljoscha Krettek <[hidden email]>
>>> wrote:
>>>
>>>> @Richard Did this work for you previously? From the change, it seems
>> that
>>>> the port was always set to 0 on YARN even before.
>>>>
>>>>> On 28. Mar 2019, at 16:13, Richard Deurwaarder <[hidden email]>
>>> wrote:
>>>>>
>>>>> -1 (non-binding)
>>>>>
>>>>> - Ran integration tests locally (1000+) of our flink job, all
>>> succeeded.
>>>>> - Attempted to run job on hadoop, failed. It failed because we have a
>>>>> firewall in place and we cannot set the rest port to a specific
>>> port/port
>>>>> range.
>>>>> Unless I am mistaken, it seems like FLINK-11081 broke the possibility
>>> of
>>>>> setting a REST port when running on yarn (
>>>>>
>>>>
>>>
>> https://github.com/apache/flink/commit/730eed71ef3f718d61f85d5e94b1060844ca56db#diff-487838863ab693af7008f04cb3359be3R102
>>>>> )
>>>>> Code-wise it seems rather straightforward to fix but I am unsure
>> about
>>>> the
>>>>> reason why this is hard-coded to 0 and what the impact would be.
>>>>>
>>>>> It would benefit us greatly if a fix for this could make it to 1.8.0.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Richard
>>>>>
>>>>> On Thu, Mar 28, 2019 at 9:54 AM Tzu-Li (Gordon) Tai <
>>> [hidden email]
>>>>>
>>>>> wrote:
>>>>>
>>>>>> +1 (binding)
>>>>>>
>>>>>> Functional checks:
>>>>>>
>>>>>> - Built Flink from source (`mvn clean verify`) locally, with success
>>>>>> - Ran end-to-end tests locally for 5 times in a loop, no attempts
>>> failed
>>>>>> (Hadoop 2.8.4, Scala 2.12)
>>>>>> - Manually tested state schema evolution for POJO. Besides the tests
>>>> that
>>>>>> @Congxian already did, additionally tested evolution cases with POJO
>>>>>> subclasses + non-registered POJOs.
>>>>>> - Manually tested migration of Scala stateful jobs that use case
>>>> classes /
>>>>>> Scala collections as state types, performing the migration across
>>> Scala
>>>>>> 2.11 to Scala 2.12.
>>>>>> - Reviewed release announcement PR
>>>>>>
>>>>>> Misc / legal checks:
>>>>>>
>>>>>> - checked checksums and signatures
>>>>>> - No binaries in source distribution
>>>>>> - Staging area does not seem to have any missing artifacts
>>>>>>
>>>>>> Cheers,
>>>>>> Gordon
>>>>>>
>>>>>> On Thu, Mar 28, 2019 at 4:52 PM Tzu-Li (Gordon) Tai <
>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> @Shaoxuan
>>>>>>>
>>>>>>> The drop in the serializerAvro benchmark, as explained earlier in
>>>>>> previous
>>>>>>> voting threads of earlier RCs, was due to a slower job
>> initialization
>>>>>> phase
>>>>>>> caused by slower deserialization of the AvroSerializer.
>>>>>>> Piotr also pointed out that after the number of records was
>> increased
>>>> in
>>>>>>> the serializer benchmarks, this drop was no longer observable
>> before
>>> /
>>>>>>> after the changes in mid February.
>>>>>>> IMO, this is not critical as it does not affect the per-record
>>>>>> performance
>>>>>>> / throughput, and therefore should not block this release.
>>>>>>>
>>>>>>> On Thu, Mar 28, 2019 at 1:08 AM Aljoscha Krettek <
>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> By now, I'm reasonably sure that the test instabilities on the
>>>>>> end-to-end
>>>>>>>> test are only instabilities. I pushed changes to increase timeouts
>>> to
>>>>>> make
>>>>>>>> the tests more stable. As in any project, there will always be
>> bugs
>>>> but
>>>>>> I
>>>>>>>> think we could release this RC4 and be reasonably sure that it
>> works
>>>>>> well.
>>>>>>>>
>>>>>>>> Now, we only need to have the required number of PMC votes.
>>>>>>>>
>>>>>>>> On Wed, Mar 27, 2019, at 07:22, Congxian Qiu wrote:
>>>>>>>>> +1 (non-binding)
>>>>>>>>>
>>>>>>>>> • checked signature and checksum  ok
>>>>>>>>> • mvn clean package -DskipTests ok
>>>>>>>>> • Run job on yarn ok
>>>>>>>>> • Test state migration with POJO type (both heap and rocksdb) ok
>>>>>>>>> • - 1.6 -> 1.8
>>>>>>>>> • - 1.7 -> 1.8
>>>>>>>>> • - 1.8 -> 1.8
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best, Congxian
>>>>>>>>> On Mar 27, 2019, 10:26 +0800, vino yang <[hidden email]>,
>>>>>> wrote:
>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>
>>>>>>>>>> - checked JIRA release note
>>>>>>>>>> - ran "mvn package -DskipTests"
>>>>>>>>>> - checked signature and checksum
>>>>>>>>>> - started a cluster locally and ran some examples in binary
>>>>>>>>>> - checked web site announcement's PR
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Vino
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Xiaowei Jiang <[hidden email]> 于2019年3月26日周二 下午8:20写道:
>>>>>>>>>>
>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>
>>>>>>>>>>> - checked checksums and GPG files
>>>>>>>>>>> - build from source successfully- run end-to-end precommit
>> tests
>>>>>>>>>>> successfully- run end-to-end nightly tests successfully
>>>>>>>>>>> Xiaowei
>>>>>>>>>>> On Tuesday, March 26, 2019, 8:09:19 PM GMT+8, Yu Li <
>>>>>>>> [hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>
>>>>>>>>>>> - Checked release notes: OK
>>>>>>>>>>> - Checked sums and signatures: OK
>>>>>>>>>>> - Source release
>>>>>>>>>>> - contains no binaries: OK
>>>>>>>>>>> - contains no 1.8-SNAPSHOT references: OK
>>>>>>>>>>> - build from source: OK (8u101)
>>>>>>>>>>> - mvn clean verify: OK (8u101)
>>>>>>>>>>> - Binary release
>>>>>>>>>>> - no examples appear to be missing
>>>>>>>>>>> - started a cluster; WebUI reachable, example ran successfully
>>>>>>>>>>> - end-to-end test (all but K8S and docker ones): OK (8u101)
>>>>>>>>>>> - Repository appears to contain all expected artifacts
>>>>>>>>>>>
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Yu
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 26 Mar 2019 at 14:28, Kurt Young <[hidden email]>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>>
>>>>>>>>>>>> Checked items:
>>>>>>>>>>>> - checked checksums and GPG files
>>>>>>>>>>>> - verified that the source archives do not contains any
>> binaries
>>>>>>>>>>>> - checked that all POM files point to the same version
>>>>>>>>>>>> - build from source successfully
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Kurt
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 26, 2019 at 10:57 AM Shaoxuan Wang <
>>>>>>>> [hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tested RC4 with the following items:
>>>>>>>>>>>>> - Maven Central Repository contains all artifacts
>>>>>>>>>>>>> - Built the source with Maven (ensured all source files have
>>>>>>>> Apache
>>>>>>>>>>>>> headers), and executed built-in tests via "mvn clean verify"
>>>>>>>>>>>>> - Manually executed the tests in IntelliJ IDE
>>>>>>>>>>>>> - Verify that the quickstarts for Scala and Java are working
>>>>>>>> with the
>>>>>>>>>>>>> staging repository in IntelliJ
>>>>>>>>>>>>> - Checked the benchmark results. The perf regression of
>>>>>>>>>>>>> tuple-key-by/statebackend/tumblingWindow are gone, but the
>>>>>>>> regression
>>>>>>>>>>> on
>>>>>>>>>>>>> serializer still exists.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Shaoxuan
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Mar 26, 2019 at 8:06 AM jincheng sun <
>>>>>>>> [hidden email]
>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Aljoscha, I think you are right, increase the timeout
>>>>>>>> config will
>>>>>>>>>>>> fix
>>>>>>>>>>>>>> this issue. this depends on the resource of Travis. I would
>>>>>>>> like
>>>>>>>>>>> share
>>>>>>>>>>>>>> some phenomenon during my test (not the flink problem) as
>>>>>>>> follows:
>>>>>>>>>>> :-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> During my testing, `mvn clean verify` and `nightly
>>>>>> end-to-end
>>>>>>>> test `
>>>>>>>>>>>> both
>>>>>>>>>>>>>> consume a lot of machine resources (especially
>>>>>>>> memory/network), and
>>>>>>>>>>> the
>>>>>>>>>>>>>> network bandwidth requirements of `nightly end-to-end test `
>>>>>>>> are also
>>>>>>>>>>>>> very
>>>>>>>>>>>>>> high. In China, need to use VPN acceleration (100~200Kb
>>>>>> before
>>>>>>>>>>>>>> acceleration, 3~4Mb after acceleration), I have encountered:
>>>>>>>> [Avro
>>>>>>>>>>>>>> Confluent Schema Registry nightly end-to-end test' failed
>>>>>>>> after 18
>>>>>>>>>>>>> minutes
>>>>>>>>>>>>>> and 15 seconds! Test exited with exit Code 1] takes more
>>>>>> than
>>>>>>>> 18
>>>>>>>>>>>> minutes,
>>>>>>>>>>>>>> the download failed because the network bandwidth is not
>>>>>>>> enough. and
>>>>>>>>>>> it
>>>>>>>>>>>>>> runs smoothly when using VPN acceleration. The overall
>>>>>>>> end-to-end run
>>>>>>>>>>>> was
>>>>>>>>>>>>>> passed twice. The Docker resource configuration (CUPs 7,
>>>>>> Mem:
>>>>>>>> 28.7G,
>>>>>>>>>>>>> Swap:
>>>>>>>>>>>>>> 3.5G). See detail log here
>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>> https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just now, I had checked the Travis for your last commit
>>>>>>>> (Increase
>>>>>>>>>>>> startup
>>>>>>>>>>>>>> timeout in end-to-end tests), in addition to the Cleanup
>>>>>>>> phase, other
>>>>>>>>>>>>>> phases are successful. here
>>>>>>>>>>>>>> <https://travis-ci.org/apache/flink/builds/511071777>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In order to verify that our speculation is accurate, I can
>>>>>>>> help with
>>>>>>>>>>> 10
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> 20 seconds timeout config on my repo verification to see if
>>>>>>>> 100%
>>>>>>>>>>>>> recurring
>>>>>>>>>>>>>> timeout problem. It is already running, we are waiting for
>>>>>> the
>>>>>>>>>>> result.
>>>>>>>>>>>>>> 10seconds <
>>>>>>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235749
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 20seconds <
>>>>>>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235598
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aljoscha Krettek <[hidden email]> 于2019年3月26日周二
>>>>>>>> 上午1:04写道:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the testing done so far!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There has been quite some flakiness on Travis lately, see
>>>>>>>> here:
>>>>>>>>>>>>>>> https://travis-ci.org/apache/flink/branches <
>>>>>>>>>>>>>>> https://travis-ci.org/apache/flink/branches>. I’m a bit
>>>>>>>> hesitant
>>>>>>>>>>> to
>>>>>>>>>>>>>>> release in this state. Looking at the tests you can see
>>>>>>>> that all of
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> end-to-end tests fail because waiting for the dispatcher
>>>>>> to
>>>>>>>> come up
>>>>>>>>>>>>> times
>>>>>>>>>>>>>>> out. I also noticed that this usually takes about 5-8
>>>>>>>> seconds on
>>>>>>>>>>>>> Travis,
>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>> a 10 second timeout might be a bit low. I pushed commits
>>>>>> to
>>>>>>>>>>> increase
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> to 20 secs. Let’s see what will happen.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I’ll keep you posted!
>>>>>>>>>>>>>>> Aljoscha
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 25. Mar 2019, at 13:13, jincheng sun <
>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Great thanks for preparing the RC4 of Flink 1.8.0,
>>>>>>>> Aljoscha!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I checked the functional things as follows(Without
>>>>>>>> performance
>>>>>>>>>>>>>>>> verification):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Checking Artifacts:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1). Download the release source code - SUCCESS
>>>>>>>>>>>>>>>> 2). Check Source release flink-1.8.0-src.tgz.sha512 -
>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>> 3). Download the released JAR - SUCCESS
>>>>>>>>>>>>>>>> 4). Check if checksums and GPG files match the
>>>>>>>> corresponding
>>>>>>>>>>>>> release
>>>>>>>>>>>>>>>> files - SUCCESS.
>>>>>>>>>>>>>>>> 5). Verify that the source archives do not contain any
>>>>>>>>>>> binaries
>>>>>>>>>>>> -
>>>>>>>>>>>>>>>> SUCCESS.
>>>>>>>>>>>>>>>> 6). Build the source with `mvn clean verify -DskipTests`
>>>>>>>> to
>>>>>>>>>>>> ensure
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> source files have Apache headers - SUCCESS
>>>>>>>>>>>>>>>> 7). Check that all POM files point to the same version -
>>>>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>> 8). Read the `README.md` file to ensure there is nothing
>>>>>>>>>>>>> unexpected
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. Testing Larger Setups
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cluster Environment:7 nodes, jm 1024m, tm 4096m
>>>>>>>>>>>>>>>> Testing Jobs: WordCount(Batch&Streaming),
>>>>>>>>>>>>>> DataStreamAllroundTestProgram
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1). Use local&hdfs file systems for checkpoints -
>>>>>> SUCCESS
>>>>>>>>>>>>>>>> 2). Use hdfs file systems for input/output -SUCCESS
>>>>>>>>>>>>>>>> 3). Run examples on YARN(with or without session) -
>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>> 4). Test failover and recovery. - SUCCESS
>>>>>>>>>>>>>>>> 5). Test incremental&non-incremental checkpoint -
>>>>>> SUCCESS
>>>>>>>>>>>>>>>> 6). Test connector - kafka -SUCCESS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3. Testing Functionality
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1). Built-in tests(linux&mac os)
>>>>>>>>>>>>>>>> - `mvn cealn verify` (some test timeout error and test
>>>>>>>> case
>>>>>>>>>>>> bug
>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>> FLINK-12001 <
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-12001>,
>>>>>>>>>>>> all
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> them are not the blocker)
>>>>>>>>>>>>>>>> - build for scala 2.11(mvn clean install -P scala-2.11
>>>>>>>>>>>>>> -DskipTests)
>>>>>>>>>>>>>>>> - SUCCESS
>>>>>>>>>>>>>>>> - Run the scripted nightly end-to-end test - SUCCESS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2). Quickstarts
>>>>>>>>>>>>>>>> - Verify that the quickstarts for Scala with the staging
>>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>>> in IntelliJ - SUCCESS
>>>>>>>>>>>>>>>> - Verify that the quickstarts for Java with the staging
>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> IntelliJ - SUCCESS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3). Simple Starter Experience and Use Cases
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - run all examples from IntelliJ IDE - SUCCESS
>>>>>>>>>>>>>>>> - Start a local cluster and verify that the processes -
>>>>>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>> a. Examine the *.out files (should be empty) and the log
>>>>>>>>>>>> files
>>>>>>>>>>>>>>>> (should contain no exceptions)
>>>>>>>>>>>>>>>> b. Test for Linux, MacOS
>>>>>>>>>>>>>>>> c. Shutdown and verify there are no exceptions in the
>>>>>> log
>>>>>>>>>>>>> output
>>>>>>>>>>>>>>>> (after shutdown)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Verify that the examples are running from both
>>>>>>>> ./bin/flink
>>>>>>>>>>>> and
>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>> the web-based job submission tool(following items) -
>>>>>>>> SUCCESS
>>>>>>>>>>>>>>>> a. Start multiple task managers in the local cluster
>>>>>>>>>>>>>>>> b. Change the flink-conf.yml to define more than one
>>>>>> task
>>>>>>>>>>>> slot
>>>>>>>>>>>>>> (2)
>>>>>>>>>>>>>>>> c. Run the examples with a parallelism > 1
>>>>>>>>>>>>>>>> d. Examine the log output - no error messages should be
>>>>>>>>>>>>>>> encountered
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 4. Review the PR
>>>>>>>>>>>>>>>> - [Add 1.8 Release Blog Post] - Just a reminder, updated
>>>>>>>> the
>>>>>>>>>>>>>> release
>>>>>>>>>>>>>>>> date to correct date before merging.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Piotr Nowojski <[hidden email]> 于2019年3月25日周一
>>>>>>>> 下午4:11写道:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +1 from my side. Previously spotted performance
>>>>>>>> regression seems
>>>>>>>>>>>> to
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> gone, or mostly gone.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 21 Mar 2019, at 17:52, Aljoscha Krettek <
>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>> Please review and vote on the release candidate 4
>>>>>> for
>>>>>>>> Flink
>>>>>>>>>>>> 1.8.0,
>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>> follows:
>>>>>>>>>>>>>>>>>> [ ] +1, Approve the release
>>>>>>>>>>>>>>>>>> [ ] -1, Do not approve the release (please provide
>>>>>>>> specific
>>>>>>>>>>>>> comments)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The complete staging area is available for your
>>>>>>>> review, which
>>>>>>>>>>>>>> includes:
>>>>>>>>>>>>>>>>>> * JIRA release notes [1],
>>>>>>>>>>>>>>>>>> * the official Apache source release and binary
>>>>>>>> convenience
>>>>>>>>>>>>> releases
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> be deployed to dist.apache.org [2], which are signed
>>>>>>>> with the
>>>>>>>>>>> key
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293
>>>>>>>> [3],
>>>>>>>>>>>>>>>>>> * all artifacts to be deployed to the Maven Central
>>>>>>>> Repository
>>>>>>>>>>>> [4],
>>>>>>>>>>>>>>>>>> * source code tag "release-1.8.0-rc4" [5],
>>>>>>>>>>>>>>>>>> * website pull request listing the new release [6]
>>>>>>>>>>>>>>>>>> * website pull request adding announcement blog post
>>>>>>>> [7].
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. It is
>>>>>>>> adopted by
>>>>>>>>>>>>>> majority
>>>>>>>>>>>>>>>>> approval, with at least 3 PMC affirmative votes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Aljoscha
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/
>>>>>>>>>>>>>>>>>> [3]
>>>>>>>> https://dist.apache.org/repos/dist/release/flink/KEYS
>>>>>>>>>>>>>>>>>> [4]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>> https://repository.apache.org/content/repositories/orgapacheflink-1215
>>>>>>>>>>>>>>>>>> [5]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145
>>>>>>>>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 <
>>>>>>>>>>>>>>>>> https://github.com/apache/flink-web/pull/180>
>>>>>>>>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 <
>>>>>>>>>>>>>>>>> https://github.com/apache/flink-web/pull/179>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> P.S. The difference to the previous RCs is small,
>>>>>> you
>>>>>>>> can fetch
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> tags
>>>>>>>>>>>>>>>>> and do a "git log
>>>>>> release-1.8.0-rc1..release-1.8.0-rc4”
>>>>>>>> to see
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> difference in commits. Its fixes for the issues that
>>>>>>>> led to the
>>>>>>>>>>>>>>>>> cancellation of the previous RCs plus smaller fixes.
>>>>>>>> Most
>>>>>>>>>>>>>>>>> verification/testing that was carried out should apply
>>>>>>>> as is to
>>>>>>>>>>>> this
>>>>>>>>>>>>>> RC.
>>>>>>>>>>>>>>>>> Any functional verification that you did on previous
>>>>>>>> RCs should
>>>>>>>>>>>>>>> therefore
>>>>>>>>>>>>>>>>> easily carry over to this one.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>

12