[VOTE] Release 1.13.0, release candidate #2

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[VOTE] Release 1.13.0, release candidate #2

dwysakowicz
Hi everyone,
Please review and vote on the release candidate #2 for the version 1.13.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)
 
 
The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org [2], which are signed with the key with fingerprint 31D2DD10BFC15A2D [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.13.0-rc2" [5],
* website pull request listing the new release and adding announcement blog post [6].
 
The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.
 
Thanks,
Dawid Wysakowicz
 

OpenPGP_signature (855 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Rui Li
Hi Dawid,

I have just merged the doc change for hive dialect [1], which talks about
how to use hive dialect to run hive queries. Do you think we can have
another RC to include it? Sorry for the trouble.

[1] https://issues.apache.org/jira/browse/FLINK-22119

On Sat, Apr 24, 2021 at 1:36 AM Dawid Wysakowicz <[hidden email]>
wrote:

> Hi everyone,
> Please review and vote on the release candidate #2 for the version 1.13.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 31D2DD10BFC15A2D [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.13.0-rc2" [5],
> * website pull request listing the new release and adding announcement
> blog post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Dawid Wysakowicz
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1420/
> [5] https://github.com/apache/flink/tree/release-1.13.0-rc2
> [6] https://github.com/apache/flink-web/pull/436
>


--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Jark Wu-2
Hi Rui,

Documentation is not a part of the release package.
The changes to the documentation will take effect in 24 hours
 and can be accessed on the web [1] then.
Docs can even be updated after 1.13 is released.
So we don't need to cancel the RC for this.

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/hive/overview/

On Sun, 25 Apr 2021 at 13:47, Rui Li <[hidden email]> wrote:

> Hi Dawid,
>
> I have just merged the doc change for hive dialect [1], which talks about
> how to use hive dialect to run hive queries. Do you think we can have
> another RC to include it? Sorry for the trouble.
>
> [1] https://issues.apache.org/jira/browse/FLINK-22119
>
> On Sat, Apr 24, 2021 at 1:36 AM Dawid Wysakowicz <[hidden email]>
> wrote:
>
> > Hi everyone,
> > Please review and vote on the release candidate #2 for the version
> 1.13.0,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint 31D2DD10BFC15A2D [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.13.0-rc2" [5],
> > * website pull request listing the new release and adding announcement
> > blog post [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Dawid Wysakowicz
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc2/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1420/
> > [5] https://github.com/apache/flink/tree/release-1.13.0-rc2
> > [6] https://github.com/apache/flink-web/pull/436
> >
>
>
> --
> Best regards!
> Rui Li
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Rui Li
I see. Thanks Jark for the clarification.

On Sun, Apr 25, 2021 at 10:36 PM Jark Wu <[hidden email]> wrote:

> Hi Rui,
>
> Documentation is not a part of the release package.
> The changes to the documentation will take effect in 24 hours
>  and can be accessed on the web [1] then.
> Docs can even be updated after 1.13 is released.
> So we don't need to cancel the RC for this.
>
> Best,
> Jark
>
> [1]:
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/hive/overview/
>
> On Sun, 25 Apr 2021 at 13:47, Rui Li <[hidden email]> wrote:
>
> > Hi Dawid,
> >
> > I have just merged the doc change for hive dialect [1], which talks about
> > how to use hive dialect to run hive queries. Do you think we can have
> > another RC to include it? Sorry for the trouble.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-22119
> >
> > On Sat, Apr 24, 2021 at 1:36 AM Dawid Wysakowicz <[hidden email]
> >
> > wrote:
> >
> > > Hi everyone,
> > > Please review and vote on the release candidate #2 for the version
> > 1.13.0,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint 31D2DD10BFC15A2D [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "release-1.13.0-rc2" [5],
> > > * website pull request listing the new release and adding announcement
> > > blog post [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Dawid Wysakowicz
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc2/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1420/
> > > [5] https://github.com/apache/flink/tree/release-1.13.0-rc2
> > > [6] https://github.com/apache/flink-web/pull/436
> > >
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>


--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Piotr Nowojski-5
+1 (binding)

I'm not aware of any release blockers. Me and my colleagues have checked
quite extensively this release for correctness of the unaligned
checkpoints, finally nailing down the two remaining known bugs.
I have manually checked the WebUI with it's new back-pressure
monitoring tool. One bug was discovered in this area recently, which has
already pending PR, but as it's not critical and very old, I wouldn't
cancel the RC vote because of that.

Piotrek

pon., 26 kwi 2021 o 07:53 Rui Li <[hidden email]> napisał(a):

> I see. Thanks Jark for the clarification.
>
> On Sun, Apr 25, 2021 at 10:36 PM Jark Wu <[hidden email]> wrote:
>
> > Hi Rui,
> >
> > Documentation is not a part of the release package.
> > The changes to the documentation will take effect in 24 hours
> >  and can be accessed on the web [1] then.
> > Docs can even be updated after 1.13 is released.
> > So we don't need to cancel the RC for this.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/hive/overview/
> >
> > On Sun, 25 Apr 2021 at 13:47, Rui Li <[hidden email]> wrote:
> >
> > > Hi Dawid,
> > >
> > > I have just merged the doc change for hive dialect [1], which talks
> about
> > > how to use hive dialect to run hive queries. Do you think we can have
> > > another RC to include it? Sorry for the trouble.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-22119
> > >
> > > On Sat, Apr 24, 2021 at 1:36 AM Dawid Wysakowicz <
> [hidden email]
> > >
> > > wrote:
> > >
> > > > Hi everyone,
> > > > Please review and vote on the release candidate #2 for the version
> > > 1.13.0,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release and binary convenience releases
> to
> > > be
> > > > deployed to dist.apache.org [2], which are signed with the key with
> > > > fingerprint 31D2DD10BFC15A2D [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag "release-1.13.0-rc2" [5],
> > > > * website pull request listing the new release and adding
> announcement
> > > > blog post [6].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Dawid Wysakowicz
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc2/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1420/
> > > > [5] https://github.com/apache/flink/tree/release-1.13.0-rc2
> > > > [6] https://github.com/apache/flink-web/pull/436
> > > >
> > >
> > >
> > > --
> > > Best regards!
> > > Rui Li
> > >
> >
>
>
> --
> Best regards!
> Rui Li
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Xintong Song
+1 (non-binding)

- verified checksum and signature
- built from source
- executed example jobs with standalone / native kubernetes deployments,
nothing unexpected
- reviewed release announcement pull request

Thank you~

Xintong Song



On Tue, Apr 27, 2021 at 10:50 PM Piotr Nowojski <[hidden email]>
wrote:

> +1 (binding)
>
> I'm not aware of any release blockers. Me and my colleagues have checked
> quite extensively this release for correctness of the unaligned
> checkpoints, finally nailing down the two remaining known bugs.
> I have manually checked the WebUI with it's new back-pressure
> monitoring tool. One bug was discovered in this area recently, which has
> already pending PR, but as it's not critical and very old, I wouldn't
> cancel the RC vote because of that.
>
> Piotrek
>
> pon., 26 kwi 2021 o 07:53 Rui Li <[hidden email]> napisał(a):
>
> > I see. Thanks Jark for the clarification.
> >
> > On Sun, Apr 25, 2021 at 10:36 PM Jark Wu <[hidden email]> wrote:
> >
> > > Hi Rui,
> > >
> > > Documentation is not a part of the release package.
> > > The changes to the documentation will take effect in 24 hours
> > >  and can be accessed on the web [1] then.
> > > Docs can even be updated after 1.13 is released.
> > > So we don't need to cancel the RC for this.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/hive/overview/
> > >
> > > On Sun, 25 Apr 2021 at 13:47, Rui Li <[hidden email]> wrote:
> > >
> > > > Hi Dawid,
> > > >
> > > > I have just merged the doc change for hive dialect [1], which talks
> > about
> > > > how to use hive dialect to run hive queries. Do you think we can have
> > > > another RC to include it? Sorry for the trouble.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-22119
> > > >
> > > > On Sat, Apr 24, 2021 at 1:36 AM Dawid Wysakowicz <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > > Please review and vote on the release candidate #2 for the version
> > > > 1.13.0,
> > > > > as follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > > > >
> > > > >
> > > > > The complete staging area is available for your review, which
> > includes:
> > > > > * JIRA release notes [1],
> > > > > * the official Apache source release and binary convenience
> releases
> > to
> > > > be
> > > > > deployed to dist.apache.org [2], which are signed with the key
> with
> > > > > fingerprint 31D2DD10BFC15A2D [3],
> > > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > > * source code tag "release-1.13.0-rc2" [5],
> > > > > * website pull request listing the new release and adding
> > announcement
> > > > > blog post [6].
> > > > >
> > > > > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > > > > approval, with at least 3 PMC affirmative votes.
> > > > >
> > > > > Thanks,
> > > > > Dawid Wysakowicz
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> > > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.13.0-rc2/
> > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [4]
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1420/
> > > > > [5] https://github.com/apache/flink/tree/release-1.13.0-rc2
> > > > > [6] https://github.com/apache/flink-web/pull/436
> > > > >
> > > >
> > > >
> > > > --
> > > > Best regards!
> > > > Rui Li
> > > >
> > >
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Caizhi Weng
-1

We're testing this version on batch jobs with large (600~1000) parallelisms
and the following exception messages appear with high frequency:

2021-04-27 21:27:26
org.apache.flink.util.FlinkException: An OperatorEvent from an
OperatorCoordinator to a task was lost. Triggering task failover to ensure
consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
execution #0
at
org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
at
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
at
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Becket Qin is investigating this issue.
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Stephan Ewen
@Caizhi and @Becket - let me reach out to you to jointly debug this issue.

I am wondering if there is some incorrect reporting of failed events?

On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]> wrote:

> -1
>
> We're testing this version on batch jobs with large (600~1000) parallelisms
> and the following exception messages appear with high frequency:
>
> 2021-04-27 21:27:26
> org.apache.flink.util.FlinkException: An OperatorEvent from an
> OperatorCoordinator to a task was lost. Triggering task failover to ensure
> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> execution #0
> at
>
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> at
>
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> at
>
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> at
>
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> at
>
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
>
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
>
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> Becket Qin is investigating this issue.
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Caizhi Weng
After the investigation we found that this issue is caused by the
implementation of connector, not by the Flink framework.

Sorry for the false alarm.

Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:

> @Caizhi and @Becket - let me reach out to you to jointly debug this issue.
>
> I am wondering if there is some incorrect reporting of failed events?
>
> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]> wrote:
>
> > -1
> >
> > We're testing this version on batch jobs with large (600~1000)
> parallelisms
> > and the following exception messages appear with high frequency:
> >
> > 2021-04-27 21:27:26
> > org.apache.flink.util.FlinkException: An OperatorEvent from an
> > OperatorCoordinator to a task was lost. Triggering task failover to
> ensure
> > consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> > execution #0
> > at
> >
> >
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> > at
> >
> >
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> > at
> >
> >
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> > at
> >
> >
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> > at
> >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> > at
> >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> > at
> >
> >
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> > at
> >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> > at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> > at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> > at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> > at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > at
> >
> >
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > at
> >
> >
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >
> > Becket Qin is investigating this issue.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Stephan Ewen
Glad to hear that outcome. And no worries about the false alarm.
Thank you for doing thorough testing, this is very helpful!

On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]> wrote:

> After the investigation we found that this issue is caused by the
> implementation of connector, not by the Flink framework.
>
> Sorry for the false alarm.
>
> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>
> > @Caizhi and @Becket - let me reach out to you to jointly debug this
> issue.
> >
> > I am wondering if there is some incorrect reporting of failed events?
> >
> > On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> wrote:
> >
> > > -1
> > >
> > > We're testing this version on batch jobs with large (600~1000)
> > parallelisms
> > > and the following exception messages appear with high frequency:
> > >
> > > 2021-04-27 21:27:26
> > > org.apache.flink.util.FlinkException: An OperatorEvent from an
> > > OperatorCoordinator to a task was lost. Triggering task failover to
> > ensure
> > > consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> > > execution #0
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> > > at
> > >
> > >
> >
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> > > at
> > >
> > >
> >
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> > > at
> > >
> > >
> >
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> > > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> > > at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> > > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> > > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> > > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> > > at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> > > at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> > > at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> > > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > at
> > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > at
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > at
> > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > >
> > > Becket Qin is investigating this issue.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Matthias
Thanks Dawid and Guowei for managing this release.

- downloaded the sources and binaries and checked the checksums
- built Flink from the downloaded sources
- executed example jobs with standalone deployments - I didn't find
anything suspicious in the logs
- reviewed release announcement pull request

- I did a pass over dependency updates: git diff release-1.12.2
release-1.13.0-rc2 */*.xml
There's one thing someone should double-check whether that's suppose to be
like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
dependency but I don't see it being reflected in the NOTICE file of the
flink-python module. Or is this automatically added later on?

+1 (non-binding; please see remark on dependency above)

Matthias

On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:

> Glad to hear that outcome. And no worries about the false alarm.
> Thank you for doing thorough testing, this is very helpful!
>
> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]> wrote:
>
> > After the investigation we found that this issue is caused by the
> > implementation of connector, not by the Flink framework.
> >
> > Sorry for the false alarm.
> >
> > Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
> >
> > > @Caizhi and @Becket - let me reach out to you to jointly debug this
> > issue.
> > >
> > > I am wondering if there is some incorrect reporting of failed events?
> > >
> > > On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> > wrote:
> > >
> > > > -1
> > > >
> > > > We're testing this version on batch jobs with large (600~1000)
> > > parallelisms
> > > > and the following exception messages appear with high frequency:
> > > >
> > > > 2021-04-27 21:27:26
> > > > org.apache.flink.util.FlinkException: An OperatorEvent from an
> > > > OperatorCoordinator to a task was lost. Triggering task failover to
> > > ensure
> > > > consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> > > > execution #0
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> > > > at
> > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> > > > at
> > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> > > > at
> > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> > > > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> > > > at akka.japi.pf
> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> > > > at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> > > > at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> > > > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> > > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> > > > at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> > > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> > > > at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> > > > at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> > > > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > at
> > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > at
> > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > at
> > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > >
> > > > Becket Qin is investigating this issue.
> > > >
> > >
> >
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Guowei Ma
Hi, Matthias

Thank you very much for your careful inspection.
I check the flink-python_2.11-1.13.0.jar and we do not bundle
org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
So I think we may not need to add this to the NOTICE file. (BTW The jar's
scope is runtime)

Best,
Guowei


On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
wrote:

> Thanks Dawid and Guowei for managing this release.
>
> - downloaded the sources and binaries and checked the checksums
> - built Flink from the downloaded sources
> - executed example jobs with standalone deployments - I didn't find
> anything suspicious in the logs
> - reviewed release announcement pull request
>
> - I did a pass over dependency updates: git diff release-1.12.2
> release-1.13.0-rc2 */*.xml
> There's one thing someone should double-check whether that's suppose to be
> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
> dependency but I don't see it being reflected in the NOTICE file of the
> flink-python module. Or is this automatically added later on?
>
> +1 (non-binding; please see remark on dependency above)
>
> Matthias
>
> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:
>
> > Glad to hear that outcome. And no worries about the false alarm.
> > Thank you for doing thorough testing, this is very helpful!
> >
> > On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
> wrote:
> >
> > > After the investigation we found that this issue is caused by the
> > > implementation of connector, not by the Flink framework.
> > >
> > > Sorry for the false alarm.
> > >
> > > Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
> > >
> > > > @Caizhi and @Becket - let me reach out to you to jointly debug this
> > > issue.
> > > >
> > > > I am wondering if there is some incorrect reporting of failed events?
> > > >
> > > > On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> > > wrote:
> > > >
> > > > > -1
> > > > >
> > > > > We're testing this version on batch jobs with large (600~1000)
> > > > parallelisms
> > > > > and the following exception messages appear with high frequency:
> > > > >
> > > > > 2021-04-27 21:27:26
> > > > > org.apache.flink.util.FlinkException: An OperatorEvent from an
> > > > > OperatorCoordinator to a task was lost. Triggering task failover to
> > > > ensure
> > > > > consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> > > > > execution #0
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> > > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> > > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> > > > > at
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> > > > > at akka.japi.pf
> > .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > > at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> > > > > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> > > > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> > > > > at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> > > > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> > > > > at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> > > > > at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> > > > > at
> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > > at
> > > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > >
> > > > > Becket Qin is investigating this issue.
> > > > >
> > > >
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Zhu Zhu
In reply to this post by Matthias
+1 (binding)
 - checked signatures and checksums
 - built from source
 - ran example jobs with parallelism=8000 on a YARN cluster. checked output
and logs.
 - the website PR looks good

Thanks,
Zhu

Matthias Pohl <[hidden email]> 于2021年4月29日周四 上午2:34写道:

> Thanks Dawid and Guowei for managing this release.
>
> - downloaded the sources and binaries and checked the checksums
> - built Flink from the downloaded sources
> - executed example jobs with standalone deployments - I didn't find
> anything suspicious in the logs
> - reviewed release announcement pull request
>
> - I did a pass over dependency updates: git diff release-1.12.2
> release-1.13.0-rc2 */*.xml
> There's one thing someone should double-check whether that's suppose to be
> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
> dependency but I don't see it being reflected in the NOTICE file of the
> flink-python module. Or is this automatically added later on?
>
> +1 (non-binding; please see remark on dependency above)
>
> Matthias
>
> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:
>
> > Glad to hear that outcome. And no worries about the false alarm.
> > Thank you for doing thorough testing, this is very helpful!
> >
> > On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
> wrote:
> >
> > > After the investigation we found that this issue is caused by the
> > > implementation of connector, not by the Flink framework.
> > >
> > > Sorry for the false alarm.
> > >
> > > Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
> > >
> > > > @Caizhi and @Becket - let me reach out to you to jointly debug this
> > > issue.
> > > >
> > > > I am wondering if there is some incorrect reporting of failed events?
> > > >
> > > > On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> > > wrote:
> > > >
> > > > > -1
> > > > >
> > > > > We're testing this version on batch jobs with large (600~1000)
> > > > parallelisms
> > > > > and the following exception messages appear with high frequency:
> > > > >
> > > > > 2021-04-27 21:27:26
> > > > > org.apache.flink.util.FlinkException: An OperatorEvent from an
> > > > > OperatorCoordinator to a task was lost. Triggering task failover to
> > > > ensure
> > > > > consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> > > > > execution #0
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> > > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> > > > > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> > > > > at
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> > > > > at akka.japi.pf
> > .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > > at
> > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> > > > > at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> > > > > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> > > > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> > > > > at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> > > > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> > > > > at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> > > > > at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> > > > > at
> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > > at
> > > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > >
> > > > > Becket Qin is investigating this issue.
> > > > >
> > > >
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

dwysakowicz
In reply to this post by Guowei Ma
Hey Matthias,

I'd like to double confirm what Guowei said. The dependency is Apache 2
licensed and we do not bundle it in our jar (as it is in the runtime
scope) thus we do not need to mention it in the NOTICE file (btw, the
best way to check what is bundled is to check the output of maven shade
plugin). Thanks for checking it!

Best,

Dawid

On 29/04/2021 05:25, Guowei Ma wrote:

> Hi, Matthias
>
> Thank you very much for your careful inspection.
> I check the flink-python_2.11-1.13.0.jar and we do not bundle
> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
> So I think we may not need to add this to the NOTICE file. (BTW The jar's
> scope is runtime)
>
> Best,
> Guowei
>
>
> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
> wrote:
>
>> Thanks Dawid and Guowei for managing this release.
>>
>> - downloaded the sources and binaries and checked the checksums
>> - built Flink from the downloaded sources
>> - executed example jobs with standalone deployments - I didn't find
>> anything suspicious in the logs
>> - reviewed release announcement pull request
>>
>> - I did a pass over dependency updates: git diff release-1.12.2
>> release-1.13.0-rc2 */*.xml
>> There's one thing someone should double-check whether that's suppose to be
>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
>> dependency but I don't see it being reflected in the NOTICE file of the
>> flink-python module. Or is this automatically added later on?
>>
>> +1 (non-binding; please see remark on dependency above)
>>
>> Matthias
>>
>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:
>>
>>> Glad to hear that outcome. And no worries about the false alarm.
>>> Thank you for doing thorough testing, this is very helpful!
>>>
>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
>> wrote:
>>>> After the investigation we found that this issue is caused by the
>>>> implementation of connector, not by the Flink framework.
>>>>
>>>> Sorry for the false alarm.
>>>>
>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>>>>
>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
>>>> issue.
>>>>> I am wondering if there is some incorrect reporting of failed events?
>>>>>
>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
>>>> wrote:
>>>>>> -1
>>>>>>
>>>>>> We're testing this version on batch jobs with large (600~1000)
>>>>> parallelisms
>>>>>> and the following exception messages appear with high frequency:
>>>>>>
>>>>>> 2021-04-27 21:27:26
>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
>>>>>> OperatorCoordinator to a task was lost. Triggering task failover to
>>>>> ensure
>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
>>>>>> execution #0
>>>>>> at
>>>>>>
>>>>>>
>> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
>>>>>> at
>>>>>>
>>>>>>
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>>>>>> at
>>>>>>
>>>>>>
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>>>>>> at
>>>>>>
>>>>>>
>> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>>>>>> at
>>>>>>
>>>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
>>>>>> at
>>>>>>
>>>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>>>>>> at
>>>>>>
>>>>>>
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>>>>>> at
>>>>>>
>>>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>>>>> at
>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>>>>> at akka.japi.pf
>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>>>>> at
>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>>>>> at
>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>> at
>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>>>>> at
>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>> at
>>>>>>
>>>>>>
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>> at
>>>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>> at
>>>>>>
>>>>>>
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>> Becket Qin is investigating this issue.
>>>>>>


OpenPGP_signature (855 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Jark Wu-2
+1 (binding)

- checked/verified signatures and hashes
- started cluster and run some e2e sql queries using SQL Client, results
are as expect:
 * read from kafka source, window aggregate, lookup mysql database, write
into elasticsearch
 * window aggregate using legacy window syntax and new window TVF
 * verified web ui and log output
- reviewed the release PR

I found the log contains some verbose information when using window
aggregate,
but I think this doesn't block the release, I created FLINK-22522 to fix
it.

Best,
Jark


On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz <[hidden email]>
wrote:

> Hey Matthias,
>
> I'd like to double confirm what Guowei said. The dependency is Apache 2
> licensed and we do not bundle it in our jar (as it is in the runtime
> scope) thus we do not need to mention it in the NOTICE file (btw, the
> best way to check what is bundled is to check the output of maven shade
> plugin). Thanks for checking it!
>
> Best,
>
> Dawid
>
> On 29/04/2021 05:25, Guowei Ma wrote:
> > Hi, Matthias
> >
> > Thank you very much for your careful inspection.
> > I check the flink-python_2.11-1.13.0.jar and we do not bundle
> > org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
> > So I think we may not need to add this to the NOTICE file. (BTW The jar's
> > scope is runtime)
> >
> > Best,
> > Guowei
> >
> >
> > On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
> > wrote:
> >
> >> Thanks Dawid and Guowei for managing this release.
> >>
> >> - downloaded the sources and binaries and checked the checksums
> >> - built Flink from the downloaded sources
> >> - executed example jobs with standalone deployments - I didn't find
> >> anything suspicious in the logs
> >> - reviewed release announcement pull request
> >>
> >> - I did a pass over dependency updates: git diff release-1.12.2
> >> release-1.13.0-rc2 */*.xml
> >> There's one thing someone should double-check whether that's suppose to
> be
> >> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
> >> dependency but I don't see it being reflected in the NOTICE file of the
> >> flink-python module. Or is this automatically added later on?
> >>
> >> +1 (non-binding; please see remark on dependency above)
> >>
> >> Matthias
> >>
> >> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:
> >>
> >>> Glad to hear that outcome. And no worries about the false alarm.
> >>> Thank you for doing thorough testing, this is very helpful!
> >>>
> >>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
> >> wrote:
> >>>> After the investigation we found that this issue is caused by the
> >>>> implementation of connector, not by the Flink framework.
> >>>>
> >>>> Sorry for the false alarm.
> >>>>
> >>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
> >>>>
> >>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
> >>>> issue.
> >>>>> I am wondering if there is some incorrect reporting of failed events?
> >>>>>
> >>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> >>>> wrote:
> >>>>>> -1
> >>>>>>
> >>>>>> We're testing this version on batch jobs with large (600~1000)
> >>>>> parallelisms
> >>>>>> and the following exception messages appear with high frequency:
> >>>>>>
> >>>>>> 2021-04-27 21:27:26
> >>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
> >>>>>> OperatorCoordinator to a task was lost. Triggering task failover to
> >>>>> ensure
> >>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
> >>>>>> execution #0
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> >>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> >>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> >>>>>> at
> >> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> >>>>>> at akka.japi.pf
> >>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> >>>>>> at
> >>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> >>>>>> at
> >>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> >>>>>> at
> >>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> >>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> >>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> >>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> >>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> >>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> >>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> >>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> >>>>>> at
> >> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >>>>>> at
> >>>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >>>>>> at
> >>>>>>
> >>>>>>
> >>
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >>>>>> Becket Qin is investigating this issue.
> >>>>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Dian Fu-2
+1 (binding)

- Verified the signature and checksum
- Installed PyFlink successfully using the source package
- Run a few PyFlink examples: Python UDF, Pandas UDF, Python DataStream API with state access, Python DataStream API with batch execution mode
- Reviewed the website PR

Regards,
Dian

> 2021年4月29日 下午3:11,Jark Wu <[hidden email]> 写道:
>
> +1 (binding)
>
> - checked/verified signatures and hashes
> - started cluster and run some e2e sql queries using SQL Client, results
> are as expect:
> * read from kafka source, window aggregate, lookup mysql database, write
> into elasticsearch
> * window aggregate using legacy window syntax and new window TVF
> * verified web ui and log output
> - reviewed the release PR
>
> I found the log contains some verbose information when using window
> aggregate,
> but I think this doesn't block the release, I created FLINK-22522 to fix
> it.
>
> Best,
> Jark
>
>
> On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz <[hidden email]>
> wrote:
>
>> Hey Matthias,
>>
>> I'd like to double confirm what Guowei said. The dependency is Apache 2
>> licensed and we do not bundle it in our jar (as it is in the runtime
>> scope) thus we do not need to mention it in the NOTICE file (btw, the
>> best way to check what is bundled is to check the output of maven shade
>> plugin). Thanks for checking it!
>>
>> Best,
>>
>> Dawid
>>
>> On 29/04/2021 05:25, Guowei Ma wrote:
>>> Hi, Matthias
>>>
>>> Thank you very much for your careful inspection.
>>> I check the flink-python_2.11-1.13.0.jar and we do not bundle
>>> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
>>> So I think we may not need to add this to the NOTICE file. (BTW The jar's
>>> scope is runtime)
>>>
>>> Best,
>>> Guowei
>>>
>>>
>>> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
>>> wrote:
>>>
>>>> Thanks Dawid and Guowei for managing this release.
>>>>
>>>> - downloaded the sources and binaries and checked the checksums
>>>> - built Flink from the downloaded sources
>>>> - executed example jobs with standalone deployments - I didn't find
>>>> anything suspicious in the logs
>>>> - reviewed release announcement pull request
>>>>
>>>> - I did a pass over dependency updates: git diff release-1.12.2
>>>> release-1.13.0-rc2 */*.xml
>>>> There's one thing someone should double-check whether that's suppose to
>> be
>>>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
>>>> dependency but I don't see it being reflected in the NOTICE file of the
>>>> flink-python module. Or is this automatically added later on?
>>>>
>>>> +1 (non-binding; please see remark on dependency above)
>>>>
>>>> Matthias
>>>>
>>>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]> wrote:
>>>>
>>>>> Glad to hear that outcome. And no worries about the false alarm.
>>>>> Thank you for doing thorough testing, this is very helpful!
>>>>>
>>>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
>>>> wrote:
>>>>>> After the investigation we found that this issue is caused by the
>>>>>> implementation of connector, not by the Flink framework.
>>>>>>
>>>>>> Sorry for the false alarm.
>>>>>>
>>>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>>>>>>
>>>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
>>>>>> issue.
>>>>>>> I am wondering if there is some incorrect reporting of failed events?
>>>>>>>
>>>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
>>>>>> wrote:
>>>>>>>> -1
>>>>>>>>
>>>>>>>> We're testing this version on batch jobs with large (600~1000)
>>>>>>> parallelisms
>>>>>>>> and the following exception messages appear with high frequency:
>>>>>>>>
>>>>>>>> 2021-04-27 21:27:26
>>>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
>>>>>>>> OperatorCoordinator to a task was lost. Triggering task failover to
>>>>>>> ensure
>>>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name> -
>>>>>>>> execution #0
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>>>>>>> at
>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>>>>>>> at akka.japi.pf
>>>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>>>>>>> at
>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>>>>>>> at
>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>> at
>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>>>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>>>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>>>>>>> at
>>>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>>>> at
>>>>>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>>> Becket Qin is investigating this issue.
>>>>>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Xingbo Huang
+1 (non-binding)

- verified checksum and signature
- test upload `apache-flink` and `apache-flink-libraries` to test.pypi
- pip install `apache-flink-libraries` and `apache-flink` in mac os
- started cluster and run row-based operation test
- started cluster and test python general group window agg

Best,
Xingbo

Dian Fu <[hidden email]> 于2021年4月29日周四 下午4:05写道:

> +1 (binding)
>
> - Verified the signature and checksum
> - Installed PyFlink successfully using the source package
> - Run a few PyFlink examples: Python UDF, Pandas UDF, Python DataStream
> API with state access, Python DataStream API with batch execution mode
> - Reviewed the website PR
>
> Regards,
> Dian
>
> > 2021年4月29日 下午3:11,Jark Wu <[hidden email]> 写道:
> >
> > +1 (binding)
> >
> > - checked/verified signatures and hashes
> > - started cluster and run some e2e sql queries using SQL Client, results
> > are as expect:
> > * read from kafka source, window aggregate, lookup mysql database, write
> > into elasticsearch
> > * window aggregate using legacy window syntax and new window TVF
> > * verified web ui and log output
> > - reviewed the release PR
> >
> > I found the log contains some verbose information when using window
> > aggregate,
> > but I think this doesn't block the release, I created FLINK-22522 to fix
> > it.
> >
> > Best,
> > Jark
> >
> >
> > On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz <[hidden email]>
> > wrote:
> >
> >> Hey Matthias,
> >>
> >> I'd like to double confirm what Guowei said. The dependency is Apache 2
> >> licensed and we do not bundle it in our jar (as it is in the runtime
> >> scope) thus we do not need to mention it in the NOTICE file (btw, the
> >> best way to check what is bundled is to check the output of maven shade
> >> plugin). Thanks for checking it!
> >>
> >> Best,
> >>
> >> Dawid
> >>
> >> On 29/04/2021 05:25, Guowei Ma wrote:
> >>> Hi, Matthias
> >>>
> >>> Thank you very much for your careful inspection.
> >>> I check the flink-python_2.11-1.13.0.jar and we do not bundle
> >>> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
> >>> So I think we may not need to add this to the NOTICE file. (BTW The
> jar's
> >>> scope is runtime)
> >>>
> >>> Best,
> >>> Guowei
> >>>
> >>>
> >>> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
> >>> wrote:
> >>>
> >>>> Thanks Dawid and Guowei for managing this release.
> >>>>
> >>>> - downloaded the sources and binaries and checked the checksums
> >>>> - built Flink from the downloaded sources
> >>>> - executed example jobs with standalone deployments - I didn't find
> >>>> anything suspicious in the logs
> >>>> - reviewed release announcement pull request
> >>>>
> >>>> - I did a pass over dependency updates: git diff release-1.12.2
> >>>> release-1.13.0-rc2 */*.xml
> >>>> There's one thing someone should double-check whether that's suppose
> to
> >> be
> >>>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
> >>>> dependency but I don't see it being reflected in the NOTICE file of
> the
> >>>> flink-python module. Or is this automatically added later on?
> >>>>
> >>>> +1 (non-binding; please see remark on dependency above)
> >>>>
> >>>> Matthias
> >>>>
> >>>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]>
> wrote:
> >>>>
> >>>>> Glad to hear that outcome. And no worries about the false alarm.
> >>>>> Thank you for doing thorough testing, this is very helpful!
> >>>>>
> >>>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
> >>>> wrote:
> >>>>>> After the investigation we found that this issue is caused by the
> >>>>>> implementation of connector, not by the Flink framework.
> >>>>>>
> >>>>>> Sorry for the false alarm.
> >>>>>>
> >>>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
> >>>>>>
> >>>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
> >>>>>> issue.
> >>>>>>> I am wondering if there is some incorrect reporting of failed
> events?
> >>>>>>>
> >>>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
> >>>>>> wrote:
> >>>>>>>> -1
> >>>>>>>>
> >>>>>>>> We're testing this version on batch jobs with large (600~1000)
> >>>>>>> parallelisms
> >>>>>>>> and the following exception messages appear with high frequency:
> >>>>>>>>
> >>>>>>>> 2021-04-27 21:27:26
> >>>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
> >>>>>>>> OperatorCoordinator to a task was lost. Triggering task failover
> to
> >>>>>>> ensure
> >>>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name>
> -
> >>>>>>>> execution #0
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> >>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> >>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> >>>>>>>> at
> >>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> >>>>>>>> at akka.japi.pf
> >>>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> >>>>>>>> at
> >>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> >>>>>>>> at
> >>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> >>>>>>>> at
> >>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> >>>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> >>>>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> >>>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> >>>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> >>>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> >>>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> >>>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> >>>>>>>> at
> >>>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >>>>>>>> at
> >>>>>>
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >>>>>>>> Becket Qin is investigating this issue.
> >>>>>>>>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Leonard Xu
+1 (non-binding)

- verified signatures and hashes
- built from source code with scala 2.11 succeeded
- started a cluster, WebUI was accessible, ran some simple SQL jobs, no suspicious log output
- tested time functions and time zone usage in SQL Client, the query result is as expected
- the web PR looks good
- found one minor exception message typo, will improve it later

Best,
Leonard Xu

> 在 2021年4月29日,16:11,Xingbo Huang <[hidden email]> 写道:
>
> +1 (non-binding)
>
> - verified checksum and signature
> - test upload `apache-flink` and `apache-flink-libraries` to test.pypi
> - pip install `apache-flink-libraries` and `apache-flink` in mac os
> - started cluster and run row-based operation test
> - started cluster and test python general group window agg
>
> Best,
> Xingbo
>
> Dian Fu <[hidden email]> 于2021年4月29日周四 下午4:05写道:
>
>> +1 (binding)
>>
>> - Verified the signature and checksum
>> - Installed PyFlink successfully using the source package
>> - Run a few PyFlink examples: Python UDF, Pandas UDF, Python DataStream
>> API with state access, Python DataStream API with batch execution mode
>> - Reviewed the website PR
>>
>> Regards,
>> Dian
>>
>>> 2021年4月29日 下午3:11,Jark Wu <[hidden email]> 写道:
>>>
>>> +1 (binding)
>>>
>>> - checked/verified signatures and hashes
>>> - started cluster and run some e2e sql queries using SQL Client, results
>>> are as expect:
>>> * read from kafka source, window aggregate, lookup mysql database, write
>>> into elasticsearch
>>> * window aggregate using legacy window syntax and new window TVF
>>> * verified web ui and log output
>>> - reviewed the release PR
>>>
>>> I found the log contains some verbose information when using window
>>> aggregate,
>>> but I think this doesn't block the release, I created FLINK-22522 to fix
>>> it.
>>>
>>> Best,
>>> Jark
>>>
>>>
>>> On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz <[hidden email]>
>>> wrote:
>>>
>>>> Hey Matthias,
>>>>
>>>> I'd like to double confirm what Guowei said. The dependency is Apache 2
>>>> licensed and we do not bundle it in our jar (as it is in the runtime
>>>> scope) thus we do not need to mention it in the NOTICE file (btw, the
>>>> best way to check what is bundled is to check the output of maven shade
>>>> plugin). Thanks for checking it!
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> On 29/04/2021 05:25, Guowei Ma wrote:
>>>>> Hi, Matthias
>>>>>
>>>>> Thank you very much for your careful inspection.
>>>>> I check the flink-python_2.11-1.13.0.jar and we do not bundle
>>>>> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
>>>>> So I think we may not need to add this to the NOTICE file. (BTW The
>> jar's
>>>>> scope is runtime)
>>>>>
>>>>> Best,
>>>>> Guowei
>>>>>
>>>>>
>>>>> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Thanks Dawid and Guowei for managing this release.
>>>>>>
>>>>>> - downloaded the sources and binaries and checked the checksums
>>>>>> - built Flink from the downloaded sources
>>>>>> - executed example jobs with standalone deployments - I didn't find
>>>>>> anything suspicious in the logs
>>>>>> - reviewed release announcement pull request
>>>>>>
>>>>>> - I did a pass over dependency updates: git diff release-1.12.2
>>>>>> release-1.13.0-rc2 */*.xml
>>>>>> There's one thing someone should double-check whether that's suppose
>> to
>>>> be
>>>>>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
>>>>>> dependency but I don't see it being reflected in the NOTICE file of
>> the
>>>>>> flink-python module. Or is this automatically added later on?
>>>>>>
>>>>>> +1 (non-binding; please see remark on dependency above)
>>>>>>
>>>>>> Matthias
>>>>>>
>>>>>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Glad to hear that outcome. And no worries about the false alarm.
>>>>>>> Thank you for doing thorough testing, this is very helpful!
>>>>>>>
>>>>>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
>>>>>> wrote:
>>>>>>>> After the investigation we found that this issue is caused by the
>>>>>>>> implementation of connector, not by the Flink framework.
>>>>>>>>
>>>>>>>> Sorry for the false alarm.
>>>>>>>>
>>>>>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>>>>>>>>
>>>>>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
>>>>>>>> issue.
>>>>>>>>> I am wondering if there is some incorrect reporting of failed
>> events?
>>>>>>>>>
>>>>>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>>> -1
>>>>>>>>>>
>>>>>>>>>> We're testing this version on batch jobs with large (600~1000)
>>>>>>>>> parallelisms
>>>>>>>>>> and the following exception messages appear with high frequency:
>>>>>>>>>>
>>>>>>>>>> 2021-04-27 21:27:26
>>>>>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
>>>>>>>>>> OperatorCoordinator to a task was lost. Triggering task failover
>> to
>>>>>>>>> ensure
>>>>>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name>
>> -
>>>>>>>>>> execution #0
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>>>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>>>>>>>>> at akka.japi.pf
>>>>>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>>>>>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>>>>>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>>>>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>>>>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>>>>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>>>>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>>>>>>>>> at
>>>>>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>>>>>> at
>>>>>>>>
>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>>>>> Becket Qin is investigating this issue.
>>>>>>>>>>
>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

RE: [VOTE] Release 1.13.0, release candidate #2

Zhou, Brian
+1 (non-binding)

- Build the Pravega Flink connector with RC artifacts and all tests pass
- Start a cluster, Run Pravega reader and writer application on it successfully

Thanks,
Brian

-----Original Message-----
From: Leonard Xu <[hidden email]>
Sent: Thursday, April 29, 2021 16:53
To: dev
Subject: Re: [VOTE] Release 1.13.0, release candidate #2


[EXTERNAL EMAIL]

+1 (non-binding)

- verified signatures and hashes
- built from source code with scala 2.11 succeeded
- started a cluster, WebUI was accessible, ran some simple SQL jobs, no suspicious log output
- tested time functions and time zone usage in SQL Client, the query result is as expected
- the web PR looks good
- found one minor exception message typo, will improve it later

Best,
Leonard Xu

> 在 2021年4月29日,16:11,Xingbo Huang <[hidden email]> 写道:
>
> +1 (non-binding)
>
> - verified checksum and signature
> - test upload `apache-flink` and `apache-flink-libraries` to test.pypi
> - pip install `apache-flink-libraries` and `apache-flink` in mac os
> - started cluster and run row-based operation test
> - started cluster and test python general group window agg
>
> Best,
> Xingbo
>
> Dian Fu <[hidden email]> 于2021年4月29日周四 下午4:05写道:
>
>> +1 (binding)
>>
>> - Verified the signature and checksum
>> - Installed PyFlink successfully using the source package
>> - Run a few PyFlink examples: Python UDF, Pandas UDF, Python
>> DataStream API with state access, Python DataStream API with batch
>> execution mode
>> - Reviewed the website PR
>>
>> Regards,
>> Dian
>>
>>> 2021年4月29日 下午3:11,Jark Wu <[hidden email]> 写道:
>>>
>>> +1 (binding)
>>>
>>> - checked/verified signatures and hashes
>>> - started cluster and run some e2e sql queries using SQL Client,
>>> results are as expect:
>>> * read from kafka source, window aggregate, lookup mysql database,
>>> write into elasticsearch
>>> * window aggregate using legacy window syntax and new window TVF
>>> * verified web ui and log output
>>> - reviewed the release PR
>>>
>>> I found the log contains some verbose information when using window
>>> aggregate, but I think this doesn't block the release, I created
>>> FLINK-22522 to fix it.
>>>
>>> Best,
>>> Jark
>>>
>>>
>>> On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz
>>> <[hidden email]>
>>> wrote:
>>>
>>>> Hey Matthias,
>>>>
>>>> I'd like to double confirm what Guowei said. The dependency is
>>>> Apache 2 licensed and we do not bundle it in our jar (as it is in
>>>> the runtime
>>>> scope) thus we do not need to mention it in the NOTICE file (btw,
>>>> the best way to check what is bundled is to check the output of
>>>> maven shade plugin). Thanks for checking it!
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> On 29/04/2021 05:25, Guowei Ma wrote:
>>>>> Hi, Matthias
>>>>>
>>>>> Thank you very much for your careful inspection.
>>>>> I check the flink-python_2.11-1.13.0.jar and we do not bundle
>>>>> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
>>>>> So I think we may not need to add this to the NOTICE file. (BTW
>>>>> The
>> jar's
>>>>> scope is runtime)
>>>>>
>>>>> Best,
>>>>> Guowei
>>>>>
>>>>>
>>>>> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl
>>>>> <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Thanks Dawid and Guowei for managing this release.
>>>>>>
>>>>>> - downloaded the sources and binaries and checked the checksums
>>>>>> - built Flink from the downloaded sources
>>>>>> - executed example jobs with standalone deployments - I didn't
>>>>>> find anything suspicious in the logs
>>>>>> - reviewed release announcement pull request
>>>>>>
>>>>>> - I did a pass over dependency updates: git diff release-1.12.2
>>>>>> release-1.13.0-rc2 */*.xml
>>>>>> There's one thing someone should double-check whether that's
>>>>>> suppose
>> to
>>>> be
>>>>>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as
>>>>>> a dependency but I don't see it being reflected in the NOTICE
>>>>>> file of
>> the
>>>>>> flink-python module. Or is this automatically added later on?
>>>>>>
>>>>>> +1 (non-binding; please see remark on dependency above)
>>>>>>
>>>>>> Matthias
>>>>>>
>>>>>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Glad to hear that outcome. And no worries about the false alarm.
>>>>>>> Thank you for doing thorough testing, this is very helpful!
>>>>>>>
>>>>>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng
>>>>>>> <[hidden email]>
>>>>>> wrote:
>>>>>>>> After the investigation we found that this issue is caused by
>>>>>>>> the implementation of connector, not by the Flink framework.
>>>>>>>>
>>>>>>>> Sorry for the false alarm.
>>>>>>>>
>>>>>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>>>>>>>>
>>>>>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug
>>>>>>>>> this
>>>>>>>> issue.
>>>>>>>>> I am wondering if there is some incorrect reporting of failed
>> events?
>>>>>>>>>
>>>>>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng
>>>>>>>>> <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>>> -1
>>>>>>>>>>
>>>>>>>>>> We're testing this version on batch jobs with large
>>>>>>>>>> (600~1000)
>>>>>>>>> parallelisms
>>>>>>>>>> and the following exception messages appear with high frequency:
>>>>>>>>>>
>>>>>>>>>> 2021-04-27 21:27:26
>>>>>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from
>>>>>>>>>> an OperatorCoordinator to a task was lost. Triggering task
>>>>>>>>>> failover
>> to
>>>>>>>>> ensure
>>>>>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task
>>>>>>>>>> name>
>> -
>>>>>>>>>> execution #0
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.la
>> mbda$sendEvent$0(SubtaskGatewayImpl.java:81)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.ja
>> va:822)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableF
>> uture.java:797)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$Completion.run(CompletableFutu
>> re.java:442)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpc
>> Actor.java:440)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaR
>> pcActor.java:208)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage
>> (FencedAkkaRpcActor.java:77)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcA
>> ctor.java:158)
>>>>>>>>>> at
>>>>>>>>>> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>>>>>>>>> at
>>>>>>>>>> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123
>>>>>> )
>>>>>>>>>> at akka.japi.pf
>>>>>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:1
>>>>>>> 70)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:1
>>>>>>> 71)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:1
>>>>>>> 71)
>>>>>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>>>>>>>>> at
>>>>>>>>>> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:22
>>>>>>>>>> 5) at
>>>>>>>>>> akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>>>>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>>>>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>>>>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>>>>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>>>>>>>>> at
>>>>>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.ja
>> va:1339)
>>>>>>>>>> at
>>>>>>>>
>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.
>> java:107)
>>>>>>>>>> Becket Qin is investigating this issue.
>>>>>>>>>>
>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.13.0, release candidate #2

Yun Tang
In reply to this post by Leonard Xu
+1 (non-binding)

- built from source code with scala 2.11 succeeded
- submit state machine example and it runed well with expected commit id shown in UI.
- enable state latency tracking with slf4j metrics reporter and all behaves as expected.
- Click 'FlameGraph' but found the we UI did not give friendly hint to tell me enable it via setting rest.flamegraph.enabled: true, will create issue later.

Best
Yun Tang
________________________________
From: Leonard Xu <[hidden email]>
Sent: Thursday, April 29, 2021 16:52
To: dev <[hidden email]>
Subject: Re: [VOTE] Release 1.13.0, release candidate #2

+1 (non-binding)

- verified signatures and hashes
- built from source code with scala 2.11 succeeded
- started a cluster, WebUI was accessible, ran some simple SQL jobs, no suspicious log output
- tested time functions and time zone usage in SQL Client, the query result is as expected
- the web PR looks good
- found one minor exception message typo, will improve it later

Best,
Leonard Xu

> 在 2021年4月29日,16:11,Xingbo Huang <[hidden email]> 写道:
>
> +1 (non-binding)
>
> - verified checksum and signature
> - test upload `apache-flink` and `apache-flink-libraries` to test.pypi
> - pip install `apache-flink-libraries` and `apache-flink` in mac os
> - started cluster and run row-based operation test
> - started cluster and test python general group window agg
>
> Best,
> Xingbo
>
> Dian Fu <[hidden email]> 于2021年4月29日周四 下午4:05写道:
>
>> +1 (binding)
>>
>> - Verified the signature and checksum
>> - Installed PyFlink successfully using the source package
>> - Run a few PyFlink examples: Python UDF, Pandas UDF, Python DataStream
>> API with state access, Python DataStream API with batch execution mode
>> - Reviewed the website PR
>>
>> Regards,
>> Dian
>>
>>> 2021年4月29日 下午3:11,Jark Wu <[hidden email]> 写道:
>>>
>>> +1 (binding)
>>>
>>> - checked/verified signatures and hashes
>>> - started cluster and run some e2e sql queries using SQL Client, results
>>> are as expect:
>>> * read from kafka source, window aggregate, lookup mysql database, write
>>> into elasticsearch
>>> * window aggregate using legacy window syntax and new window TVF
>>> * verified web ui and log output
>>> - reviewed the release PR
>>>
>>> I found the log contains some verbose information when using window
>>> aggregate,
>>> but I think this doesn't block the release, I created FLINK-22522 to fix
>>> it.
>>>
>>> Best,
>>> Jark
>>>
>>>
>>> On Thu, 29 Apr 2021 at 14:46, Dawid Wysakowicz <[hidden email]>
>>> wrote:
>>>
>>>> Hey Matthias,
>>>>
>>>> I'd like to double confirm what Guowei said. The dependency is Apache 2
>>>> licensed and we do not bundle it in our jar (as it is in the runtime
>>>> scope) thus we do not need to mention it in the NOTICE file (btw, the
>>>> best way to check what is bundled is to check the output of maven shade
>>>> plugin). Thanks for checking it!
>>>>
>>>> Best,
>>>>
>>>> Dawid
>>>>
>>>> On 29/04/2021 05:25, Guowei Ma wrote:
>>>>> Hi, Matthias
>>>>>
>>>>> Thank you very much for your careful inspection.
>>>>> I check the flink-python_2.11-1.13.0.jar and we do not bundle
>>>>> org.conscrypt:conscrypt-openjdk-uber:2.5.1 to it.
>>>>> So I think we may not need to add this to the NOTICE file. (BTW The
>> jar's
>>>>> scope is runtime)
>>>>>
>>>>> Best,
>>>>> Guowei
>>>>>
>>>>>
>>>>> On Thu, Apr 29, 2021 at 2:33 AM Matthias Pohl <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Thanks Dawid and Guowei for managing this release.
>>>>>>
>>>>>> - downloaded the sources and binaries and checked the checksums
>>>>>> - built Flink from the downloaded sources
>>>>>> - executed example jobs with standalone deployments - I didn't find
>>>>>> anything suspicious in the logs
>>>>>> - reviewed release announcement pull request
>>>>>>
>>>>>> - I did a pass over dependency updates: git diff release-1.12.2
>>>>>> release-1.13.0-rc2 */*.xml
>>>>>> There's one thing someone should double-check whether that's suppose
>> to
>>>> be
>>>>>> like that: We added org.conscrypt:conscrypt-openjdk-uber:2.5.1 as a
>>>>>> dependency but I don't see it being reflected in the NOTICE file of
>> the
>>>>>> flink-python module. Or is this automatically added later on?
>>>>>>
>>>>>> +1 (non-binding; please see remark on dependency above)
>>>>>>
>>>>>> Matthias
>>>>>>
>>>>>> On Wed, Apr 28, 2021 at 1:52 PM Stephan Ewen <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Glad to hear that outcome. And no worries about the false alarm.
>>>>>>> Thank you for doing thorough testing, this is very helpful!
>>>>>>>
>>>>>>> On Wed, Apr 28, 2021 at 1:04 PM Caizhi Weng <[hidden email]>
>>>>>> wrote:
>>>>>>>> After the investigation we found that this issue is caused by the
>>>>>>>> implementation of connector, not by the Flink framework.
>>>>>>>>
>>>>>>>> Sorry for the false alarm.
>>>>>>>>
>>>>>>>> Stephan Ewen <[hidden email]> 于2021年4月28日周三 下午3:23写道:
>>>>>>>>
>>>>>>>>> @Caizhi and @Becket - let me reach out to you to jointly debug this
>>>>>>>> issue.
>>>>>>>>> I am wondering if there is some incorrect reporting of failed
>> events?
>>>>>>>>>
>>>>>>>>> On Wed, Apr 28, 2021 at 8:53 AM Caizhi Weng <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>>> -1
>>>>>>>>>>
>>>>>>>>>> We're testing this version on batch jobs with large (600~1000)
>>>>>>>>> parallelisms
>>>>>>>>>> and the following exception messages appear with high frequency:
>>>>>>>>>>
>>>>>>>>>> 2021-04-27 21:27:26
>>>>>>>>>> org.apache.flink.util.FlinkException: An OperatorEvent from an
>>>>>>>>>> OperatorCoordinator to a task was lost. Triggering task failover
>> to
>>>>>>>>> ensure
>>>>>>>>>> consistency. Event: '[NoMoreSplitEvent]', targetTask: <task name>
>> -
>>>>>>>>>> execution #0
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.operators.coordination.SubtaskGatewayImpl.lambda$sendEvent$0(SubtaskGatewayImpl.java:81)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>>>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>>>>>>>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>>>>>>>>> at akka.japi.pf
>>>>>>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>>>> at
>>>>>>> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>>>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>>>>>>>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>>>>>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>>>>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>>>>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>>>>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>>>>>>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>>>>>>>>> at
>>>>>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>>>>>> at
>>>>>>>>
>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>>>>> Becket Qin is investigating this issue.
>>>>>>>>>>
>>>>
>>>>
>>
>>

12