(DEPRECATED) Apache Flink Mailing List archive.

[VOTE] Release Apache Flink 1.1.0 (RC1)

Classic

List

Threaded

17 messages Options

Ufuk Celebi-2

[VOTE] Release Apache Flink 1.1.0 (RC1)

Dear Flink community,

Please vote on releasing the following candidate as Apache Flink version 1.1.0.

I've CC'd [hidden email] as users are encouraged to help
testing Flink 1.1.0 for their specific use cases. Please feel free to
report issues and successful tests on [hidden email].

The commit to be voted on:
3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)

Branch:
release-1.1.0-rc1
(https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
)

The release artifacts to be voted on can be found at:
http://people.apache.org/~uce/flink-1.1.0-rc1/

The release artifacts are signed with the key with fingerprint 9D403309:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1098

There is also a Google doc to coordinate the testing efforts. This is
a copy of the release document found in our Wiki:
https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing

-------------------------------------------------------------

Thanks to everyone who contributed to this release candidate.

The vote is open for the next 3 days (not counting the weekend) and
passes if a majority of at least three +1 PMC votes are cast.

The vote ends on Monday August 1st, 2016.

[ ] +1 Release this package as Apache Flink 1.1.0
[ ] -1 Do not release this package, because ...

Aljoscha Krettek-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

When running "mvn clean verify" with Hadoop version 2.6.1 the
Zookeeper/Leader Election tests fail with this:

java.lang.NoSuchMethodError:
org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
at
org.apache.curator.framework.imps.NamespaceImpl.<init>(NamespaceImpl.java:37)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:113)
at
org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
at
org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
at
org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
at
org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
at
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)

I'll continue testing other parts and other Hadoop versions.

On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi <[hidden email]> wrote:

> Dear Flink community,
>
> Please vote on releasing the following candidate as Apache Flink version
> 1.1.0.
>
> I've CC'd [hidden email] as users are encouraged to help
> testing Flink 1.1.0 for their specific use cases. Please feel free to
> report issues and successful tests on [hidden email].
>
> The commit to be voted on:
> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
>
> Branch:
> release-1.1.0-rc1
> (
> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
> )
>
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~uce/flink-1.1.0-rc1/
>
> The release artifacts are signed with the key with fingerprint 9D403309:
> http://www.apache.org/dist/flink/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1098
>
> There is also a Google doc to coordinate the testing efforts. This is
> a copy of the release document found in our Wiki:
>
> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
>
> -------------------------------------------------------------
>
> Thanks to everyone who contributed to this release candidate.
>
> The vote is open for the next 3 days (not counting the weekend) and
> passes if a majority of at least three +1 PMC votes are cast.
>
> The vote ends on Monday August 1st, 2016.
>
> [ ] +1 Release this package as Apache Flink 1.1.0
> [ ] -1 Do not release this package, because ...
>

Ufuk Celebi-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

Probably related to shading :( What's strange is that Travis builds
for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes...
Travis is super flakey at the moment, because of some corrupted cached
dependencies): https://travis-ci.org/apache/flink/jobs/148348699

On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek <[hidden email]> wrote:

> When running "mvn clean verify" with Hadoop version 2.6.1 the
> Zookeeper/Leader Election tests fail with this:
>
> java.lang.NoSuchMethodError:
> org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
> at
> org.apache.curator.framework.imps.NamespaceImpl.<init>(NamespaceImpl.java:37)
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:113)
> at
> org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
> at
> org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
> at
> org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
> at
> org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
> at
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)
>
> I'll continue testing other parts and other Hadoop versions.
>
> On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi <[hidden email]> wrote:
>
>> Dear Flink community,
>>
>> Please vote on releasing the following candidate as Apache Flink version
>> 1.1.0.
>>
>> I've CC'd [hidden email] as users are encouraged to help
>> testing Flink 1.1.0 for their specific use cases. Please feel free to
>> report issues and successful tests on [hidden email].
>>
>> The commit to be voted on:
>> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
>>
>> Branch:
>> release-1.1.0-rc1
>> (
>> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
>> )
>>
>> The release artifacts to be voted on can be found at:
>> http://people.apache.org/~uce/flink-1.1.0-rc1/
>>
>> The release artifacts are signed with the key with fingerprint 9D403309:
>> http://www.apache.org/dist/flink/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapacheflink-1098
>>
>> There is also a Google doc to coordinate the testing efforts. This is
>> a copy of the release document found in our Wiki:
>>
>> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
>>
>> -------------------------------------------------------------
>>
>> Thanks to everyone who contributed to this release candidate.
>>
>> The vote is open for the next 3 days (not counting the weekend) and
>> passes if a majority of at least three +1 PMC votes are cast.
>>
>> The vote ends on Monday August 1st, 2016.
>>
>> [ ] +1 Release this package as Apache Flink 1.1.0
>> [ ] -1 Do not release this package, because ...
>>

Stephan Ewen

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

Just tried to reproduce the error reported by Aljoscha, but could not.
I used a clean checkpoint of the RC1 code and cleaned all local maven
caches before the testing.

@Aljoscha: Can you reproduce this on your machine? Can you try and clean
the maven caches?

On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi <[hidden email]> wrote:

> Probably related to shading :( What's strange is that Travis builds
> for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes...
> Travis is super flakey at the moment, because of some corrupted cached
> dependencies): https://travis-ci.org/apache/flink/jobs/148348699
>
> On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek <[hidden email]>
> wrote:
> > When running "mvn clean verify" with Hadoop version 2.6.1 the
> > Zookeeper/Leader Election tests fail with this:
> >
> > java.lang.NoSuchMethodError:
> >
> org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
> > at
> >
> org.apache.curator.framework.imps.NamespaceImpl.<init>(NamespaceImpl.java:37)
> > at
> >
> org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:113)
> > at
> >
> org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
> > at
> >
> org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
> > at
> >
> org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
> > at
> >
> org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
> > at
> >
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)
> >
> > I'll continue testing other parts and other Hadoop versions.
> >
> > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi <[hidden email]> wrote:
> >
> >> Dear Flink community,
> >>
> >> Please vote on releasing the following candidate as Apache Flink version
> >> 1.1.0.
> >>
> >> I've CC'd [hidden email] as users are encouraged to help
> >> testing Flink 1.1.0 for their specific use cases. Please feel free to
> >> report issues and successful tests on [hidden email].
> >>
> >> The commit to be voted on:
> >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
> >>
> >> Branch:
> >> release-1.1.0-rc1
> >> (
> >>
> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
> >> )
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~uce/flink-1.1.0-rc1/
> >>
> >> The release artifacts are signed with the key with fingerprint 9D403309:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapacheflink-1098
> >>
> >> There is also a Google doc to coordinate the testing efforts. This is
> >> a copy of the release document found in our Wiki:
> >>
> >>
> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
> >>
> >> -------------------------------------------------------------
> >>
> >> Thanks to everyone who contributed to this release candidate.
> >>
> >> The vote is open for the next 3 days (not counting the weekend) and
> >> passes if a majority of at least three +1 PMC votes are cast.
> >>
> >> The vote ends on Monday August 1st, 2016.
> >>
> >> [ ] +1 Release this package as Apache Flink 1.1.0
> >> [ ] -1 Do not release this package, because ...
> >>
>

mxm

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

Thanks for the new release candidate Ufuk!

Found two issues during testing:

1) Scheduling: The Flink scheduler accepts (it shouldn't) jobs with
parallelism > total number of task slots, schedules tasks in all
available task slots, and leaves the remaining tasks lingering
forever. Haven't had time to investigate much, but a bit more details
here:
=> JIRA: https://issues.apache.org/jira/browse/FLINK-4296

2) Yarn encoding issues with special characters in automatically
determined location of the far jar
=> JIRA: https://issues.apache.org/jira/browse/FLINK-4297
=> Fix: https://github.com/apache/flink/pull/2320

Otherwise, looks pretty good so far :)

On Mon, Aug 1, 2016 at 10:27 AM, Stephan Ewen <[hidden email]> wrote:

> Just tried to reproduce the error reported by Aljoscha, but could not.
> I used a clean checkpoint of the RC1 code and cleaned all local maven caches
> before the testing.
>
> @Aljoscha: Can you reproduce this on your machine? Can you try and clean the
> maven caches?
>
> On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi <[hidden email]> wrote:
>>
>> Probably related to shading :( What's strange is that Travis builds
>> for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes...
>> Travis is super flakey at the moment, because of some corrupted cached
>> dependencies): https://travis-ci.org/apache/flink/jobs/148348699
>>
>> On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek <[hidden email]>
>> wrote:
>> > When running "mvn clean verify" with Hadoop version 2.6.1 the
>> > Zookeeper/Leader Election tests fail with this:
>> >
>> > java.lang.NoSuchMethodError:
>> >
>> > org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
>> > at
>> >
>> > org.apache.curator.framework.imps.NamespaceImpl.<init>(NamespaceImpl.java:37)
>> > at
>> >
>> > org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:113)
>> > at
>> >
>> > org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
>> > at
>> >
>> > org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
>> > at
>> >
>> > org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
>> > at
>> >
>> > org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
>> > at
>> >
>> > org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)
>> >
>> > I'll continue testing other parts and other Hadoop versions.
>> >
>> > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi <[hidden email]> wrote:
>> >
>> >> Dear Flink community,
>> >>
>> >> Please vote on releasing the following candidate as Apache Flink
>> >> version
>> >> 1.1.0.
>> >>
>> >> I've CC'd [hidden email] as users are encouraged to help
>> >> testing Flink 1.1.0 for their specific use cases. Please feel free to
>> >> report issues and successful tests on [hidden email].
>> >>
>> >> The commit to be voted on:
>> >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
>> >>
>> >> Branch:
>> >> release-1.1.0-rc1
>> >> (
>> >>
>> >> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
>> >> )
>> >>
>> >> The release artifacts to be voted on can be found at:
>> >> http://people.apache.org/~uce/flink-1.1.0-rc1/
>> >>
>> >> The release artifacts are signed with the key with fingerprint
>> >> 9D403309:
>> >> http://www.apache.org/dist/flink/KEYS
>> >>
>> >> The staging repository for this release can be found at:
>> >> https://repository.apache.org/content/repositories/orgapacheflink-1098
>> >>
>> >> There is also a Google doc to coordinate the testing efforts. This is
>> >> a copy of the release document found in our Wiki:
>> >>
>> >>
>> >> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
>> >>
>> >> -------------------------------------------------------------
>> >>
>> >> Thanks to everyone who contributed to this release candidate.
>> >>
>> >> The vote is open for the next 3 days (not counting the weekend) and
>> >> passes if a majority of at least three +1 PMC votes are cast.
>> >>
>> >> The vote ends on Monday August 1st, 2016.
>> >>
>> >> [ ] +1 Release this package as Apache Flink 1.1.0
>> >> [ ] -1 Do not release this package, because ...
>> >>
>
>

mxm

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

This is also a major issue for batch with off-heap memory and memory
preallocation turned off:
https://issues.apache.org/jira/browse/FLINK-4094
Not hard to fix though as we simply need to reliably clear the direct
memory instead of relying on garbage collection. Another possible fix
is to maintain memory pools independently of the preallocation mode. I
think this is fine because preallocation:false suggests that no memory
will be preallocated but not that memory will be freed once acquired.

Aljoscha Krettek-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

I tried it again now. I did:

rm -r .m2/repository
mvn clean verify -Dhadoop.version=2.6.0

failed again. Also with versions 2.6.1 and 2.6.3.

On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:

> This is also a major issue for batch with off-heap memory and memory
> preallocation turned off:
> https://issues.apache.org/jira/browse/FLINK-4094
> Not hard to fix though as we simply need to reliably clear the direct
> memory instead of relying on garbage collection. Another possible fix
> is to maintain memory pools independently of the preallocation mode. I
> think this is fine because preallocation:false suggests that no memory
> will be preallocated but not that memory will be freed once acquired.
>

Till Rohrmann

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

I think that FLINK-4094 is nice to fix but not a release blocker since we
know how to prevent this situation (setting preallocation to true).

On Mon, Aug 1, 2016 at 11:56 PM, Aljoscha Krettek <[hidden email]>
wrote:

> I tried it again now. I did:
>
> rm -r .m2/repository
> mvn clean verify -Dhadoop.version=2.6.0
>
> failed again. Also with versions 2.6.1 and 2.6.3.
>
> On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:
>
> > This is also a major issue for batch with off-heap memory and memory
> > preallocation turned off:
> > https://issues.apache.org/jira/browse/FLINK-4094
> > Not hard to fix though as we simply need to reliably clear the direct
> > memory instead of relying on garbage collection. Another possible fix
> > is to maintain memory pools independently of the preallocation mode. I
> > think this is fine because preallocation:false suggests that no memory
> > will be preallocated but not that memory will be freed once acquired.
> >
>

Ufuk Celebi-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

In reply to this post by Aljoscha Krettek-2

Which Maven version are you using?

On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]> wrote:

> I tried it again now. I did:
>
> rm -r .m2/repository
> mvn clean verify -Dhadoop.version=2.6.0
>
> failed again. Also with versions 2.6.1 and 2.6.3.
>
> On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:
>
>> This is also a major issue for batch with off-heap memory and memory
>> preallocation turned off:
>> https://issues.apache.org/jira/browse/FLINK-4094
>> Not hard to fix though as we simply need to reliably clear the direct
>> memory instead of relying on garbage collection. Another possible fix
>> is to maintain memory pools independently of the preallocation mode. I
>> think this is fine because preallocation:false suggests that no memory
>> will be preallocated but not that memory will be freed once acquired.
>>

Stephan Ewen

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

In reply to this post by Aljoscha Krettek-2

@Aljoscha: Have you made sure you have a clean maven cache (remove the
.m2/repository/org/apache/flink folder)?

On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]>
wrote:

Aljoscha Krettek-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

@Ufuk: 3.3.9, that's probably it because that messes with the shading,
right?

@Stephan: Yes, even did a "rm -r .m2/repository". But the maven version is
most likely the reason.

On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:

> @Aljoscha: Have you made sure you have a clean maven cache (remove the
> .m2/repository/org/apache/flink folder)?
>
> On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > I tried it again now. I did:
> >
> > rm -r .m2/repository
> > mvn clean verify -Dhadoop.version=2.6.0
> >
> > failed again. Also with versions 2.6.1 and 2.6.3.
> >
> > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:
> >
> > > This is also a major issue for batch with off-heap memory and memory
> > > preallocation turned off:
> > > https://issues.apache.org/jira/browse/FLINK-4094
> > > Not hard to fix though as we simply need to reliably clear the direct
> > > memory instead of relying on garbage collection. Another possible fix
> > > is to maintain memory pools independently of the preallocation mode. I
> > > think this is fine because preallocation:false suggests that no memory
> > > will be preallocated but not that memory will be freed once acquired.
> > >
> >
>

Till Rohrmann

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

I can confirm Aljoscha's findings concerning building Flink with Hadoop
version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed a
Maven 3.3 issue. If you build flink-runtime twice, then everything goes
through because the shaded curator Flink dependency is installed in during
the first run.

On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <[hidden email]>
wrote:

> @Ufuk: 3.3.9, that's probably it because that messes with the shading,
> right?
>
> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version is
> most likely the reason.
>
> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
>
> > @Aljoscha: Have you made sure you have a clean maven cache (remove the
> > .m2/repository/org/apache/flink folder)?
> >
> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > I tried it again now. I did:
> > >
> > > rm -r .m2/repository
> > > mvn clean verify -Dhadoop.version=2.6.0
> > >
> > > failed again. Also with versions 2.6.1 and 2.6.3.
> > >
> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:
> > >
> > > > This is also a major issue for batch with off-heap memory and memory
> > > > preallocation turned off:
> > > > https://issues.apache.org/jira/browse/FLINK-4094
> > > > Not hard to fix though as we simply need to reliably clear the direct
> > > > memory instead of relying on garbage collection. Another possible fix
> > > > is to maintain memory pools independently of the preallocation mode.
> I
> > > > think this is fine because preallocation:false suggests that no
> memory
> > > > will be preallocated but not that memory will be freed once acquired.
> > > >
> > >
> >
>

Ufuk Celebi-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

Dear community,

I would like to vote +1, but during testing I've noted that we should
have reverted FLINK-4154 (correction of murmur hash) for this release.

We had a wrong murmur hash implementation for 1.0, which was fixed for
1.1. We reverted that fix, because we thought that it broke savepoint
compatibility between 1.0 and 1.1. That revert is part of RC1. It
turns out though that there are other problems with savepoint
compatibility which are independent of the hash function. Therefore I
would like to revert it again and create a new RC with only this extra
commit and extend the vote for one day.

Would you be OK with this? Most testing results should be applicable
to RC2, too.

I ran the following tests:

+ Check checksums and signatures
+ Verify no binaries in source release
+ Build (clean verify) with default Hadoop version
+ Build (clean verify) with Hadoop 2.6.1
+ Checked build for Scala 2.11
+ Checked all POMs
+ Read README.md
+ Examined OUT and LOG files
+ Checked paths with spaces (found non-blocking issue with YARN CLI)
+ Checked local, cluster mode, and multi-node cluster
+ Tested HDFS split assignment
+ Tested bin/flink command line
+ Tested recovery (master and worker failure) in standalone mode with
RocksDB and HDFS
+ Tested Scala/SBT giter8 template
+ Tested Metrics (user defined metrics, multiple JMX reporters, JM
metrics, user defined reporter)

– Ufuk

On Tue, Aug 2, 2016 at 10:13 AM, Till Rohrmann <[hidden email]> wrote:

> I can confirm Aljoscha's findings concerning building Flink with Hadoop
> version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed a
> Maven 3.3 issue. If you build flink-runtime twice, then everything goes
> through because the shaded curator Flink dependency is installed in during
> the first run.
>
> On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> @Ufuk: 3.3.9, that's probably it because that messes with the shading,
>> right?
>>
>> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version is
>> most likely the reason.
>>
>> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
>>
>> > @Aljoscha: Have you made sure you have a clean maven cache (remove the
>> > .m2/repository/org/apache/flink folder)?
>> >
>> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]>
>> > wrote:
>> >
>> > > I tried it again now. I did:
>> > >
>> > > rm -r .m2/repository
>> > > mvn clean verify -Dhadoop.version=2.6.0
>> > >
>> > > failed again. Also with versions 2.6.1 and 2.6.3.
>> > >
>> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]> wrote:
>> > >
>> > > > This is also a major issue for batch with off-heap memory and memory
>> > > > preallocation turned off:
>> > > > https://issues.apache.org/jira/browse/FLINK-4094
>> > > > Not hard to fix though as we simply need to reliably clear the direct
>> > > > memory instead of relying on garbage collection. Another possible fix
>> > > > is to maintain memory pools independently of the preallocation mode.
>> I
>> > > > think this is fine because preallocation:false suggests that no
>> memory
>> > > > will be preallocated but not that memory will be freed once acquired.
>> > > >
>> > >
>> >
>>

Stephan Ewen

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

+1 from my side

Create a new RC that differs only in the hash function commit.
I would support to carry forward the vote thread (extend it for one
additional day), because virtually all test results should apply to the new
RC as well.

We certainly need to redo:
- signature validation
- Build & integration tests (that should catch any potential error caused
by a change of hash function)

That is pretty lightweight, should be good within a day.

On Tue, Aug 2, 2016 at 10:43 AM, Ufuk Celebi <[hidden email]> wrote:

> Dear community,
>
> I would like to vote +1, but during testing I've noted that we should
> have reverted FLINK-4154 (correction of murmur hash) for this release.
>
> We had a wrong murmur hash implementation for 1.0, which was fixed for
> 1.1. We reverted that fix, because we thought that it broke savepoint
> compatibility between 1.0 and 1.1. That revert is part of RC1. It
> turns out though that there are other problems with savepoint
> compatibility which are independent of the hash function. Therefore I
> would like to revert it again and create a new RC with only this extra
> commit and extend the vote for one day.
>
> Would you be OK with this? Most testing results should be applicable
> to RC2, too.
>
> I ran the following tests:
>
> + Check checksums and signatures
> + Verify no binaries in source release
> + Build (clean verify) with default Hadoop version
> + Build (clean verify) with Hadoop 2.6.1
> + Checked build for Scala 2.11
> + Checked all POMs
> + Read README.md
> + Examined OUT and LOG files
> + Checked paths with spaces (found non-blocking issue with YARN CLI)
> + Checked local, cluster mode, and multi-node cluster
> + Tested HDFS split assignment
> + Tested bin/flink command line
> + Tested recovery (master and worker failure) in standalone mode with
> RocksDB and HDFS
> + Tested Scala/SBT giter8 template
> + Tested Metrics (user defined metrics, multiple JMX reporters, JM
> metrics, user defined reporter)
>
> – Ufuk
>
>
> On Tue, Aug 2, 2016 at 10:13 AM, Till Rohrmann <[hidden email]>
> wrote:
> > I can confirm Aljoscha's findings concerning building Flink with Hadoop
> > version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed a
> > Maven 3.3 issue. If you build flink-runtime twice, then everything goes
> > through because the shaded curator Flink dependency is installed in
> during
> > the first run.
> >
> > On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> >> @Ufuk: 3.3.9, that's probably it because that messes with the shading,
> >> right?
> >>
> >> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version
> is
> >> most likely the reason.
> >>
> >> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
> >>
> >> > @Aljoscha: Have you made sure you have a clean maven cache (remove the
> >> > .m2/repository/org/apache/flink folder)?
> >> >
> >> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]
> >
> >> > wrote:
> >> >
> >> > > I tried it again now. I did:
> >> > >
> >> > > rm -r .m2/repository
> >> > > mvn clean verify -Dhadoop.version=2.6.0
> >> > >
> >> > > failed again. Also with versions 2.6.1 and 2.6.3.
> >> > >
> >> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]>
> wrote:
> >> > >
> >> > > > This is also a major issue for batch with off-heap memory and
> memory
> >> > > > preallocation turned off:
> >> > > > https://issues.apache.org/jira/browse/FLINK-4094
> >> > > > Not hard to fix though as we simply need to reliably clear the
> direct
> >> > > > memory instead of relying on garbage collection. Another possible
> fix
> >> > > > is to maintain memory pools independently of the preallocation
> mode.
> >> I
> >> > > > think this is fine because preallocation:false suggests that no
> >> memory
> >> > > > will be preallocated but not that memory will be freed once
> acquired.
> >> > > >
> >> > >
> >> >
> >>
>

mxm

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

I agree with Ufuk and Stephan that we could forward most of the
testing if we only included the hash function fix in the new RC. There
are some other minor issues we could merge as well, but they are
involved enough that they would set us back to redoing the testing. So
+1 for a new RC with the hash function fix.

On Tue, Aug 2, 2016 at 12:35 PM, Stephan Ewen <[hidden email]> wrote:

> +1 from my side
>
> Create a new RC that differs only in the hash function commit.
> I would support to carry forward the vote thread (extend it for one
> additional day), because virtually all test results should apply to the new
> RC as well.
>
> We certainly need to redo:
> - signature validation
> - Build & integration tests (that should catch any potential error caused
> by a change of hash function)
>
> That is pretty lightweight, should be good within a day.
>
>
> On Tue, Aug 2, 2016 at 10:43 AM, Ufuk Celebi <[hidden email]> wrote:
>
>> Dear community,
>>
>> I would like to vote +1, but during testing I've noted that we should
>> have reverted FLINK-4154 (correction of murmur hash) for this release.
>>
>> We had a wrong murmur hash implementation for 1.0, which was fixed for
>> 1.1. We reverted that fix, because we thought that it broke savepoint
>> compatibility between 1.0 and 1.1. That revert is part of RC1. It
>> turns out though that there are other problems with savepoint
>> compatibility which are independent of the hash function. Therefore I
>> would like to revert it again and create a new RC with only this extra
>> commit and extend the vote for one day.
>>
>> Would you be OK with this? Most testing results should be applicable
>> to RC2, too.
>>
>> I ran the following tests:
>>
>> + Check checksums and signatures
>> + Verify no binaries in source release
>> + Build (clean verify) with default Hadoop version
>> + Build (clean verify) with Hadoop 2.6.1
>> + Checked build for Scala 2.11
>> + Checked all POMs
>> + Read README.md
>> + Examined OUT and LOG files
>> + Checked paths with spaces (found non-blocking issue with YARN CLI)
>> + Checked local, cluster mode, and multi-node cluster
>> + Tested HDFS split assignment
>> + Tested bin/flink command line
>> + Tested recovery (master and worker failure) in standalone mode with
>> RocksDB and HDFS
>> + Tested Scala/SBT giter8 template
>> + Tested Metrics (user defined metrics, multiple JMX reporters, JM
>> metrics, user defined reporter)
>>
>> – Ufuk
>>
>>
>> On Tue, Aug 2, 2016 at 10:13 AM, Till Rohrmann <[hidden email]>
>> wrote:
>> > I can confirm Aljoscha's findings concerning building Flink with Hadoop
>> > version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed a
>> > Maven 3.3 issue. If you build flink-runtime twice, then everything goes
>> > through because the shaded curator Flink dependency is installed in
>> during
>> > the first run.
>> >
>> > On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <[hidden email]>
>> > wrote:
>> >
>> >> @Ufuk: 3.3.9, that's probably it because that messes with the shading,
>> >> right?
>> >>
>> >> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version
>> is
>> >> most likely the reason.
>> >>
>> >> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
>> >>
>> >> > @Aljoscha: Have you made sure you have a clean maven cache (remove the
>> >> > .m2/repository/org/apache/flink folder)?
>> >> >
>> >> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]
>> >
>> >> > wrote:
>> >> >
>> >> > > I tried it again now. I did:
>> >> > >
>> >> > > rm -r .m2/repository
>> >> > > mvn clean verify -Dhadoop.version=2.6.0
>> >> > >
>> >> > > failed again. Also with versions 2.6.1 and 2.6.3.
>> >> > >
>> >> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]>
>> wrote:
>> >> > >
>> >> > > > This is also a major issue for batch with off-heap memory and
>> memory
>> >> > > > preallocation turned off:
>> >> > > > https://issues.apache.org/jira/browse/FLINK-4094
>> >> > > > Not hard to fix though as we simply need to reliably clear the
>> direct
>> >> > > > memory instead of relying on garbage collection. Another possible
>> fix
>> >> > > > is to maintain memory pools independently of the preallocation
>> mode.
>> >> I
>> >> > > > think this is fine because preallocation:false suggests that no
>> >> memory
>> >> > > > will be preallocated but not that memory will be freed once
>> acquired.
>> >> > > >
>> >> > >
>> >> >
>> >>
>>

Ufuk Celebi-2

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

I just saw that we changed the behaviour of ListState and
FoldingState. They used to return the default value given to the state
descriptor, but have been changed to return null now (in [1]).
Furthermore ValueState still returns the default value instead of
null. Gyula noticed another inconsistency for GenericListState and
GenericFoldingState in [2].

The state interfaces are annotated with @PublicEvolving, so
technically it should be OK to change this, but I wanted to double
check that everyone is aware of this. Do we want to keep it like it is
or should we revert this?

– Ufuk

[1] https://github.com/apache/flink/commit/12bf7c1a0b81d199085fe874c64763c51a93b3bf#diff-2c622001cff86abb3e36e6621d6f73ad
[2] https://issues.apache.org/jira/browse/FLINK-4275

On Tue, Aug 2, 2016 at 1:37 PM, Maximilian Michels <[hidden email]> wrote:

> I agree with Ufuk and Stephan that we could forward most of the
> testing if we only included the hash function fix in the new RC. There
> are some other minor issues we could merge as well, but they are
> involved enough that they would set us back to redoing the testing. So
> +1 for a new RC with the hash function fix.
>
> On Tue, Aug 2, 2016 at 12:35 PM, Stephan Ewen <[hidden email]> wrote:
>> +1 from my side
>>
>> Create a new RC that differs only in the hash function commit.
>> I would support to carry forward the vote thread (extend it for one
>> additional day), because virtually all test results should apply to the new
>> RC as well.
>>
>> We certainly need to redo:
>> - signature validation
>> - Build & integration tests (that should catch any potential error caused
>> by a change of hash function)
>>
>> That is pretty lightweight, should be good within a day.
>>
>>
>> On Tue, Aug 2, 2016 at 10:43 AM, Ufuk Celebi <[hidden email]> wrote:
>>
>>> Dear community,
>>>
>>> I would like to vote +1, but during testing I've noted that we should
>>> have reverted FLINK-4154 (correction of murmur hash) for this release.
>>>
>>> We had a wrong murmur hash implementation for 1.0, which was fixed for
>>> 1.1. We reverted that fix, because we thought that it broke savepoint
>>> compatibility between 1.0 and 1.1. That revert is part of RC1. It
>>> turns out though that there are other problems with savepoint
>>> compatibility which are independent of the hash function. Therefore I
>>> would like to revert it again and create a new RC with only this extra
>>> commit and extend the vote for one day.
>>>
>>> Would you be OK with this? Most testing results should be applicable
>>> to RC2, too.
>>>
>>> I ran the following tests:
>>>
>>> + Check checksums and signatures
>>> + Verify no binaries in source release
>>> + Build (clean verify) with default Hadoop version
>>> + Build (clean verify) with Hadoop 2.6.1
>>> + Checked build for Scala 2.11
>>> + Checked all POMs
>>> + Read README.md
>>> + Examined OUT and LOG files
>>> + Checked paths with spaces (found non-blocking issue with YARN CLI)
>>> + Checked local, cluster mode, and multi-node cluster
>>> + Tested HDFS split assignment
>>> + Tested bin/flink command line
>>> + Tested recovery (master and worker failure) in standalone mode with
>>> RocksDB and HDFS
>>> + Tested Scala/SBT giter8 template
>>> + Tested Metrics (user defined metrics, multiple JMX reporters, JM
>>> metrics, user defined reporter)
>>>
>>> – Ufuk
>>>
>>>
>>> On Tue, Aug 2, 2016 at 10:13 AM, Till Rohrmann <[hidden email]>
>>> wrote:
>>> > I can confirm Aljoscha's findings concerning building Flink with Hadoop
>>> > version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed a
>>> > Maven 3.3 issue. If you build flink-runtime twice, then everything goes
>>> > through because the shaded curator Flink dependency is installed in
>>> during
>>> > the first run.
>>> >
>>> > On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <[hidden email]>
>>> > wrote:
>>> >
>>> >> @Ufuk: 3.3.9, that's probably it because that messes with the shading,
>>> >> right?
>>> >>
>>> >> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version
>>> is
>>> >> most likely the reason.
>>> >>
>>> >> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
>>> >>
>>> >> > @Aljoscha: Have you made sure you have a clean maven cache (remove the
>>> >> > .m2/repository/org/apache/flink folder)?
>>> >> >
>>> >> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <[hidden email]
>>> >
>>> >> > wrote:
>>> >> >
>>> >> > > I tried it again now. I did:
>>> >> > >
>>> >> > > rm -r .m2/repository
>>> >> > > mvn clean verify -Dhadoop.version=2.6.0
>>> >> > >
>>> >> > > failed again. Also with versions 2.6.1 and 2.6.3.
>>> >> > >
>>> >> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]>
>>> wrote:
>>> >> > >
>>> >> > > > This is also a major issue for batch with off-heap memory and
>>> memory
>>> >> > > > preallocation turned off:
>>> >> > > > https://issues.apache.org/jira/browse/FLINK-4094
>>> >> > > > Not hard to fix though as we simply need to reliably clear the
>>> direct
>>> >> > > > memory instead of relying on garbage collection. Another possible
>>> fix
>>> >> > > > is to maintain memory pools independently of the preallocation
>>> mode.
>>> >> I
>>> >> > > > think this is fine because preallocation:false suggests that no
>>> >> memory
>>> >> > > > will be preallocated but not that memory will be freed once
>>> acquired.
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>>

Stephan Ewen

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

@Ufuk - I agree, this looks quite dubious.

Need to resolve that before proceeding with the release...

On Tue, Aug 2, 2016 at 1:45 PM, Ufuk Celebi <[hidden email]> wrote:

> I just saw that we changed the behaviour of ListState and
> FoldingState. They used to return the default value given to the state
> descriptor, but have been changed to return null now (in [1]).
> Furthermore ValueState still returns the default value instead of
> null. Gyula noticed another inconsistency for GenericListState and
> GenericFoldingState in [2].
>
> The state interfaces are annotated with @PublicEvolving, so
> technically it should be OK to change this, but I wanted to double
> check that everyone is aware of this. Do we want to keep it like it is
> or should we revert this?
>
> – Ufuk
>
> [1]
> https://github.com/apache/flink/commit/12bf7c1a0b81d199085fe874c64763c51a93b3bf#diff-2c622001cff86abb3e36e6621d6f73ad
> [2] https://issues.apache.org/jira/browse/FLINK-4275
>
> On Tue, Aug 2, 2016 at 1:37 PM, Maximilian Michels <[hidden email]> wrote:
> > I agree with Ufuk and Stephan that we could forward most of the
> > testing if we only included the hash function fix in the new RC. There
> > are some other minor issues we could merge as well, but they are
> > involved enough that they would set us back to redoing the testing. So
> > +1 for a new RC with the hash function fix.
> >
> > On Tue, Aug 2, 2016 at 12:35 PM, Stephan Ewen <[hidden email]> wrote:
> >> +1 from my side
> >>
> >> Create a new RC that differs only in the hash function commit.
> >> I would support to carry forward the vote thread (extend it for one
> >> additional day), because virtually all test results should apply to the
> new
> >> RC as well.
> >>
> >> We certainly need to redo:
> >> - signature validation
> >> - Build & integration tests (that should catch any potential error
> caused
> >> by a change of hash function)
> >>
> >> That is pretty lightweight, should be good within a day.
> >>
> >>
> >> On Tue, Aug 2, 2016 at 10:43 AM, Ufuk Celebi <[hidden email]> wrote:
> >>
> >>> Dear community,
> >>>
> >>> I would like to vote +1, but during testing I've noted that we should
> >>> have reverted FLINK-4154 (correction of murmur hash) for this release.
> >>>
> >>> We had a wrong murmur hash implementation for 1.0, which was fixed for
> >>> 1.1. We reverted that fix, because we thought that it broke savepoint
> >>> compatibility between 1.0 and 1.1. That revert is part of RC1. It
> >>> turns out though that there are other problems with savepoint
> >>> compatibility which are independent of the hash function. Therefore I
> >>> would like to revert it again and create a new RC with only this extra
> >>> commit and extend the vote for one day.
> >>>
> >>> Would you be OK with this? Most testing results should be applicable
> >>> to RC2, too.
> >>>
> >>> I ran the following tests:
> >>>
> >>> + Check checksums and signatures
> >>> + Verify no binaries in source release
> >>> + Build (clean verify) with default Hadoop version
> >>> + Build (clean verify) with Hadoop 2.6.1
> >>> + Checked build for Scala 2.11
> >>> + Checked all POMs
> >>> + Read README.md
> >>> + Examined OUT and LOG files
> >>> + Checked paths with spaces (found non-blocking issue with YARN CLI)
> >>> + Checked local, cluster mode, and multi-node cluster
> >>> + Tested HDFS split assignment
> >>> + Tested bin/flink command line
> >>> + Tested recovery (master and worker failure) in standalone mode with
> >>> RocksDB and HDFS
> >>> + Tested Scala/SBT giter8 template
> >>> + Tested Metrics (user defined metrics, multiple JMX reporters, JM
> >>> metrics, user defined reporter)
> >>>
> >>> – Ufuk
> >>>
> >>>
> >>> On Tue, Aug 2, 2016 at 10:13 AM, Till Rohrmann <[hidden email]>
> >>> wrote:
> >>> > I can confirm Aljoscha's findings concerning building Flink with
> Hadoop
> >>> > version 2.6.0 using Maven 3.3.9. Aljoscha is right that it is indeed
> a
> >>> > Maven 3.3 issue. If you build flink-runtime twice, then everything
> goes
> >>> > through because the shaded curator Flink dependency is installed in
> >>> during
> >>> > the first run.
> >>> >
> >>> > On Tue, Aug 2, 2016 at 5:09 AM, Aljoscha Krettek <
> [hidden email]>
> >>> > wrote:
> >>> >
> >>> >> @Ufuk: 3.3.9, that's probably it because that messes with the
> shading,
> >>> >> right?
> >>> >>
> >>> >> @Stephan: Yes, even did a "rm -r .m2/repository". But the maven
> version
> >>> is
> >>> >> most likely the reason.
> >>> >>
> >>> >> On Mon, 1 Aug 2016 at 10:59 Stephan Ewen <[hidden email]> wrote:
> >>> >>
> >>> >> > @Aljoscha: Have you made sure you have a clean maven cache
> (remove the
> >>> >> > .m2/repository/org/apache/flink folder)?
> >>> >> >
> >>> >> > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek <
> [hidden email]
> >>> >
> >>> >> > wrote:
> >>> >> >
> >>> >> > > I tried it again now. I did:
> >>> >> > >
> >>> >> > > rm -r .m2/repository
> >>> >> > > mvn clean verify -Dhadoop.version=2.6.0
> >>> >> > >
> >>> >> > > failed again. Also with versions 2.6.1 and 2.6.3.
> >>> >> > >
> >>> >> > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels <[hidden email]>
> >>> wrote:
> >>> >> > >
> >>> >> > > > This is also a major issue for batch with off-heap memory and
> >>> memory
> >>> >> > > > preallocation turned off:
> >>> >> > > > https://issues.apache.org/jira/browse/FLINK-4094
> >>> >> > > > Not hard to fix though as we simply need to reliably clear the
> >>> direct
> >>> >> > > > memory instead of relying on garbage collection. Another
> possible
> >>> fix
> >>> >> > > > is to maintain memory pools independently of the preallocation
> >>> mode.
> >>> >> I
> >>> >> > > > think this is fine because preallocation:false suggests that
> no
> >>> >> memory
> >>> >> > > > will be preallocated but not that memory will be freed once
> >>> acquired.
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>>
>