Testing Apache Flink 0.9.0-rc1

classic Classic list List threaded Threaded
74 messages Options
1234
mxm
Reply | Threaded
Open this post in threaded view
|

Testing Apache Flink 0.9.0-rc1

mxm
Hi everyone!

As previously discussed, the Flink developer community is very eager to get
out a new major release. Apache Flink 0.9.0 will contain lots of new
features and many bugfixes. This time, I'll try to coordinate the release
process. Feel free to correct me if I'm doing something wrong because I
don't no any better :)

To release a great version of Flink to the public, I'd like to ask everyone
to test the release candidate. Recently, Flink has received a lot of
attention. The expectations are quite high. Only through thorough testing
we will be able to satisfy all the Flink users out there.

Below is a list from the Wiki that we use to ensure the legal and
functional aspects of a release [1]. What I would like you to do is pick at
least one of the tasks, put your name as assignee in the link below, and
report back once you verified it. That way, I hope we can quickly and
thoroughly test the release candidate.

https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit

Best,
Max

Git branch: release-0.9-rc1
Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
Maven artifacts:
https://repository.apache.org/content/repositories/orgapacheflink-1037/
PGP public key for verifying the signatures:
http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF


Legal
====

L.1 Check if checksums and GPG files match the corresponding release files

L.2 Verify that the source archives do NOT contains any binaries

L.3 Check if the source release is building properly with Maven (including
license header check (default) and checkstyle). Also the tests should be
executed (mvn clean verify)

L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
source release.

L.5 All dependencies must be checked for their license and the license must
be ASL 2.0 compatible (http://www.apache.org/legal/resolved.html#category-x)
* The LICENSE and NOTICE files in the root directory refer to dependencies
in the source release, i.e., files in the git repository (such as fonts,
css, JavaScript, images)
* The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
the binary distribution and mention all of Flink's Maven dependencies as
well

L.6 Check that all POM files point to the same version (mostly relevant to
examine quickstart artifact files)

L.7 Read the README.md file


Functional
========

F.1 Run the start-local.sh/start-local-streaming.sh,
start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts and
verify that the processes come up

F.2 Examine the *.out files (should be empty) and the log files (should
contain no exceptions)
* Test for Linux, OS X, Windows (for Windows as far as possible, not all
scripts exist)
* Shutdown and verify there are no exceptions in the log output (after
shutdown)
* Check all start+submission scripts for paths with and without spaces
(./bin/* scripts are quite fragile for paths with spaces)

F.3 local mode (start-local.sh, see criteria below)
F.4 cluster mode (start-cluster.sh, see criteria below)
F.5 multi-node cluster (can simulate locally by starting two taskmanagers,
see criteria below)

Criteria for F.3 F.4 F.5
----------------------------
* Verify that the examples are running from both ./bin/flink and from the
web-based job submission tool
* flink-conf.yml should define more than one task slot
* Results of job are produced and correct
** Check also that the examples are running with the build-in data and
external sources.
* Examine the log output - no error messages should be encountered
** Web interface shows progress and finished job in history


F.6 Test on a cluster with HDFS.
* Check that a good amount of input splits is read locally (JobManager log
reveals local assignments)

F.7 Test against a Kafka installation

F.8 Test the ./bin/flink command line client
* Test "info" option, paste the JSON into the plan visualizer HTML file,
check that plan is rendered
* Test the parallelism flag (-p) to override the configured default
parallelism

F.9 Verify the plan visualizer with different browsers/operating systems

F.10 Verify that the quickstarts for scala and java are working with the
staging repository for both IntelliJ and Eclipse.
* In particular the dependencies of the quickstart project need to be set
correctly and the QS project needs to build from the staging repository
(replace the snapshot repo URL with the staging repo URL)
* The dependency tree of the QuickStart project must not contain any
dependencies we shade away upstream (guava, netty, ...)

F.11 Run examples on a YARN cluster

F.12 Run all examples from the IDE (Eclipse & IntelliJ)

F.13 Run an example with the RemoteEnvironment against a cluster started
from the shell script

F.14 Run manual Tests in "flink-tests" module.
* Marked with the @Ignore interface.


[1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Chiwan Park
Hi. I’m very excited about preparing a new major release. :)
I just picked two tests. I will report status as soon as possible.

Regards,
Chiwan Park

> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
>
> Hi everyone!
>
> As previously discussed, the Flink developer community is very eager to get
> out a new major release. Apache Flink 0.9.0 will contain lots of new
> features and many bugfixes. This time, I'll try to coordinate the release
> process. Feel free to correct me if I'm doing something wrong because I
> don't no any better :)
>
> To release a great version of Flink to the public, I'd like to ask everyone
> to test the release candidate. Recently, Flink has received a lot of
> attention. The expectations are quite high. Only through thorough testing
> we will be able to satisfy all the Flink users out there.
>
> Below is a list from the Wiki that we use to ensure the legal and
> functional aspects of a release [1]. What I would like you to do is pick at
> least one of the tasks, put your name as assignee in the link below, and
> report back once you verified it. That way, I hope we can quickly and
> thoroughly test the release candidate.
>
> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
>
> Best,
> Max
>
> Git branch: release-0.9-rc1
> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
> Maven artifacts:
> https://repository.apache.org/content/repositories/orgapacheflink-1037/
> PGP public key for verifying the signatures:
> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>
>
> Legal
> ====
>
> L.1 Check if checksums and GPG files match the corresponding release files
>
> L.2 Verify that the source archives do NOT contains any binaries
>
> L.3 Check if the source release is building properly with Maven (including
> license header check (default) and checkstyle). Also the tests should be
> executed (mvn clean verify)
>
> L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
> source release.
>
> L.5 All dependencies must be checked for their license and the license must
> be ASL 2.0 compatible (http://www.apache.org/legal/resolved.html#category-x)
> * The LICENSE and NOTICE files in the root directory refer to dependencies
> in the source release, i.e., files in the git repository (such as fonts,
> css, JavaScript, images)
> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
> the binary distribution and mention all of Flink's Maven dependencies as
> well
>
> L.6 Check that all POM files point to the same version (mostly relevant to
> examine quickstart artifact files)
>
> L.7 Read the README.md file
>
>
> Functional
> ========
>
> F.1 Run the start-local.sh/start-local-streaming.sh,
> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts and
> verify that the processes come up
>
> F.2 Examine the *.out files (should be empty) and the log files (should
> contain no exceptions)
> * Test for Linux, OS X, Windows (for Windows as far as possible, not all
> scripts exist)
> * Shutdown and verify there are no exceptions in the log output (after
> shutdown)
> * Check all start+submission scripts for paths with and without spaces
> (./bin/* scripts are quite fragile for paths with spaces)
>
> F.3 local mode (start-local.sh, see criteria below)
> F.4 cluster mode (start-cluster.sh, see criteria below)
> F.5 multi-node cluster (can simulate locally by starting two taskmanagers,
> see criteria below)
>
> Criteria for F.3 F.4 F.5
> ----------------------------
> * Verify that the examples are running from both ./bin/flink and from the
> web-based job submission tool
> * flink-conf.yml should define more than one task slot
> * Results of job are produced and correct
> ** Check also that the examples are running with the build-in data and
> external sources.
> * Examine the log output - no error messages should be encountered
> ** Web interface shows progress and finished job in history
>
>
> F.6 Test on a cluster with HDFS.
> * Check that a good amount of input splits is read locally (JobManager log
> reveals local assignments)
>
> F.7 Test against a Kafka installation
>
> F.8 Test the ./bin/flink command line client
> * Test "info" option, paste the JSON into the plan visualizer HTML file,
> check that plan is rendered
> * Test the parallelism flag (-p) to override the configured default
> parallelism
>
> F.9 Verify the plan visualizer with different browsers/operating systems
>
> F.10 Verify that the quickstarts for scala and java are working with the
> staging repository for both IntelliJ and Eclipse.
> * In particular the dependencies of the quickstart project need to be set
> correctly and the QS project needs to build from the staging repository
> (replace the snapshot repo URL with the staging repo URL)
> * The dependency tree of the QuickStart project must not contain any
> dependencies we shade away upstream (guava, netty, ...)
>
> F.11 Run examples on a YARN cluster
>
> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
>
> F.13 Run an example with the RemoteEnvironment against a cluster started
> from the shell script
>
> F.14 Run manual Tests in "flink-tests" module.
> * Marked with the @Ignore interface.
>
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing





Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Márton Balassi
Added F7 Running against Kafka cluster for me in the doc. Doing it
tomorrow.

On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park <[hidden email]> wrote:

> Hi. I’m very excited about preparing a new major release. :)
> I just picked two tests. I will report status as soon as possible.
>
> Regards,
> Chiwan Park
>
> > On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
> >
> > Hi everyone!
> >
> > As previously discussed, the Flink developer community is very eager to
> get
> > out a new major release. Apache Flink 0.9.0 will contain lots of new
> > features and many bugfixes. This time, I'll try to coordinate the release
> > process. Feel free to correct me if I'm doing something wrong because I
> > don't no any better :)
> >
> > To release a great version of Flink to the public, I'd like to ask
> everyone
> > to test the release candidate. Recently, Flink has received a lot of
> > attention. The expectations are quite high. Only through thorough testing
> > we will be able to satisfy all the Flink users out there.
> >
> > Below is a list from the Wiki that we use to ensure the legal and
> > functional aspects of a release [1]. What I would like you to do is pick
> at
> > least one of the tasks, put your name as assignee in the link below, and
> > report back once you verified it. That way, I hope we can quickly and
> > thoroughly test the release candidate.
> >
> >
> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
> >
> > Best,
> > Max
> >
> > Git branch: release-0.9-rc1
> > Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
> > Maven artifacts:
> > https://repository.apache.org/content/repositories/orgapacheflink-1037/
> > PGP public key for verifying the signatures:
> > http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >
> >
> > Legal
> > ====
> >
> > L.1 Check if checksums and GPG files match the corresponding release
> files
> >
> > L.2 Verify that the source archives do NOT contains any binaries
> >
> > L.3 Check if the source release is building properly with Maven
> (including
> > license header check (default) and checkstyle). Also the tests should be
> > executed (mvn clean verify)
> >
> > L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
> > source release.
> >
> > L.5 All dependencies must be checked for their license and the license
> must
> > be ASL 2.0 compatible (
> http://www.apache.org/legal/resolved.html#category-x)
> > * The LICENSE and NOTICE files in the root directory refer to
> dependencies
> > in the source release, i.e., files in the git repository (such as fonts,
> > css, JavaScript, images)
> > * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
> > the binary distribution and mention all of Flink's Maven dependencies as
> > well
> >
> > L.6 Check that all POM files point to the same version (mostly relevant
> to
> > examine quickstart artifact files)
> >
> > L.7 Read the README.md file
> >
> >
> > Functional
> > ========
> >
> > F.1 Run the start-local.sh/start-local-streaming.sh,
> > start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts
> and
> > verify that the processes come up
> >
> > F.2 Examine the *.out files (should be empty) and the log files (should
> > contain no exceptions)
> > * Test for Linux, OS X, Windows (for Windows as far as possible, not all
> > scripts exist)
> > * Shutdown and verify there are no exceptions in the log output (after
> > shutdown)
> > * Check all start+submission scripts for paths with and without spaces
> > (./bin/* scripts are quite fragile for paths with spaces)
> >
> > F.3 local mode (start-local.sh, see criteria below)
> > F.4 cluster mode (start-cluster.sh, see criteria below)
> > F.5 multi-node cluster (can simulate locally by starting two
> taskmanagers,
> > see criteria below)
> >
> > Criteria for F.3 F.4 F.5
> > ----------------------------
> > * Verify that the examples are running from both ./bin/flink and from the
> > web-based job submission tool
> > * flink-conf.yml should define more than one task slot
> > * Results of job are produced and correct
> > ** Check also that the examples are running with the build-in data and
> > external sources.
> > * Examine the log output - no error messages should be encountered
> > ** Web interface shows progress and finished job in history
> >
> >
> > F.6 Test on a cluster with HDFS.
> > * Check that a good amount of input splits is read locally (JobManager
> log
> > reveals local assignments)
> >
> > F.7 Test against a Kafka installation
> >
> > F.8 Test the ./bin/flink command line client
> > * Test "info" option, paste the JSON into the plan visualizer HTML file,
> > check that plan is rendered
> > * Test the parallelism flag (-p) to override the configured default
> > parallelism
> >
> > F.9 Verify the plan visualizer with different browsers/operating systems
> >
> > F.10 Verify that the quickstarts for scala and java are working with the
> > staging repository for both IntelliJ and Eclipse.
> > * In particular the dependencies of the quickstart project need to be set
> > correctly and the QS project needs to build from the staging repository
> > (replace the snapshot repo URL with the staging repo URL)
> > * The dependency tree of the QuickStart project must not contain any
> > dependencies we shade away upstream (guava, netty, ...)
> >
> > F.11 Run examples on a YARN cluster
> >
> > F.12 Run all examples from the IDE (Eclipse & IntelliJ)
> >
> > F.13 Run an example with the RemoteEnvironment against a cluster started
> > from the shell script
> >
> > F.14 Run manual Tests in "flink-tests" module.
> > * Marked with the @Ignore interface.
> >
> >
> > [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Chiwan Park
Hi. I have a problem running `mvn clean verify` command.
TaskManagerFailsWithSlotSharingITCase hangs in Oracle JDK 7 (1.7.0_80). But in Oracle JDK 8 the test case doesn’t hang.

I’ve investigated about this problem but I cannot found the bug.

Regards,
Chiwan Park

> On Jun 9, 2015, at 2:11 AM, Márton Balassi <[hidden email]> wrote:
>
> Added F7 Running against Kafka cluster for me in the doc. Doing it
> tomorrow.
>
> On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park <[hidden email]> wrote:
>
>> Hi. I’m very excited about preparing a new major release. :)
>> I just picked two tests. I will report status as soon as possible.
>>
>> Regards,
>> Chiwan Park
>>
>>> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
>>>
>>> Hi everyone!
>>>
>>> As previously discussed, the Flink developer community is very eager to
>> get
>>> out a new major release. Apache Flink 0.9.0 will contain lots of new
>>> features and many bugfixes. This time, I'll try to coordinate the release
>>> process. Feel free to correct me if I'm doing something wrong because I
>>> don't no any better :)
>>>
>>> To release a great version of Flink to the public, I'd like to ask
>> everyone
>>> to test the release candidate. Recently, Flink has received a lot of
>>> attention. The expectations are quite high. Only through thorough testing
>>> we will be able to satisfy all the Flink users out there.
>>>
>>> Below is a list from the Wiki that we use to ensure the legal and
>>> functional aspects of a release [1]. What I would like you to do is pick
>> at
>>> least one of the tasks, put your name as assignee in the link below, and
>>> report back once you verified it. That way, I hope we can quickly and
>>> thoroughly test the release candidate.
>>>
>>>
>> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
>>>
>>> Best,
>>> Max
>>>
>>> Git branch: release-0.9-rc1
>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
>>> Maven artifacts:
>>> https://repository.apache.org/content/repositories/orgapacheflink-1037/
>>> PGP public key for verifying the signatures:
>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>>>
>>>
>>> Legal
>>> ====
>>>
>>> L.1 Check if checksums and GPG files match the corresponding release
>> files
>>>
>>> L.2 Verify that the source archives do NOT contains any binaries
>>>
>>> L.3 Check if the source release is building properly with Maven
>> (including
>>> license header check (default) and checkstyle). Also the tests should be
>>> executed (mvn clean verify)
>>>
>>> L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
>>> source release.
>>>
>>> L.5 All dependencies must be checked for their license and the license
>> must
>>> be ASL 2.0 compatible (
>> http://www.apache.org/legal/resolved.html#category-x)
>>> * The LICENSE and NOTICE files in the root directory refer to
>> dependencies
>>> in the source release, i.e., files in the git repository (such as fonts,
>>> css, JavaScript, images)
>>> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
>>> the binary distribution and mention all of Flink's Maven dependencies as
>>> well
>>>
>>> L.6 Check that all POM files point to the same version (mostly relevant
>> to
>>> examine quickstart artifact files)
>>>
>>> L.7 Read the README.md file
>>>
>>>
>>> Functional
>>> ========
>>>
>>> F.1 Run the start-local.sh/start-local-streaming.sh,
>>> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts
>> and
>>> verify that the processes come up
>>>
>>> F.2 Examine the *.out files (should be empty) and the log files (should
>>> contain no exceptions)
>>> * Test for Linux, OS X, Windows (for Windows as far as possible, not all
>>> scripts exist)
>>> * Shutdown and verify there are no exceptions in the log output (after
>>> shutdown)
>>> * Check all start+submission scripts for paths with and without spaces
>>> (./bin/* scripts are quite fragile for paths with spaces)
>>>
>>> F.3 local mode (start-local.sh, see criteria below)
>>> F.4 cluster mode (start-cluster.sh, see criteria below)
>>> F.5 multi-node cluster (can simulate locally by starting two
>> taskmanagers,
>>> see criteria below)
>>>
>>> Criteria for F.3 F.4 F.5
>>> ----------------------------
>>> * Verify that the examples are running from both ./bin/flink and from the
>>> web-based job submission tool
>>> * flink-conf.yml should define more than one task slot
>>> * Results of job are produced and correct
>>> ** Check also that the examples are running with the build-in data and
>>> external sources.
>>> * Examine the log output - no error messages should be encountered
>>> ** Web interface shows progress and finished job in history
>>>
>>>
>>> F.6 Test on a cluster with HDFS.
>>> * Check that a good amount of input splits is read locally (JobManager
>> log
>>> reveals local assignments)
>>>
>>> F.7 Test against a Kafka installation
>>>
>>> F.8 Test the ./bin/flink command line client
>>> * Test "info" option, paste the JSON into the plan visualizer HTML file,
>>> check that plan is rendered
>>> * Test the parallelism flag (-p) to override the configured default
>>> parallelism
>>>
>>> F.9 Verify the plan visualizer with different browsers/operating systems
>>>
>>> F.10 Verify that the quickstarts for scala and java are working with the
>>> staging repository for both IntelliJ and Eclipse.
>>> * In particular the dependencies of the quickstart project need to be set
>>> correctly and the QS project needs to build from the staging repository
>>> (replace the snapshot repo URL with the staging repo URL)
>>> * The dependency tree of the QuickStart project must not contain any
>>> dependencies we shade away upstream (guava, netty, ...)
>>>
>>> F.11 Run examples on a YARN cluster
>>>
>>> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
>>>
>>> F.13 Run an example with the RemoteEnvironment against a cluster started
>>> from the shell script
>>>
>>> F.14 Run manual Tests in "flink-tests" module.
>>> * Marked with the @Ignore interface.
>>>
>>>
>>> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
>>
>>
>>
>>
>>
>>




Reply | Threaded
Open this post in threaded view
|

Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
Hey Chiwan!

Is the problem reproducible? Does it always deadlock? Can you please wait
for it to deadlock and then post a stacktrace (jps and jstack) of the
process? Please post it to this issue: FLINK-2183.

Thanks :)

– Ufuk

On Monday, June 8, 2015, Chiwan Park <[hidden email]
<javascript:_e(%7B%7D,'cvml','[hidden email]');>> wrote:

> Hi. I have a problem running `mvn clean verify` command.
> TaskManagerFailsWithSlotSharingITCase hangs in Oracle JDK 7 (1.7.0_80).
> But in Oracle JDK 8 the test case doesn’t hang.
>
> I’ve investigated about this problem but I cannot found the bug.
>
> Regards,
> Chiwan Park
>
> > On Jun 9, 2015, at 2:11 AM, Márton Balassi <[hidden email]>
> wrote:
> >
> > Added F7 Running against Kafka cluster for me in the doc. Doing it
> > tomorrow.
> >
> > On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park <[hidden email]>
> wrote:
> >
> >> Hi. I’m very excited about preparing a new major release. :)
> >> I just picked two tests. I will report status as soon as possible.
> >>
> >> Regards,
> >> Chiwan Park
> >>
> >>> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
> >>>
> >>> Hi everyone!
> >>>
> >>> As previously discussed, the Flink developer community is very eager to
> >> get
> >>> out a new major release. Apache Flink 0.9.0 will contain lots of new
> >>> features and many bugfixes. This time, I'll try to coordinate the
> release
> >>> process. Feel free to correct me if I'm doing something wrong because I
> >>> don't no any better :)
> >>>
> >>> To release a great version of Flink to the public, I'd like to ask
> >> everyone
> >>> to test the release candidate. Recently, Flink has received a lot of
> >>> attention. The expectations are quite high. Only through thorough
> testing
> >>> we will be able to satisfy all the Flink users out there.
> >>>
> >>> Below is a list from the Wiki that we use to ensure the legal and
> >>> functional aspects of a release [1]. What I would like you to do is
> pick
> >> at
> >>> least one of the tasks, put your name as assignee in the link below,
> and
> >>> report back once you verified it. That way, I hope we can quickly and
> >>> thoroughly test the release candidate.
> >>>
> >>>
> >>
> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
> >>>
> >>> Best,
> >>> Max
> >>>
> >>> Git branch: release-0.9-rc1
> >>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
> >>> Maven artifacts:
> >>>
> https://repository.apache.org/content/repositories/orgapacheflink-1037/
> >>> PGP public key for verifying the signatures:
> >>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >>>
> >>>
> >>> Legal
> >>> ====
> >>>
> >>> L.1 Check if checksums and GPG files match the corresponding release
> >> files
> >>>
> >>> L.2 Verify that the source archives do NOT contains any binaries
> >>>
> >>> L.3 Check if the source release is building properly with Maven
> >> (including
> >>> license header check (default) and checkstyle). Also the tests should
> be
> >>> executed (mvn clean verify)
> >>>
> >>> L.4 Verify that the LICENSE and NOTICE file is correct for the binary
> and
> >>> source release.
> >>>
> >>> L.5 All dependencies must be checked for their license and the license
> >> must
> >>> be ASL 2.0 compatible (
> >> http://www.apache.org/legal/resolved.html#category-x)
> >>> * The LICENSE and NOTICE files in the root directory refer to
> >> dependencies
> >>> in the source release, i.e., files in the git repository (such as
> fonts,
> >>> css, JavaScript, images)
> >>> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer
> to
> >>> the binary distribution and mention all of Flink's Maven dependencies
> as
> >>> well
> >>>
> >>> L.6 Check that all POM files point to the same version (mostly relevant
> >> to
> >>> examine quickstart artifact files)
> >>>
> >>> L.7 Read the README.md file
> >>>
> >>>
> >>> Functional
> >>> ========
> >>>
> >>> F.1 Run the start-local.sh/start-local-streaming.sh,
> >>> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh
> scripts
> >> and
> >>> verify that the processes come up
> >>>
> >>> F.2 Examine the *.out files (should be empty) and the log files (should
> >>> contain no exceptions)
> >>> * Test for Linux, OS X, Windows (for Windows as far as possible, not
> all
> >>> scripts exist)
> >>> * Shutdown and verify there are no exceptions in the log output (after
> >>> shutdown)
> >>> * Check all start+submission scripts for paths with and without spaces
> >>> (./bin/* scripts are quite fragile for paths with spaces)
> >>>
> >>> F.3 local mode (start-local.sh, see criteria below)
> >>> F.4 cluster mode (start-cluster.sh, see criteria below)
> >>> F.5 multi-node cluster (can simulate locally by starting two
> >> taskmanagers,
> >>> see criteria below)
> >>>
> >>> Criteria for F.3 F.4 F.5
> >>> ----------------------------
> >>> * Verify that the examples are running from both ./bin/flink and from
> the
> >>> web-based job submission tool
> >>> * flink-conf.yml should define more than one task slot
> >>> * Results of job are produced and correct
> >>> ** Check also that the examples are running with the build-in data and
> >>> external sources.
> >>> * Examine the log output - no error messages should be encountered
> >>> ** Web interface shows progress and finished job in history
> >>>
> >>>
> >>> F.6 Test on a cluster with HDFS.
> >>> * Check that a good amount of input splits is read locally (JobManager
> >> log
> >>> reveals local assignments)
> >>>
> >>> F.7 Test against a Kafka installation
> >>>
> >>> F.8 Test the ./bin/flink command line client
> >>> * Test "info" option, paste the JSON into the plan visualizer HTML
> file,
> >>> check that plan is rendered
> >>> * Test the parallelism flag (-p) to override the configured default
> >>> parallelism
> >>>
> >>> F.9 Verify the plan visualizer with different browsers/operating
> systems
> >>>
> >>> F.10 Verify that the quickstarts for scala and java are working with
> the
> >>> staging repository for both IntelliJ and Eclipse.
> >>> * In particular the dependencies of the quickstart project need to be
> set
> >>> correctly and the QS project needs to build from the staging repository
> >>> (replace the snapshot repo URL with the staging repo URL)
> >>> * The dependency tree of the QuickStart project must not contain any
> >>> dependencies we shade away upstream (guava, netty, ...)
> >>>
> >>> F.11 Run examples on a YARN cluster
> >>>
> >>> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
> >>>
> >>> F.13 Run an example with the RemoteEnvironment against a cluster
> started
> >>> from the shell script
> >>>
> >>> F.14 Run manual Tests in "flink-tests" module.
> >>> * Marked with the @Ignore interface.
> >>>
> >>>
> >>> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
> >>
> >>
> >>
> >>
> >>
> >>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
In reply to this post by Chiwan Park
Hey all,

1. it would be nice if we find more people to also do testing of the streaming API. I think it's especially good to have people on it, which did not use it before.

2. Just to make sure: the "assignee" field of each task is a list, i.e. we can and should have more people testing per task. ;-)

– Ufuk

On 08 Jun 2015, at 19:00, Chiwan Park <[hidden email]> wrote:

> Hi. I’m very excited about preparing a new major release. :)
> I just picked two tests. I will report status as soon as possible.
>
> Regards,
> Chiwan Park
>
>> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
>>
>> Hi everyone!
>>
>> As previously discussed, the Flink developer community is very eager to get
>> out a new major release. Apache Flink 0.9.0 will contain lots of new
>> features and many bugfixes. This time, I'll try to coordinate the release
>> process. Feel free to correct me if I'm doing something wrong because I
>> don't no any better :)
>>
>> To release a great version of Flink to the public, I'd like to ask everyone
>> to test the release candidate. Recently, Flink has received a lot of
>> attention. The expectations are quite high. Only through thorough testing
>> we will be able to satisfy all the Flink users out there.
>>
>> Below is a list from the Wiki that we use to ensure the legal and
>> functional aspects of a release [1]. What I would like you to do is pick at
>> least one of the tasks, put your name as assignee in the link below, and
>> report back once you verified it. That way, I hope we can quickly and
>> thoroughly test the release candidate.
>>
>> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
>>
>> Best,
>> Max
>>
>> Git branch: release-0.9-rc1
>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
>> Maven artifacts:
>> https://repository.apache.org/content/repositories/orgapacheflink-1037/
>> PGP public key for verifying the signatures:
>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>>
>>
>> Legal
>> ====
>>
>> L.1 Check if checksums and GPG files match the corresponding release files
>>
>> L.2 Verify that the source archives do NOT contains any binaries
>>
>> L.3 Check if the source release is building properly with Maven (including
>> license header check (default) and checkstyle). Also the tests should be
>> executed (mvn clean verify)
>>
>> L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
>> source release.
>>
>> L.5 All dependencies must be checked for their license and the license must
>> be ASL 2.0 compatible (http://www.apache.org/legal/resolved.html#category-x)
>> * The LICENSE and NOTICE files in the root directory refer to dependencies
>> in the source release, i.e., files in the git repository (such as fonts,
>> css, JavaScript, images)
>> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
>> the binary distribution and mention all of Flink's Maven dependencies as
>> well
>>
>> L.6 Check that all POM files point to the same version (mostly relevant to
>> examine quickstart artifact files)
>>
>> L.7 Read the README.md file
>>
>>
>> Functional
>> ========
>>
>> F.1 Run the start-local.sh/start-local-streaming.sh,
>> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts and
>> verify that the processes come up
>>
>> F.2 Examine the *.out files (should be empty) and the log files (should
>> contain no exceptions)
>> * Test for Linux, OS X, Windows (for Windows as far as possible, not all
>> scripts exist)
>> * Shutdown and verify there are no exceptions in the log output (after
>> shutdown)
>> * Check all start+submission scripts for paths with and without spaces
>> (./bin/* scripts are quite fragile for paths with spaces)
>>
>> F.3 local mode (start-local.sh, see criteria below)
>> F.4 cluster mode (start-cluster.sh, see criteria below)
>> F.5 multi-node cluster (can simulate locally by starting two taskmanagers,
>> see criteria below)
>>
>> Criteria for F.3 F.4 F.5
>> ----------------------------
>> * Verify that the examples are running from both ./bin/flink and from the
>> web-based job submission tool
>> * flink-conf.yml should define more than one task slot
>> * Results of job are produced and correct
>> ** Check also that the examples are running with the build-in data and
>> external sources.
>> * Examine the log output - no error messages should be encountered
>> ** Web interface shows progress and finished job in history
>>
>>
>> F.6 Test on a cluster with HDFS.
>> * Check that a good amount of input splits is read locally (JobManager log
>> reveals local assignments)
>>
>> F.7 Test against a Kafka installation
>>
>> F.8 Test the ./bin/flink command line client
>> * Test "info" option, paste the JSON into the plan visualizer HTML file,
>> check that plan is rendered
>> * Test the parallelism flag (-p) to override the configured default
>> parallelism
>>
>> F.9 Verify the plan visualizer with different browsers/operating systems
>>
>> F.10 Verify that the quickstarts for scala and java are working with the
>> staging repository for both IntelliJ and Eclipse.
>> * In particular the dependencies of the quickstart project need to be set
>> correctly and the QS project needs to build from the staging repository
>> (replace the snapshot repo URL with the staging repo URL)
>> * The dependency tree of the QuickStart project must not contain any
>> dependencies we shade away upstream (guava, netty, ...)
>>
>> F.11 Run examples on a YARN cluster
>>
>> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
>>
>> F.13 Run an example with the RemoteEnvironment against a cluster started
>> from the shell script
>>
>> F.14 Run manual Tests in "flink-tests" module.
>> * Marked with the @Ignore interface.
>>
>>
>> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
I would suggest we use this format to notify others that we did a task:

Assignees:
 - Aljoscha: done
 - Ufuk: found bug in such an such...
 - Chiwan Park: done, ...

The simple "status" doesn't work with multiple people on one task.

On Tue, Jun 9, 2015 at 9:40 AM, Ufuk Celebi <[hidden email]> wrote:

> Hey all,
>
> 1. it would be nice if we find more people to also do testing of the streaming API. I think it's especially good to have people on it, which did not use it before.
>
> 2. Just to make sure: the "assignee" field of each task is a list, i.e. we can and should have more people testing per task. ;-)
>
> – Ufuk
>
> On 08 Jun 2015, at 19:00, Chiwan Park <[hidden email]> wrote:
>
>> Hi. I’m very excited about preparing a new major release. :)
>> I just picked two tests. I will report status as soon as possible.
>>
>> Regards,
>> Chiwan Park
>>
>>> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
>>>
>>> Hi everyone!
>>>
>>> As previously discussed, the Flink developer community is very eager to get
>>> out a new major release. Apache Flink 0.9.0 will contain lots of new
>>> features and many bugfixes. This time, I'll try to coordinate the release
>>> process. Feel free to correct me if I'm doing something wrong because I
>>> don't no any better :)
>>>
>>> To release a great version of Flink to the public, I'd like to ask everyone
>>> to test the release candidate. Recently, Flink has received a lot of
>>> attention. The expectations are quite high. Only through thorough testing
>>> we will be able to satisfy all the Flink users out there.
>>>
>>> Below is a list from the Wiki that we use to ensure the legal and
>>> functional aspects of a release [1]. What I would like you to do is pick at
>>> least one of the tasks, put your name as assignee in the link below, and
>>> report back once you verified it. That way, I hope we can quickly and
>>> thoroughly test the release candidate.
>>>
>>> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
>>>
>>> Best,
>>> Max
>>>
>>> Git branch: release-0.9-rc1
>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
>>> Maven artifacts:
>>> https://repository.apache.org/content/repositories/orgapacheflink-1037/
>>> PGP public key for verifying the signatures:
>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>>>
>>>
>>> Legal
>>> ====
>>>
>>> L.1 Check if checksums and GPG files match the corresponding release files
>>>
>>> L.2 Verify that the source archives do NOT contains any binaries
>>>
>>> L.3 Check if the source release is building properly with Maven (including
>>> license header check (default) and checkstyle). Also the tests should be
>>> executed (mvn clean verify)
>>>
>>> L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
>>> source release.
>>>
>>> L.5 All dependencies must be checked for their license and the license must
>>> be ASL 2.0 compatible (http://www.apache.org/legal/resolved.html#category-x)
>>> * The LICENSE and NOTICE files in the root directory refer to dependencies
>>> in the source release, i.e., files in the git repository (such as fonts,
>>> css, JavaScript, images)
>>> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
>>> the binary distribution and mention all of Flink's Maven dependencies as
>>> well
>>>
>>> L.6 Check that all POM files point to the same version (mostly relevant to
>>> examine quickstart artifact files)
>>>
>>> L.7 Read the README.md file
>>>
>>>
>>> Functional
>>> ========
>>>
>>> F.1 Run the start-local.sh/start-local-streaming.sh,
>>> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts and
>>> verify that the processes come up
>>>
>>> F.2 Examine the *.out files (should be empty) and the log files (should
>>> contain no exceptions)
>>> * Test for Linux, OS X, Windows (for Windows as far as possible, not all
>>> scripts exist)
>>> * Shutdown and verify there are no exceptions in the log output (after
>>> shutdown)
>>> * Check all start+submission scripts for paths with and without spaces
>>> (./bin/* scripts are quite fragile for paths with spaces)
>>>
>>> F.3 local mode (start-local.sh, see criteria below)
>>> F.4 cluster mode (start-cluster.sh, see criteria below)
>>> F.5 multi-node cluster (can simulate locally by starting two taskmanagers,
>>> see criteria below)
>>>
>>> Criteria for F.3 F.4 F.5
>>> ----------------------------
>>> * Verify that the examples are running from both ./bin/flink and from the
>>> web-based job submission tool
>>> * flink-conf.yml should define more than one task slot
>>> * Results of job are produced and correct
>>> ** Check also that the examples are running with the build-in data and
>>> external sources.
>>> * Examine the log output - no error messages should be encountered
>>> ** Web interface shows progress and finished job in history
>>>
>>>
>>> F.6 Test on a cluster with HDFS.
>>> * Check that a good amount of input splits is read locally (JobManager log
>>> reveals local assignments)
>>>
>>> F.7 Test against a Kafka installation
>>>
>>> F.8 Test the ./bin/flink command line client
>>> * Test "info" option, paste the JSON into the plan visualizer HTML file,
>>> check that plan is rendered
>>> * Test the parallelism flag (-p) to override the configured default
>>> parallelism
>>>
>>> F.9 Verify the plan visualizer with different browsers/operating systems
>>>
>>> F.10 Verify that the quickstarts for scala and java are working with the
>>> staging repository for both IntelliJ and Eclipse.
>>> * In particular the dependencies of the quickstart project need to be set
>>> correctly and the QS project needs to build from the staging repository
>>> (replace the snapshot repo URL with the staging repo URL)
>>> * The dependency tree of the QuickStart project must not contain any
>>> dependencies we shade away upstream (guava, netty, ...)
>>>
>>> F.11 Run examples on a YARN cluster
>>>
>>> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
>>>
>>> F.13 Run an example with the RemoteEnvironment against a cluster started
>>> from the shell script
>>>
>>> F.14 Run manual Tests in "flink-tests" module.
>>> * Marked with the @Ignore interface.
>>>
>>>
>>> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
>>
>>
>>
>>
>>
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

mxm
+1 makes sense.

On Tue, Jun 9, 2015 at 10:48 AM, Aljoscha Krettek <[hidden email]>
wrote:

> I would suggest we use this format to notify others that we did a task:
>
> Assignees:
>  - Aljoscha: done
>  - Ufuk: found bug in such an such...
>  - Chiwan Park: done, ...
>
> The simple "status" doesn't work with multiple people on one task.
>
> On Tue, Jun 9, 2015 at 9:40 AM, Ufuk Celebi <[hidden email]> wrote:
> > Hey all,
> >
> > 1. it would be nice if we find more people to also do testing of the
> streaming API. I think it's especially good to have people on it, which did
> not use it before.
> >
> > 2. Just to make sure: the "assignee" field of each task is a list, i.e.
> we can and should have more people testing per task. ;-)
> >
> > – Ufuk
> >
> > On 08 Jun 2015, at 19:00, Chiwan Park <[hidden email]> wrote:
> >
> >> Hi. I’m very excited about preparing a new major release. :)
> >> I just picked two tests. I will report status as soon as possible.
> >>
> >> Regards,
> >> Chiwan Park
> >>
> >>> On Jun 9, 2015, at 1:52 AM, Maximilian Michels <[hidden email]> wrote:
> >>>
> >>> Hi everyone!
> >>>
> >>> As previously discussed, the Flink developer community is very eager
> to get
> >>> out a new major release. Apache Flink 0.9.0 will contain lots of new
> >>> features and many bugfixes. This time, I'll try to coordinate the
> release
> >>> process. Feel free to correct me if I'm doing something wrong because I
> >>> don't no any better :)
> >>>
> >>> To release a great version of Flink to the public, I'd like to ask
> everyone
> >>> to test the release candidate. Recently, Flink has received a lot of
> >>> attention. The expectations are quite high. Only through thorough
> testing
> >>> we will be able to satisfy all the Flink users out there.
> >>>
> >>> Below is a list from the Wiki that we use to ensure the legal and
> >>> functional aspects of a release [1]. What I would like you to do is
> pick at
> >>> least one of the tasks, put your name as assignee in the link below,
> and
> >>> report back once you verified it. That way, I hope we can quickly and
> >>> thoroughly test the release candidate.
> >>>
> >>>
> https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
> >>>
> >>> Best,
> >>> Max
> >>>
> >>> Git branch: release-0.9-rc1
> >>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
> >>> Maven artifacts:
> >>>
> https://repository.apache.org/content/repositories/orgapacheflink-1037/
> >>> PGP public key for verifying the signatures:
> >>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
> >>>
> >>>
> >>> Legal
> >>> ====
> >>>
> >>> L.1 Check if checksums and GPG files match the corresponding release
> files
> >>>
> >>> L.2 Verify that the source archives do NOT contains any binaries
> >>>
> >>> L.3 Check if the source release is building properly with Maven
> (including
> >>> license header check (default) and checkstyle). Also the tests should
> be
> >>> executed (mvn clean verify)
> >>>
> >>> L.4 Verify that the LICENSE and NOTICE file is correct for the binary
> and
> >>> source release.
> >>>
> >>> L.5 All dependencies must be checked for their license and the license
> must
> >>> be ASL 2.0 compatible (
> http://www.apache.org/legal/resolved.html#category-x)
> >>> * The LICENSE and NOTICE files in the root directory refer to
> dependencies
> >>> in the source release, i.e., files in the git repository (such as
> fonts,
> >>> css, JavaScript, images)
> >>> * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer
> to
> >>> the binary distribution and mention all of Flink's Maven dependencies
> as
> >>> well
> >>>
> >>> L.6 Check that all POM files point to the same version (mostly
> relevant to
> >>> examine quickstart artifact files)
> >>>
> >>> L.7 Read the README.md file
> >>>
> >>>
> >>> Functional
> >>> ========
> >>>
> >>> F.1 Run the start-local.sh/start-local-streaming.sh,
> >>> start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh
> scripts and
> >>> verify that the processes come up
> >>>
> >>> F.2 Examine the *.out files (should be empty) and the log files (should
> >>> contain no exceptions)
> >>> * Test for Linux, OS X, Windows (for Windows as far as possible, not
> all
> >>> scripts exist)
> >>> * Shutdown and verify there are no exceptions in the log output (after
> >>> shutdown)
> >>> * Check all start+submission scripts for paths with and without spaces
> >>> (./bin/* scripts are quite fragile for paths with spaces)
> >>>
> >>> F.3 local mode (start-local.sh, see criteria below)
> >>> F.4 cluster mode (start-cluster.sh, see criteria below)
> >>> F.5 multi-node cluster (can simulate locally by starting two
> taskmanagers,
> >>> see criteria below)
> >>>
> >>> Criteria for F.3 F.4 F.5
> >>> ----------------------------
> >>> * Verify that the examples are running from both ./bin/flink and from
> the
> >>> web-based job submission tool
> >>> * flink-conf.yml should define more than one task slot
> >>> * Results of job are produced and correct
> >>> ** Check also that the examples are running with the build-in data and
> >>> external sources.
> >>> * Examine the log output - no error messages should be encountered
> >>> ** Web interface shows progress and finished job in history
> >>>
> >>>
> >>> F.6 Test on a cluster with HDFS.
> >>> * Check that a good amount of input splits is read locally (JobManager
> log
> >>> reveals local assignments)
> >>>
> >>> F.7 Test against a Kafka installation
> >>>
> >>> F.8 Test the ./bin/flink command line client
> >>> * Test "info" option, paste the JSON into the plan visualizer HTML
> file,
> >>> check that plan is rendered
> >>> * Test the parallelism flag (-p) to override the configured default
> >>> parallelism
> >>>
> >>> F.9 Verify the plan visualizer with different browsers/operating
> systems
> >>>
> >>> F.10 Verify that the quickstarts for scala and java are working with
> the
> >>> staging repository for both IntelliJ and Eclipse.
> >>> * In particular the dependencies of the quickstart project need to be
> set
> >>> correctly and the QS project needs to build from the staging repository
> >>> (replace the snapshot repo URL with the staging repo URL)
> >>> * The dependency tree of the QuickStart project must not contain any
> >>> dependencies we shade away upstream (guava, netty, ...)
> >>>
> >>> F.11 Run examples on a YARN cluster
> >>>
> >>> F.12 Run all examples from the IDE (Eclipse & IntelliJ)
> >>>
> >>> F.13 Run an example with the RemoteEnvironment against a cluster
> started
> >>> from the shell script
> >>>
> >>> F.14 Run manual Tests in "flink-tests" module.
> >>> * Marked with the @Ignore interface.
> >>>
> >>>
> >>> [1] https://cwiki.apache.org/confluence/display/FLINK/Releasing
> >>
> >>
> >>
> >>
> >>
> >
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

mxm
The name of the Git branch was not correct. Thank you, Aljoscha, for
noticing. I've changed it from "release-0.9-rc1" to "release-0.9.0-rc1".
This has no affect on the validity of the release candidate.
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
I also encountered a failing TaskManagerFailsWithSlotSharingITCase using
Java8. I could, however, not reproduce the error a second time. The stack
trace is:

The JobManager should handle hard failing task manager with slot
sharing(org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase)
 Time elapsed: 1,400.148 sec  <<< ERROR!
java.util.concurrent.TimeoutException: Futures timed out after [200000
milliseconds]
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
    at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
    at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.ready(package.scala:86)
    at org.apache.flink.runtime.minicluster.FlinkMiniCluster.shutdown(FlinkMiniCluster.scala:162)
    at org.apache.flink.runtime.minicluster.FlinkMiniCluster.stop(FlinkMiniCluster.scala:149)
    at org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply$mcV$sp(TaskManagerFailsWithSlotSharingITCase.scala:140)
    at org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
    at org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
    at org.scalatest.Transformer$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
    at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
    at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
    at org.scalatest.Transformer.apply(Transformer.scala:22)
    at org.scalatest.Transformer.apply(Transformer.scala:20)
    at org.scalatest.WordSpecLike$anon$1.apply(WordSpecLike.scala:953)
    at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
    at org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase.withFixture(TaskManagerFailsWithSlotSharingITCase.scala:36)

Results :

Tests in error:
  TaskManagerFailsWithSlotSharingITCase.run:36->org$scalatest$BeforeAndAfterAll$super$run:36->org$scalatest$WordSpecLike$super$run:36->runTests:36->runTest:36->withFixture:36
» Timeout

On Tue, Jun 9, 2015 at 11:26 AM Maximilian Michels [hidden email]
<http://mailto:mxm@...> wrote:

The name of the Git branch was not correct. Thank you, Aljoscha, for
> noticing. I've changed it from "release-0.9-rc1" to "release-0.9.0-rc1".
> This has no affect on the validity of the release candidate.
>

Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
I also saw the same error on my third "mvn clean verify" run. Before it
always failed in the YARN tests.

On Tue, Jun 9, 2015 at 12:23 PM, Till Rohrmann <[hidden email]> wrote:

> I also encountered a failing TaskManagerFailsWithSlotSharingITCase using
> Java8. I could, however, not reproduce the error a second time. The stack
> trace is:
>
> The JobManager should handle hard failing task manager with slot
>
> sharing(org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase)
>  Time elapsed: 1,400.148 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Futures timed out after [200000
> milliseconds]
>     at
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>     at
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
>     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
>     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
>     at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>     at scala.concurrent.Await$.ready(package.scala:86)
>     at
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.shutdown(FlinkMiniCluster.scala:162)
>     at
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.stop(FlinkMiniCluster.scala:149)
>     at
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply$mcV$sp(TaskManagerFailsWithSlotSharingITCase.scala:140)
>     at
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
>     at
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
>     at
> org.scalatest.Transformer$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>     at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>     at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>     at org.scalatest.Transformer.apply(Transformer.scala:22)
>     at org.scalatest.Transformer.apply(Transformer.scala:20)
>     at org.scalatest.WordSpecLike$anon$1.apply(WordSpecLike.scala:953)
>     at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
>     at
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase.withFixture(TaskManagerFailsWithSlotSharingITCase.scala:36)
>
> Results :
>
> Tests in error:
>
> TaskManagerFailsWithSlotSharingITCase.run:36->org$scalatest$BeforeAndAfterAll$super$run:36->org$scalatest$WordSpecLike$super$run:36->runTests:36->runTest:36->withFixture:36
> » Timeout
>
> On Tue, Jun 9, 2015 at 11:26 AM Maximilian Michels [hidden email]
> <http://mailto:mxm@...> wrote:
>
> The name of the Git branch was not correct. Thank you, Aljoscha, for
> > noticing. I've changed it from "release-0.9-rc1" to "release-0.9.0-rc1".
> > This has no affect on the validity of the release candidate.
> >
> ​
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Sachin Goel
On my local machine, several flink runtime tests are failing on "mvn clean
verify". Here is the log output: http://pastebin.com/raw.php?i=VWbx2ppf

--
​ Sachin​

On Tue, Jun 9, 2015 at 4:04 PM, Aljoscha Krettek <[hidden email]>
wrote:

> I also saw the same error on my third "mvn clean verify" run. Before it
> always failed in the YARN tests.
>
> On Tue, Jun 9, 2015 at 12:23 PM, Till Rohrmann <[hidden email]>
> wrote:
>
> > I also encountered a failing TaskManagerFailsWithSlotSharingITCase using
> > Java8. I could, however, not reproduce the error a second time. The stack
> > trace is:
> >
> > The JobManager should handle hard failing task manager with slot
> >
> >
> sharing(org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase)
> >  Time elapsed: 1,400.148 sec  <<< ERROR!
> > java.util.concurrent.TimeoutException: Futures timed out after [200000
> > milliseconds]
> >     at
> > scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> >     at
> > scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
> >     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
> >     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
> >     at
> >
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> >     at scala.concurrent.Await$.ready(package.scala:86)
> >     at
> >
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.shutdown(FlinkMiniCluster.scala:162)
> >     at
> >
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.stop(FlinkMiniCluster.scala:149)
> >     at
> >
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply$mcV$sp(TaskManagerFailsWithSlotSharingITCase.scala:140)
> >     at
> >
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
> >     at
> >
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
> >     at
> >
> org.scalatest.Transformer$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> >     at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> >     at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> >     at org.scalatest.Transformer.apply(Transformer.scala:22)
> >     at org.scalatest.Transformer.apply(Transformer.scala:20)
> >     at org.scalatest.WordSpecLike$anon$1.apply(WordSpecLike.scala:953)
> >     at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
> >     at
> >
> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase.withFixture(TaskManagerFailsWithSlotSharingITCase.scala:36)
> >
> > Results :
> >
> > Tests in error:
> >
> >
> TaskManagerFailsWithSlotSharingITCase.run:36->org$scalatest$BeforeAndAfterAll$super$run:36->org$scalatest$WordSpecLike$super$run:36->runTests:36->runTest:36->withFixture:36
> > » Timeout
> >
> > On Tue, Jun 9, 2015 at 11:26 AM Maximilian Michels [hidden email]
> > <http://mailto:mxm@...> wrote:
> >
> > The name of the Git branch was not correct. Thank you, Aljoscha, for
> > > noticing. I've changed it from "release-0.9-rc1" to
> "release-0.9.0-rc1".
> > > This has no affect on the validity of the release candidate.
> > >
> > ​
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
In reply to this post by Aljoscha Krettek-2
I did five "mvn clean verify" runs by now. All of them failed. One
with the TaskmanagerFailsWithSlotSharingITCase and the other ones with
YARNSessionFIFOITCase

On Tue, Jun 9, 2015 at 12:34 PM, Aljoscha Krettek <[hidden email]> wrote:

> I also saw the same error on my third "mvn clean verify" run. Before it
> always failed in the YARN tests.
>
> On Tue, Jun 9, 2015 at 12:23 PM, Till Rohrmann <[hidden email]> wrote:
>>
>> I also encountered a failing TaskManagerFailsWithSlotSharingITCase using
>> Java8. I could, however, not reproduce the error a second time. The stack
>> trace is:
>>
>> The JobManager should handle hard failing task manager with slot
>>
>> sharing(org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase)
>>  Time elapsed: 1,400.148 sec  <<< ERROR!
>> java.util.concurrent.TimeoutException: Futures timed out after [200000
>> milliseconds]
>>     at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>     at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
>>     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
>>     at scala.concurrent.Await$anonfun$ready$1.apply(package.scala:86)
>>     at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>     at scala.concurrent.Await$.ready(package.scala:86)
>>     at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.shutdown(FlinkMiniCluster.scala:162)
>>     at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.stop(FlinkMiniCluster.scala:149)
>>     at
>> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply$mcV$sp(TaskManagerFailsWithSlotSharingITCase.scala:140)
>>     at
>> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
>>     at
>> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$anonfun$1$anonfun$apply$mcV$sp$3.apply(TaskManagerFailsWithSlotSharingITCase.scala:95)
>>     at
>> org.scalatest.Transformer$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>>     at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>>     at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>>     at org.scalatest.Transformer.apply(Transformer.scala:22)
>>     at org.scalatest.Transformer.apply(Transformer.scala:20)
>>     at org.scalatest.WordSpecLike$anon$1.apply(WordSpecLike.scala:953)
>>     at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
>>     at
>> org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase.withFixture(TaskManagerFailsWithSlotSharingITCase.scala:36)
>>
>> Results :
>>
>> Tests in error:
>>
>> TaskManagerFailsWithSlotSharingITCase.run:36->org$scalatest$BeforeAndAfterAll$super$run:36->org$scalatest$WordSpecLike$super$run:36->runTests:36->runTest:36->withFixture:36
>> » Timeout
>>
>> On Tue, Jun 9, 2015 at 11:26 AM Maximilian Michels [hidden email]
>> <http://mailto:mxm@...> wrote:
>>
>> The name of the Git branch was not correct. Thank you, Aljoscha, for
>> > noticing. I've changed it from "release-0.9-rc1" to "release-0.9.0-rc1".
>> > This has no affect on the validity of the release candidate.
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
In reply to this post by Sachin Goel

On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> wrote:

> On my local machine, several flink runtime tests are failing on "mvn clean
> verify". Here is the log output: http://pastebin.com/raw.php?i=VWbx2ppf

Thanks for reporting this. Have you tried it multiple times? Is it failing reproducibly with the same tests? What's your setup?

– Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Sachin Goel
A re-ran lead to reproducibility of 11 failures again.
TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
managed to succeed in a re-run. Here is the log output again:
http://pastebin.com/raw.php?i=N4cm1J18

Setup: JDK 1.8.0_40 on windows 8.1
System memory: 8GB, quad-core with maximum 8 threads.

Regards
Sachin Goel

On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:

>
> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> wrote:
>
> > On my local machine, several flink runtime tests are failing on "mvn
> clean
> > verify". Here is the log output: http://pastebin.com/raw.php?i=VWbx2ppf
>
> Thanks for reporting this. Have you tried it multiple times? Is it failing
> reproducibly with the same tests? What's your setup?
>
> – Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
I found the bug in the failing YARNSessionFIFOITCase: It was comparing
the hostname to a hostname in some yarn config. In one case it was
capitalised, in the other case it wasn't.

Pushing fix to master and release-0.9 branch.

On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email]> wrote:

> A re-ran lead to reproducibility of 11 failures again.
> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
> managed to succeed in a re-run. Here is the log output again:
> http://pastebin.com/raw.php?i=N4cm1J18
>
> Setup: JDK 1.8.0_40 on windows 8.1
> System memory: 8GB, quad-core with maximum 8 threads.
>
> Regards
> Sachin Goel
>
> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:
>
>>
>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> wrote:
>>
>> > On my local machine, several flink runtime tests are failing on "mvn
>> clean
>> > verify". Here is the log output: http://pastebin.com/raw.php?i=VWbx2ppf
>>
>> Thanks for reporting this. Have you tried it multiple times? Is it failing
>> reproducibly with the same tests? What's your setup?
>>
>> – Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
I discovered another problem:
https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner
cannot be disabled in part of the Streaming Java API and all of the
Streaming Scala API. I think this is a release blocker (in addition
the the other bugs found so far.)

On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek <[hidden email]> wrote:

> I found the bug in the failing YARNSessionFIFOITCase: It was comparing
> the hostname to a hostname in some yarn config. In one case it was
> capitalised, in the other case it wasn't.
>
> Pushing fix to master and release-0.9 branch.
>
> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email]> wrote:
>> A re-ran lead to reproducibility of 11 failures again.
>> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
>> managed to succeed in a re-run. Here is the log output again:
>> http://pastebin.com/raw.php?i=N4cm1J18
>>
>> Setup: JDK 1.8.0_40 on windows 8.1
>> System memory: 8GB, quad-core with maximum 8 threads.
>>
>> Regards
>> Sachin Goel
>>
>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:
>>
>>>
>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> wrote:
>>>
>>> > On my local machine, several flink runtime tests are failing on "mvn
>>> clean
>>> > verify". Here is the log output: http://pastebin.com/raw.php?i=VWbx2ppf
>>>
>>> Thanks for reporting this. Have you tried it multiple times? Is it failing
>>> reproducibly with the same tests? What's your setup?
>>>
>>> – Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Aljoscha Krettek-2
I discovered something that might be a feature, rather than a bug. When you
submit an example using the web client without giving parameters the
program fails with this:

org.apache.flink.client.program.ProgramInvocationException: The main method
caused an error.

at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452)

at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)

at org.apache.flink.client.program.Client.run(Client.java:315)

at
org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)

at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)

at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)

at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)

at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)

at org.eclipse.jetty.server.Server.handle(Server.java:352)

at
org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)

at
org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)

at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)

at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)

at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.NullPointerException

at
org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78)

at org.apache.flink.api.java.DataSet.collect(DataSet.java:409)

at org.apache.flink.api.java.DataSet.print(DataSet.java:1345)

at
org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)

... 24 more


This also only occurs when you uncheck the "suspend execution while showing
plan".

I think this arises because the new print() uses collect() which tries to
get the job execution result. I guess the result is Null since the job is
submitted asynchronously when the checkbox is unchecked.


Other than that, the new print() is pretty sweet when you run the builtin
examples from the CLI. You get all the state changes and also the result,
even when running in cluster mode on several task managers. :D


On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email]>
wrote:

> I discovered another problem:
> https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner
> cannot be disabled in part of the Streaming Java API and all of the
> Streaming Scala API. I think this is a release blocker (in addition
> the the other bugs found so far.)
>
> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek <[hidden email]>
> wrote:
> > I found the bug in the failing YARNSessionFIFOITCase: It was comparing
> > the hostname to a hostname in some yarn config. In one case it was
> > capitalised, in the other case it wasn't.
> >
> > Pushing fix to master and release-0.9 branch.
> >
> > On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email]>
> wrote:
> >> A re-ran lead to reproducibility of 11 failures again.
> >> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
> >> managed to succeed in a re-run. Here is the log output again:
> >> http://pastebin.com/raw.php?i=N4cm1J18
> >>
> >> Setup: JDK 1.8.0_40 on windows 8.1
> >> System memory: 8GB, quad-core with maximum 8 threads.
> >>
> >> Regards
> >> Sachin Goel
> >>
> >> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:
> >>
> >>>
> >>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]>
> wrote:
> >>>
> >>> > On my local machine, several flink runtime tests are failing on "mvn
> >>> clean
> >>> > verify". Here is the log output:
> http://pastebin.com/raw.php?i=VWbx2ppf
> >>>
> >>> Thanks for reporting this. Have you tried it multiple times? Is it
> failing
> >>> reproducibly with the same tests? What's your setup?
> >>>
> >>> – Ufuk
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Chiwan Park
I attached jps and jstack log about hanging TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183.

Regards,
Chiwan Park

> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek <[hidden email]> wrote:
>
> I discovered something that might be a feature, rather than a bug. When you
> submit an example using the web client without giving parameters the
> program fails with this:
>
> org.apache.flink.client.program.ProgramInvocationException: The main method
> caused an error.
>
> at
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452)
>
> at
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
>
> at org.apache.flink.client.program.Client.run(Client.java:315)
>
> at
> org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)
>
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
>
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
>
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
>
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
>
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>
> at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
>
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
>
> at org.eclipse.jetty.server.Server.handle(Server.java:352)
>
> at
> org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
>
> at
> org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
>
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
>
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
>
> at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
>
> at
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.NullPointerException
>
> at
> org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78)
>
> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409)
>
> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345)
>
> at
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:497)
>
> at
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
>
> ... 24 more
>
>
> This also only occurs when you uncheck the "suspend execution while showing
> plan".
>
> I think this arises because the new print() uses collect() which tries to
> get the job execution result. I guess the result is Null since the job is
> submitted asynchronously when the checkbox is unchecked.
>
>
> Other than that, the new print() is pretty sweet when you run the builtin
> examples from the CLI. You get all the state changes and also the result,
> even when running in cluster mode on several task managers. :D
>
>
> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> I discovered another problem:
>> https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner
>> cannot be disabled in part of the Streaming Java API and all of the
>> Streaming Scala API. I think this is a release blocker (in addition
>> the the other bugs found so far.)
>>
>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek <[hidden email]>
>> wrote:
>>> I found the bug in the failing YARNSessionFIFOITCase: It was comparing
>>> the hostname to a hostname in some yarn config. In one case it was
>>> capitalised, in the other case it wasn't.
>>>
>>> Pushing fix to master and release-0.9 branch.
>>>
>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email]>
>> wrote:
>>>> A re-ran lead to reproducibility of 11 failures again.
>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
>>>> managed to succeed in a re-run. Here is the log output again:
>>>> http://pastebin.com/raw.php?i=N4cm1J18
>>>>
>>>> Setup: JDK 1.8.0_40 on windows 8.1
>>>> System memory: 8GB, quad-core with maximum 8 threads.
>>>>
>>>> Regards
>>>> Sachin Goel
>>>>
>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:
>>>>
>>>>>
>>>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]>
>> wrote:
>>>>>
>>>>>> On my local machine, several flink runtime tests are failing on "mvn
>>>>> clean
>>>>>> verify". Here is the log output:
>> http://pastebin.com/raw.php?i=VWbx2ppf
>>>>>
>>>>> Thanks for reporting this. Have you tried it multiple times? Is it
>> failing
>>>>> reproducibly with the same tests? What's your setup?
>>>>>
>>>>> – Ufuk
>>



Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
While looking into FLINK-2188 (HBase input) I've discovered that Hadoop input formats implementing Configurable (like mapreduce.TableInputFormat) don't have the Hadoop configuration set via setConf(Configuration).

I have a small fix for this, which I have to clean up. First, I wanted to check what you think about this issue wrt the release. Personally, I think this is a release blocker, because it essentially means that no Hadoop input format, which relies on the Configuration instance to be set this way will work (this is to some extent a bug of the respective input formats) – most notably the HBase TableInputFormat.

– Ufuk

On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> wrote:

> I attached jps and jstack log about hanging TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183.
>
> Regards,
> Chiwan Park
>
>> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek <[hidden email]> wrote:
>>
>> I discovered something that might be a feature, rather than a bug. When you
>> submit an example using the web client without giving parameters the
>> program fails with this:
>>
>> org.apache.flink.client.program.ProgramInvocationException: The main method
>> caused an error.
>>
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452)
>>
>> at
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
>>
>> at org.apache.flink.client.program.Client.run(Client.java:315)
>>
>> at
>> org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302)
>>
>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
>>
>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)
>>
>> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
>>
>> at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
>>
>> at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
>>
>> at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
>>
>> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
>>
>> at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
>>
>> at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
>>
>> at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>>
>> at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
>>
>> at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
>>
>> at org.eclipse.jetty.server.Server.handle(Server.java:352)
>>
>> at
>> org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
>>
>> at
>> org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
>>
>> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
>>
>> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
>>
>> at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
>>
>> at
>> org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
>>
>> at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.NullPointerException
>>
>> at
>> org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78)
>>
>> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409)
>>
>> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345)
>>
>> at
>> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:497)
>>
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
>>
>> ... 24 more
>>
>>
>> This also only occurs when you uncheck the "suspend execution while showing
>> plan".
>>
>> I think this arises because the new print() uses collect() which tries to
>> get the job execution result. I guess the result is Null since the job is
>> submitted asynchronously when the checkbox is unchecked.
>>
>>
>> Other than that, the new print() is pretty sweet when you run the builtin
>> examples from the CLI. You get all the state changes and also the result,
>> even when running in cluster mode on several task managers. :D
>>
>>
>> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email]>
>> wrote:
>>
>>> I discovered another problem:
>>> https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner
>>> cannot be disabled in part of the Streaming Java API and all of the
>>> Streaming Scala API. I think this is a release blocker (in addition
>>> the the other bugs found so far.)
>>>
>>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek <[hidden email]>
>>> wrote:
>>>> I found the bug in the failing YARNSessionFIFOITCase: It was comparing
>>>> the hostname to a hostname in some yarn config. In one case it was
>>>> capitalised, in the other case it wasn't.
>>>>
>>>> Pushing fix to master and release-0.9 branch.
>>>>
>>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email]>
>>> wrote:
>>>>> A re-ran lead to reproducibility of 11 failures again.
>>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
>>>>> managed to succeed in a re-run. Here is the log output again:
>>>>> http://pastebin.com/raw.php?i=N4cm1J18
>>>>>
>>>>> Setup: JDK 1.8.0_40 on windows 8.1
>>>>> System memory: 8GB, quad-core with maximum 8 threads.
>>>>>
>>>>> Regards
>>>>> Sachin Goel
>>>>>
>>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote:
>>>>>
>>>>>>
>>>>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]>
>>> wrote:
>>>>>>
>>>>>>> On my local machine, several flink runtime tests are failing on "mvn
>>>>>> clean
>>>>>>> verify". Here is the log output:
>>> http://pastebin.com/raw.php?i=VWbx2ppf
>>>>>>
>>>>>> Thanks for reporting this. Have you tried it multiple times? Is it
>>> failing
>>>>>> reproducibly with the same tests? What's your setup?
>>>>>>
>>>>>> – Ufuk
>>>
>
>
>

1234