This doesn't look good, yes.
On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> wrote: > While looking into FLINK-2188 (HBase input) I've discovered that Hadoop > input formats implementing Configurable (like mapreduce.TableInputFormat) > don't have the Hadoop configuration set via setConf(Configuration). > > I have a small fix for this, which I have to clean up. First, I wanted to > check what you think about this issue wrt the release. Personally, I think > this is a release blocker, because it essentially means that no Hadoop > input format, which relies on the Configuration instance to be set this way > will work (this is to some extent a bug of the respective input formats) – > most notably the HBase TableInputFormat. > > – Ufuk > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> wrote: > > > I attached jps and jstack log about hanging > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > Regards, > > Chiwan Park > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek <[hidden email]> > wrote: > >> > >> I discovered something that might be a feature, rather than a bug. When > you > >> submit an example using the web client without giving parameters the > >> program fails with this: > >> > >> org.apache.flink.client.program.ProgramInvocationException: The main > method > >> caused an error. > >> > >> at > >> > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > >> > >> at > >> > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > >> > >> at org.apache.flink.client.program.Client.run(Client.java:315) > >> > >> at > >> > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > >> > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > >> > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > >> > >> at > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > >> > >> at > >> > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > >> > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > >> > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > >> > >> at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > >> > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > >> > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > >> > >> at > >> > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > >> > >> at > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > >> > >> at > >> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > >> > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > >> > >> at > >> > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > >> > >> at > >> > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > >> > >> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > >> > >> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > >> > >> at > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > >> > >> at > >> > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > >> > >> at > >> > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > >> > >> at java.lang.Thread.run(Thread.java:745) > >> > >> Caused by: java.lang.NullPointerException > >> > >> at > >> > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > >> > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > >> > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > >> > >> at > >> > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > >> > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> > >> at > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > >> > >> at > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >> > >> at java.lang.reflect.Method.invoke(Method.java:497) > >> > >> at > >> > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > >> > >> ... 24 more > >> > >> > >> This also only occurs when you uncheck the "suspend execution while > showing > >> plan". > >> > >> I think this arises because the new print() uses collect() which tries > to > >> get the job execution result. I guess the result is Null since the job > is > >> submitted asynchronously when the checkbox is unchecked. > >> > >> > >> Other than that, the new print() is pretty sweet when you run the > builtin > >> examples from the CLI. You get all the state changes and also the > result, > >> even when running in cluster mode on several task managers. :D > >> > >> > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email]> > >> wrote: > >> > >>> I discovered another problem: > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner > >>> cannot be disabled in part of the Streaming Java API and all of the > >>> Streaming Scala API. I think this is a release blocker (in addition > >>> the the other bugs found so far.) > >>> > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek <[hidden email]> > >>> wrote: > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was comparing > >>>> the hostname to a hostname in some yarn config. In one case it was > >>>> capitalised, in the other case it wasn't. > >>>> > >>>> Pushing fix to master and release-0.9 branch. > >>>> > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel <[hidden email] > > > >>> wrote: > >>>>> A re-ran lead to reproducibility of 11 failures again. > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out > but > >>>>> managed to succeed in a re-run. Here is the log output again: > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > >>>>> > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > >>>>> > >>>>> Regards > >>>>> Sachin Goel > >>>>> > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> wrote: > >>>>> > >>>>>> > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> > >>> wrote: > >>>>>> > >>>>>>> On my local machine, several flink runtime tests are failing on > "mvn > >>>>>> clean > >>>>>>> verify". Here is the log output: > >>> http://pastebin.com/raw.php?i=VWbx2ppf > >>>>>> > >>>>>> Thanks for reporting this. Have you tried it multiple times? Is it > >>> failing > >>>>>> reproducibly with the same tests? What's your setup? > >>>>>> > >>>>>> – Ufuk > >>> > > > > > > > > |
With all the issues discovered, it looks like we'll have another release
candidate. Right now, we have discovered the following problems: 1 YARN ITCase fails [fixed via 2eb5cfe] 2 No Jar for SessionWindowing example [fixed in #809] 3 Wrong description of the input format for the graph examples (eg. ConnectedComponents) [fixed in #809] 4 TaskManagerFailsWithSlotSharingITCase fails 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails 6 Submitting KMeans example to Web Submission Client does not work on Firefox. 7 Zooming is buggy in Web Submission Client (Firefox) Do we have someone familiar with the web interface who could take a look at the Firefox issues? One more important thing: The release-0.9 branch should only be used for bug fixes or prior discussed feature changes. Adding new features defies the purpose of carefully testing in advance and can have unforeseeable consequences. In particular, I'm referring to #810 pull request: https://github.com/apache/flink/pull/810 IMHO, this one shouldn't have been cherry-picked onto the release-0.9 branch. I would like to remove it from there if no objections are raised. https://github.com/apache/flink/commit/e0e6f59f309170e5217bdfbf5d30db87c947f8ce On Wed, Jun 10, 2015 at 8:52 AM, Aljoscha Krettek <[hidden email]> wrote: > This doesn't look good, yes. > > On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> wrote: > > > While looking into FLINK-2188 (HBase input) I've discovered that Hadoop > > input formats implementing Configurable (like mapreduce.TableInputFormat) > > don't have the Hadoop configuration set via setConf(Configuration). > > > > I have a small fix for this, which I have to clean up. First, I wanted to > > check what you think about this issue wrt the release. Personally, I > think > > this is a release blocker, because it essentially means that no Hadoop > > input format, which relies on the Configuration instance to be set this > way > > will work (this is to some extent a bug of the respective input formats) > – > > most notably the HBase TableInputFormat. > > > > – Ufuk > > > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> wrote: > > > > > I attached jps and jstack log about hanging > > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > > > Regards, > > > Chiwan Park > > > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek <[hidden email]> > > wrote: > > >> > > >> I discovered something that might be a feature, rather than a bug. > When > > you > > >> submit an example using the web client without giving parameters the > > >> program fails with this: > > >> > > >> org.apache.flink.client.program.ProgramInvocationException: The main > > method > > >> caused an error. > > >> > > >> at > > >> > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > > >> > > >> at > > >> > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > > >> > > >> at org.apache.flink.client.program.Client.run(Client.java:315) > > >> > > >> at > > >> > > > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > > >> > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > > >> > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > > >> > > >> at > > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > > >> > > >> at > > >> > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > > >> > > >> at > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > > >> > > >> at > > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > > >> > > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > > >> > > >> at > > >> > > > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > > >> > > >> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > > >> > > >> at > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > > >> > > >> at > > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > > >> > > >> at > > >> > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > > >> > > >> at > > >> > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > > >> > > >> at java.lang.Thread.run(Thread.java:745) > > >> > > >> Caused by: java.lang.NullPointerException > > >> > > >> at > > >> > > > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > > >> > > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > > >> > > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > > >> > > >> at > > >> > > > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > > >> > > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > >> > > >> at > > >> > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > >> > > >> at > > >> > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > >> > > >> at java.lang.reflect.Method.invoke(Method.java:497) > > >> > > >> at > > >> > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > > >> > > >> ... 24 more > > >> > > >> > > >> This also only occurs when you uncheck the "suspend execution while > > showing > > >> plan". > > >> > > >> I think this arises because the new print() uses collect() which tries > > to > > >> get the job execution result. I guess the result is Null since the job > > is > > >> submitted asynchronously when the checkbox is unchecked. > > >> > > >> > > >> Other than that, the new print() is pretty sweet when you run the > > builtin > > >> examples from the CLI. You get all the state changes and also the > > result, > > >> even when running in cluster mode on several task managers. :D > > >> > > >> > > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email] > > > > >> wrote: > > >> > > >>> I discovered another problem: > > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner > > >>> cannot be disabled in part of the Streaming Java API and all of the > > >>> Streaming Scala API. I think this is a release blocker (in addition > > >>> the the other bugs found so far.) > > >>> > > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek < > [hidden email]> > > >>> wrote: > > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was > comparing > > >>>> the hostname to a hostname in some yarn config. In one case it was > > >>>> capitalised, in the other case it wasn't. > > >>>> > > >>>> Pushing fix to master and release-0.9 branch. > > >>>> > > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel < > [hidden email] > > > > > >>> wrote: > > >>>>> A re-ran lead to reproducibility of 11 failures again. > > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a > time-out > > but > > >>>>> managed to succeed in a re-run. Here is the log output again: > > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > > >>>>> > > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > > >>>>> > > >>>>> Regards > > >>>>> Sachin Goel > > >>>>> > > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> > wrote: > > >>>>> > > >>>>>> > > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email]> > > >>> wrote: > > >>>>>> > > >>>>>>> On my local machine, several flink runtime tests are failing on > > "mvn > > >>>>>> clean > > >>>>>>> verify". Here is the log output: > > >>> http://pastebin.com/raw.php?i=VWbx2ppf > > >>>>>> > > >>>>>> Thanks for reporting this. Have you tried it multiple times? Is it > > >>> failing > > >>>>>> reproducibly with the same tests? What's your setup? > > >>>>>> > > >>>>>> – Ufuk > > >>> > > > > > > > > > > > > > > |
This feature needs to be included in the release, it has been tested and
used extensively. And many applciations depend on it. Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., Sze, 10:47): > With all the issues discovered, it looks like we'll have another release > candidate. Right now, we have discovered the following problems: > > 1 YARN ITCase fails [fixed via 2eb5cfe] > 2 No Jar for SessionWindowing example [fixed in #809] > 3 Wrong description of the input format for the graph examples (eg. > ConnectedComponents) [fixed in #809] > 4 TaskManagerFailsWithSlotSharingITCase fails > 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > 6 Submitting KMeans example to Web Submission Client does not work on > Firefox. > 7 Zooming is buggy in Web Submission Client (Firefox) > > Do we have someone familiar with the web interface who could take a look at > the Firefox issues? > > One more important thing: The release-0.9 branch should only be used for > bug fixes or prior discussed feature changes. Adding new features defies > the purpose of carefully testing in advance and can have unforeseeable > consequences. In particular, I'm referring to #810 pull request: > https://github.com/apache/flink/pull/810 > > IMHO, this one shouldn't have been cherry-picked onto the release-0.9 > branch. I would like to remove it from there if no objections are raised. > > > https://github.com/apache/flink/commit/e0e6f59f309170e5217bdfbf5d30db87c947f8ce > > On Wed, Jun 10, 2015 at 8:52 AM, Aljoscha Krettek <[hidden email]> > wrote: > > > This doesn't look good, yes. > > > > On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > While looking into FLINK-2188 (HBase input) I've discovered that Hadoop > > > input formats implementing Configurable (like > mapreduce.TableInputFormat) > > > don't have the Hadoop configuration set via setConf(Configuration). > > > > > > I have a small fix for this, which I have to clean up. First, I wanted > to > > > check what you think about this issue wrt the release. Personally, I > > think > > > this is a release blocker, because it essentially means that no Hadoop > > > input format, which relies on the Configuration instance to be set this > > way > > > will work (this is to some extent a bug of the respective input > formats) > > – > > > most notably the HBase TableInputFormat. > > > > > > – Ufuk > > > > > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> wrote: > > > > > > > I attached jps and jstack log about hanging > > > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > > > > > Regards, > > > > Chiwan Park > > > > > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek <[hidden email] > > > > > wrote: > > > >> > > > >> I discovered something that might be a feature, rather than a bug. > > When > > > you > > > >> submit an example using the web client without giving parameters the > > > >> program fails with this: > > > >> > > > >> org.apache.flink.client.program.ProgramInvocationException: The main > > > method > > > >> caused an error. > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > > > >> > > > >> at org.apache.flink.client.program.Client.run(Client.java:315) > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > > > >> > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > > > >> > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > > > >> > > > >> at > > > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > > > >> > > > >> at > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > > > >> > > > >> at > > > > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > > > >> > > > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > > > >> > > > >> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > > > >> > > > >> at > > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > > > >> > > > >> at > > > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > > > >> > > > >> at > > > >> > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > > > >> > > > >> at java.lang.Thread.run(Thread.java:745) > > > >> > > > >> Caused by: java.lang.NullPointerException > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > > > >> > > > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > > > >> > > > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > > > >> > > > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > >> > > > >> at > > > >> > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > >> > > > >> at > > > >> > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > >> > > > >> at java.lang.reflect.Method.invoke(Method.java:497) > > > >> > > > >> at > > > >> > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > > > >> > > > >> ... 24 more > > > >> > > > >> > > > >> This also only occurs when you uncheck the "suspend execution while > > > showing > > > >> plan". > > > >> > > > >> I think this arises because the new print() uses collect() which > tries > > > to > > > >> get the job execution result. I guess the result is Null since the > job > > > is > > > >> submitted asynchronously when the checkbox is unchecked. > > > >> > > > >> > > > >> Other than that, the new print() is pretty sweet when you run the > > > builtin > > > >> examples from the CLI. You get all the state changes and also the > > > result, > > > >> even when running in cluster mode on several task managers. :D > > > >> > > > >> > > > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek < > [hidden email] > > > > > > >> wrote: > > > >> > > > >>> I discovered another problem: > > > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure > cleaner > > > >>> cannot be disabled in part of the Streaming Java API and all of the > > > >>> Streaming Scala API. I think this is a release blocker (in addition > > > >>> the the other bugs found so far.) > > > >>> > > > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek < > > [hidden email]> > > > >>> wrote: > > > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was > > comparing > > > >>>> the hostname to a hostname in some yarn config. In one case it was > > > >>>> capitalised, in the other case it wasn't. > > > >>>> > > > >>>> Pushing fix to master and release-0.9 branch. > > > >>>> > > > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel < > > [hidden email] > > > > > > > >>> wrote: > > > >>>>> A re-ran lead to reproducibility of 11 failures again. > > > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a > > time-out > > > but > > > >>>>> managed to succeed in a re-run. Here is the log output again: > > > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > > > >>>>> > > > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > > > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > > > >>>>> > > > >>>>> Regards > > > >>>>> Sachin Goel > > > >>>>> > > > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> > > wrote: > > > >>>>> > > > >>>>>> > > > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel <[hidden email] > > > > > >>> wrote: > > > >>>>>> > > > >>>>>>> On my local machine, several flink runtime tests are failing on > > > "mvn > > > >>>>>> clean > > > >>>>>>> verify". Here is the log output: > > > >>> http://pastebin.com/raw.php?i=VWbx2ppf > > > >>>>>> > > > >>>>>> Thanks for reporting this. Have you tried it multiple times? Is > it > > > >>> failing > > > >>>>>> reproducibly with the same tests? What's your setup? > > > >>>>>> > > > >>>>>> – Ufuk > > > >>> > > > > > > > > > > > > > > > > > > > > > |
I agree with Gyula regarding the iteration partitioning.
I have also been using this feature for developing machine learning algorithms. And I think SAMOA also needs this feature. Faye 2015-06-10 10:54 GMT+02:00 Gyula Fóra <[hidden email]>: > This feature needs to be included in the release, it has been tested and > used extensively. And many applciations depend on it. > > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., > Sze, > 10:47): > > > With all the issues discovered, it looks like we'll have another release > > candidate. Right now, we have discovered the following problems: > > > > 1 YARN ITCase fails [fixed via 2eb5cfe] > > 2 No Jar for SessionWindowing example [fixed in #809] > > 3 Wrong description of the input format for the graph examples (eg. > > ConnectedComponents) [fixed in #809] > > 4 TaskManagerFailsWithSlotSharingITCase fails > > 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > > 6 Submitting KMeans example to Web Submission Client does not work on > > Firefox. > > 7 Zooming is buggy in Web Submission Client (Firefox) > > > > Do we have someone familiar with the web interface who could take a look > at > > the Firefox issues? > > > > One more important thing: The release-0.9 branch should only be used for > > bug fixes or prior discussed feature changes. Adding new features defies > > the purpose of carefully testing in advance and can have unforeseeable > > consequences. In particular, I'm referring to #810 pull request: > > https://github.com/apache/flink/pull/810 > > > > IMHO, this one shouldn't have been cherry-picked onto the release-0.9 > > branch. I would like to remove it from there if no objections are raised. > > > > > > > https://github.com/apache/flink/commit/e0e6f59f309170e5217bdfbf5d30db87c947f8ce > > > > On Wed, Jun 10, 2015 at 8:52 AM, Aljoscha Krettek <[hidden email]> > > wrote: > > > > > This doesn't look good, yes. > > > > > > On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > > > While looking into FLINK-2188 (HBase input) I've discovered that > Hadoop > > > > input formats implementing Configurable (like > > mapreduce.TableInputFormat) > > > > don't have the Hadoop configuration set via setConf(Configuration). > > > > > > > > I have a small fix for this, which I have to clean up. First, I > wanted > > to > > > > check what you think about this issue wrt the release. Personally, I > > > think > > > > this is a release blocker, because it essentially means that no > Hadoop > > > > input format, which relies on the Configuration instance to be set > this > > > way > > > > will work (this is to some extent a bug of the respective input > > formats) > > > – > > > > most notably the HBase TableInputFormat. > > > > > > > > – Ufuk > > > > > > > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> wrote: > > > > > > > > > I attached jps and jstack log about hanging > > > > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > > > > > > > Regards, > > > > > Chiwan Park > > > > > > > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek < > [hidden email] > > > > > > > wrote: > > > > >> > > > > >> I discovered something that might be a feature, rather than a bug. > > > When > > > > you > > > > >> submit an example using the web client without giving parameters > the > > > > >> program fails with this: > > > > >> > > > > >> org.apache.flink.client.program.ProgramInvocationException: The > main > > > > method > > > > >> caused an error. > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > > > > >> > > > > >> at org.apache.flink.client.program.Client.run(Client.java:315) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > > > > >> > > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > > > > >> > > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > > > > >> > > > > >> at > > > > > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > > > > >> > > > > >> at > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > > > > >> > > > > >> at > > > > > > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > > > > >> > > > > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > > > > >> > > > > >> at > org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > > > > >> > > > > >> at > > > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > > > > >> > > > > >> at > > > > > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > > > > >> > > > > >> at java.lang.Thread.run(Thread.java:745) > > > > >> > > > > >> Caused by: java.lang.NullPointerException > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > > > > >> > > > > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > > > > >> > > > > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > > > > >> > > > > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > >> > > > > >> at java.lang.reflect.Method.invoke(Method.java:497) > > > > >> > > > > >> at > > > > >> > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > > > > >> > > > > >> ... 24 more > > > > >> > > > > >> > > > > >> This also only occurs when you uncheck the "suspend execution > while > > > > showing > > > > >> plan". > > > > >> > > > > >> I think this arises because the new print() uses collect() which > > tries > > > > to > > > > >> get the job execution result. I guess the result is Null since the > > job > > > > is > > > > >> submitted asynchronously when the checkbox is unchecked. > > > > >> > > > > >> > > > > >> Other than that, the new print() is pretty sweet when you run the > > > > builtin > > > > >> examples from the CLI. You get all the state changes and also the > > > > result, > > > > >> even when running in cluster mode on several task managers. :D > > > > >> > > > > >> > > > > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek < > > [hidden email] > > > > > > > > >> wrote: > > > > >> > > > > >>> I discovered another problem: > > > > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure > > cleaner > > > > >>> cannot be disabled in part of the Streaming Java API and all of > the > > > > >>> Streaming Scala API. I think this is a release blocker (in > addition > > > > >>> the the other bugs found so far.) > > > > >>> > > > > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek < > > > [hidden email]> > > > > >>> wrote: > > > > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was > > > comparing > > > > >>>> the hostname to a hostname in some yarn config. In one case it > was > > > > >>>> capitalised, in the other case it wasn't. > > > > >>>> > > > > >>>> Pushing fix to master and release-0.9 branch. > > > > >>>> > > > > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel < > > > [hidden email] > > > > > > > > > >>> wrote: > > > > >>>>> A re-ran lead to reproducibility of 11 failures again. > > > > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a > > > time-out > > > > but > > > > >>>>> managed to succeed in a re-run. Here is the log output again: > > > > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > > > > >>>>> > > > > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > > > > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > > > > >>>>> > > > > >>>>> Regards > > > > >>>>> Sachin Goel > > > > >>>>> > > > > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> > > > wrote: > > > > >>>>> > > > > >>>>>> > > > > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel < > [hidden email] > > > > > > > >>> wrote: > > > > >>>>>> > > > > >>>>>>> On my local machine, several flink runtime tests are failing > on > > > > "mvn > > > > >>>>>> clean > > > > >>>>>>> verify". Here is the log output: > > > > >>> http://pastebin.com/raw.php?i=VWbx2ppf > > > > >>>>>> > > > > >>>>>> Thanks for reporting this. Have you tried it multiple times? > Is > > it > > > > >>> failing > > > > >>>>>> reproducibly with the same tests? What's your setup? > > > > >>>>>> > > > > >>>>>> – Ufuk > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
I'm not against including the feature but I'd like to discuss it first. I
believe that only very carefully selected commits should be added to release-0.9. If that feature happens to be tested extensively and is very important for user satisfactory then we might include it. On Wed, Jun 10, 2015 at 10:59 AM, F. Beligianni <[hidden email]> wrote: > I agree with Gyula regarding the iteration partitioning. > I have also been using this feature for developing machine learning > algorithms. And I think SAMOA also needs this feature. > > Faye > > 2015-06-10 10:54 GMT+02:00 Gyula Fóra <[hidden email]>: > > > This feature needs to be included in the release, it has been tested and > > used extensively. And many applciations depend on it. > > > > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., > > Sze, > > 10:47): > > > > > With all the issues discovered, it looks like we'll have another > release > > > candidate. Right now, we have discovered the following problems: > > > > > > 1 YARN ITCase fails [fixed via 2eb5cfe] > > > 2 No Jar for SessionWindowing example [fixed in #809] > > > 3 Wrong description of the input format for the graph examples (eg. > > > ConnectedComponents) [fixed in #809] > > > 4 TaskManagerFailsWithSlotSharingITCase fails > > > 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > > > 6 Submitting KMeans example to Web Submission Client does not work on > > > Firefox. > > > 7 Zooming is buggy in Web Submission Client (Firefox) > > > > > > Do we have someone familiar with the web interface who could take a > look > > at > > > the Firefox issues? > > > > > > One more important thing: The release-0.9 branch should only be used > for > > > bug fixes or prior discussed feature changes. Adding new features > defies > > > the purpose of carefully testing in advance and can have unforeseeable > > > consequences. In particular, I'm referring to #810 pull request: > > > https://github.com/apache/flink/pull/810 > > > > > > IMHO, this one shouldn't have been cherry-picked onto the release-0.9 > > > branch. I would like to remove it from there if no objections are > raised. > > > > > > > > > > > > https://github.com/apache/flink/commit/e0e6f59f309170e5217bdfbf5d30db87c947f8ce > > > > > > On Wed, Jun 10, 2015 at 8:52 AM, Aljoscha Krettek <[hidden email] > > > > > wrote: > > > > > > > This doesn't look good, yes. > > > > > > > > On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > > > > > While looking into FLINK-2188 (HBase input) I've discovered that > > Hadoop > > > > > input formats implementing Configurable (like > > > mapreduce.TableInputFormat) > > > > > don't have the Hadoop configuration set via setConf(Configuration). > > > > > > > > > > I have a small fix for this, which I have to clean up. First, I > > wanted > > > to > > > > > check what you think about this issue wrt the release. Personally, > I > > > > think > > > > > this is a release blocker, because it essentially means that no > > Hadoop > > > > > input format, which relies on the Configuration instance to be set > > this > > > > way > > > > > will work (this is to some extent a bug of the respective input > > > formats) > > > > – > > > > > most notably the HBase TableInputFormat. > > > > > > > > > > – Ufuk > > > > > > > > > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> > wrote: > > > > > > > > > > > I attached jps and jstack log about hanging > > > > > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > > > > > > > > > Regards, > > > > > > Chiwan Park > > > > > > > > > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek < > > [hidden email] > > > > > > > > > wrote: > > > > > >> > > > > > >> I discovered something that might be a feature, rather than a > bug. > > > > When > > > > > you > > > > > >> submit an example using the web client without giving parameters > > the > > > > > >> program fails with this: > > > > > >> > > > > > >> org.apache.flink.client.program.ProgramInvocationException: The > > main > > > > > method > > > > > >> caused an error. > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > > > > > >> > > > > > >> at org.apache.flink.client.program.Client.run(Client.java:315) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > > > > > >> > > > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > > > > > >> > > > > > >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > > > > > >> > > > > > >> at > > > > > > > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > > > > > >> > > > > > >> at > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > > > > > >> > > > > > >> at > > > > > > > > > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > > > > > >> > > > > > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > > > > > >> > > > > > >> at > > org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > > > > > >> > > > > > >> at > > > > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > > > > > >> > > > > > >> at > > > > > > > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > > > > > >> > > > > > >> at java.lang.Thread.run(Thread.java:745) > > > > > >> > > > > > >> Caused by: java.lang.NullPointerException > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > > > > > >> > > > > > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > > > > > >> > > > > > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > > > > > >> > > > > > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > > >> > > > > > >> at java.lang.reflect.Method.invoke(Method.java:497) > > > > > >> > > > > > >> at > > > > > >> > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > > > > > >> > > > > > >> ... 24 more > > > > > >> > > > > > >> > > > > > >> This also only occurs when you uncheck the "suspend execution > > while > > > > > showing > > > > > >> plan". > > > > > >> > > > > > >> I think this arises because the new print() uses collect() which > > > tries > > > > > to > > > > > >> get the job execution result. I guess the result is Null since > the > > > job > > > > > is > > > > > >> submitted asynchronously when the checkbox is unchecked. > > > > > >> > > > > > >> > > > > > >> Other than that, the new print() is pretty sweet when you run > the > > > > > builtin > > > > > >> examples from the CLI. You get all the state changes and also > the > > > > > result, > > > > > >> even when running in cluster mode on several task managers. :D > > > > > >> > > > > > >> > > > > > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek < > > > [hidden email] > > > > > > > > > > >> wrote: > > > > > >> > > > > > >>> I discovered another problem: > > > > > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure > > > cleaner > > > > > >>> cannot be disabled in part of the Streaming Java API and all of > > the > > > > > >>> Streaming Scala API. I think this is a release blocker (in > > addition > > > > > >>> the the other bugs found so far.) > > > > > >>> > > > > > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek < > > > > [hidden email]> > > > > > >>> wrote: > > > > > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was > > > > comparing > > > > > >>>> the hostname to a hostname in some yarn config. In one case it > > was > > > > > >>>> capitalised, in the other case it wasn't. > > > > > >>>> > > > > > >>>> Pushing fix to master and release-0.9 branch. > > > > > >>>> > > > > > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel < > > > > [hidden email] > > > > > > > > > > > >>> wrote: > > > > > >>>>> A re-ran lead to reproducibility of 11 failures again. > > > > > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a > > > > time-out > > > > > but > > > > > >>>>> managed to succeed in a re-run. Here is the log output again: > > > > > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > > > > > >>>>> > > > > > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > > > > > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > > > > > >>>>> > > > > > >>>>> Regards > > > > > >>>>> Sachin Goel > > > > > >>>>> > > > > > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi <[hidden email]> > > > > wrote: > > > > > >>>>> > > > > > >>>>>> > > > > > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel < > > [hidden email] > > > > > > > > > >>> wrote: > > > > > >>>>>> > > > > > >>>>>>> On my local machine, several flink runtime tests are > failing > > on > > > > > "mvn > > > > > >>>>>> clean > > > > > >>>>>>> verify". Here is the log output: > > > > > >>> http://pastebin.com/raw.php?i=VWbx2ppf > > > > > >>>>>> > > > > > >>>>>> Thanks for reporting this. Have you tried it multiple times? > > Is > > > it > > > > > >>> failing > > > > > >>>>>> reproducibly with the same tests? What's your setup? > > > > > >>>>>> > > > > > >>>>>> – Ufuk > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Adding one more thing to the list:
The code contains a misplaced class (mea culpa) in flink-java, org.apache.flink.api.java.SortPartitionOperator which is API facing and should be moved to the operators package. If we do that after the release, it will break binary compatibility. I created FLINK-2196 and will open a PR soon. If nobody objects, I'll merge it into the 0.9 release branch as well. 2015-06-10 11:02 GMT+02:00 Maximilian Michels <[hidden email]>: > I'm not against including the feature but I'd like to discuss it first. I > believe that only very carefully selected commits should be added to > release-0.9. If that feature happens to be tested extensively and is very > important for user satisfactory then we might include it. > > On Wed, Jun 10, 2015 at 10:59 AM, F. Beligianni <[hidden email]> > wrote: > > > I agree with Gyula regarding the iteration partitioning. > > I have also been using this feature for developing machine learning > > algorithms. And I think SAMOA also needs this feature. > > > > Faye > > > > 2015-06-10 10:54 GMT+02:00 Gyula Fóra <[hidden email]>: > > > > > This feature needs to be included in the release, it has been tested > and > > > used extensively. And many applciations depend on it. > > > > > > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., > > > Sze, > > > 10:47): > > > > > > > With all the issues discovered, it looks like we'll have another > > release > > > > candidate. Right now, we have discovered the following problems: > > > > > > > > 1 YARN ITCase fails [fixed via 2eb5cfe] > > > > 2 No Jar for SessionWindowing example [fixed in #809] > > > > 3 Wrong description of the input format for the graph examples (eg. > > > > ConnectedComponents) [fixed in #809] > > > > 4 TaskManagerFailsWithSlotSharingITCase fails > > > > 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > > > > 6 Submitting KMeans example to Web Submission Client does not work on > > > > Firefox. > > > > 7 Zooming is buggy in Web Submission Client (Firefox) > > > > > > > > Do we have someone familiar with the web interface who could take a > > look > > > at > > > > the Firefox issues? > > > > > > > > One more important thing: The release-0.9 branch should only be used > > for > > > > bug fixes or prior discussed feature changes. Adding new features > > defies > > > > the purpose of carefully testing in advance and can have > unforeseeable > > > > consequences. In particular, I'm referring to #810 pull request: > > > > https://github.com/apache/flink/pull/810 > > > > > > > > IMHO, this one shouldn't have been cherry-picked onto the release-0.9 > > > > branch. I would like to remove it from there if no objections are > > raised. > > > > > > > > > > > > > > > > > > https://github.com/apache/flink/commit/e0e6f59f309170e5217bdfbf5d30db87c947f8ce > > > > > > > > On Wed, Jun 10, 2015 at 8:52 AM, Aljoscha Krettek < > [hidden email] > > > > > > > wrote: > > > > > > > > > This doesn't look good, yes. > > > > > > > > > > On Wed, Jun 10, 2015 at 1:32 AM, Ufuk Celebi <[hidden email]> > wrote: > > > > > > > > > > > While looking into FLINK-2188 (HBase input) I've discovered that > > > Hadoop > > > > > > input formats implementing Configurable (like > > > > mapreduce.TableInputFormat) > > > > > > don't have the Hadoop configuration set via > setConf(Configuration). > > > > > > > > > > > > I have a small fix for this, which I have to clean up. First, I > > > wanted > > > > to > > > > > > check what you think about this issue wrt the release. > Personally, > > I > > > > > think > > > > > > this is a release blocker, because it essentially means that no > > > Hadoop > > > > > > input format, which relies on the Configuration instance to be > set > > > this > > > > > way > > > > > > will work (this is to some extent a bug of the respective input > > > > formats) > > > > > – > > > > > > most notably the HBase TableInputFormat. > > > > > > > > > > > > – Ufuk > > > > > > > > > > > > On 09 Jun 2015, at 18:07, Chiwan Park <[hidden email]> > > wrote: > > > > > > > > > > > > > I attached jps and jstack log about hanging > > > > > > TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183. > > > > > > > > > > > > > > Regards, > > > > > > > Chiwan Park > > > > > > > > > > > > > >> On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek < > > > [hidden email] > > > > > > > > > > > wrote: > > > > > > >> > > > > > > >> I discovered something that might be a feature, rather than a > > bug. > > > > > When > > > > > > you > > > > > > >> submit an example using the web client without giving > parameters > > > the > > > > > > >> program fails with this: > > > > > > >> > > > > > > >> org.apache.flink.client.program.ProgramInvocationException: > The > > > main > > > > > > method > > > > > > >> caused an error. > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > > > > > > >> > > > > > > >> at org.apache.flink.client.program.Client.run(Client.java:315) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302) > > > > > > >> > > > > > > >> at > javax.servlet.http.HttpServlet.service(HttpServlet.java:668) > > > > > > >> > > > > > > >> at > javax.servlet.http.HttpServlet.service(HttpServlet.java:770) > > > > > > >> > > > > > > >> at > > > > > > > > > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965) > > > > > > >> > > > > > > >> at > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > > > > > > >> > > > > > > >> at > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) > > > > > > >> > > > > > > >> at org.eclipse.jetty.server.Server.handle(Server.java:352) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048) > > > > > > >> > > > > > > >> at > > > org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549) > > > > > > >> > > > > > > >> at > > > > > > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) > > > > > > >> > > > > > > >> at > > > > > > > > > org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) > > > > > > >> > > > > > > >> at java.lang.Thread.run(Thread.java:745) > > > > > > >> > > > > > > >> Caused by: java.lang.NullPointerException > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78) > > > > > > >> > > > > > > >> at org.apache.flink.api.java.DataSet.collect(DataSet.java:409) > > > > > > >> > > > > > > >> at org.apache.flink.api.java.DataSet.print(DataSet.java:1345) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80) > > > > > > >> > > > > > > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > > > >> > > > > > > >> at java.lang.reflect.Method.invoke(Method.java:497) > > > > > > >> > > > > > > >> at > > > > > > >> > > > > > > > > > > > > > > > > > > > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > > > > > > >> > > > > > > >> ... 24 more > > > > > > >> > > > > > > >> > > > > > > >> This also only occurs when you uncheck the "suspend execution > > > while > > > > > > showing > > > > > > >> plan". > > > > > > >> > > > > > > >> I think this arises because the new print() uses collect() > which > > > > tries > > > > > > to > > > > > > >> get the job execution result. I guess the result is Null since > > the > > > > job > > > > > > is > > > > > > >> submitted asynchronously when the checkbox is unchecked. > > > > > > >> > > > > > > >> > > > > > > >> Other than that, the new print() is pretty sweet when you run > > the > > > > > > builtin > > > > > > >> examples from the CLI. You get all the state changes and also > > the > > > > > > result, > > > > > > >> even when running in cluster mode on several task managers. :D > > > > > > >> > > > > > > >> > > > > > > >> On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek < > > > > [hidden email] > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > >>> I discovered another problem: > > > > > > >>> https://issues.apache.org/jira/browse/FLINK-2191 The closure > > > > cleaner > > > > > > >>> cannot be disabled in part of the Streaming Java API and all > of > > > the > > > > > > >>> Streaming Scala API. I think this is a release blocker (in > > > addition > > > > > > >>> the the other bugs found so far.) > > > > > > >>> > > > > > > >>> On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek < > > > > > [hidden email]> > > > > > > >>> wrote: > > > > > > >>>> I found the bug in the failing YARNSessionFIFOITCase: It was > > > > > comparing > > > > > > >>>> the hostname to a hostname in some yarn config. In one case > it > > > was > > > > > > >>>> capitalised, in the other case it wasn't. > > > > > > >>>> > > > > > > >>>> Pushing fix to master and release-0.9 branch. > > > > > > >>>> > > > > > > >>>> On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel < > > > > > [hidden email] > > > > > > > > > > > > > >>> wrote: > > > > > > >>>>> A re-ran lead to reproducibility of 11 failures again. > > > > > > >>>>> TaskManagerTest.testSubmitAndExecuteTask was failing with a > > > > > time-out > > > > > > but > > > > > > >>>>> managed to succeed in a re-run. Here is the log output > again: > > > > > > >>>>> http://pastebin.com/raw.php?i=N4cm1J18 > > > > > > >>>>> > > > > > > >>>>> Setup: JDK 1.8.0_40 on windows 8.1 > > > > > > >>>>> System memory: 8GB, quad-core with maximum 8 threads. > > > > > > >>>>> > > > > > > >>>>> Regards > > > > > > >>>>> Sachin Goel > > > > > > >>>>> > > > > > > >>>>> On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi < > [hidden email]> > > > > > wrote: > > > > > > >>>>> > > > > > > >>>>>> > > > > > > >>>>>> On 09 Jun 2015, at 13:58, Sachin Goel < > > > [hidden email] > > > > > > > > > > > >>> wrote: > > > > > > >>>>>> > > > > > > >>>>>>> On my local machine, several flink runtime tests are > > failing > > > on > > > > > > "mvn > > > > > > >>>>>> clean > > > > > > >>>>>>> verify". Here is the log output: > > > > > > >>> http://pastebin.com/raw.php?i=VWbx2ppf > > > > > > >>>>>> > > > > > > >>>>>> Thanks for reporting this. Have you tried it multiple > times? > > > Is > > > > it > > > > > > >>> failing > > > > > > >>>>>> reproducibly with the same tests? What's your setup? > > > > > > >>>>>> > > > > > > >>>>>> – Ufuk > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
In reply to this post by Gyula Fóra
Hey Gyula, Max,
On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > This feature needs to be included in the release, it has been tested and > used extensively. And many applciations depend on it. It would be nice to announce/discuss this before just cherry-picking it into the release branch. The issue is that no one (except you) knows that this is important. Let's just make sure to do this for future fixes. Having said that... it seems to be an important fix. Does someone have time (looking at Aljoscha ;)) to review the changes? > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., Sze, > 10:47): > >> With all the issues discovered, it looks like we'll have another release >> candidate. Right now, we have discovered the following problems: >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] >> 2 No Jar for SessionWindowing example [fixed in #809] >> 3 Wrong description of the input format for the graph examples (eg. >> ConnectedComponents) [fixed in #809] >> 4 TaskManagerFailsWithSlotSharingITCase fails >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails Can we verify that the tests are defect and not the tested component? ;) Otherwise, I would not block the release on flakey tests. >> 6 Submitting KMeans example to Web Submission Client does not work on >> Firefox. >> 7 Zooming is buggy in Web Submission Client (Firefox) >> Do we have someone familiar with the web interface who could take a look at >> the Firefox issues? If not, I would not block the release on this. |
As for the streaming commit cherry-picked to the release branch:
This is an unfortunate communication issue, let us make sure that we clearly communicate similar issues in the future. As for FLINK-2192: This is essentially a duplicate issue of the testability of the streaming iteration. Not a blocker, I will comment on the JIRA ticket, Gabor Hermann is already working on the root cause. On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> wrote: > Hey Gyula, Max, > > On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > > > This feature needs to be included in the release, it has been tested and > > used extensively. And many applciations depend on it. > > It would be nice to announce/discuss this before just cherry-picking it > into the release branch. The issue is that no one (except you) knows that > this is important. Let's just make sure to do this for future fixes. > > Having said that... it seems to be an important fix. Does someone have > time (looking at Aljoscha ;)) to review the changes? > > > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., > Sze, > > 10:47): > > > >> With all the issues discovered, it looks like we'll have another release > >> candidate. Right now, we have discovered the following problems: > >> > >> 1 YARN ITCase fails [fixed via 2eb5cfe] > >> 2 No Jar for SessionWindowing example [fixed in #809] > >> 3 Wrong description of the input format for the graph examples (eg. > >> ConnectedComponents) [fixed in #809] > >> 4 TaskManagerFailsWithSlotSharingITCase fails > >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > > Can we verify that the tests are defect and not the tested component? ;) > Otherwise, I would not block the release on flakey tests. > > >> 6 Submitting KMeans example to Web Submission Client does not work on > >> Firefox. > >> 7 Zooming is buggy in Web Submission Client (Firefox) > >> Do we have someone familiar with the web interface who could take a > look at > >> the Firefox issues? > > If not, I would not block the release on this. |
The KMeans quickstart example does not work with the current state of
the KMeansDataGenerator. I created PR that brings the two in sync. This should probably go into the release since it affects initial user "satisfaction". On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi <[hidden email]> wrote: > As for the streaming commit cherry-picked to the release branch: > This is an unfortunate communication issue, let us make sure that we > clearly communicate similar issues in the future. > > As for FLINK-2192: This is essentially a duplicate issue of the testability > of the streaming iteration. Not a blocker, I will comment on the JIRA > ticket, Gabor Hermann is already working on the root cause. > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> wrote: > >> Hey Gyula, Max, >> >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: >> >> > This feature needs to be included in the release, it has been tested and >> > used extensively. And many applciations depend on it. >> >> It would be nice to announce/discuss this before just cherry-picking it >> into the release branch. The issue is that no one (except you) knows that >> this is important. Let's just make sure to do this for future fixes. >> >> Having said that... it seems to be an important fix. Does someone have >> time (looking at Aljoscha ;)) to review the changes? >> >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. 10., >> Sze, >> > 10:47): >> > >> >> With all the issues discovered, it looks like we'll have another release >> >> candidate. Right now, we have discovered the following problems: >> >> >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] >> >> 2 No Jar for SessionWindowing example [fixed in #809] >> >> 3 Wrong description of the input format for the graph examples (eg. >> >> ConnectedComponents) [fixed in #809] >> >> 4 TaskManagerFailsWithSlotSharingITCase fails >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails >> >> Can we verify that the tests are defect and not the tested component? ;) >> Otherwise, I would not block the release on flakey tests. >> >> >> 6 Submitting KMeans example to Web Submission Client does not work on >> >> Firefox. >> >> 7 Zooming is buggy in Web Submission Client (Firefox) >> >> Do we have someone familiar with the web interface who could take a >> look at >> >> the Firefox issues? >> >> If not, I would not block the release on this. |
I have run "mvn clean verify" five times now and every time I'm getting
these failed tests: BlobUtilsTest.before:45 null BlobUtilsTest.before:45 null BlobServerDeleteTest.testDeleteFails:291 null BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not remove write permissions from cache directory BlobServerPutTest.testPutBufferFails:224 null BlobServerPutTest.testPutNamedBufferFails:286 null JobManagerStartupTest.before:55 null JobManagerStartupTest.before:55 null DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has not been removed DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file has not been removed TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: timeout (19998080696 nanoseconds) during expectMsgClass waiting for class org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager TaskManagerProcessReapingTest.testReapProcessOnFailure:133 TaskManager process did not launch the TaskManager properly. Failed to look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager ** fails randomly. Is someone able to reproduce these while building on a windows machine? I would try to debug these myself but I'm not yet familiar with the core architecture and API. -- Sachin On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek <[hidden email]> wrote: > The KMeans quickstart example does not work with the current state of > the KMeansDataGenerator. I created PR that brings the two in sync. > This should probably go into the release since it affects initial user > "satisfaction". > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi > <[hidden email]> wrote: > > As for the streaming commit cherry-picked to the release branch: > > This is an unfortunate communication issue, let us make sure that we > > clearly communicate similar issues in the future. > > > > As for FLINK-2192: This is essentially a duplicate issue of the > testability > > of the streaming iteration. Not a blocker, I will comment on the JIRA > > ticket, Gabor Hermann is already working on the root cause. > > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> wrote: > > > >> Hey Gyula, Max, > >> > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > >> > >> > This feature needs to be included in the release, it has been tested > and > >> > used extensively. And many applciations depend on it. > >> > >> It would be nice to announce/discuss this before just cherry-picking it > >> into the release branch. The issue is that no one (except you) knows > that > >> this is important. Let's just make sure to do this for future fixes. > >> > >> Having said that... it seems to be an important fix. Does someone have > >> time (looking at Aljoscha ;)) to review the changes? > >> > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. > 10., > >> Sze, > >> > 10:47): > >> > > >> >> With all the issues discovered, it looks like we'll have another > release > >> >> candidate. Right now, we have discovered the following problems: > >> >> > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] > >> >> 2 No Jar for SessionWindowing example [fixed in #809] > >> >> 3 Wrong description of the input format for the graph examples (eg. > >> >> ConnectedComponents) [fixed in #809] > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) fails > >> > >> Can we verify that the tests are defect and not the tested component? ;) > >> Otherwise, I would not block the release on flakey tests. > >> > >> >> 6 Submitting KMeans example to Web Submission Client does not work on > >> >> Firefox. > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) > >> >> Do we have someone familiar with the web interface who could take a > >> look at > >> >> the Firefox issues? > >> > >> If not, I would not block the release on this. > |
Regarding the iteration partitioning feature, since I use it of course I
find it very useful, but it is true that it needs to be tested more extensively and also be discussed by the community before it is added in a release. Moreover, given the fact that I can still use it for research purposes (I had already cherry-picked before it is being merged to the master branch), there is no actual reason to put it in the next release, so that the community has more time to discuss and decide about the feature. Lastly, I cross checked the SAMOA application, and till now, there is still no algorithm implemented in the SAMOA API which needs the new feature. Faye. 2015-06-10 11:28 GMT+02:00 Sachin Goel <[hidden email]>: > I have run "mvn clean verify" five times now and every time I'm getting > these failed tests: > > BlobUtilsTest.before:45 null > BlobUtilsTest.before:45 null > BlobServerDeleteTest.testDeleteFails:291 null > BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not > remove write permissions from cache directory > BlobServerPutTest.testPutBufferFails:224 null > BlobServerPutTest.testPutNamedBufferFails:286 null > JobManagerStartupTest.before:55 null > JobManagerStartupTest.before:55 null > DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has > not been removed > DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file > has not been removed > TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: > timeout (19998080696 nanoseconds) during expectMsgClass waiting for > class > org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager > TaskManagerProcessReapingTest.testReapProcessOnFailure:133 > TaskManager process did not launch the TaskManager properly. Failed to > look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager > > ** fails randomly. > > Is someone able to reproduce these while building on a windows machine? I > would try to debug these myself but I'm not yet familiar with the core > architecture and API. > > -- Sachin > > On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek <[hidden email]> > wrote: > > > The KMeans quickstart example does not work with the current state of > > the KMeansDataGenerator. I created PR that brings the two in sync. > > This should probably go into the release since it affects initial user > > "satisfaction". > > > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi > > <[hidden email]> wrote: > > > As for the streaming commit cherry-picked to the release branch: > > > This is an unfortunate communication issue, let us make sure that we > > > clearly communicate similar issues in the future. > > > > > > As for FLINK-2192: This is essentially a duplicate issue of the > > testability > > > of the streaming iteration. Not a blocker, I will comment on the JIRA > > > ticket, Gabor Hermann is already working on the root cause. > > > > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > >> Hey Gyula, Max, > > >> > > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > > >> > > >> > This feature needs to be included in the release, it has been tested > > and > > >> > used extensively. And many applciations depend on it. > > >> > > >> It would be nice to announce/discuss this before just cherry-picking > it > > >> into the release branch. The issue is that no one (except you) knows > > that > > >> this is important. Let's just make sure to do this for future fixes. > > >> > > >> Having said that... it seems to be an important fix. Does someone have > > >> time (looking at Aljoscha ;)) to review the changes? > > >> > > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. > > 10., > > >> Sze, > > >> > 10:47): > > >> > > > >> >> With all the issues discovered, it looks like we'll have another > > release > > >> >> candidate. Right now, we have discovered the following problems: > > >> >> > > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] > > >> >> 2 No Jar for SessionWindowing example [fixed in #809] > > >> >> 3 Wrong description of the input format for the graph examples (eg. > > >> >> ConnectedComponents) [fixed in #809] > > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails > > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) > fails > > >> > > >> Can we verify that the tests are defect and not the tested component? > ;) > > >> Otherwise, I would not block the release on flakey tests. > > >> > > >> >> 6 Submitting KMeans example to Web Submission Client does not work > on > > >> >> Firefox. > > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) > > >> >> Do we have someone familiar with the web interface who could take a > > >> look at > > >> >> the Firefox issues? > > >> > > >> If not, I would not block the release on this. > > > |
In reply to this post by Sachin Goel
@Sachin: This looks like a file permission issue. We should have someone
else verify that on a Windows system. On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel <[hidden email]> wrote: > I have run "mvn clean verify" five times now and every time I'm getting > these failed tests: > > BlobUtilsTest.before:45 null > BlobUtilsTest.before:45 null > BlobServerDeleteTest.testDeleteFails:291 null > BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not > remove write permissions from cache directory > BlobServerPutTest.testPutBufferFails:224 null > BlobServerPutTest.testPutNamedBufferFails:286 null > JobManagerStartupTest.before:55 null > JobManagerStartupTest.before:55 null > DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has > not been removed > DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file > has not been removed > TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: > timeout (19998080696 nanoseconds) during expectMsgClass waiting for > class > org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager > TaskManagerProcessReapingTest.testReapProcessOnFailure:133 > TaskManager process did not launch the TaskManager properly. Failed to > look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager > > ** fails randomly. > > Is someone able to reproduce these while building on a windows machine? I > would try to debug these myself but I'm not yet familiar with the core > architecture and API. > > -- Sachin > > On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek <[hidden email]> > wrote: > > > The KMeans quickstart example does not work with the current state of > > the KMeansDataGenerator. I created PR that brings the two in sync. > > This should probably go into the release since it affects initial user > > "satisfaction". > > > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi > > <[hidden email]> wrote: > > > As for the streaming commit cherry-picked to the release branch: > > > This is an unfortunate communication issue, let us make sure that we > > > clearly communicate similar issues in the future. > > > > > > As for FLINK-2192: This is essentially a duplicate issue of the > > testability > > > of the streaming iteration. Not a blocker, I will comment on the JIRA > > > ticket, Gabor Hermann is already working on the root cause. > > > > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> wrote: > > > > > >> Hey Gyula, Max, > > >> > > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > > >> > > >> > This feature needs to be included in the release, it has been tested > > and > > >> > used extensively. And many applciations depend on it. > > >> > > >> It would be nice to announce/discuss this before just cherry-picking > it > > >> into the release branch. The issue is that no one (except you) knows > > that > > >> this is important. Let's just make sure to do this for future fixes. > > >> > > >> Having said that... it seems to be an important fix. Does someone have > > >> time (looking at Aljoscha ;)) to review the changes? > > >> > > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. > > 10., > > >> Sze, > > >> > 10:47): > > >> > > > >> >> With all the issues discovered, it looks like we'll have another > > release > > >> >> candidate. Right now, we have discovered the following problems: > > >> >> > > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] > > >> >> 2 No Jar for SessionWindowing example [fixed in #809] > > >> >> 3 Wrong description of the input format for the graph examples (eg. > > >> >> ConnectedComponents) [fixed in #809] > > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails > > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) > fails > > >> > > >> Can we verify that the tests are defect and not the tested component? > ;) > > >> Otherwise, I would not block the release on flakey tests. > > >> > > >> >> 6 Submitting KMeans example to Web Submission Client does not work > on > > >> >> Firefox. > > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) > > >> >> Do we have someone familiar with the web interface who could take a > > >> look at > > >> >> the Firefox issues? > > >> > > >> If not, I would not block the release on this. > > > |
@Sachin: I reproduced the build error on my Windows machine.
2015-06-10 12:22 GMT+02:00 Maximilian Michels <[hidden email]>: > @Sachin: This looks like a file permission issue. We should have someone > else verify that on a Windows system. > > On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel <[hidden email]> > wrote: > > > I have run "mvn clean verify" five times now and every time I'm getting > > these failed tests: > > > > BlobUtilsTest.before:45 null > > BlobUtilsTest.before:45 null > > BlobServerDeleteTest.testDeleteFails:291 null > > BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not > > remove write permissions from cache directory > > BlobServerPutTest.testPutBufferFails:224 null > > BlobServerPutTest.testPutNamedBufferFails:286 null > > JobManagerStartupTest.before:55 null > > JobManagerStartupTest.before:55 null > > DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has > > not been removed > > DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file > > has not been removed > > TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: > > timeout (19998080696 nanoseconds) during expectMsgClass waiting for > > class > > > org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager > > TaskManagerProcessReapingTest.testReapProcessOnFailure:133 > > TaskManager process did not launch the TaskManager properly. Failed to > > look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager > > > > ** fails randomly. > > > > Is someone able to reproduce these while building on a windows machine? I > > would try to debug these myself but I'm not yet familiar with the core > > architecture and API. > > > > -- Sachin > > > > On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek <[hidden email]> > > wrote: > > > > > The KMeans quickstart example does not work with the current state of > > > the KMeansDataGenerator. I created PR that brings the two in sync. > > > This should probably go into the release since it affects initial user > > > "satisfaction". > > > > > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi > > > <[hidden email]> wrote: > > > > As for the streaming commit cherry-picked to the release branch: > > > > This is an unfortunate communication issue, let us make sure that we > > > > clearly communicate similar issues in the future. > > > > > > > > As for FLINK-2192: This is essentially a duplicate issue of the > > > testability > > > > of the streaming iteration. Not a blocker, I will comment on the JIRA > > > > ticket, Gabor Hermann is already working on the root cause. > > > > > > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> > wrote: > > > > > > > >> Hey Gyula, Max, > > > >> > > > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: > > > >> > > > >> > This feature needs to be included in the release, it has been > tested > > > and > > > >> > used extensively. And many applciations depend on it. > > > >> > > > >> It would be nice to announce/discuss this before just cherry-picking > > it > > > >> into the release branch. The issue is that no one (except you) knows > > > that > > > >> this is important. Let's just make sure to do this for future fixes. > > > >> > > > >> Having said that... it seems to be an important fix. Does someone > have > > > >> time (looking at Aljoscha ;)) to review the changes? > > > >> > > > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. > > > 10., > > > >> Sze, > > > >> > 10:47): > > > >> > > > > >> >> With all the issues discovered, it looks like we'll have another > > > release > > > >> >> candidate. Right now, we have discovered the following problems: > > > >> >> > > > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] > > > >> >> 2 No Jar for SessionWindowing example [fixed in #809] > > > >> >> 3 Wrong description of the input format for the graph examples > (eg. > > > >> >> ConnectedComponents) [fixed in #809] > > > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails > > > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) > > fails > > > >> > > > >> Can we verify that the tests are defect and not the tested > component? > > ;) > > > >> Otherwise, I would not block the release on flakey tests. > > > >> > > > >> >> 6 Submitting KMeans example to Web Submission Client does not > work > > on > > > >> >> Firefox. > > > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) > > > >> >> Do we have someone familiar with the web interface who could > take a > > > >> look at > > > >> >> the Firefox issues? > > > >> > > > >> If not, I would not block the release on this. > > > > > > |
I added a section at the top of the release testing document to keep
track of commits that we might want to cherry-pick to the release. I included the YARNSessionFIFOITCase fix and the optional stream iteration partitioning (both already on release branch). On Wed, Jun 10, 2015 at 12:51 PM, Fabian Hueske <[hidden email]> wrote: > @Sachin: I reproduced the build error on my Windows machine. > > 2015-06-10 12:22 GMT+02:00 Maximilian Michels <[hidden email]>: > >> @Sachin: This looks like a file permission issue. We should have someone >> else verify that on a Windows system. >> >> On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel <[hidden email]> >> wrote: >> >> > I have run "mvn clean verify" five times now and every time I'm getting >> > these failed tests: >> > >> > BlobUtilsTest.before:45 null >> > BlobUtilsTest.before:45 null >> > BlobServerDeleteTest.testDeleteFails:291 null >> > BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not >> > remove write permissions from cache directory >> > BlobServerPutTest.testPutBufferFails:224 null >> > BlobServerPutTest.testPutNamedBufferFails:286 null >> > JobManagerStartupTest.before:55 null >> > JobManagerStartupTest.before:55 null >> > DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has >> > not been removed >> > DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file >> > has not been removed >> > TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: >> > timeout (19998080696 nanoseconds) during expectMsgClass waiting for >> > class >> > >> org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager >> > TaskManagerProcessReapingTest.testReapProcessOnFailure:133 >> > TaskManager process did not launch the TaskManager properly. Failed to >> > look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager >> > >> > ** fails randomly. >> > >> > Is someone able to reproduce these while building on a windows machine? I >> > would try to debug these myself but I'm not yet familiar with the core >> > architecture and API. >> > >> > -- Sachin >> > >> > On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek <[hidden email]> >> > wrote: >> > >> > > The KMeans quickstart example does not work with the current state of >> > > the KMeansDataGenerator. I created PR that brings the two in sync. >> > > This should probably go into the release since it affects initial user >> > > "satisfaction". >> > > >> > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi >> > > <[hidden email]> wrote: >> > > > As for the streaming commit cherry-picked to the release branch: >> > > > This is an unfortunate communication issue, let us make sure that we >> > > > clearly communicate similar issues in the future. >> > > > >> > > > As for FLINK-2192: This is essentially a duplicate issue of the >> > > testability >> > > > of the streaming iteration. Not a blocker, I will comment on the JIRA >> > > > ticket, Gabor Hermann is already working on the root cause. >> > > > >> > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> >> wrote: >> > > > >> > > >> Hey Gyula, Max, >> > > >> >> > > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> wrote: >> > > >> >> > > >> > This feature needs to be included in the release, it has been >> tested >> > > and >> > > >> > used extensively. And many applciations depend on it. >> > > >> >> > > >> It would be nice to announce/discuss this before just cherry-picking >> > it >> > > >> into the release branch. The issue is that no one (except you) knows >> > > that >> > > >> this is important. Let's just make sure to do this for future fixes. >> > > >> >> > > >> Having said that... it seems to be an important fix. Does someone >> have >> > > >> time (looking at Aljoscha ;)) to review the changes? >> > > >> >> > > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. jún. >> > > 10., >> > > >> Sze, >> > > >> > 10:47): >> > > >> > >> > > >> >> With all the issues discovered, it looks like we'll have another >> > > release >> > > >> >> candidate. Right now, we have discovered the following problems: >> > > >> >> >> > > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] >> > > >> >> 2 No Jar for SessionWindowing example [fixed in #809] >> > > >> >> 3 Wrong description of the input format for the graph examples >> (eg. >> > > >> >> ConnectedComponents) [fixed in #809] >> > > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails >> > > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() (FLINK-2192) >> > fails >> > > >> >> > > >> Can we verify that the tests are defect and not the tested >> component? >> > ;) >> > > >> Otherwise, I would not block the release on flakey tests. >> > > >> >> > > >> >> 6 Submitting KMeans example to Web Submission Client does not >> work >> > on >> > > >> >> Firefox. >> > > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) >> > > >> >> Do we have someone familiar with the web interface who could >> take a >> > > >> look at >> > > >> >> the Firefox issues? >> > > >> >> > > >> If not, I would not block the release on this. >> > > >> > >> |
I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its
cause but still need to find out how to fix it. On Wed, Jun 10, 2015 at 2:25 PM, Aljoscha Krettek <[hidden email]> wrote: > I added a section at the top of the release testing document to keep > track of commits that we might want to cherry-pick to the release. I > included the YARNSessionFIFOITCase fix and the optional stream > iteration partitioning (both already on release branch). > > On Wed, Jun 10, 2015 at 12:51 PM, Fabian Hueske <[hidden email]> wrote: > > @Sachin: I reproduced the build error on my Windows machine. > > > > 2015-06-10 12:22 GMT+02:00 Maximilian Michels <[hidden email]>: > > > >> @Sachin: This looks like a file permission issue. We should have someone > >> else verify that on a Windows system. > >> > >> On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel <[hidden email] > > > >> wrote: > >> > >> > I have run "mvn clean verify" five times now and every time I'm > getting > >> > these failed tests: > >> > > >> > BlobUtilsTest.before:45 null > >> > BlobUtilsTest.before:45 null > >> > BlobServerDeleteTest.testDeleteFails:291 null > >> > BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not > >> > remove write permissions from cache directory > >> > BlobServerPutTest.testPutBufferFails:224 null > >> > BlobServerPutTest.testPutNamedBufferFails:286 null > >> > JobManagerStartupTest.before:55 null > >> > JobManagerStartupTest.before:55 null > >> > DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has > >> > not been removed > >> > DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file > >> > has not been removed > >> > TaskManagerTest.testSubmitAndExecuteTask**:123 assertion failed: > >> > timeout (19998080696 nanoseconds) during expectMsgClass waiting for > >> > class > >> > > >> > org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager > >> > TaskManagerProcessReapingTest.testReapProcessOnFailure:133 > >> > TaskManager process did not launch the TaskManager properly. Failed to > >> > look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager > >> > > >> > ** fails randomly. > >> > > >> > Is someone able to reproduce these while building on a windows > machine? I > >> > would try to debug these myself but I'm not yet familiar with the core > >> > architecture and API. > >> > > >> > -- Sachin > >> > > >> > On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek < > [hidden email]> > >> > wrote: > >> > > >> > > The KMeans quickstart example does not work with the current state > of > >> > > the KMeansDataGenerator. I created PR that brings the two in sync. > >> > > This should probably go into the release since it affects initial > user > >> > > "satisfaction". > >> > > > >> > > On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi > >> > > <[hidden email]> wrote: > >> > > > As for the streaming commit cherry-picked to the release branch: > >> > > > This is an unfortunate communication issue, let us make sure that > we > >> > > > clearly communicate similar issues in the future. > >> > > > > >> > > > As for FLINK-2192: This is essentially a duplicate issue of the > >> > > testability > >> > > > of the streaming iteration. Not a blocker, I will comment on the > JIRA > >> > > > ticket, Gabor Hermann is already working on the root cause. > >> > > > > >> > > > On Wed, Jun 10, 2015 at 11:07 AM, Ufuk Celebi <[hidden email]> > >> wrote: > >> > > > > >> > > >> Hey Gyula, Max, > >> > > >> > >> > > >> On 10 Jun 2015, at 10:54, Gyula Fóra <[hidden email]> > wrote: > >> > > >> > >> > > >> > This feature needs to be included in the release, it has been > >> tested > >> > > and > >> > > >> > used extensively. And many applciations depend on it. > >> > > >> > >> > > >> It would be nice to announce/discuss this before just > cherry-picking > >> > it > >> > > >> into the release branch. The issue is that no one (except you) > knows > >> > > that > >> > > >> this is important. Let's just make sure to do this for future > fixes. > >> > > >> > >> > > >> Having said that... it seems to be an important fix. Does someone > >> have > >> > > >> time (looking at Aljoscha ;)) to review the changes? > >> > > >> > >> > > >> > Maximilian Michels <[hidden email]> ezt írta (időpont: 2015. > jún. > >> > > 10., > >> > > >> Sze, > >> > > >> > 10:47): > >> > > >> > > >> > > >> >> With all the issues discovered, it looks like we'll have > another > >> > > release > >> > > >> >> candidate. Right now, we have discovered the following > problems: > >> > > >> >> > >> > > >> >> 1 YARN ITCase fails [fixed via 2eb5cfe] > >> > > >> >> 2 No Jar for SessionWindowing example [fixed in #809] > >> > > >> >> 3 Wrong description of the input format for the graph examples > >> (eg. > >> > > >> >> ConnectedComponents) [fixed in #809] > >> > > >> >> 4 TaskManagerFailsWithSlotSharingITCase fails > >> > > >> >> 5 ComplexIntegrationTest.complexIntegrationTest1() > (FLINK-2192) > >> > fails > >> > > >> > >> > > >> Can we verify that the tests are defect and not the tested > >> component? > >> > ;) > >> > > >> Otherwise, I would not block the release on flakey tests. > >> > > >> > >> > > >> >> 6 Submitting KMeans example to Web Submission Client does not > >> work > >> > on > >> > > >> >> Firefox. > >> > > >> >> 7 Zooming is buggy in Web Submission Client (Firefox) > >> > > >> >> Do we have someone familiar with the web interface who could > >> take a > >> > > >> look at > >> > > >> >> the Firefox issues? > >> > > >> > >> > > >> If not, I would not block the release on this. > >> > > > >> > > >> > |
On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote: > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its > cause but still need to find out how to fix it. Very good find, Max! Max, Till, and I have looked into this and it is a reproducible deadlock in the scheduler during concurrent slot release (in failure cases). Max will attach the relevant stack trace to the issue. I think this is a release blocker. Any opinions? – Ufuk |
Yes since it is clearly a deadlock in the scheduler, the current version
shouldn't be released. On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote: > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote: > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its > > cause but still need to find out how to fix it. > > Very good find, Max! > > Max, Till, and I have looked into this and it is a reproducible deadlock > in the scheduler during concurrent slot release (in failure cases). Max > will attach the relevant stack trace to the issue. > > I think this is a release blocker. Any opinions? > > – Ufuk |
Yes, that needs to be fixed IMO
2015-06-10 17:51 GMT+02:00 Till Rohrmann <[hidden email]>: > Yes since it is clearly a deadlock in the scheduler, the current version > shouldn't be released. > > On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote: > > > > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote: > > > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located > its > > > cause but still need to find out how to fix it. > > > > Very good find, Max! > > > > Max, Till, and I have looked into this and it is a reproducible deadlock > > in the scheduler during concurrent slot release (in failure cases). Max > > will attach the relevant stack trace to the issue. > > > > I think this is a release blocker. Any opinions? > > > > – Ufuk > |
The deadlock in the scheduler is now fixed. Based on the changes that have
been push to the release-0.9 branch, I'd like to create a new release candidate later on. I think we have gotten the most critical issues out of the way. Would that be ok for you? On Wed, Jun 10, 2015 at 5:56 PM, Fabian Hueske <[hidden email]> wrote: > Yes, that needs to be fixed IMO > > 2015-06-10 17:51 GMT+02:00 Till Rohrmann <[hidden email]>: > > > Yes since it is clearly a deadlock in the scheduler, the current version > > shouldn't be released. > > > > On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote: > > > > > > > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote: > > > > > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located > > its > > > > cause but still need to find out how to fix it. > > > > > > Very good find, Max! > > > > > > Max, Till, and I have looked into this and it is a reproducible > deadlock > > > in the scheduler during concurrent slot release (in failure cases). Max > > > will attach the relevant stack trace to the issue. > > > > > > I think this is a release blocker. Any opinions? > > > > > > – Ufuk > > > |
Aren't there still some commits at the top of the release document that
need to be cherry-picked to the release branch? On Thu, 11 Jun 2015 at 17:13 Maximilian Michels <[hidden email]> wrote: > The deadlock in the scheduler is now fixed. Based on the changes that have > been push to the release-0.9 branch, I'd like to create a new release > candidate later on. I think we have gotten the most critical issues out of > the way. Would that be ok for you? > > On Wed, Jun 10, 2015 at 5:56 PM, Fabian Hueske <[hidden email]> wrote: > > > Yes, that needs to be fixed IMO > > > > 2015-06-10 17:51 GMT+02:00 Till Rohrmann <[hidden email]>: > > > > > Yes since it is clearly a deadlock in the scheduler, the current > version > > > shouldn't be released. > > > > > > On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote: > > > > > > > > > > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote: > > > > > > > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've > located > > > its > > > > > cause but still need to find out how to fix it. > > > > > > > > Very good find, Max! > > > > > > > > Max, Till, and I have looked into this and it is a reproducible > > deadlock > > > > in the scheduler during concurrent slot release (in failure cases). > Max > > > > will attach the relevant stack trace to the issue. > > > > > > > > I think this is a release blocker. Any opinions? > > > > > > > > – Ufuk > > > > > > |
Free forum by Nabble | Edit this page |