Testing Apache Flink 0.9.0-rc1

classic Classic list List threaded Threaded
74 messages Options
1234
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

mxm
Yes, we would include those in the new release candidate.
On Jun 11, 2015 5:22 PM, "Aljoscha Krettek" <[hidden email]> wrote:

> Aren't there still some commits at the top of the release document that
> need to be cherry-picked to the release branch?
>
> On Thu, 11 Jun 2015 at 17:13 Maximilian Michels <[hidden email]> wrote:
>
> > The deadlock in the scheduler is now fixed. Based on the changes that
> have
> > been push to the release-0.9 branch, I'd like to create a new release
> > candidate later on. I think we have gotten the most critical issues out
> of
> > the way. Would that be ok for you?
> >
> > On Wed, Jun 10, 2015 at 5:56 PM, Fabian Hueske <[hidden email]>
> wrote:
> >
> > > Yes, that needs to be fixed IMO
> > >
> > > 2015-06-10 17:51 GMT+02:00 Till Rohrmann <[hidden email]>:
> > >
> > > > Yes since it is clearly a deadlock in the scheduler, the current
> > version
> > > > shouldn't be released.
> > > >
> > > > On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote:
> > > >
> > > > >
> > > > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]>
> wrote:
> > > > >
> > > > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've
> > located
> > > > its
> > > > > > cause but still need to find out how to fix it.
> > > > >
> > > > > Very good find, Max!
> > > > >
> > > > > Max, Till, and I have looked into this and it is a reproducible
> > > deadlock
> > > > > in the scheduler during concurrent slot release (in failure cases).
> > Max
> > > > > will attach the relevant stack trace to the issue.
> > > > >
> > > > > I think this is a release blocker. Any opinions?
> > > > >
> > > > > – Ufuk
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Fabian Hueske-2
In reply to this post by mxm
How about the following issues?

1. The Hbase Hadoop Compat issue, Ufuk is working on
2. The incorrect webinterface counts

@Ufuk were you able to reproduce the bug?
The deadlock in the scheduler is now fixed. Based on the changes that have
been push to the release-0.9 branch, I'd like to create a new release
candidate later on. I think we have gotten the most critical issues out of
the way. Would that be ok for you?

On Wed, Jun 10, 2015 at 5:56 PM, Fabian Hueske <[hidden email]> wrote:

> Yes, that needs to be fixed IMO
>
> 2015-06-10 17:51 GMT+02:00 Till Rohrmann <[hidden email]>:
>
> > Yes since it is clearly a deadlock in the scheduler, the current version
> > shouldn't be released.
> >
> > On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi <[hidden email]> wrote:
> >
> > >
> > > On 10 Jun 2015, at 16:18, Maximilian Michels <[hidden email]> wrote:
> > >
> > > > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've
located
> > its
> > > > cause but still need to find out how to fix it.
> > >
> > > Very good find, Max!
> > >
> > > Max, Till, and I have looked into this and it is a reproducible
> deadlock
> > > in the scheduler during concurrent slot release (in failure cases).
Max
> > > will attach the relevant stack trace to the issue.
> > >
> > > I think this is a release blocker. Any opinions?
> > >
> > > – Ufuk
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2

On 11 Jun 2015, at 20:04, Fabian Hueske <[hidden email]> wrote:

> How about the following issues?
>
> 1. The Hbase Hadoop Compat issue, Ufuk is working on

I was not able to reproduce this :( I ran HadoopInputFormats against various sources and confirmed the results and everything was fine so far.

I think I will open a PR for the small HadoopInputFormat fix and then we can start a new RC.

If we manage to reproduce the problem over the next days (I'm in contact with the user who reported the original issue [thanks Himi!]), we can still include a fix.

> 2. The incorrect webinterface counts

Personally, I don't think 2) is a blocker if it takes much more time. I still think it would be good to have it in.
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Fabian Hueske-2
2. is basically done. I have a patch which updates the counters on page
reload but that shouldn't be hard to extend to dynamic updates.

2015-06-12 0:40 GMT+02:00 Ufuk Celebi <[hidden email]>:

>
> On 11 Jun 2015, at 20:04, Fabian Hueske <[hidden email]> wrote:
>
> > How about the following issues?
> >
> > 1. The Hbase Hadoop Compat issue, Ufuk is working on
>
> I was not able to reproduce this :( I ran HadoopInputFormats against
> various sources and confirmed the results and everything was fine so far.
>
> I think I will open a PR for the small HadoopInputFormat fix and then we
> can start a new RC.
>
> If we manage to reproduce the problem over the next days (I'm in contact
> with the user who reported the original issue [thanks Himi!]), we can still
> include a fix.
>
> > 2. The incorrect webinterface counts
>
> Personally, I don't think 2) is a blocker if it takes much more time. I
> still think it would be good to have it in.
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2

On 12 Jun 2015, at 00:49, Fabian Hueske <[hidden email]> wrote:

> 2. is basically done. I have a patch which updates the counters on page
> reload but that shouldn't be hard to extend to dynamic updates.

Very nice! :-) Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
I'm currently going through the license file and I discovered some
skeletons in our closet. This has to be merged as well. But I'm still
working on it (we have a lot of dependencies).

Cheers,
Till

On Fri, Jun 12, 2015 at 12:51 AM Ufuk Celebi <[hidden email]> wrote:

>
> On 12 Jun 2015, at 00:49, Fabian Hueske <[hidden email]> wrote:
>
> > 2. is basically done. I have a patch which updates the counters on page
> > reload but that shouldn't be hard to extend to dynamic updates.
>
> Very nice! :-) Thanks!
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
Hi guys,

I just noticed while testing the TableAPI on the cluster that it is not
part of the dist module. Therefore, programs using the TableAPI will only
run when you put the TableAPI jar directly on the cluster or if you build a
fat jar including the TableAPI jar. This is nowhere documented.
Furthermore, this also applies to Gelly and FlinkML.

Cheers,
Till

On Fri, Jun 12, 2015 at 9:16 AM Till Rohrmann <[hidden email]> wrote:

> I'm currently going through the license file and I discovered some
> skeletons in our closet. This has to be merged as well. But I'm still
> working on it (we have a lot of dependencies).
>
> Cheers,
> Till
>
>
> On Fri, Jun 12, 2015 at 12:51 AM Ufuk Celebi <[hidden email]> wrote:
>
>>
>> On 12 Jun 2015, at 00:49, Fabian Hueske <[hidden email]> wrote:
>>
>> > 2. is basically done. I have a patch which updates the counters on page
>> > reload but that shouldn't be hard to extend to dynamic updates.
>>
>> Very nice! :-) Thanks!
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Márton Balassi
@Till: This also apples to the streaming connectors.

On Fri, Jun 12, 2015 at 9:45 AM, Till Rohrmann <[hidden email]> wrote:

> Hi guys,
>
> I just noticed while testing the TableAPI on the cluster that it is not
> part of the dist module. Therefore, programs using the TableAPI will only
> run when you put the TableAPI jar directly on the cluster or if you build a
> fat jar including the TableAPI jar. This is nowhere documented.
> Furthermore, this also applies to Gelly and FlinkML.
>
> Cheers,
> Till
>
> On Fri, Jun 12, 2015 at 9:16 AM Till Rohrmann <[hidden email]>
> wrote:
>
> > I'm currently going through the license file and I discovered some
> > skeletons in our closet. This has to be merged as well. But I'm still
> > working on it (we have a lot of dependencies).
> >
> > Cheers,
> > Till
> >
> >
> > On Fri, Jun 12, 2015 at 12:51 AM Ufuk Celebi <[hidden email]> wrote:
> >
> >>
> >> On 12 Jun 2015, at 00:49, Fabian Hueske <[hidden email]> wrote:
> >>
> >> > 2. is basically done. I have a patch which updates the counters on
> page
> >> > reload but that shouldn't be hard to extend to dynamic updates.
> >>
> >> Very nice! :-) Thanks!
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
In reply to this post by Till Rohrmann

On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:

> Hi guys,
>
> I just noticed while testing the TableAPI on the cluster that it is not
> part of the dist module. Therefore, programs using the TableAPI will only
> run when you put the TableAPI jar directly on the cluster or if you build a
> fat jar including the TableAPI jar. This is nowhere documented.
> Furthermore, this also applies to Gelly and FlinkML.

I think all of these should be included in the fat jar. They are all highly advertized components.

Very good catch, Till! I didn't get around to testing Table API on a cluster, yet.
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

mxm
We should have a nightly cluster test for every library. Let's keep that in
mind for the future. Very nice find, Till!

Since there were not objections, I cherry-picked the proposed commits from
the document to the release-0.9 branch. If I understand correctly, we can
create the new release candidate once Till has checked the licenses, Ufuk's
TableInput fix has been merged, and Fabian's web interface improvement are
in. Plus, we need to include all Flink libraries in flink-dist. Are you
going to fix that as well, Till?

On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:

>
> On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:
>
> > Hi guys,
> >
> > I just noticed while testing the TableAPI on the cluster that it is not
> > part of the dist module. Therefore, programs using the TableAPI will only
> > run when you put the TableAPI jar directly on the cluster or if you
> build a
> > fat jar including the TableAPI jar. This is nowhere documented.
> > Furthermore, this also applies to Gelly and FlinkML.
>
> I think all of these should be included in the fat jar. They are all
> highly advertized components.
>
> Very good catch, Till! I didn't get around to testing Table API on a
> cluster, yet.
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Fabian Hueske-2
I have another fix, but this is just a documentation update (FLINK-2207)
and will be done soon.

2015-06-12 10:02 GMT+02:00 Maximilian Michels <[hidden email]>:

> We should have a nightly cluster test for every library. Let's keep that in
> mind for the future. Very nice find, Till!
>
> Since there were not objections, I cherry-picked the proposed commits from
> the document to the release-0.9 branch. If I understand correctly, we can
> create the new release candidate once Till has checked the licenses, Ufuk's
> TableInput fix has been merged, and Fabian's web interface improvement are
> in. Plus, we need to include all Flink libraries in flink-dist. Are you
> going to fix that as well, Till?
>
> On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
>
> >
> > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:
> >
> > > Hi guys,
> > >
> > > I just noticed while testing the TableAPI on the cluster that it is not
> > > part of the dist module. Therefore, programs using the TableAPI will
> only
> > > run when you put the TableAPI jar directly on the cluster or if you
> > build a
> > > fat jar including the TableAPI jar. This is nowhere documented.
> > > Furthermore, this also applies to Gelly and FlinkML.
> >
> > I think all of these should be included in the fat jar. They are all
> > highly advertized components.
> >
> > Very good catch, Till! I didn't get around to testing Table API on a
> > cluster, yet.
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Márton Balassi
In reply to this post by mxm
As for outstanding issues I think streaming is good to go as far as I know.
I am personally against including all libraries - at least speaking for the
streaming connectors. Robert, Stephan and myself had a detailed discussion
on that some time ago and the disadvantage of having all the libraries in
the distribution is the dependency mess that they pull. In this case I
would rather add documentation on putting them in the user jar then. As for
the other libraries they do not depend on so much external code, so +1 for
putting them in.

On Fri, Jun 12, 2015 at 10:02 AM, Maximilian Michels <[hidden email]> wrote:

> We should have a nightly cluster test for every library. Let's keep that in
> mind for the future. Very nice find, Till!
>
> Since there were not objections, I cherry-picked the proposed commits from
> the document to the release-0.9 branch. If I understand correctly, we can
> create the new release candidate once Till has checked the licenses, Ufuk's
> TableInput fix has been merged, and Fabian's web interface improvement are
> in. Plus, we need to include all Flink libraries in flink-dist. Are you
> going to fix that as well, Till?
>
> On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
>
> >
> > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:
> >
> > > Hi guys,
> > >
> > > I just noticed while testing the TableAPI on the cluster that it is not
> > > part of the dist module. Therefore, programs using the TableAPI will
> only
> > > run when you put the TableAPI jar directly on the cluster or if you
> > build a
> > > fat jar including the TableAPI jar. This is nowhere documented.
> > > Furthermore, this also applies to Gelly and FlinkML.
> >
> > I think all of these should be included in the fat jar. They are all
> > highly advertized components.
> >
> > Very good catch, Till! I didn't get around to testing Table API on a
> > cluster, yet.
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
Well I think the initial idea was to keep the dist jar as small a possible
and therefore we did not include the libraries. I'm not sure whether we can
decide this here ad-hoc. If the community says that we shall include these
libraries then I can add them. But bear in mind that all of them have some
transitive dependencies which will be added as well.

On Fri, Jun 12, 2015 at 10:15 AM Márton Balassi <[hidden email]>
wrote:

> As for outstanding issues I think streaming is good to go as far as I know.
> I am personally against including all libraries - at least speaking for the
> streaming connectors. Robert, Stephan and myself had a detailed discussion
> on that some time ago and the disadvantage of having all the libraries in
> the distribution is the dependency mess that they pull. In this case I
> would rather add documentation on putting them in the user jar then. As for
> the other libraries they do not depend on so much external code, so +1 for
> putting them in.
>
> On Fri, Jun 12, 2015 at 10:02 AM, Maximilian Michels <[hidden email]>
> wrote:
>
> > We should have a nightly cluster test for every library. Let's keep that
> in
> > mind for the future. Very nice find, Till!
> >
> > Since there were not objections, I cherry-picked the proposed commits
> from
> > the document to the release-0.9 branch. If I understand correctly, we can
> > create the new release candidate once Till has checked the licenses,
> Ufuk's
> > TableInput fix has been merged, and Fabian's web interface improvement
> are
> > in. Plus, we need to include all Flink libraries in flink-dist. Are you
> > going to fix that as well, Till?
> >
> > On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
> >
> > >
> > > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I just noticed while testing the TableAPI on the cluster that it is
> not
> > > > part of the dist module. Therefore, programs using the TableAPI will
> > only
> > > > run when you put the TableAPI jar directly on the cluster or if you
> > > build a
> > > > fat jar including the TableAPI jar. This is nowhere documented.
> > > > Furthermore, this also applies to Gelly and FlinkML.
> > >
> > > I think all of these should be included in the fat jar. They are all
> > > highly advertized components.
> > >
> > > Very good catch, Till! I didn't get around to testing Table API on a
> > > cluster, yet.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2

On 12 Jun 2015, at 10:29, Till Rohrmann <[hidden email]> wrote:

> Well I think the initial idea was to keep the dist jar as small a possible
> and therefore we did not include the libraries. I'm not sure whether we can
> decide this here ad-hoc. If the community says that we shall include these
> libraries then I can add them. But bear in mind that all of them have some
> transitive dependencies which will be added as well.

I'm against the connectors as well, but not having Table API, Flink ML, and Gelly not in seems odd to me.

Or maybe I'm missing something. Someone who wants to try this out has to place the dependencies manually into the lib folder of the Flink installation, right?
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
In reply to this post by Till Rohrmann
I think I found a real release blocker. Currently we don't add license
files to our shaded jars. For example
the flink-shaded-include-yarn-0.9.0-milestone-1.jar shades hadoop code.
This code also includes the `org.apache.util.bloom.*` classes. These
classes are licensed under  The European Commission project OneLab. We have
a notice in the LICENSE file of our binary distribution but I think we also
have to add them in the shaded jar. There might even be more code bundled
as part of some shaded jars which I have not spotted yet.

Furthermore, I noticed that we list all Apache License dependencies in our
LICENSE file of our binary distribution (which we don't have to do).
However, we don't do it in our jars which contain for example guava and asm
as shaded dependencies. Maybe we should be consistent here.

But maybe I overlook something here and we don't have to do it.

On Fri, Jun 12, 2015 at 10:29 AM Till Rohrmann <[hidden email]> wrote:

> Well I think the initial idea was to keep the dist jar as small a possible
> and therefore we did not include the libraries. I'm not sure whether we can
> decide this here ad-hoc. If the community says that we shall include these
> libraries then I can add them. But bear in mind that all of them have some
> transitive dependencies which will be added as well.
>
>
> On Fri, Jun 12, 2015 at 10:15 AM Márton Balassi <[hidden email]>
> wrote:
>
>> As for outstanding issues I think streaming is good to go as far as I
>> know.
>> I am personally against including all libraries - at least speaking for
>> the
>> streaming connectors. Robert, Stephan and myself had a detailed discussion
>> on that some time ago and the disadvantage of having all the libraries in
>> the distribution is the dependency mess that they pull. In this case I
>> would rather add documentation on putting them in the user jar then. As
>> for
>> the other libraries they do not depend on so much external code, so +1 for
>> putting them in.
>>
>> On Fri, Jun 12, 2015 at 10:02 AM, Maximilian Michels <[hidden email]>
>> wrote:
>>
>> > We should have a nightly cluster test for every library. Let's keep
>> that in
>> > mind for the future. Very nice find, Till!
>> >
>> > Since there were not objections, I cherry-picked the proposed commits
>> from
>> > the document to the release-0.9 branch. If I understand correctly, we
>> can
>> > create the new release candidate once Till has checked the licenses,
>> Ufuk's
>> > TableInput fix has been merged, and Fabian's web interface improvement
>> are
>> > in. Plus, we need to include all Flink libraries in flink-dist. Are you
>> > going to fix that as well, Till?
>> >
>> > On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
>> >
>> > >
>> > > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]> wrote:
>> > >
>> > > > Hi guys,
>> > > >
>> > > > I just noticed while testing the TableAPI on the cluster that it is
>> not
>> > > > part of the dist module. Therefore, programs using the TableAPI will
>> > only
>> > > > run when you put the TableAPI jar directly on the cluster or if you
>> > > build a
>> > > > fat jar including the TableAPI jar. This is nowhere documented.
>> > > > Furthermore, this also applies to Gelly and FlinkML.
>> > >
>> > > I think all of these should be included in the fat jar. They are all
>> > > highly advertized components.
>> > >
>> > > Very good catch, Till! I didn't get around to testing Table API on a
>> > > cluster, yet.
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Till Rohrmann
Yes you're right Ufuk. At the moment the user has to place the jars in the
lib folder of Flink. If this folder is not shared then he has to do it for
every node on which Flink runs.

On Fri, Jun 12, 2015 at 10:42 AM Till Rohrmann <[hidden email]> wrote:

> I think I found a real release blocker. Currently we don't add license
> files to our shaded jars. For example
> the flink-shaded-include-yarn-0.9.0-milestone-1.jar shades hadoop code.
> This code also includes the `org.apache.util.bloom.*` classes. These
> classes are licensed under  The European Commission project OneLab. We have
> a notice in the LICENSE file of our binary distribution but I think we also
> have to add them in the shaded jar. There might even be more code bundled
> as part of some shaded jars which I have not spotted yet.
>
> Furthermore, I noticed that we list all Apache License dependencies in our
> LICENSE file of our binary distribution (which we don't have to do).
> However, we don't do it in our jars which contain for example guava and asm
> as shaded dependencies. Maybe we should be consistent here.
>
> But maybe I overlook something here and we don't have to do it.
>
> On Fri, Jun 12, 2015 at 10:29 AM Till Rohrmann <[hidden email]>
> wrote:
>
>> Well I think the initial idea was to keep the dist jar as small a
>> possible and therefore we did not include the libraries. I'm not sure
>> whether we can decide this here ad-hoc. If the community says that we shall
>> include these libraries then I can add them. But bear in mind that all of
>> them have some transitive dependencies which will be added as well.
>>
>>
>> On Fri, Jun 12, 2015 at 10:15 AM Márton Balassi <[hidden email]>
>> wrote:
>>
>>> As for outstanding issues I think streaming is good to go as far as I
>>> know.
>>> I am personally against including all libraries - at least speaking for
>>> the
>>> streaming connectors. Robert, Stephan and myself had a detailed
>>> discussion
>>> on that some time ago and the disadvantage of having all the libraries in
>>> the distribution is the dependency mess that they pull. In this case I
>>> would rather add documentation on putting them in the user jar then. As
>>> for
>>> the other libraries they do not depend on so much external code, so +1
>>> for
>>> putting them in.
>>>
>>> On Fri, Jun 12, 2015 at 10:02 AM, Maximilian Michels <[hidden email]>
>>> wrote:
>>>
>>> > We should have a nightly cluster test for every library. Let's keep
>>> that in
>>> > mind for the future. Very nice find, Till!
>>> >
>>> > Since there were not objections, I cherry-picked the proposed commits
>>> from
>>> > the document to the release-0.9 branch. If I understand correctly, we
>>> can
>>> > create the new release candidate once Till has checked the licenses,
>>> Ufuk's
>>> > TableInput fix has been merged, and Fabian's web interface improvement
>>> are
>>> > in. Plus, we need to include all Flink libraries in flink-dist. Are you
>>> > going to fix that as well, Till?
>>> >
>>> > On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
>>> >
>>> > >
>>> > > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]>
>>> wrote:
>>> > >
>>> > > > Hi guys,
>>> > > >
>>> > > > I just noticed while testing the TableAPI on the cluster that it
>>> is not
>>> > > > part of the dist module. Therefore, programs using the TableAPI
>>> will
>>> > only
>>> > > > run when you put the TableAPI jar directly on the cluster or if you
>>> > > build a
>>> > > > fat jar including the TableAPI jar. This is nowhere documented.
>>> > > > Furthermore, this also applies to Gelly and FlinkML.
>>> > >
>>> > > I think all of these should be included in the fat jar. They are all
>>> > > highly advertized components.
>>> > >
>>> > > Very good catch, Till! I didn't get around to testing Table API on a
>>> > > cluster, yet.
>>> >
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2

On 12 Jun 2015, at 10:44, Till Rohrmann <[hidden email]> wrote:

> Yes you're right Ufuk. At the moment the user has to place the jars in the
> lib folder of Flink. If this folder is not shared then he has to do it for
> every node on which Flink runs.

OK. I guess there is a nice way to do this with YARN as well. I think it decreases the out-of-the-box experience quite a bit if you want to use these nice features.

What's your stand on this issue?
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

mxm
In reply to this post by Till Rohrmann
Just to clarify. If you write a Flink program and include the Table API as
a dependency, then you have to package your program in the JAR with the
Table API and submit it to the cluster. IMHO that's ok but it should be
documented to inform users which libraries are included in Flink binaries
out of the box. Let's document and postpone a discussion whether Flink
should ship all libraries (and all their dependencies, as Till pointed out)
in the future. I think this is too big of a change for a release candidate.

On Fri, Jun 12, 2015 at 10:44 AM, Till Rohrmann <[hidden email]>
wrote:

> Yes you're right Ufuk. At the moment the user has to place the jars in the
> lib folder of Flink. If this folder is not shared then he has to do it for
> every node on which Flink runs.
>
> On Fri, Jun 12, 2015 at 10:42 AM Till Rohrmann <[hidden email]>
> wrote:
>
> > I think I found a real release blocker. Currently we don't add license
> > files to our shaded jars. For example
> > the flink-shaded-include-yarn-0.9.0-milestone-1.jar shades hadoop code.
> > This code also includes the `org.apache.util.bloom.*` classes. These
> > classes are licensed under  The European Commission project OneLab. We
> have
> > a notice in the LICENSE file of our binary distribution but I think we
> also
> > have to add them in the shaded jar. There might even be more code bundled
> > as part of some shaded jars which I have not spotted yet.
> >
> > Furthermore, I noticed that we list all Apache License dependencies in
> our
> > LICENSE file of our binary distribution (which we don't have to do).
> > However, we don't do it in our jars which contain for example guava and
> asm
> > as shaded dependencies. Maybe we should be consistent here.
> >
> > But maybe I overlook something here and we don't have to do it.
> >
> > On Fri, Jun 12, 2015 at 10:29 AM Till Rohrmann <[hidden email]>
> > wrote:
> >
> >> Well I think the initial idea was to keep the dist jar as small a
> >> possible and therefore we did not include the libraries. I'm not sure
> >> whether we can decide this here ad-hoc. If the community says that we
> shall
> >> include these libraries then I can add them. But bear in mind that all
> of
> >> them have some transitive dependencies which will be added as well.
> >>
> >>
> >> On Fri, Jun 12, 2015 at 10:15 AM Márton Balassi <
> [hidden email]>
> >> wrote:
> >>
> >>> As for outstanding issues I think streaming is good to go as far as I
> >>> know.
> >>> I am personally against including all libraries - at least speaking for
> >>> the
> >>> streaming connectors. Robert, Stephan and myself had a detailed
> >>> discussion
> >>> on that some time ago and the disadvantage of having all the libraries
> in
> >>> the distribution is the dependency mess that they pull. In this case I
> >>> would rather add documentation on putting them in the user jar then. As
> >>> for
> >>> the other libraries they do not depend on so much external code, so +1
> >>> for
> >>> putting them in.
> >>>
> >>> On Fri, Jun 12, 2015 at 10:02 AM, Maximilian Michels <[hidden email]>
> >>> wrote:
> >>>
> >>> > We should have a nightly cluster test for every library. Let's keep
> >>> that in
> >>> > mind for the future. Very nice find, Till!
> >>> >
> >>> > Since there were not objections, I cherry-picked the proposed commits
> >>> from
> >>> > the document to the release-0.9 branch. If I understand correctly, we
> >>> can
> >>> > create the new release candidate once Till has checked the licenses,
> >>> Ufuk's
> >>> > TableInput fix has been merged, and Fabian's web interface
> improvement
> >>> are
> >>> > in. Plus, we need to include all Flink libraries in flink-dist. Are
> you
> >>> > going to fix that as well, Till?
> >>> >
> >>> > On Fri, Jun 12, 2015 at 9:53 AM, Ufuk Celebi <[hidden email]> wrote:
> >>> >
> >>> > >
> >>> > > On 12 Jun 2015, at 09:45, Till Rohrmann <[hidden email]>
> >>> wrote:
> >>> > >
> >>> > > > Hi guys,
> >>> > > >
> >>> > > > I just noticed while testing the TableAPI on the cluster that it
> >>> is not
> >>> > > > part of the dist module. Therefore, programs using the TableAPI
> >>> will
> >>> > only
> >>> > > > run when you put the TableAPI jar directly on the cluster or if
> you
> >>> > > build a
> >>> > > > fat jar including the TableAPI jar. This is nowhere documented.
> >>> > > > Furthermore, this also applies to Gelly and FlinkML.
> >>> > >
> >>> > > I think all of these should be included in the fat jar. They are
> all
> >>> > > highly advertized components.
> >>> > >
> >>> > > Very good catch, Till! I didn't get around to testing Table API on
> a
> >>> > > cluster, yet.
> >>> >
> >>>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
In reply to this post by Aljoscha Krettek-2
After thinking about it a bit more, I think that's fine.

+1 to document and keep it as it is.
Reply | Threaded
Open this post in threaded view
|

Re: Testing Apache Flink 0.9.0-rc1

Ufuk Celebi-2
In reply to this post by Ufuk Celebi-2

On 12 Jun 2015, at 00:40, Ufuk Celebi <[hidden email]> wrote:

>
> On 11 Jun 2015, at 20:04, Fabian Hueske <[hidden email]> wrote:
>
>> How about the following issues?
>>
>> 1. The Hbase Hadoop Compat issue, Ufuk is working on
>
> I was not able to reproduce this :( I ran HadoopInputFormats against various sources and confirmed the results and everything was fine so far.

The issue has been resolved as "Not a problem". There was some misconfiguration in the user code.
1234