Where to host user-contributed Flink programs?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Where to host user-contributed Flink programs?

Robert Metzger
Hi,

our current repository contains a set of examples that show different
aspects of the system.
We recently got a pull request to add a graph centrality/closeness
implementation to Flink (
https://github.com/stratosphere/stratosphere/pull/904). The program is too
complex to be added as an example. In addition to that, we don't want to
maintain actual applications on top of Flink.
I have a GitHub repository that also contains a few other Flink programs
that need a new home.

I would like to provide users an infrastructure in our project to host such
code and establish a community around it.

What is the Apache way of handling such a situation (infrastructure wise)?
On GitHub, I would create a new repository called "application-library" and
link it from the website.
We could also create an orphan branch in our current repository.

What do you think?


Robert
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Alan Gates
In Pig we created a separate section of the repository (called piggybank) where we stored user contributed user defined functinos (UDFs).  The contract was that users needed to provide unit tests so that as changes were made to Pig the committers could make sure that the UDFs in piggybank still worked.  Committers weren't responsible for fixing piggybank code, though if the change was minor they often would.  We distribute piggybank with Pig as a convenience to users.

The downside to this is users can't self-manage it.  It requires a committer to review and commit people's code.  Though we always kept the bar quite low (just make sure it has tests and isn't completely nuts).  But still people complained that they had to wait for us to review it.

The upside was that since committers had to review it, it didn't become a dumping ground.

Alan.

June 25, 2014 at 7:50 AM
Hi,

our current repository contains a set of examples that show different
aspects of the system.
We recently got a pull request to add a graph centrality/closeness
implementation to Flink (
https://github.com/stratosphere/stratosphere/pull/904). The program is too
complex to be added as an example. In addition to that, we don't want to
maintain actual applications on top of Flink.
I have a GitHub repository that also contains a few other Flink programs
that need a new home.

I would like to provide users an infrastructure in our project to host such
code and establish a community around it.

What is the Apache way of handling such a situation (infrastructure wise)?
On GitHub, I would create a new repository called "application-library" and
link it from the website.
We could also create an orphan branch in our current repository.

What do you think?


Robert


--
Sent with Postbox

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Robert Metzger
Okay. This approach sounds actually quite good.
So piggybank is just a subdirectory in the pig repository:
https://github.com/apache/pig/tree/trunk/contrib/piggybank. We have a
"addons" module in our project that could serve a similar purpose.
But I'm still a bit hesitant adding this kind of code into our main
repository. It implies that the code has been reviewed by us and that we
endorse it.

We could add the code into our repository without shipping it in releases.
What do the others say about this idea?




On Thu, Jun 26, 2014 at 1:40 AM, Alan Gates <[hidden email]> wrote:

> In Pig we created a separate section of the repository (called piggybank)
> where we stored user contributed user defined functinos (UDFs).  The
> contract was that users needed to provide unit tests so that as changes
> were made to Pig the committers could make sure that the UDFs in piggybank
> still worked.  Committers weren't responsible for fixing piggybank code,
> though if the change was minor they often would.  We distribute piggybank
> with Pig as a convenience to users.
>
> The downside to this is users can't self-manage it.  It requires a
> committer to review and commit people's code.  Though we always kept the
> bar quite low (just make sure it has tests and isn't completely nuts).  But
> still people complained that they had to wait for us to review it.
>
> The upside was that since committers had to review it, it didn't become a
> dumping ground.
>
> Alan.
>
>    Robert Metzger <[hidden email]>
>  June 25, 2014 at 7:50 AM
> Hi,
>
> our current repository contains a set of examples that show different
> aspects of the system.
> We recently got a pull request to add a graph centrality/closeness
> implementation to Flink (
> https://github.com/stratosphere/stratosphere/pull/904). The program is too
> complex to be added as an example. In addition to that, we don't want to
> maintain actual applications on top of Flink.
> I have a GitHub repository that also contains a few other Flink programs
> that need a new home.
>
> I would like to provide users an infrastructure in our project to host such
> code and establish a community around it.
>
> What is the Apache way of handling such a situation (infrastructure wise)?
> On GitHub, I would create a new repository called "application-library" and
> link it from the website.
> We could also create an orphan branch in our current repository.
>
> What do you think?
>
>
> Robert
>
>
> --
> Sent with Postbox <http://www.getpostbox.com>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Ufuk Celebi

On 27 Jun 2014, at 09:50, Robert Metzger <[hidden email]> wrote:

> Okay. This approach sounds actually quite good.
> So piggybank is just a subdirectory in the pig repository:
> https://github.com/apache/pig/tree/trunk/contrib/piggybank. We have a
> "addons" module in our project that could serve a similar purpose.
> But I'm still a bit hesitant adding this kind of code into our main
> repository. It implies that the code has been reviewed by us and that we
> endorse it.
>
> We could add the code into our repository without shipping it in releases.
> What do the others say about this idea?

+1 to NOT ship it in releases.

I like the idea of a place for such programs (Pig's piggybank is a nice pun btw :P), but would also favor a different repository than our main one. If at some point in time, we think that we have gathered enough and good quality programs, we could then think about moving it into some kind of library in the main code.
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Stephan Ewen
Robert and Ufuks suggestion would imply that committers review the programs
nonetheless, so it should result in high quality.

The only difference seems to be whether it is part of the core repo, or an
addons repo.

The danger of an addons-repo is that it may get discarded eventually, or
that it is treated with much lower priority.
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Fabian Hueske
+1 for not shipping them.

I like the idea of having additional test programs. If we require that
programs include tests, we could increase the test diversity and coverage.

So I would add them to a separate folder in the main repository, use them
for testing during builds, but not include them in the distro.




2014-06-27 11:46 GMT+02:00 Stephan Ewen <[hidden email]>:

> Robert and Ufuks suggestion would imply that committers review the programs
> nonetheless, so it should result in high quality.
>
> The only difference seems to be whether it is part of the core repo, or an
> addons repo.
>
> The danger of an addons-repo is that it may get discarded eventually, or
> that it is treated with much lower priority.
>
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Ufuk Celebi

On 27 Jun 2014, at 11:54, Fabian Hueske <[hidden email]> wrote:

> +1 for not shipping them.
>
> I like the idea of having additional test programs. If we require that
> programs include tests, we could increase the test diversity and coverage.
>
> So I would add them to a separate folder in the main repository, use them
> for testing during builds, but not include them in the distro.

That's a nice idea. :) With this, I would also agree to include it in the main repo.
Reply | Threaded
Open this post in threaded view
|

Re: Where to host user-contributed Flink programs?

Robert Metzger
+1


On Fri, Jun 27, 2014 at 11:56 AM, Ufuk Celebi <[hidden email]> wrote:

>
> On 27 Jun 2014, at 11:54, Fabian Hueske <[hidden email]> wrote:
>
> > +1 for not shipping them.
> >
> > I like the idea of having additional test programs. If we require that
> > programs include tests, we could increase the test diversity and
> coverage.
> >
> > So I would add them to a separate folder in the main repository, use them
> > for testing during builds, but not include them in the distro.
>
> That's a nice idea. :) With this, I would also agree to include it in the
> main repo.