Adding the streaming project to the main repository

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Márton Balassi
Hi guys,

@Stefan: Thanks for the script, we've gone through the commits with Gabor,
Gyula is reviewing it right now.
https://github.com/mbalassi/incubator-flink/commits/streamrebase3

@Robert: We've went through the coding style, the update commit is already
pushed to our old repo, I'm merging it to my flink fork soon.

@Henry: Ok, I'm pinging all the contributors with the subject, the three of
us already signed the form.

I'm dealing with the Licensing tomorrow.


On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[hidden email]> wrote:

> Before adding this contribution to the project, there are some legal things
> to do:
>
>  - Obtain ICLAs from all major contributors. There are 7 in the streaming
> code, out of which three did the largest portion of the work: Márton
> Balassi, Gyula Fóra, Hermann Gábor
>  - @mentors: Should the other 4 also sign and send ICLAs?
>
>  - Licenses: Walk through the code, collect all dependencies and make sure
> they are ASL compatible.Here are some links with information:
>     - http://www.apache.org/legal/resolved.html
>     - http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
>
>  - All used licenses must be mentioned in the LICENSE files
>    - under ./LICENSE
>    - under ./flink-dist/src/main/flink-bin/LICENSE
>
>  - Check headers for ASF compliance.
>
>
> This looks manageable. Anything I forgot?
>
> Greetings,
> Stephan
>
>
>
>
> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[hidden email]> wrote:
>
> > Ho guys!
> >
> > I made a scripted manual rebase of each commit (basically add the commit
> > not via its diff, but such that it reflects the code base after the
> commit)
> >
> > https://github.com/StephanEwen/incubator-flink/commits/streamrebase
> >
> > No more merge commits that mess things up. You should be able to squash
> > things easily via "git rebase -i
> 3002258f8a22a8adbdb230e57c972ad17910debf"
> >
> > The commit diffs may be a bit different than before (not too much if I
> did
> > things correctly), but can you have a quick look at the commits to see
> > whether they make sense?
> >
> > Stephan
> >
> >
> > BTW: I used this way to do it:
> >
> > Have two repositories (clones)
> >   - /data/repositories/flink
> >   - /data/repositories/flinkbak
> >
> > The do the following for every non-merge commit:
> >  - Check out the state after a commit in the backup (detached head)
> >  - Remove current streaming directory (physically and from the index)
> >  - Add it again (files and index), with the state of the cloned repo
> >  - Commit (git recreates the diffs in a way that they reflect the
> original
> > commit plus any merges)
> >
> > ---------------------
> >
> > #!/bin/bash
> >
> > for line in $(cat commits)
> > do
> >   cd /data/repositories/flinkbak
> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
> >   message=`git --no-pager show -s --format='%s%n' $line`
> >
> >   echo "picking commit $line from author $author"
> >
> >   git checkout $line
> >   cd /data/repositories/flink
> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
> >   git rm -r "/data/repositories/flink/flink-addons/flink-streaming"
> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
> > "/data/repositories/flink/flink-addons/flink-streaming"
> >   git add /data/repositories/flink/flink-addons/flink-streaming
> >   git commit --author "$author" --m "$message"
> >
> > #  read -rsp $'Press any key to continue...\n' -n1 key
> > done
> >
> >
> >
> >
> >
> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[hidden email]>
> wrote:
> >
> >> By the way, I forked your repo switch to the streaming branch and then I
> >> executed the commands (I think this is how it should have been done)
> >>
> >>
> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <[hidden email]>
> wrote:
> >>
> >>> This is what I get with "rebase -i -p master":
> >>>
> >>> pick 9456624 Merge branch 'master' of
> file:///data/repositories/streamin
> >>> into streaming
> >>> pick 89299b8 [streaming] Post-merge cleanups
> >>>
> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
> >>> #......
> >>>
> >>>
> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <[hidden email]>
> wrote:
> >>>
> >>>> Can you do "rebase -i -p master". That should include all commits and
> >>>> might save you the meeting hell.
> >>>>
> >>>
> >>>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Márton Balassi
Hi all,

We've decided to do our preparations on a fork of the main repo:
*https://github.com/mbalassi/incubator-flink/tree/streaming-ready
<https://github.com/mbalassi/incubator-flink/tree/streaming-ready>*

We've fixed the code to match the coding style and added the modules to the
maven build.
https://github.com/mbalassi/incubator-flink/commit/f8a6b0ecf7f453cad13ed6752051f29783ec0469

As for the licensing:
https://github.com/mbalassi/incubator-flink/commit/5ddcebc6f0cedfcb3ed67a4f53ee1b415dd1d82f

   - Removed JBlas as it is no longer needed
   - Included the information for RabbitMQ
   - Deleted the ZeroMQ package and it's dependency as a whole - by the way
   Spark has LGPL licensed packages in its NOTICE
   - Did not include additional information for Apache Kafka

On my machine the project builds with the default hadoop profile and Java
6&7 and the tests are passing, however the Travis CI for the latest travis
build is way less rosy:
*https://travis-ci.org/mbalassi/incubator-flink/builds/30055789
<https://travis-ci.org/mbalassi/incubator-flink/builds/30055789>*

   - The ones with the hadoop-2 profile fail with not finding one of the
   poms (?)
   - The ones with the hadoop-1 profile either fail in flink-tests with an
   error in DataSink (maybe an the travis slot run out of disk...) or an
   exception in the streaming code that did not occur when neither when I
   built the project locally with maven nor in Eclipse

Do you have any suggestion for additional requirements or fixing the CI
build?

Cheers,

Marton


On Mon, Jul 14, 2014 at 6:28 PM, Márton Balassi <[hidden email]>
wrote:

> Hi guys,
>
> @Stefan: Thanks for the script, we've gone through the commits with Gabor,
> Gyula is reviewing it right now.
> https://github.com/mbalassi/incubator-flink/commits/streamrebase3
>
> @Robert: We've went through the coding style, the update commit is already
> pushed to our old repo, I'm merging it to my flink fork soon.
>
> @Henry: Ok, I'm pinging all the contributors with the subject, the three
> of us already signed the form.
>
> I'm dealing with the Licensing tomorrow.
>
>
> On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Before adding this contribution to the project, there are some legal
>> things
>> to do:
>>
>>  - Obtain ICLAs from all major contributors. There are 7 in the streaming
>> code, out of which three did the largest portion of the work: Márton
>> Balassi, Gyula Fóra, Hermann Gábor
>>  - @mentors: Should the other 4 also sign and send ICLAs?
>>
>>  - Licenses: Walk through the code, collect all dependencies and make sure
>> they are ASL compatible.Here are some links with information:
>>     - http://www.apache.org/legal/resolved.html
>>     - http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
>>
>>  - All used licenses must be mentioned in the LICENSE files
>>    - under ./LICENSE
>>    - under ./flink-dist/src/main/flink-bin/LICENSE
>>
>>  - Check headers for ASF compliance.
>>
>>
>> This looks manageable. Anything I forgot?
>>
>> Greetings,
>> Stephan
>>
>>
>>
>>
>> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[hidden email]> wrote:
>>
>> > Ho guys!
>> >
>> > I made a scripted manual rebase of each commit (basically add the commit
>> > not via its diff, but such that it reflects the code base after the
>> commit)
>> >
>> > https://github.com/StephanEwen/incubator-flink/commits/streamrebase
>> >
>> > No more merge commits that mess things up. You should be able to squash
>> > things easily via "git rebase -i
>> 3002258f8a22a8adbdb230e57c972ad17910debf"
>> >
>> > The commit diffs may be a bit different than before (not too much if I
>> did
>> > things correctly), but can you have a quick look at the commits to see
>> > whether they make sense?
>> >
>> > Stephan
>> >
>> >
>> > BTW: I used this way to do it:
>> >
>> > Have two repositories (clones)
>> >   - /data/repositories/flink
>> >   - /data/repositories/flinkbak
>> >
>> > The do the following for every non-merge commit:
>> >  - Check out the state after a commit in the backup (detached head)
>> >  - Remove current streaming directory (physically and from the index)
>> >  - Add it again (files and index), with the state of the cloned repo
>> >  - Commit (git recreates the diffs in a way that they reflect the
>> original
>> > commit plus any merges)
>> >
>> > ---------------------
>> >
>> > #!/bin/bash
>> >
>> > for line in $(cat commits)
>> > do
>> >   cd /data/repositories/flinkbak
>> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
>> >   message=`git --no-pager show -s --format='%s%n' $line`
>> >
>> >   echo "picking commit $line from author $author"
>> >
>> >   git checkout $line
>> >   cd /data/repositories/flink
>> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
>> >   git rm -r "/data/repositories/flink/flink-addons/flink-streaming"
>> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
>> > "/data/repositories/flink/flink-addons/flink-streaming"
>> >   git add /data/repositories/flink/flink-addons/flink-streaming
>> >   git commit --author "$author" --m "$message"
>> >
>> > #  read -rsp $'Press any key to continue...\n' -n1 key
>> > done
>> >
>> >
>> >
>> >
>> >
>> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[hidden email]>
>> wrote:
>> >
>> >> By the way, I forked your repo switch to the streaming branch and then
>> I
>> >> executed the commands (I think this is how it should have been done)
>> >>
>> >>
>> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <[hidden email]>
>> wrote:
>> >>
>> >>> This is what I get with "rebase -i -p master":
>> >>>
>> >>> pick 9456624 Merge branch 'master' of
>> file:///data/repositories/streamin
>> >>> into streaming
>> >>> pick 89299b8 [streaming] Post-merge cleanups
>> >>>
>> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
>> >>> #......
>> >>>
>> >>>
>> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <[hidden email]>
>> wrote:
>> >>>
>> >>>> Can you do "rebase -i -p master". That should include all commits and
>> >>>> might save you the meeting hell.
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Robert Metzger
Cool. Thanks for the update.
I think you can host the ZeroMQ connectors on a private repository or so.

"by the way Spark has LGPL licensed packages in its NOTICE" --> did you
find the discussion in their mailing list / JIRA regarding this? Maybe they
contacted the authors of the code or got a special permission to do that?

What is the license of this file?
https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/index/BTreeIndex.java



The build errors for hadoop1 are:

[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
(default-compile) on project flink-streaming-core: Compilation
failure: Compilation failure:
[ERROR] /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[190,89]
incompatible types:
java.lang.Class<org.apache.flink.streaming.api.streamrecord.StreamRecord>
cannot be converted to java.lang.Class<? extends
org.apache.flink.streaming.api.streamrecord.StreamRecord<IN>>
[ERROR] /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[208,42]
incompatible types:
java.lang.Class<org.apache.flink.streaming.partitioner.DefaultPartitioner>
cannot be converted to java.lang.Class<? extends
org.apache.flink.runtime.io.network.api.ChannelSelector<org.apache.flink.streaming.api.streamrecord.StreamRecord<OUT>>>

I guess thats easy to fix ?

The build errors for hadoop2 are:
I think its copypasta.. You've copied the build-profiles stuff from another
project. Your pom is including the "flink-hbase" and "flink-yarn"
submodules (of flink-streaming).
https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/pom.xml#L109

Just remove the whole <profiles> .. </profiles> block in the
flink-streaming/pom.xml.




On Wed, Jul 16, 2014 at 1:56 PM, Márton Balassi <[hidden email]>
wrote:

> Hi all,
>
> We've decided to do our preparations on a fork of the main repo:
> *https://github.com/mbalassi/incubator-flink/tree/streaming-ready
> <https://github.com/mbalassi/incubator-flink/tree/streaming-ready>*
>
> We've fixed the code to match the coding style and added the modules to the
> maven build.
>
> https://github.com/mbalassi/incubator-flink/commit/f8a6b0ecf7f453cad13ed6752051f29783ec0469
>
> As for the licensing:
>
> https://github.com/mbalassi/incubator-flink/commit/5ddcebc6f0cedfcb3ed67a4f53ee1b415dd1d82f
>
>    - Removed JBlas as it is no longer needed
>    - Included the information for RabbitMQ
>    - Deleted the ZeroMQ package and it's dependency as a whole - by the way
>    Spark has LGPL licensed packages in its NOTICE
>    - Did not include additional information for Apache Kafka
>
> On my machine the project builds with the default hadoop profile and Java
> 6&7 and the tests are passing, however the Travis CI for the latest travis
> build is way less rosy:
> *https://travis-ci.org/mbalassi/incubator-flink/builds/30055789
> <https://travis-ci.org/mbalassi/incubator-flink/builds/30055789>*
>
>    - The ones with the hadoop-2 profile fail with not finding one of the
>    poms (?)
>    - The ones with the hadoop-1 profile either fail in flink-tests with an
>    error in DataSink (maybe an the travis slot run out of disk...) or an
>    exception in the streaming code that did not occur when neither when I
>    built the project locally with maven nor in Eclipse
>
> Do you have any suggestion for additional requirements or fixing the CI
> build?
>
> Cheers,
>
> Marton
>
>
> On Mon, Jul 14, 2014 at 6:28 PM, Márton Balassi <[hidden email]>
> wrote:
>
> > Hi guys,
> >
> > @Stefan: Thanks for the script, we've gone through the commits with
> Gabor,
> > Gyula is reviewing it right now.
> > https://github.com/mbalassi/incubator-flink/commits/streamrebase3
> >
> > @Robert: We've went through the coding style, the update commit is
> already
> > pushed to our old repo, I'm merging it to my flink fork soon.
> >
> > @Henry: Ok, I'm pinging all the contributors with the subject, the three
> > of us already signed the form.
> >
> > I'm dealing with the Licensing tomorrow.
> >
> >
> > On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[hidden email]> wrote:
> >
> >> Before adding this contribution to the project, there are some legal
> >> things
> >> to do:
> >>
> >>  - Obtain ICLAs from all major contributors. There are 7 in the
> streaming
> >> code, out of which three did the largest portion of the work: Márton
> >> Balassi, Gyula Fóra, Hermann Gábor
> >>  - @mentors: Should the other 4 also sign and send ICLAs?
> >>
> >>  - Licenses: Walk through the code, collect all dependencies and make
> sure
> >> they are ASL compatible.Here are some links with information:
> >>     - http://www.apache.org/legal/resolved.html
> >>     - http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
> >>
> >>  - All used licenses must be mentioned in the LICENSE files
> >>    - under ./LICENSE
> >>    - under ./flink-dist/src/main/flink-bin/LICENSE
> >>
> >>  - Check headers for ASF compliance.
> >>
> >>
> >> This looks manageable. Anything I forgot?
> >>
> >> Greetings,
> >> Stephan
> >>
> >>
> >>
> >>
> >> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[hidden email]> wrote:
> >>
> >> > Ho guys!
> >> >
> >> > I made a scripted manual rebase of each commit (basically add the
> commit
> >> > not via its diff, but such that it reflects the code base after the
> >> commit)
> >> >
> >> > https://github.com/StephanEwen/incubator-flink/commits/streamrebase
> >> >
> >> > No more merge commits that mess things up. You should be able to
> squash
> >> > things easily via "git rebase -i
> >> 3002258f8a22a8adbdb230e57c972ad17910debf"
> >> >
> >> > The commit diffs may be a bit different than before (not too much if I
> >> did
> >> > things correctly), but can you have a quick look at the commits to see
> >> > whether they make sense?
> >> >
> >> > Stephan
> >> >
> >> >
> >> > BTW: I used this way to do it:
> >> >
> >> > Have two repositories (clones)
> >> >   - /data/repositories/flink
> >> >   - /data/repositories/flinkbak
> >> >
> >> > The do the following for every non-merge commit:
> >> >  - Check out the state after a commit in the backup (detached head)
> >> >  - Remove current streaming directory (physically and from the index)
> >> >  - Add it again (files and index), with the state of the cloned repo
> >> >  - Commit (git recreates the diffs in a way that they reflect the
> >> original
> >> > commit plus any merges)
> >> >
> >> > ---------------------
> >> >
> >> > #!/bin/bash
> >> >
> >> > for line in $(cat commits)
> >> > do
> >> >   cd /data/repositories/flinkbak
> >> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
> >> >   message=`git --no-pager show -s --format='%s%n' $line`
> >> >
> >> >   echo "picking commit $line from author $author"
> >> >
> >> >   git checkout $line
> >> >   cd /data/repositories/flink
> >> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
> >> >   git rm -r "/data/repositories/flink/flink-addons/flink-streaming"
> >> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
> >> > "/data/repositories/flink/flink-addons/flink-streaming"
> >> >   git add /data/repositories/flink/flink-addons/flink-streaming
> >> >   git commit --author "$author" --m "$message"
> >> >
> >> > #  read -rsp $'Press any key to continue...\n' -n1 key
> >> > done
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[hidden email]>
> >> wrote:
> >> >
> >> >> By the way, I forked your repo switch to the streaming branch and
> then
> >> I
> >> >> executed the commands (I think this is how it should have been done)
> >> >>
> >> >>
> >> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <[hidden email]>
> >> wrote:
> >> >>
> >> >>> This is what I get with "rebase -i -p master":
> >> >>>
> >> >>> pick 9456624 Merge branch 'master' of
> >> file:///data/repositories/streamin
> >> >>> into streaming
> >> >>> pick 89299b8 [streaming] Post-merge cleanups
> >> >>>
> >> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
> >> >>> #......
> >> >>>
> >> >>>
> >> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <[hidden email]>
> >> wrote:
> >> >>>
> >> >>>> Can you do "rebase -i -p master". That should include all commits
> and
> >> >>>> might save you the meeting hell.
> >> >>>>
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Sean Owen
On Wed, Jul 16, 2014 at 2:02 PM, Robert Metzger <[hidden email]> wrote:
> "by the way Spark has LGPL licensed packages in its NOTICE" --> did you
> find the discussion in their mailing list / JIRA regarding this? Maybe they
> contacted the authors of the code or got a special permission to do that?

I discussed with Marton offline. The reference is to:

https://github.com/apache/spark/blob/master/NOTICE#L60

This is licensed under the GPL, LGPL, and MPL. It's included because
of the MPL, and wouldn't be includable if it were only GPL or LGPL
licensed. The line in NOTICE notes all of its licenses.
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Márton Balassi
In reply to this post by Robert Metzger
Thanks, Robert:

   - ZeroMQ - thanks, we have it in another repo
   - Spark & LGPL - Sean Owen was kind enough to clarify the situation
   - BTree: The whole org.apache.flink.streaming.index is somewhat legacy
   code, currently being unused - was for the purpose of state management, but
   the API got refactored since then and we decided to leave some parts there
   that we have to readdress. It is quite likely that we are not using it any
   more. I'm removing it.
   - The hadoop-2 profile is indeed copypasta, good call.
   - The hadoop-1 profile was interesting for me, because it builds on "my
   machine" :)




On Wed, Jul 16, 2014 at 3:02 PM, Robert Metzger <[hidden email]> wrote:

> Cool. Thanks for the update.
> I think you can host the ZeroMQ connectors on a private repository or so.
>
> "by the way Spark has LGPL licensed packages in its NOTICE" --> did you
> find the discussion in their mailing list / JIRA regarding this? Maybe they
> contacted the authors of the code or got a special permission to do that?
>
> What is the license of this file?
>
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/index/BTreeIndex.java
>
>
>
> The build errors for hadoop1 are:
>
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
> (default-compile) on project flink-streaming-core: Compilation
> failure: Compilation failure:
> [ERROR]
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[190,89]
> incompatible types:
> java.lang.Class<org.apache.flink.streaming.api.streamrecord.StreamRecord>
> cannot be converted to java.lang.Class<? extends
> org.apache.flink.streaming.api.streamrecord.StreamRecord<IN>>
> [ERROR]
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[208,42]
> incompatible types:
> java.lang.Class<org.apache.flink.streaming.partitioner.DefaultPartitioner>
> cannot be converted to java.lang.Class<? extends
>
> org.apache.flink.runtime.io.network.api.ChannelSelector<org.apache.flink.streaming.api.streamrecord.StreamRecord<OUT>>>
>
> I guess thats easy to fix ?
>
> The build errors for hadoop2 are:
> I think its copypasta.. You've copied the build-profiles stuff from another
> project. Your pom is including the "flink-hbase" and "flink-yarn"
> submodules (of flink-streaming).
>
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/pom.xml#L109
>
> Just remove the whole <profiles> .. </profiles> block in the
> flink-streaming/pom.xml.
>
>
>
>
> On Wed, Jul 16, 2014 at 1:56 PM, Márton Balassi <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > We've decided to do our preparations on a fork of the main repo:
> > *https://github.com/mbalassi/incubator-flink/tree/streaming-ready
> > <https://github.com/mbalassi/incubator-flink/tree/streaming-ready>*
> >
> > We've fixed the code to match the coding style and added the modules to
> the
> > maven build.
> >
> >
> https://github.com/mbalassi/incubator-flink/commit/f8a6b0ecf7f453cad13ed6752051f29783ec0469
> >
> > As for the licensing:
> >
> >
> https://github.com/mbalassi/incubator-flink/commit/5ddcebc6f0cedfcb3ed67a4f53ee1b415dd1d82f
> >
> >    - Removed JBlas as it is no longer needed
> >    - Included the information for RabbitMQ
> >    - Deleted the ZeroMQ package and it's dependency as a whole - by the
> way
> >    Spark has LGPL licensed packages in its NOTICE
> >    - Did not include additional information for Apache Kafka
> >
> > On my machine the project builds with the default hadoop profile and Java
> > 6&7 and the tests are passing, however the Travis CI for the latest
> travis
> > build is way less rosy:
> > *https://travis-ci.org/mbalassi/incubator-flink/builds/30055789
> > <https://travis-ci.org/mbalassi/incubator-flink/builds/30055789>*
> >
> >    - The ones with the hadoop-2 profile fail with not finding one of the
> >    poms (?)
> >    - The ones with the hadoop-1 profile either fail in flink-tests with
> an
> >    error in DataSink (maybe an the travis slot run out of disk...) or an
> >    exception in the streaming code that did not occur when neither when I
> >    built the project locally with maven nor in Eclipse
> >
> > Do you have any suggestion for additional requirements or fixing the CI
> > build?
> >
> > Cheers,
> >
> > Marton
> >
> >
> > On Mon, Jul 14, 2014 at 6:28 PM, Márton Balassi <
> [hidden email]>
> > wrote:
> >
> > > Hi guys,
> > >
> > > @Stefan: Thanks for the script, we've gone through the commits with
> > Gabor,
> > > Gyula is reviewing it right now.
> > > https://github.com/mbalassi/incubator-flink/commits/streamrebase3
> > >
> > > @Robert: We've went through the coding style, the update commit is
> > already
> > > pushed to our old repo, I'm merging it to my flink fork soon.
> > >
> > > @Henry: Ok, I'm pinging all the contributors with the subject, the
> three
> > > of us already signed the form.
> > >
> > > I'm dealing with the Licensing tomorrow.
> > >
> > >
> > > On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[hidden email]>
> wrote:
> > >
> > >> Before adding this contribution to the project, there are some legal
> > >> things
> > >> to do:
> > >>
> > >>  - Obtain ICLAs from all major contributors. There are 7 in the
> > streaming
> > >> code, out of which three did the largest portion of the work: Márton
> > >> Balassi, Gyula Fóra, Hermann Gábor
> > >>  - @mentors: Should the other 4 also sign and send ICLAs?
> > >>
> > >>  - Licenses: Walk through the code, collect all dependencies and make
> > sure
> > >> they are ASL compatible.Here are some links with information:
> > >>     - http://www.apache.org/legal/resolved.html
> > >>     -
> http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
> > >>
> > >>  - All used licenses must be mentioned in the LICENSE files
> > >>    - under ./LICENSE
> > >>    - under ./flink-dist/src/main/flink-bin/LICENSE
> > >>
> > >>  - Check headers for ASF compliance.
> > >>
> > >>
> > >> This looks manageable. Anything I forgot?
> > >>
> > >> Greetings,
> > >> Stephan
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[hidden email]>
> wrote:
> > >>
> > >> > Ho guys!
> > >> >
> > >> > I made a scripted manual rebase of each commit (basically add the
> > commit
> > >> > not via its diff, but such that it reflects the code base after the
> > >> commit)
> > >> >
> > >> > https://github.com/StephanEwen/incubator-flink/commits/streamrebase
> > >> >
> > >> > No more merge commits that mess things up. You should be able to
> > squash
> > >> > things easily via "git rebase -i
> > >> 3002258f8a22a8adbdb230e57c972ad17910debf"
> > >> >
> > >> > The commit diffs may be a bit different than before (not too much
> if I
> > >> did
> > >> > things correctly), but can you have a quick look at the commits to
> see
> > >> > whether they make sense?
> > >> >
> > >> > Stephan
> > >> >
> > >> >
> > >> > BTW: I used this way to do it:
> > >> >
> > >> > Have two repositories (clones)
> > >> >   - /data/repositories/flink
> > >> >   - /data/repositories/flinkbak
> > >> >
> > >> > The do the following for every non-merge commit:
> > >> >  - Check out the state after a commit in the backup (detached head)
> > >> >  - Remove current streaming directory (physically and from the
> index)
> > >> >  - Add it again (files and index), with the state of the cloned repo
> > >> >  - Commit (git recreates the diffs in a way that they reflect the
> > >> original
> > >> > commit plus any merges)
> > >> >
> > >> > ---------------------
> > >> >
> > >> > #!/bin/bash
> > >> >
> > >> > for line in $(cat commits)
> > >> > do
> > >> >   cd /data/repositories/flinkbak
> > >> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
> > >> >   message=`git --no-pager show -s --format='%s%n' $line`
> > >> >
> > >> >   echo "picking commit $line from author $author"
> > >> >
> > >> >   git checkout $line
> > >> >   cd /data/repositories/flink
> > >> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
> > >> >   git rm -r "/data/repositories/flink/flink-addons/flink-streaming"
> > >> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
> > >> > "/data/repositories/flink/flink-addons/flink-streaming"
> > >> >   git add /data/repositories/flink/flink-addons/flink-streaming
> > >> >   git commit --author "$author" --m "$message"
> > >> >
> > >> > #  read -rsp $'Press any key to continue...\n' -n1 key
> > >> > done
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[hidden email]>
> > >> wrote:
> > >> >
> > >> >> By the way, I forked your repo switch to the streaming branch and
> > then
> > >> I
> > >> >> executed the commands (I think this is how it should have been
> done)
> > >> >>
> > >> >>
> > >> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <[hidden email]>
> > >> wrote:
> > >> >>
> > >> >>> This is what I get with "rebase -i -p master":
> > >> >>>
> > >> >>> pick 9456624 Merge branch 'master' of
> > >> file:///data/repositories/streamin
> > >> >>> into streaming
> > >> >>> pick 89299b8 [streaming] Post-merge cleanups
> > >> >>>
> > >> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
> > >> >>> #......
> > >> >>>
> > >> >>>
> > >> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <[hidden email]>
> > >> wrote:
> > >> >>>
> > >> >>>> Can you do "rebase -i -p master". That should include all commits
> > and
> > >> >>>> might save you the meeting hell.
> > >> >>>>
> > >> >>>
> > >> >>>
> > >> >>
> > >> >
> > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Gyula Fóra
Hey,

I have completely reworked the way we managed tuple serialization for
streaming. Now it is possible for the user to call .setMutability(true) on
an operator to enable object reuse at tuple deserialization.

What do you think, what should be the default mutability setting for
operators? We use immutable at the moment.

Cheers,
Gyula


On Wed, Jul 16, 2014 at 3:19 PM, Márton Balassi <[hidden email]>
wrote:

> Thanks, Robert:
>
>    - ZeroMQ - thanks, we have it in another repo
>    - Spark & LGPL - Sean Owen was kind enough to clarify the situation
>    - BTree: The whole org.apache.flink.streaming.index is somewhat legacy
>    code, currently being unused - was for the purpose of state management,
> but
>    the API got refactored since then and we decided to leave some parts
> there
>    that we have to readdress. It is quite likely that we are not using it
> any
>    more. I'm removing it.
>    - The hadoop-2 profile is indeed copypasta, good call.
>    - The hadoop-1 profile was interesting for me, because it builds on "my
>    machine" :)
>
>
>
>
> On Wed, Jul 16, 2014 at 3:02 PM, Robert Metzger <[hidden email]>
> wrote:
>
> > Cool. Thanks for the update.
> > I think you can host the ZeroMQ connectors on a private repository or so.
> >
> > "by the way Spark has LGPL licensed packages in its NOTICE" --> did you
> > find the discussion in their mailing list / JIRA regarding this? Maybe
> they
> > contacted the authors of the code or got a special permission to do that?
> >
> > What is the license of this file?
> >
> >
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/index/BTreeIndex.java
> >
> >
> >
> > The build errors for hadoop1 are:
> >
> > [INFO]
> > ------------------------------------------------------------------------
> > [ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
> > (default-compile) on project flink-streaming-core: Compilation
> > failure: Compilation failure:
> > [ERROR]
> >
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[190,89]
> > incompatible types:
> > java.lang.Class<org.apache.flink.streaming.api.streamrecord.StreamRecord>
> > cannot be converted to java.lang.Class<? extends
> > org.apache.flink.streaming.api.streamrecord.StreamRecord<IN>>
> > [ERROR]
> >
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[208,42]
> > incompatible types:
> >
> java.lang.Class<org.apache.flink.streaming.partitioner.DefaultPartitioner>
> > cannot be converted to java.lang.Class<? extends
> >
> >
> org.apache.flink.runtime.io.network.api.ChannelSelector<org.apache.flink.streaming.api.streamrecord.StreamRecord<OUT>>>
> >
> > I guess thats easy to fix ?
> >
> > The build errors for hadoop2 are:
> > I think its copypasta.. You've copied the build-profiles stuff from
> another
> > project. Your pom is including the "flink-hbase" and "flink-yarn"
> > submodules (of flink-streaming).
> >
> >
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/pom.xml#L109
> >
> > Just remove the whole <profiles> .. </profiles> block in the
> > flink-streaming/pom.xml.
> >
> >
> >
> >
> > On Wed, Jul 16, 2014 at 1:56 PM, Márton Balassi <
> [hidden email]>
> > wrote:
> >
> > > Hi all,
> > >
> > > We've decided to do our preparations on a fork of the main repo:
> > > *https://github.com/mbalassi/incubator-flink/tree/streaming-ready
> > > <https://github.com/mbalassi/incubator-flink/tree/streaming-ready>*
> > >
> > > We've fixed the code to match the coding style and added the modules to
> > the
> > > maven build.
> > >
> > >
> >
> https://github.com/mbalassi/incubator-flink/commit/f8a6b0ecf7f453cad13ed6752051f29783ec0469
> > >
> > > As for the licensing:
> > >
> > >
> >
> https://github.com/mbalassi/incubator-flink/commit/5ddcebc6f0cedfcb3ed67a4f53ee1b415dd1d82f
> > >
> > >    - Removed JBlas as it is no longer needed
> > >    - Included the information for RabbitMQ
> > >    - Deleted the ZeroMQ package and it's dependency as a whole - by the
> > way
> > >    Spark has LGPL licensed packages in its NOTICE
> > >    - Did not include additional information for Apache Kafka
> > >
> > > On my machine the project builds with the default hadoop profile and
> Java
> > > 6&7 and the tests are passing, however the Travis CI for the latest
> > travis
> > > build is way less rosy:
> > > *https://travis-ci.org/mbalassi/incubator-flink/builds/30055789
> > > <https://travis-ci.org/mbalassi/incubator-flink/builds/30055789>*
> > >
> > >    - The ones with the hadoop-2 profile fail with not finding one of
> the
> > >    poms (?)
> > >    - The ones with the hadoop-1 profile either fail in flink-tests with
> > an
> > >    error in DataSink (maybe an the travis slot run out of disk...) or
> an
> > >    exception in the streaming code that did not occur when neither
> when I
> > >    built the project locally with maven nor in Eclipse
> > >
> > > Do you have any suggestion for additional requirements or fixing the CI
> > > build?
> > >
> > > Cheers,
> > >
> > > Marton
> > >
> > >
> > > On Mon, Jul 14, 2014 at 6:28 PM, Márton Balassi <
> > [hidden email]>
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > @Stefan: Thanks for the script, we've gone through the commits with
> > > Gabor,
> > > > Gyula is reviewing it right now.
> > > > https://github.com/mbalassi/incubator-flink/commits/streamrebase3
> > > >
> > > > @Robert: We've went through the coding style, the update commit is
> > > already
> > > > pushed to our old repo, I'm merging it to my flink fork soon.
> > > >
> > > > @Henry: Ok, I'm pinging all the contributors with the subject, the
> > three
> > > > of us already signed the form.
> > > >
> > > > I'm dealing with the Licensing tomorrow.
> > > >
> > > >
> > > > On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[hidden email]>
> > wrote:
> > > >
> > > >> Before adding this contribution to the project, there are some legal
> > > >> things
> > > >> to do:
> > > >>
> > > >>  - Obtain ICLAs from all major contributors. There are 7 in the
> > > streaming
> > > >> code, out of which three did the largest portion of the work: Márton
> > > >> Balassi, Gyula Fóra, Hermann Gábor
> > > >>  - @mentors: Should the other 4 also sign and send ICLAs?
> > > >>
> > > >>  - Licenses: Walk through the code, collect all dependencies and
> make
> > > sure
> > > >> they are ASL compatible.Here are some links with information:
> > > >>     - http://www.apache.org/legal/resolved.html
> > > >>     -
> > http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
> > > >>
> > > >>  - All used licenses must be mentioned in the LICENSE files
> > > >>    - under ./LICENSE
> > > >>    - under ./flink-dist/src/main/flink-bin/LICENSE
> > > >>
> > > >>  - Check headers for ASF compliance.
> > > >>
> > > >>
> > > >> This looks manageable. Anything I forgot?
> > > >>
> > > >> Greetings,
> > > >> Stephan
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[hidden email]>
> > wrote:
> > > >>
> > > >> > Ho guys!
> > > >> >
> > > >> > I made a scripted manual rebase of each commit (basically add the
> > > commit
> > > >> > not via its diff, but such that it reflects the code base after
> the
> > > >> commit)
> > > >> >
> > > >> >
> https://github.com/StephanEwen/incubator-flink/commits/streamrebase
> > > >> >
> > > >> > No more merge commits that mess things up. You should be able to
> > > squash
> > > >> > things easily via "git rebase -i
> > > >> 3002258f8a22a8adbdb230e57c972ad17910debf"
> > > >> >
> > > >> > The commit diffs may be a bit different than before (not too much
> > if I
> > > >> did
> > > >> > things correctly), but can you have a quick look at the commits to
> > see
> > > >> > whether they make sense?
> > > >> >
> > > >> > Stephan
> > > >> >
> > > >> >
> > > >> > BTW: I used this way to do it:
> > > >> >
> > > >> > Have two repositories (clones)
> > > >> >   - /data/repositories/flink
> > > >> >   - /data/repositories/flinkbak
> > > >> >
> > > >> > The do the following for every non-merge commit:
> > > >> >  - Check out the state after a commit in the backup (detached
> head)
> > > >> >  - Remove current streaming directory (physically and from the
> > index)
> > > >> >  - Add it again (files and index), with the state of the cloned
> repo
> > > >> >  - Commit (git recreates the diffs in a way that they reflect the
> > > >> original
> > > >> > commit plus any merges)
> > > >> >
> > > >> > ---------------------
> > > >> >
> > > >> > #!/bin/bash
> > > >> >
> > > >> > for line in $(cat commits)
> > > >> > do
> > > >> >   cd /data/repositories/flinkbak
> > > >> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
> > > >> >   message=`git --no-pager show -s --format='%s%n' $line`
> > > >> >
> > > >> >   echo "picking commit $line from author $author"
> > > >> >
> > > >> >   git checkout $line
> > > >> >   cd /data/repositories/flink
> > > >> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   git rm -r
> "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
> > > >> > "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   git add /data/repositories/flink/flink-addons/flink-streaming
> > > >> >   git commit --author "$author" --m "$message"
> > > >> >
> > > >> > #  read -rsp $'Press any key to continue...\n' -n1 key
> > > >> > done
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[hidden email]
> >
> > > >> wrote:
> > > >> >
> > > >> >> By the way, I forked your repo switch to the streaming branch and
> > > then
> > > >> I
> > > >> >> executed the commands (I think this is how it should have been
> > done)
> > > >> >>
> > > >> >>
> > > >> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <
> [hidden email]>
> > > >> wrote:
> > > >> >>
> > > >> >>> This is what I get with "rebase -i -p master":
> > > >> >>>
> > > >> >>> pick 9456624 Merge branch 'master' of
> > > >> file:///data/repositories/streamin
> > > >> >>> into streaming
> > > >> >>> pick 89299b8 [streaming] Post-merge cleanups
> > > >> >>>
> > > >> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
> > > >> >>> #......
> > > >> >>>
> > > >> >>>
> > > >> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <
> [hidden email]>
> > > >> wrote:
> > > >> >>>
> > > >> >>>> Can you do "rebase -i -p master". That should include all
> commits
> > > and
> > > >> >>>> might save you the meeting hell.
> > > >> >>>>
> > > >> >>>
> > > >> >>>
> > > >> >>
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Stephan Ewen
Very good!

We have an initial effort for that as well (
https://github.com/apache/incubator-flink/pull/66), so that aligns very
well.
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Stephan Ewen
I suggested to go for "immutable" by default, because it is less error
prone and gives better initial experience. Mutable objects is a switch for
performance tuning then. I think people agreed with that.


On Mon, Jul 21, 2014 at 1:48 PM, Stephan Ewen <[hidden email]> wrote:

> Very good!
>
> We have an initial effort for that as well (
> https://github.com/apache/incubator-flink/pull/66), so that aligns very
> well.
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Ufuk Celebi
Great. :)

On 21 Jul 2014, at 13:49, Stephan Ewen <[hidden email]> wrote:

> I suggested to go for "immutable" by default, because it is less error
> prone and gives better initial experience. Mutable objects is a switch for
> performance tuning then. I think people agreed with that.

Yes, everyone was in favour of immutable mode as default.
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Stephan Ewen
The ICLA's and the SGA are cleared as far as I know.

I think we should merge the code into the current master (not the 0.6
release, but the successive one).
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Robert Metzger
Cool. I if we have the confirmation by the secretary (or for foundation
members: https://svn.apache.org/repos/private/foundation/officers/iclas.txt),
I vote for adding the code to the "master" branch.


On Wed, Aug 6, 2014 at 5:18 PM, Stephan Ewen <[hidden email]> wrote:

> The ICLA's and the SGA are cleared as far as I know.
>
> I think we should merge the code into the current master (not the 0.6
> release, but the successive one).
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Stephan Ewen
After the Apache Secretary confirmed that the SGA has arrived and the ICLAs
are filed, I have merged the streaming code into the master for the next
release.

A whole bunch of code that was!

Great work, all of you. Looking forward to what this blossoms into... It's
a good day, today :-)



On Wed, Aug 6, 2014 at 7:17 PM, Robert Metzger <[hidden email]> wrote:

> Cool. I if we have the confirmation by the secretary (or for foundation
> members:
> https://svn.apache.org/repos/private/foundation/officers/iclas.txt),
> I vote for adding the code to the "master" branch.
>
>
> On Wed, Aug 6, 2014 at 5:18 PM, Stephan Ewen <[hidden email]> wrote:
>
> > The ICLA's and the SGA are cleared as far as I know.
> >
> > I think we should merge the code into the current master (not the 0.6
> > release, but the successive one).
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Henry Saputra
W00t!

- Henry

On Mon, Aug 18, 2014 at 10:55 AM, Stephan Ewen <[hidden email]> wrote:

> After the Apache Secretary confirmed that the SGA has arrived and the ICLAs
> are filed, I have merged the streaming code into the master for the next
> release.
>
> A whole bunch of code that was!
>
> Great work, all of you. Looking forward to what this blossoms into... It's
> a good day, today :-)
>
>
>
> On Wed, Aug 6, 2014 at 7:17 PM, Robert Metzger <[hidden email]> wrote:
>
>> Cool. I if we have the confirmation by the secretary (or for foundation
>> members:
>> https://svn.apache.org/repos/private/foundation/officers/iclas.txt),
>> I vote for adding the code to the "master" branch.
>>
>>
>> On Wed, Aug 6, 2014 at 5:18 PM, Stephan Ewen <[hidden email]> wrote:
>>
>> > The ICLA's and the SGA are cleared as far as I know.
>> >
>> > I think we should merge the code into the current master (not the 0.6
>> > release, but the successive one).
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Henry Saputra
In reply to this post by Stephan Ewen
Hmm, quick question, I could not find any documentation about the
streaming support. Is it part of the source code or will there be
additional doc included?

- Henry

On Mon, Aug 18, 2014 at 10:55 AM, Stephan Ewen <[hidden email]> wrote:

> After the Apache Secretary confirmed that the SGA has arrived and the ICLAs
> are filed, I have merged the streaming code into the master for the next
> release.
>
> A whole bunch of code that was!
>
> Great work, all of you. Looking forward to what this blossoms into... It's
> a good day, today :-)
>
>
>
> On Wed, Aug 6, 2014 at 7:17 PM, Robert Metzger <[hidden email]> wrote:
>
>> Cool. I if we have the confirmation by the secretary (or for foundation
>> members:
>> https://svn.apache.org/repos/private/foundation/officers/iclas.txt),
>> I vote for adding the code to the "master" branch.
>>
>>
>> On Wed, Aug 6, 2014 at 5:18 PM, Stephan Ewen <[hidden email]> wrote:
>>
>> > The ICLA's and the SGA are cleared as far as I know.
>> >
>> > I think we should merge the code into the current master (not the 0.6
>> > release, but the successive one).
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Stephan Ewen
The streaming code is in "flink-addons", for new/experimental code.

Documents should come over the next days/weeks, definitely before we make
this part of the core.

Right now, I would suggest to have a look at some of the examples, to get a
feeling for the addon, check for example this here:
https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount

(The example reads a file for simplicity, but the project also provides
connectors for Kafka, RabbitMQ, ...)
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Fabian Hueske
Hi folks,

great work!

Looking at the example I have a quick question. What's the semantics of the
Reduce operator? I guess its not a window reduce.
Is it backed by a hash table and every input tuple updates the hash table
and returns the updated value?

Cheers, Fabian


2014-08-18 20:53 GMT+02:00 Stephan Ewen <[hidden email]>:

> The streaming code is in "flink-addons", for new/experimental code.
>
> Documents should come over the next days/weeks, definitely before we make
> this part of the core.
>
> Right now, I would suggest to have a look at some of the examples, to get a
> feeling for the addon, check for example this here:
>
> https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount
>
> (The example reads a file for simplicity, but the project also provides
> connectors for Kafka, RabbitMQ, ...)
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Gyula Fóra
Hey,

The simple reduce is like what you said yes. But there are also grouped
reduce which you can use by calling .groupBy(keyposition) and then reduce.

Also there is reduce for windows: batchReduce and windowReduce batch gives
you a sliding window over a predefined number of records, and window reduce
gices you the same but by time. (also there are grouped versions of these)

Cheers,
Gyula


On Mon, Aug 18, 2014 at 9:19 PM, Fabian Hueske <[hidden email]> wrote:

> Hi folks,
>
> great work!
>
> Looking at the example I have a quick question. What's the semantics of the
> Reduce operator? I guess its not a window reduce.
> Is it backed by a hash table and every input tuple updates the hash table
> and returns the updated value?
>
> Cheers, Fabian
>
>
> 2014-08-18 20:53 GMT+02:00 Stephan Ewen <[hidden email]>:
>
> > The streaming code is in "flink-addons", for new/experimental code.
> >
> > Documents should come over the next days/weeks, definitely before we make
> > this part of the core.
> >
> > Right now, I would suggest to have a look at some of the examples, to
> get a
> > feeling for the addon, check for example this here:
> >
> >
> https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount
> >
> > (The example reads a file for simplicity, but the project also provides
> > connectors for Kafka, RabbitMQ, ...)
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Henry Saputra
In reply to this post by Stephan Ewen
Thanks Stephan. If no one object I will create JIRA ticket as reminder
to add formal documentation for the streaming feature.

- Henry

On Mon, Aug 18, 2014 at 11:53 AM, Stephan Ewen <[hidden email]> wrote:

> The streaming code is in "flink-addons", for new/experimental code.
>
> Documents should come over the next days/weeks, definitely before we make
> this part of the core.
>
> Right now, I would suggest to have a look at some of the examples, to get a
> feeling for the addon, check for example this here:
> https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount
>
> (The example reads a file for simplicity, but the project also provides
> connectors for Kafka, RabbitMQ, ...)
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Márton Balassi
Sure, please assign it to me.

On Aug 19, 2014 2:44 AM, "Henry Saputra" <[hidden email]> wrote:

> Thanks Stephan. If no one object I will create JIRA ticket as reminder
> to add formal documentation for the streaming feature.
>
> - Henry
>
> On Mon, Aug 18, 2014 at 11:53 AM, Stephan Ewen <[hidden email]> wrote:
> > The streaming code is in "flink-addons", for new/experimental code.
> >
> > Documents should come over the next days/weeks, definitely before we make
> > this part of the core.
> >
> > Right now, I would suggest to have a look at some of the examples, to
> get a
> > feeling for the addon, check for example this here:
> >
> https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount
> >
> > (The example reads a file for simplicity, but the project also provides
> > connectors for Kafka, RabbitMQ, ...)
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding the streaming project to the main repository

Henry Saputra
Hi Marton,

I created the JIRA ticket to track the streaming documentation:
https://issues.apache.org/jira/browse/FLINK-1058
Somehow I could not find your ASF JIRA username. Could you tell me
what is your ASF JIRA username?

- Henry

On Mon, Aug 18, 2014 at 10:29 PM, Márton Balassi
<[hidden email]> wrote:

> Sure, please assign it to me.
>
> On Aug 19, 2014 2:44 AM, "Henry Saputra" <[hidden email]> wrote:
>
>> Thanks Stephan. If no one object I will create JIRA ticket as reminder
>> to add formal documentation for the streaming feature.
>>
>> - Henry
>>
>> On Mon, Aug 18, 2014 at 11:53 AM, Stephan Ewen <[hidden email]> wrote:
>> > The streaming code is in "flink-addons", for new/experimental code.
>> >
>> > Documents should come over the next days/weeks, definitely before we make
>> > this part of the core.
>> >
>> > Right now, I would suggest to have a look at some of the examples, to
>> get a
>> > feeling for the addon, check for example this here:
>> >
>> https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount
>> >
>> > (The example reads a file for simplicity, but the project also provides
>> > connectors for Kafka, RabbitMQ, ...)
>>
123