[DISCUSS] Project build time and possible restructuring

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Stephan Ewen
@Robert - I think once we know that a separate git repo works well, and
that it actually solves problems, I see no reason to not create a
connectors repository later. The infrastructure changes should be identical
for two or more repositories.

On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]> wrote:

> I think it should not be at least the flink-dist but exactly the remaining
> flink-dist module. Otherwise we do redundant work.
>
> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
> wrote:
>
> > "flink-core" means the main repository, not the "flink-core" module.
> >
> > When doing a release, we need to build the flink main code first, because
> > the flink-libraries depend on that.
> > Once the "flink-libraries" are build, we need to run the main build again
> > (at least the flink-dist module), so that it is pulling the artifacts
> from
> > the flink-libraries to put them into the opt/ folder of the final
> artifact.
> >
> >
> >
> >
> > On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]>
> > wrote:
> >
> > > I'm ok with point 3.
> > >
> > > Concerning point 8: Why do we have to build flink-core twice after
> having
> > > it built as a dependency for flink-libraries? This seems wrong to me.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email]>
> > > wrote:
> > >
> > > > Thank you. Running on AWS is a good idea!
> > > > Let me know if you (or anybody else) wants to help me with the
> > > > infrastructure work! Any help is much appreciated (as I've said
> > before, I
> > > > don't really have time for doing this, but it has to be done :) )
> > > >
> > > > I'm against creating two new repositories. I fear that this
> introduces
> > > too
> > > > much complexity and too many repositories.
> > > > "flink" and "flink-libraries" are hopefully enough to get the build
> > time
> > > > significantly down.
> > > > We can also consider putting the connectors into the
> "flink-libraries"
> > > repo
> > > > if we need to further reduce the build time.
> > > >
> > > > We should probably move "flink-table" of out "flink-libraries" if we
> > want
> > > > to keep "flink-table" in the main repo. (This would eliminate the
> > > > "flink-libraries" module from main.
> > > >
> > > > Also, I agree that "flink-statebackend-rocksdb" is not correctly
> placed
> > > in
> > > > contrib anymore.
> > > >
> > > >
> > > > On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
> > wrote:
> > > >
> > > > > Robert, appreciate your kickstarting this task.
> > > > >
> > > > > We should compare the verification time with and without the listed
> > > > > modules. I’ll try to run this by tomorrow on AWS and on Travis.
> > > > >
> > > > > Should we maintain separate repos for flink-contrib and
> > > flink-libraries?
> > > > > Are you intending that we move flink-table out of flink-libraries
> > (and
> > > > > perhaps flink-statebackend-rocksdb out of flink-contrib)?
> > > > >
> > > > > Greg
> > > > >
> > > > >
> > > > > > On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
> >
> > > > wrote:
> > > > > >
> > > > > > Thank you for looking into this Till.
> > > > > >
> > > > > > I think we then have to split the repositories.
> > > > > > My main motivation for doing this is that it seems to be the only
> > > > > feasible
> > > > > > way of scaling the community to allow more committers working on
> > the
> > > > > > libraries.
> > > > > >
> > > > > > I'll take care of getting things started.
> > > > > >
> > > > > > As the next steps I propose to:
> > > > > > 1. Ask INFRA to rename https://git-wip-us.apache.org/
> > > > repos/asf?p=flink-
> > > > > > connectors.git;a=summary to "flink-libraries"
> > > > > > 2. Ask INFRA to set up GitHub and travis integration for
> > > > > "flink-libraries"
> > > > > > 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> > > > > "flink-cep",
> > > > > > "flink-scala-shell", "flink-storm" into the new repository. (I
> > > decided
> > > > > > against moving flink-contrib there, because rocksdb is in the
> > contrib
> > > > > > module, for flink-table, I'm undecided, but I kept it in the main
> > > repo
> > > > > > because its probably going to interact more with the core code in
> > the
> > > > > > future)
> > > > > > I try to preserve the history of those modules when splitting
> them
> > > into
> > > > > the
> > > > > > new repo
> > > > > > 4. I'll close all pull requests against those modules in the main
> > > repo.
> > > > > > 5. I'll set up a minimal documentation page for the library
> > > repository,
> > > > > > similar to the main documentation.
> > > > > > 6. I'll update the documentation build process to build both
> > > > > documentations
> > > > > > & link them to each other
> > > > > > 7. I'll update the nightly deployment process to include both
> > > > > repositories
> > > > > > 8. I'll update the release script to create the Flink release out
> > of
> > > > both
> > > > > > repositories. In order to put the libraries into the opt/ dir of
> > the
> > > > > > release, I'll need to change the build of "flink-dist" so that it
> > > first
> > > > > > builds flink core, then the libraries and then the core again
> with
> > > the
> > > > > > libraries as an additional dependency.
> > > > > >
> > > > > > The main question for the community is: do you agree with point
> 3 ?
> > > > Would
> > > > > > you like to include more or less?
> > > > > >
> > > > > > I'll start with 1. and 2. tomorrow morning.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> > [hidden email]
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> In theory we could have a merging bot which solves the problem
> of
> > > the
> > > > > >> "commit window". Once the PR passes all tests and has enough
> +1s,
> > > the
> > > > > bot
> > > > > >> could do the merging and, thus, it effectively linearizes the
> > merge
> > > > > >> process.
> > > > > >>
> > > > > >> I think the second point is actually a disadvantage because
> there
> > is
> > > > not
> > > > > >> such an immediate incentive/pressure to fix the broken module if
> > it
> > > > > lives
> > > > > >> in a separate repository. Furthermore, breaking API changes in
> the
> > > > core
> > > > > >> will most likely go unnoticed for some time in other modules
> which
> > > are
> > > > > not
> > > > > >> developed so actively. In the worst case these things will only
> be
> > > > > noticed
> > > > > >> when we try to make a release.
> > > > > >>
> > > > > >> But I also agree that we are not Google and we don't have the
> > > > > capacities to
> > > > > >> maintain such a smooth a build process that we can keep all the
> > code
> > > > in
> > > > > a
> > > > > >> single repository.
> > > > > >>
> > > > > >> I looked a bit into Gradle and as far as I can tell it offers
> some
> > > > nice
> > > > > >> features wrt incrementally building projects. This would be
> > > beneficial
> > > > > for
> > > > > >> local development but it would not solve our build time problems
> > on
> > > > > Travis.
> > > > > >> Gradle intends to introduce a task result cache which allows to
> > > reuse
> > > > > >> results across builds. This could help when building on Travis,
> > > > > however, it
> > > > > >> is not yet fully implemented. Moreover, migrating from Maven to
> > > Gradle
> > > > > >> won't come for free (there's simply no free lunch out there) and
> > we
> > > > > might
> > > > > >> risk to introduce new bugs. Therefore, I would vote to split the
> > > > > repository
> > > > > >> in order to mitigate our current problems with Travis and the
> > build
> > > > > time in
> > > > > >> general. Whether to use a different build system or not can then
> > be
> > > > > >> discussed as an orthogonal question.
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Till
> > > > > >>
> > > > > >> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
> >
> > > > wrote:
> > > > > >>
> > > > > >>> Some other thoughts on how repository split would help. I am
> not
> > > sure
> > > > > for
> > > > > >>> all of them, so please comment:
> > > > > >>>
> > > > > >>>  - There is less competition for a "commit window". It happens
> a
> > > lot
> > > > > >>> already that you run all tests and want to commit, but there
> was
> > a
> > > > > commit
> > > > > >>> in the meantime. You rebase, need to re-test, again commit in
> the
> > > > > >> meantime.
> > > > > >>>    For a "linear" commit history, this may become a bottleneck
> > > > > >> eventually
> > > > > >>> as well.
> > > > > >>>
> > > > > >>>  - There is less risk of broken master. If one
> repository/modules
> > > > > breaks
> > > > > >>> its master, the others can still continue.
> > > > > >>>
> > > > > >>> Stephan
> > > > > >>>
> > > > > >>>
> > > > > >>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> > > > [hidden email]>
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>>> Thanks for all your input. In order to wrap the discussion up
> > I'd
> > > > like
> > > > > >> to
> > > > > >>>> summarize the mentioned points:
> > > > > >>>>
> > > > > >>>> The problem of increasing build times and complexity of the
> > > project
> > > > > has
> > > > > >>>> been acknowledged. Ideally we would have everything in one
> > > > repository
> > > > > >>> using
> > > > > >>>> an incremental build tool. Since Maven does not properly
> support
> > > > this
> > > > > >> we
> > > > > >>>> would have to switch our build tool to something like Gradle,
> > for
> > > > > >>> example.
> > > > > >>>>
> > > > > >>>> Another option is introducing build profiles for different
> sets
> > of
> > > > > >>> modules
> > > > > >>>> as well as separating integration and unit tests. The third
> > > > > alternative
> > > > > >>>> would be creating sub-projects with their own repositories. I
> > > > actually
> > > > > >>>> think that these two proposal are not necessarily exclusive
> and
> > it
> > > > > >> would
> > > > > >>>> also make sense to have a separation between unit and
> > integration
> > > > > tests
> > > > > >>> if
> > > > > >>>> we split the respository.
> > > > > >>>>
> > > > > >>>> The overall consensus seems to be that we don't want to split
> > the
> > > > > >>> community
> > > > > >>>> and want to keep everything under the same umbrella. I think
> > this
> > > is
> > > > > >> the
> > > > > >>>> right way to go, because otherwise some parts of the project
> > could
> > > > > >> become
> > > > > >>>> second class citizens. Given that and that we continue using
> > > Maven,
> > > > I
> > > > > >>> still
> > > > > >>>> think that creating sub-projects for the libraries, for
> example,
> > > > could
> > > > > >> be
> > > > > >>>> beneficial. A split could reduce the project's complexity and
> > make
> > > > it
> > > > > >>>> potentially easier for libraries to get actively developed.
> The
> > > main
> > > > > >>>> concern is setting up the build infrastructure to aggregate
> docs
> > > > from
> > > > > >>>> multiple repositories and making them publicly available.
> > > > > >>>>
> > > > > >>>> Since I started this thread and I would really like to see
> > Flink's
> > > > ML
> > > > > >>>> library being revived again, I'd volunteer investigating first
> > > > whether
> > > > > >> it
> > > > > >>>> is doable establishing a proper incremental build for Flink.
> If
> > > that
> > > > > >>> should
> > > > > >>>> not be possible, I will look into splitting the repository,
> > first
> > > > only
> > > > > >>> for
> > > > > >>>> the libraries. I'll share my results with the community once
> I'm
> > > > done
> > > > > >>> with
> > > > > >>>> the investigation.
> > > > > >>>>
> > > > > >>>> Cheers,
> > > > > >>>> Till
> > > > > >>>>
> > > > > >>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> > > > [hidden email]>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> @Jin Mingjian: You can not use the paid travis version for
> open
> > > > > >> source
> > > > > >>>>> projects. It only works for private repositories (at least
> back
> > > > then
> > > > > >>> when
> > > > > >>>>> we've asked them about that).
> > > > > >>>>>
> > > > > >>>>> @Stephan: I don't think that incremental builds will be
> > available
> > > > > >> with
> > > > > >>>>> Maven anytime soon.
> > > > > >>>>>
> > > > > >>>>> I agree that we need to fix the build time issue on Travis.
> > I've
> > > > > >>> recently
> > > > > >>>>> pushed a commit to use now three instead of two test groups.
> > > > > >>>>> But I don't think that this is feasible long-term solution.
> > > > > >>>>>
> > > > > >>>>> If this discussion is only about reducing the build and test
> > > time,
> > > > > >>>>> introducing build profiles for different components as
> Aljoscha
> > > > > >>> suggested
> > > > > >>>>> would solve the problem Till mentioned.
> > > > > >>>>> Also, if we decide that travis is not a good tool anymore for
> > the
> > > > > >>>> testing,
> > > > > >>>>> I guess we can find a different solution. There are now
> > > competitors
> > > > > >> to
> > > > > >>>>> Travis that might be willing to offer a paid plan for an open
> > > > source
> > > > > >>>>> project, or we set up our own infra on a server sponsored by
> > one
> > > of
> > > > > >> the
> > > > > >>>>> contributing companies.
> > > > > >>>>> If we want to solve "community issues" with the change as
> well,
> > > > then
> > > > > >> I
> > > > > >>>>> think its work the effort of splitting up Flink into
> different
> > > > > >>>>> repositories.
> > > > > >>>>>
> > > > > >>>>> Splitting up repositories is not a trivial task in my
> opinion.
> > As
> > > > > >>> others
> > > > > >>>>> have mentioned before, we need to consider the following
> > things:
> > > > > >>>>> - How are we doing to build the documentation? Ideally every
> > repo
> > > > > >>> should
> > > > > >>>>> contain its docs, so we would need to pull them together when
> > > > > >> building
> > > > > >>>> the
> > > > > >>>>> main docs.
> > > > > >>>>> - How do organize the dependencies? If we have library
> > repository
> > > > > >>> depend
> > > > > >>>> on
> > > > > >>>>> snapshot Flink versions, we need to make sure that the
> snapshot
> > > > > >>>> deployment
> > > > > >>>>> always works. This also means that people working on a
> library
> > > > > >>> repository
> > > > > >>>>> will pull from snapshot OR need to build first locally.
> > > > > >>>>> - We need to update the release scripts
> > > > > >>>>>
> > > > > >>>>> If we commit to do these changes, we need to assign at least
> > one
> > > > > >>>> committer
> > > > > >>>>> (yes, in this case we need somebody who can commit, for
> example
> > > for
> > > > > >>>>> updating the buildbot stuff) who volunteers to do the change.
> > > > > >>>>> I've done a lot of infrastructure work in the past, but I'm
> > > > currently
> > > > > >>>>> pretty booked with many other things, so I don't
> realistically
> > > see
> > > > > >>> myself
> > > > > >>>>> doing that. Max who used to work on these things is taking
> some
> > > > time
> > > > > >>> off.
> > > > > >>>>> I think we need, best case 3 days for the change, worst case
> 5
> > > > days.
> > > > > >>> The
> > > > > >>>>> problem is that there are no "unit tests" for the infra
> stuff,
> > so
> > > > > >> many
> > > > > >>>>> things are "trial and error" (like Apache's buildbot, our
> > release
> > > > > >>>> scripts,
> > > > > >>>>> the doc scripts, maven stuff, nightly builds).
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> > [hidden email]>
> > > > > >>> wrote:
> > > > > >>>>>
> > > > > >>>>>> If we can get a incremental builds to work, that would
> > actually
> > > be
> > > > > >>> the
> > > > > >>>>>> preferred solution in my opinion.
> > > > > >>>>>>
> > > > > >>>>>> Many companies have invested heavily in making a "single
> > > > > >> repository"
> > > > > >>>> code
> > > > > >>>>>> base work, because it has the advantage of not having to
> > > > > >>> update/publish
> > > > > >>>>>> several repositories first.
> > > > > >>>>>> However, the strong prerequisite for that is an incremental
> > > build
> > > > > >>>> system
> > > > > >>>>>> that builds only (fine grained) what it has to build. I am
> not
> > > > sure
> > > > > >>> how
> > > > > >>>>> we
> > > > > >>>>>> could make that work
> > > > > >>>>>> with Maven and Travis...
> > > > > >>>>>>
> > > > > >>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> > > [hidden email]>
> > > > > >>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> An additional option for reducing time to build and test is
> > > > > >>> parallel
> > > > > >>>>>>> execution. This would help users more than on TravisCI
> since
> > > > > >> we're
> > > > > >>>>>>> generally running on multi-core machines rather than VM
> > slices.
> > > > > >>>>>>>
> > > > > >>>>>>> Is the idea that each user would only check out the modules
> > > that
> > > > > >> he
> > > > > >>>> or
> > > > > >>>>>> she
> > > > > >>>>>>> is developing with? For example, if a developer is not
> > working
> > > on
> > > > > >>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
> > would
> > > > > >> not
> > > > > >>> be
> > > > > >>>>>> clone
> > > > > >>>>>>> to their filesystem?
> > > > > >>>>>>>
> > > > > >>>>>>> We can run a TravisCI nightly build on each repo to
> validate
> > > > > >>> against
> > > > > >>>>> API
> > > > > >>>>>>> changes.
> > > > > >>>>>>>
> > > > > >>>>>>> Greg
> > > > > >>>>>>>
> > > > > >>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> > > > > >> [hidden email]
> > > > > >>>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi everybody,
> > > > > >>>>>>>>
> > > > > >>>>>>>> I think this should be a discussion about the benefits and
> > > > > >>>> drawbacks
> > > > > >>>>> of
> > > > > >>>>>>>> separating the code into distinct repositories from a
> > > > > >> development
> > > > > >>>>> point
> > > > > >>>>>>> of
> > > > > >>>>>>>> view.
> > > > > >>>>>>>> So I agree with Stephan that we should not divide the
> > > community
> > > > > >>> by
> > > > > >>>>>>> creating
> > > > > >>>>>>>> separate groups of committers.
> > > > > >>>>>>>> Also the discussion about independent releases is not be
> > > > > >> strictly
> > > > > >>>>>> related
> > > > > >>>>>>>> to the decision, IMO.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I see a few pros and cons for splitting the code base into
> > > > > >>> separate
> > > > > >>>>>>>> repositories which (I think) haven't been mentioned
> before:
> > > > > >>>>>>>> pros:
> > > > > >>>>>>>> - IDE setup will be leaner. It is not necessary to compile
> > the
> > > > > >>>> whole
> > > > > >>>>>> code
> > > > > >>>>>>>> base to run a test after switching a branch.
> > > > > >>>>>>>> cons:
> > > > > >>>>>>>> - developing libraries features that require changes in
> the
> > > > > >> core
> > > > > >>> /
> > > > > >>>>> APIs
> > > > > >>>>>>>> become more time consuming due to back-and-forth between
> > code
> > > > > >>>> bases.
> > > > > >>>>>>>> However, I think this is not very often the case.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Aljoscha has good points as well. Many of the build issues
> > > > > >> could
> > > > > >>> be
> > > > > >>>>>>> solved
> > > > > >>>>>>>> by different build profiles and configurations.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Best, Fabian
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> > > > > >> [hidden email]
> > > > > >>>> :
> > > > > >>>>>>>>
> > > > > >>>>>>>>> @Stephan:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Although I tried to raise some issues about splitting
> > > > > >>> committers,
> > > > > >>>>> I'm
> > > > > >>>>>>>>> still strongly in favor of some kind of restructuring. We
> > > > > >> just
> > > > > >>>> have
> > > > > >>>>>> to
> > > > > >>>>>>> be
> > > > > >>>>>>>>> conscious about the disadvantages.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Not splitting the committers could leave the libraries in
> > the
> > > > > >>>> same
> > > > > >>>>>>>>> stalling status, described by Till. Of course, dedicating
> > > > > >>> current
> > > > > >>>>>>>>> committers as shepherds of the libraries could easily
> > resolve
> > > > > >>> the
> > > > > >>>>>>> issue.
> > > > > >>>>>>>>> But that requires time from current committers. It seems
> > like
> > > > > >>>>>>> trade-offs
> > > > > >>>>>>>>> between code quality, speed of development, and committer
> > > > > >>>> efforts.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> From what I see in the discussion about ML, there are
> many
> > > > > >>> people
> > > > > >>>>>>> willing
> > > > > >>>>>>>>> to contribute as well as production use-cases. This means
> > we
> > > > > >>>> could
> > > > > >>>>>> and
> > > > > >>>>>>>>> should move forward. However, the development speed is
> > > > > >>>>> significantly
> > > > > >>>>>>>> slowed
> > > > > >>>>>>>>> down by stalling PRs. The proposal for contributors
> helping
> > > > > >> the
> > > > > >>>>>> review
> > > > > >>>>>>>>> process did not really work out so far. In my opinion,
> > either
> > > > > >>>> code
> > > > > >>>>>>>> quality
> > > > > >>>>>>>>> (by more easily accepting new committers) or some
> committer
> > > > > >>> time
> > > > > >>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
> > As
> > > > > >>> Till
> > > > > >>>>> has
> > > > > >>>>>>>>> indicated, it would be shameful if we let this
> contribution
> > > > > >>>> effort
> > > > > >>>>>> die.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Cheers,
> > > > > >>>>>>>>> Gabor
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Greg Hogan
I’d like to use this refactoring opportunity to unspilt the Travis tests. With 51 builds queued up for the weekend (some of which may fail or have been force pushed) we are at the limit of the number of contributions we can process. Fixing this requires 1) splitting the project, 2) investigating speedups for long-running tests, and 3) staying cognizant of test performance when accepting new code.

I’d like to add one to Stephan’s list of module group. I like that the modules are generic (“libraries”) so that no one module is alone and independent.

Flink has three “libraries”: cep, ml, and gelly.

“connectors” is a hotspot due to the long-running Kafka tests (and connectors for three Kafka versions).

Both flink-storm and flink-python have a modest number of number of tests and could live with the miscellaneous modules in “contrib”.

The YARN tests are long-running and problematic (I am unable to successfully run these locally). A “cluster” module could host flink-mesos, flink-yarn, and flink-yarn-tests.

That gets us close to running all tests in a single Travis build.
  https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590>

I also tested (https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build>) with a maven parallelism of 2 and 4, with the latter a 6.4% drop in build time.
  https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659>
  https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470>

We can run Travis CI builds nightly to guard against breaking changes.

I also wanted to get an idea of how disruptive it would be to developers to divide the project into multiple git repos. I wrote a simple python script and configured it with the module partitions listed above. The usage string from the top of the file lists commits with files from multiple partitions and well as the modified files.
  https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>

Accounting for the merging of the batch and streaming connector modules, and assuming that the project structure has not changed much over the past 15 months, for the following date ranges the listed number of commits would have been split across repositories.

since "2017-01-01"
36 of 571 commits were mixed

since "2016-07-01"
155 of 1607 commits were mixed

since "2016-01-01"
272 of 2561 commits were mixed

Greg


> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>
> @Robert - I think once we know that a separate git repo works well, and
> that it actually solves problems, I see no reason to not create a
> connectors repository later. The infrastructure changes should be identical
> for two or more repositories.
>
> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]> wrote:
>
>> I think it should not be at least the flink-dist but exactly the remaining
>> flink-dist module. Otherwise we do redundant work.
>>
>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
>> wrote:
>>
>>> "flink-core" means the main repository, not the "flink-core" module.
>>>
>>> When doing a release, we need to build the flink main code first, because
>>> the flink-libraries depend on that.
>>> Once the "flink-libraries" are build, we need to run the main build again
>>> (at least the flink-dist module), so that it is pulling the artifacts
>> from
>>> the flink-libraries to put them into the opt/ folder of the final
>> artifact.
>>>
>>>
>>>
>>>
>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]>
>>> wrote:
>>>
>>>> I'm ok with point 3.
>>>>
>>>> Concerning point 8: Why do we have to build flink-core twice after
>> having
>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email]>
>>>> wrote:
>>>>
>>>>> Thank you. Running on AWS is a good idea!
>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>> infrastructure work! Any help is much appreciated (as I've said
>>> before, I
>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>
>>>>> I'm against creating two new repositories. I fear that this
>> introduces
>>>> too
>>>>> much complexity and too many repositories.
>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>> time
>>>>> significantly down.
>>>>> We can also consider putting the connectors into the
>> "flink-libraries"
>>>> repo
>>>>> if we need to further reduce the build time.
>>>>>
>>>>> We should probably move "flink-table" of out "flink-libraries" if we
>>> want
>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>> "flink-libraries" module from main.
>>>>>
>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>> placed
>>>> in
>>>>> contrib anymore.
>>>>>
>>>>>
>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>> wrote:
>>>>>
>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>
>>>>>> We should compare the verification time with and without the listed
>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>
>>>>>> Should we maintain separate repos for flink-contrib and
>>>> flink-libraries?
>>>>>> Are you intending that we move flink-table out of flink-libraries
>>> (and
>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>
>>>>> wrote:
>>>>>>>
>>>>>>> Thank you for looking into this Till.
>>>>>>>
>>>>>>> I think we then have to split the repositories.
>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>> feasible
>>>>>>> way of scaling the community to allow more committers working on
>>> the
>>>>>>> libraries.
>>>>>>>
>>>>>>> I'll take care of getting things started.
>>>>>>>
>>>>>>> As the next steps I propose to:
>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>> repos/asf?p=flink-
>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>> "flink-libraries"
>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>> "flink-cep",
>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>> decided
>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>> contrib
>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>> repo
>>>>>>> because its probably going to interact more with the core code in
>>> the
>>>>>>> future)
>>>>>>> I try to preserve the history of those modules when splitting
>> them
>>>> into
>>>>>> the
>>>>>>> new repo
>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>> repo.
>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>> repository,
>>>>>>> similar to the main documentation.
>>>>>>> 6. I'll update the documentation build process to build both
>>>>>> documentations
>>>>>>> & link them to each other
>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>> repositories
>>>>>>> 8. I'll update the release script to create the Flink release out
>>> of
>>>>> both
>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>> the
>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>> first
>>>>>>> builds flink core, then the libraries and then the core again
>> with
>>>> the
>>>>>>> libraries as an additional dependency.
>>>>>>>
>>>>>>> The main question for the community is: do you agree with point
>> 3 ?
>>>>> Would
>>>>>>> you like to include more or less?
>>>>>>>
>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>> [hidden email]
>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>>> In theory we could have a merging bot which solves the problem
>> of
>>>> the
>>>>>>>> "commit window". Once the PR passes all tests and has enough
>> +1s,
>>>> the
>>>>>> bot
>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>> merge
>>>>>>>> process.
>>>>>>>>
>>>>>>>> I think the second point is actually a disadvantage because
>> there
>>> is
>>>>> not
>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>> it
>>>>>> lives
>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>> the
>>>>> core
>>>>>>>> will most likely go unnoticed for some time in other modules
>> which
>>>> are
>>>>>> not
>>>>>>>> developed so actively. In the worst case these things will only
>> be
>>>>>> noticed
>>>>>>>> when we try to make a release.
>>>>>>>>
>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>> capacities to
>>>>>>>> maintain such a smooth a build process that we can keep all the
>>> code
>>>>> in
>>>>>> a
>>>>>>>> single repository.
>>>>>>>>
>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>> some
>>>>> nice
>>>>>>>> features wrt incrementally building projects. This would be
>>>> beneficial
>>>>>> for
>>>>>>>> local development but it would not solve our build time problems
>>> on
>>>>>> Travis.
>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>> reuse
>>>>>>>> results across builds. This could help when building on Travis,
>>>>>> however, it
>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>> Gradle
>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>> we
>>>>>> might
>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>> repository
>>>>>>>> in order to mitigate our current problems with Travis and the
>>> build
>>>>>> time in
>>>>>>>> general. Whether to use a different build system or not can then
>>> be
>>>>>>>> discussed as an orthogonal question.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Till
>>>>>>>>
>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> Some other thoughts on how repository split would help. I am
>> not
>>>> sure
>>>>>> for
>>>>>>>>> all of them, so please comment:
>>>>>>>>>
>>>>>>>>> - There is less competition for a "commit window". It happens
>> a
>>>> lot
>>>>>>>>> already that you run all tests and want to commit, but there
>> was
>>> a
>>>>>> commit
>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>> the
>>>>>>>> meantime.
>>>>>>>>>   For a "linear" commit history, this may become a bottleneck
>>>>>>>> eventually
>>>>>>>>> as well.
>>>>>>>>>
>>>>>>>>> - There is less risk of broken master. If one
>> repository/modules
>>>>>> breaks
>>>>>>>>> its master, the others can still continue.
>>>>>>>>>
>>>>>>>>> Stephan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>> I'd
>>>>> like
>>>>>>>> to
>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>
>>>>>>>>>> The problem of increasing build times and complexity of the
>>>> project
>>>>>> has
>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>> repository
>>>>>>>>> using
>>>>>>>>>> an incremental build tool. Since Maven does not properly
>> support
>>>>> this
>>>>>>>> we
>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>> for
>>>>>>>>> example.
>>>>>>>>>>
>>>>>>>>>> Another option is introducing build profiles for different
>> sets
>>> of
>>>>>>>>> modules
>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>> alternative
>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>> actually
>>>>>>>>>> think that these two proposal are not necessarily exclusive
>> and
>>> it
>>>>>>>> would
>>>>>>>>>> also make sense to have a separation between unit and
>>> integration
>>>>>> tests
>>>>>>>>> if
>>>>>>>>>> we split the respository.
>>>>>>>>>>
>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>> the
>>>>>>>>> community
>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>> this
>>>> is
>>>>>>>> the
>>>>>>>>>> right way to go, because otherwise some parts of the project
>>> could
>>>>>>>> become
>>>>>>>>>> second class citizens. Given that and that we continue using
>>>> Maven,
>>>>> I
>>>>>>>>> still
>>>>>>>>>> think that creating sub-projects for the libraries, for
>> example,
>>>>> could
>>>>>>>> be
>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>> make
>>>>> it
>>>>>>>>>> potentially easier for libraries to get actively developed.
>> The
>>>> main
>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>> docs
>>>>> from
>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>
>>>>>>>>>> Since I started this thread and I would really like to see
>>> Flink's
>>>>> ML
>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>> whether
>>>>>>>> it
>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>> If
>>>> that
>>>>>>>>> should
>>>>>>>>>> not be possible, I will look into splitting the repository,
>>> first
>>>>> only
>>>>>>>>> for
>>>>>>>>>> the libraries. I'll share my results with the community once
>> I'm
>>>>> done
>>>>>>>>> with
>>>>>>>>>> the investigation.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>> [hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>> open
>>>>>>>> source
>>>>>>>>>>> projects. It only works for private repositories (at least
>> back
>>>>> then
>>>>>>>>> when
>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>
>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>> available
>>>>>>>> with
>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>
>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>> I've
>>>>>>>>> recently
>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>
>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>> time,
>>>>>>>>>>> introducing build profiles for different components as
>> Aljoscha
>>>>>>>>> suggested
>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>> the
>>>>>>>>>> testing,
>>>>>>>>>>> I guess we can find a different solution. There are now
>>>> competitors
>>>>>>>> to
>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>> source
>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>> one
>>>> of
>>>>>>>> the
>>>>>>>>>>> contributing companies.
>>>>>>>>>>> If we want to solve "community issues" with the change as
>> well,
>>>>> then
>>>>>>>> I
>>>>>>>>>>> think its work the effort of splitting up Flink into
>> different
>>>>>>>>>>> repositories.
>>>>>>>>>>>
>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>> opinion.
>>> As
>>>>>>>>> others
>>>>>>>>>>> have mentioned before, we need to consider the following
>>> things:
>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>> repo
>>>>>>>>> should
>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>> building
>>>>>>>>>> the
>>>>>>>>>>> main docs.
>>>>>>>>>>> - How do organize the dependencies? If we have library
>>> repository
>>>>>>>>> depend
>>>>>>>>>> on
>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>> snapshot
>>>>>>>>>> deployment
>>>>>>>>>>> always works. This also means that people working on a
>> library
>>>>>>>>> repository
>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>
>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>> one
>>>>>>>>>> committer
>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>> example
>>>> for
>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>> currently
>>>>>>>>>>> pretty booked with many other things, so I don't
>> realistically
>>>> see
>>>>>>>>> myself
>>>>>>>>>>> doing that. Max who used to work on these things is taking
>> some
>>>>> time
>>>>>>>>> off.
>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>> 5
>>>>> days.
>>>>>>>>> The
>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>> stuff,
>>> so
>>>>>>>> many
>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>> release
>>>>>>>>>> scripts,
>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>> actually
>>>> be
>>>>>>>>> the
>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>
>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>> repository"
>>>>>>>>>> code
>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>> update/publish
>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>> build
>>>>>>>>>> system
>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>> not
>>>>> sure
>>>>>>>>> how
>>>>>>>>>>> we
>>>>>>>>>>>> could make that work
>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>> [hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>> parallel
>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>> since
>>>>>>>> we're
>>>>>>>>>>>>> generally running on multi-core machines rather than VM
>>> slices.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is the idea that each user would only check out the modules
>>>> that
>>>>>>>> he
>>>>>>>>>> or
>>>>>>>>>>>> she
>>>>>>>>>>>>> is developing with? For example, if a developer is not
>>> working
>>>> on
>>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
>>> would
>>>>>>>> not
>>>>>>>>> be
>>>>>>>>>>>> clone
>>>>>>>>>>>>> to their filesystem?
>>>>>>>>>>>>>
>>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
>> validate
>>>>>>>>> against
>>>>>>>>>>> API
>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
>>>>>>>> [hidden email]
>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think this should be a discussion about the benefits and
>>>>>>>>>> drawbacks
>>>>>>>>>>> of
>>>>>>>>>>>>>> separating the code into distinct repositories from a
>>>>>>>> development
>>>>>>>>>>> point
>>>>>>>>>>>>> of
>>>>>>>>>>>>>> view.
>>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
>>>> community
>>>>>>>>> by
>>>>>>>>>>>>> creating
>>>>>>>>>>>>>> separate groups of committers.
>>>>>>>>>>>>>> Also the discussion about independent releases is not be
>>>>>>>> strictly
>>>>>>>>>>>> related
>>>>>>>>>>>>>> to the decision, IMO.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
>>>>>>>>> separate
>>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
>> before:
>>>>>>>>>>>>>> pros:
>>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
>>> the
>>>>>>>>>> whole
>>>>>>>>>>>> code
>>>>>>>>>>>>>> base to run a test after switching a branch.
>>>>>>>>>>>>>> cons:
>>>>>>>>>>>>>> - developing libraries features that require changes in
>> the
>>>>>>>> core
>>>>>>>>> /
>>>>>>>>>>> APIs
>>>>>>>>>>>>>> become more time consuming due to back-and-forth between
>>> code
>>>>>>>>>> bases.
>>>>>>>>>>>>>> However, I think this is not very often the case.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
>>>>>>>> could
>>>>>>>>> be
>>>>>>>>>>>>> solved
>>>>>>>>>>>>>> by different build profiles and configurations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
>>>>>>>> [hidden email]
>>>>>>>>>> :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Stephan:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
>>>>>>>>> committers,
>>>>>>>>>>> I'm
>>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
>>>>>>>> just
>>>>>>>>>> have
>>>>>>>>>>>> to
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> conscious about the disadvantages.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
>>> the
>>>>>>>>>> same
>>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
>>>>>>>>> current
>>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
>>> resolve
>>>>>>>>> the
>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>> But that requires time from current committers. It seems
>>> like
>>>>>>>>>>>>> trade-offs
>>>>>>>>>>>>>>> between code quality, speed of development, and committer
>>>>>>>>>> efforts.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
>> many
>>>>>>>>> people
>>>>>>>>>>>>> willing
>>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
>>> we
>>>>>>>>>> could
>>>>>>>>>>>> and
>>>>>>>>>>>>>>> should move forward. However, the development speed is
>>>>>>>>>>> significantly
>>>>>>>>>>>>>> slowed
>>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
>> helping
>>>>>>>> the
>>>>>>>>>>>> review
>>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
>>> either
>>>>>>>>>> code
>>>>>>>>>>>>>> quality
>>>>>>>>>>>>>>> (by more easily accepting new committers) or some
>> committer
>>>>>>>>> time
>>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
>>> As
>>>>>>>>> Till
>>>>>>>>>>> has
>>>>>>>>>>>>>>> indicated, it would be shameful if we let this
>> contribution
>>>>>>>>>> effort
>>>>>>>>>>>> die.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Gabor
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Stephan Ewen
@Greg

I am personally in favor of splitting "connectors" and "contrib" out as
well. I know that @rmetzger has some reservations about the connectors, but
we may be able to convince him.

For the cluster tests (yarn / mesos) - in the past there were many cases
where these tests caught cases that other tests did not, because they are
the only tests that actually use the "flink-dist.jar" and thus discover
many dependency and configuration issues. For that reason, my feeling would
be that they are valuable in the core repository.

I would actually suggest to do only the library split initially, to see
what the challenges are in setting up the multi-repo build and release
tooling. Once we gathered experience there, we can probably easily see what
else we can split out.

Stephan


On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:

> I’d like to use this refactoring opportunity to unspilt the Travis tests.
> With 51 builds queued up for the weekend (some of which may fail or have
> been force pushed) we are at the limit of the number of contributions we
> can process. Fixing this requires 1) splitting the project, 2)
> investigating speedups for long-running tests, and 3) staying cognizant of
> test performance when accepting new code.
>
> I’d like to add one to Stephan’s list of module group. I like that the
> modules are generic (“libraries”) so that no one module is alone and
> independent.
>
> Flink has three “libraries”: cep, ml, and gelly.
>
> “connectors” is a hotspot due to the long-running Kafka tests (and
> connectors for three Kafka versions).
>
> Both flink-storm and flink-python have a modest number of number of tests
> and could live with the miscellaneous modules in “contrib”.
>
> The YARN tests are long-running and problematic (I am unable to
> successfully run these locally). A “cluster” module could host flink-mesos,
> flink-yarn, and flink-yarn-tests.
>
> That gets us close to running all tests in a single Travis build.
>   https://travis-ci.org/greghogan/flink/builds/212122590 <
> https://travis-ci.org/greghogan/flink/builds/212122590>
>
> I also tested (https://github.com/greghogan/flink/commits/core_build <
> https://github.com/greghogan/flink/commits/core_build>) with a maven
> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>   https://travis-ci.org/greghogan/flink/builds/212137659 <
> https://travis-ci.org/greghogan/flink/builds/212137659>
>   https://travis-ci.org/greghogan/flink/builds/212154470 <
> https://travis-ci.org/greghogan/flink/builds/212154470>
>
> We can run Travis CI builds nightly to guard against breaking changes.
>
> I also wanted to get an idea of how disruptive it would be to developers
> to divide the project into multiple git repos. I wrote a simple python
> script and configured it with the module partitions listed above. The usage
> string from the top of the file lists commits with files from multiple
> partitions and well as the modified files.
>   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>
> Accounting for the merging of the batch and streaming connector modules,
> and assuming that the project structure has not changed much over the past
> 15 months, for the following date ranges the listed number of commits would
> have been split across repositories.
>
> since "2017-01-01"
> 36 of 571 commits were mixed
>
> since "2016-07-01"
> 155 of 1607 commits were mixed
>
> since "2016-01-01"
> 272 of 2561 commits were mixed
>
> Greg
>
>
> > On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
> >
> > @Robert - I think once we know that a separate git repo works well, and
> > that it actually solves problems, I see no reason to not create a
> > connectors repository later. The infrastructure changes should be
> identical
> > for two or more repositories.
> >
> > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
> wrote:
> >
> >> I think it should not be at least the flink-dist but exactly the
> remaining
> >> flink-dist module. Otherwise we do redundant work.
> >>
> >> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
> >> wrote:
> >>
> >>> "flink-core" means the main repository, not the "flink-core" module.
> >>>
> >>> When doing a release, we need to build the flink main code first,
> because
> >>> the flink-libraries depend on that.
> >>> Once the "flink-libraries" are build, we need to run the main build
> again
> >>> (at least the flink-dist module), so that it is pulling the artifacts
> >> from
> >>> the flink-libraries to put them into the opt/ folder of the final
> >> artifact.
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]>
> >>> wrote:
> >>>
> >>>> I'm ok with point 3.
> >>>>
> >>>> Concerning point 8: Why do we have to build flink-core twice after
> >> having
> >>>> it built as a dependency for flink-libraries? This seems wrong to me.
> >>>>
> >>>> Cheers,
> >>>> Till
> >>>>
> >>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>> Thank you. Running on AWS is a good idea!
> >>>>> Let me know if you (or anybody else) wants to help me with the
> >>>>> infrastructure work! Any help is much appreciated (as I've said
> >>> before, I
> >>>>> don't really have time for doing this, but it has to be done :) )
> >>>>>
> >>>>> I'm against creating two new repositories. I fear that this
> >> introduces
> >>>> too
> >>>>> much complexity and too many repositories.
> >>>>> "flink" and "flink-libraries" are hopefully enough to get the build
> >>> time
> >>>>> significantly down.
> >>>>> We can also consider putting the connectors into the
> >> "flink-libraries"
> >>>> repo
> >>>>> if we need to further reduce the build time.
> >>>>>
> >>>>> We should probably move "flink-table" of out "flink-libraries" if we
> >>> want
> >>>>> to keep "flink-table" in the main repo. (This would eliminate the
> >>>>> "flink-libraries" module from main.
> >>>>>
> >>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
> >> placed
> >>>> in
> >>>>> contrib anymore.
> >>>>>
> >>>>>
> >>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
> >>> wrote:
> >>>>>
> >>>>>> Robert, appreciate your kickstarting this task.
> >>>>>>
> >>>>>> We should compare the verification time with and without the listed
> >>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
> >>>>>>
> >>>>>> Should we maintain separate repos for flink-contrib and
> >>>> flink-libraries?
> >>>>>> Are you intending that we move flink-table out of flink-libraries
> >>> (and
> >>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
> >>>>>>
> >>>>>> Greg
> >>>>>>
> >>>>>>
> >>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
> >>>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> Thank you for looking into this Till.
> >>>>>>>
> >>>>>>> I think we then have to split the repositories.
> >>>>>>> My main motivation for doing this is that it seems to be the only
> >>>>>> feasible
> >>>>>>> way of scaling the community to allow more committers working on
> >>> the
> >>>>>>> libraries.
> >>>>>>>
> >>>>>>> I'll take care of getting things started.
> >>>>>>>
> >>>>>>> As the next steps I propose to:
> >>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
> >>>>> repos/asf?p=flink-
> >>>>>>> connectors.git;a=summary to "flink-libraries"
> >>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
> >>>>>> "flink-libraries"
> >>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> >>>>>> "flink-cep",
> >>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
> >>>> decided
> >>>>>>> against moving flink-contrib there, because rocksdb is in the
> >>> contrib
> >>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
> >>>> repo
> >>>>>>> because its probably going to interact more with the core code in
> >>> the
> >>>>>>> future)
> >>>>>>> I try to preserve the history of those modules when splitting
> >> them
> >>>> into
> >>>>>> the
> >>>>>>> new repo
> >>>>>>> 4. I'll close all pull requests against those modules in the main
> >>>> repo.
> >>>>>>> 5. I'll set up a minimal documentation page for the library
> >>>> repository,
> >>>>>>> similar to the main documentation.
> >>>>>>> 6. I'll update the documentation build process to build both
> >>>>>> documentations
> >>>>>>> & link them to each other
> >>>>>>> 7. I'll update the nightly deployment process to include both
> >>>>>> repositories
> >>>>>>> 8. I'll update the release script to create the Flink release out
> >>> of
> >>>>> both
> >>>>>>> repositories. In order to put the libraries into the opt/ dir of
> >>> the
> >>>>>>> release, I'll need to change the build of "flink-dist" so that it
> >>>> first
> >>>>>>> builds flink core, then the libraries and then the core again
> >> with
> >>>> the
> >>>>>>> libraries as an additional dependency.
> >>>>>>>
> >>>>>>> The main question for the community is: do you agree with point
> >> 3 ?
> >>>>> Would
> >>>>>>> you like to include more or less?
> >>>>>>>
> >>>>>>> I'll start with 1. and 2. tomorrow morning.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> >>> [hidden email]
> >>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> In theory we could have a merging bot which solves the problem
> >> of
> >>>> the
> >>>>>>>> "commit window". Once the PR passes all tests and has enough
> >> +1s,
> >>>> the
> >>>>>> bot
> >>>>>>>> could do the merging and, thus, it effectively linearizes the
> >>> merge
> >>>>>>>> process.
> >>>>>>>>
> >>>>>>>> I think the second point is actually a disadvantage because
> >> there
> >>> is
> >>>>> not
> >>>>>>>> such an immediate incentive/pressure to fix the broken module if
> >>> it
> >>>>>> lives
> >>>>>>>> in a separate repository. Furthermore, breaking API changes in
> >> the
> >>>>> core
> >>>>>>>> will most likely go unnoticed for some time in other modules
> >> which
> >>>> are
> >>>>>> not
> >>>>>>>> developed so actively. In the worst case these things will only
> >> be
> >>>>>> noticed
> >>>>>>>> when we try to make a release.
> >>>>>>>>
> >>>>>>>> But I also agree that we are not Google and we don't have the
> >>>>>> capacities to
> >>>>>>>> maintain such a smooth a build process that we can keep all the
> >>> code
> >>>>> in
> >>>>>> a
> >>>>>>>> single repository.
> >>>>>>>>
> >>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
> >> some
> >>>>> nice
> >>>>>>>> features wrt incrementally building projects. This would be
> >>>> beneficial
> >>>>>> for
> >>>>>>>> local development but it would not solve our build time problems
> >>> on
> >>>>>> Travis.
> >>>>>>>> Gradle intends to introduce a task result cache which allows to
> >>>> reuse
> >>>>>>>> results across builds. This could help when building on Travis,
> >>>>>> however, it
> >>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
> >>>> Gradle
> >>>>>>>> won't come for free (there's simply no free lunch out there) and
> >>> we
> >>>>>> might
> >>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
> >>>>>> repository
> >>>>>>>> in order to mitigate our current problems with Travis and the
> >>> build
> >>>>>> time in
> >>>>>>>> general. Whether to use a different build system or not can then
> >>> be
> >>>>>>>> discussed as an orthogonal question.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Till
> >>>>>>>>
> >>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
> >>>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Some other thoughts on how repository split would help. I am
> >> not
> >>>> sure
> >>>>>> for
> >>>>>>>>> all of them, so please comment:
> >>>>>>>>>
> >>>>>>>>> - There is less competition for a "commit window". It happens
> >> a
> >>>> lot
> >>>>>>>>> already that you run all tests and want to commit, but there
> >> was
> >>> a
> >>>>>> commit
> >>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
> >> the
> >>>>>>>> meantime.
> >>>>>>>>>   For a "linear" commit history, this may become a bottleneck
> >>>>>>>> eventually
> >>>>>>>>> as well.
> >>>>>>>>>
> >>>>>>>>> - There is less risk of broken master. If one
> >> repository/modules
> >>>>>> breaks
> >>>>>>>>> its master, the others can still continue.
> >>>>>>>>>
> >>>>>>>>> Stephan
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> >>>>> [hidden email]>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
> >>> I'd
> >>>>> like
> >>>>>>>> to
> >>>>>>>>>> summarize the mentioned points:
> >>>>>>>>>>
> >>>>>>>>>> The problem of increasing build times and complexity of the
> >>>> project
> >>>>>> has
> >>>>>>>>>> been acknowledged. Ideally we would have everything in one
> >>>>> repository
> >>>>>>>>> using
> >>>>>>>>>> an incremental build tool. Since Maven does not properly
> >> support
> >>>>> this
> >>>>>>>> we
> >>>>>>>>>> would have to switch our build tool to something like Gradle,
> >>> for
> >>>>>>>>> example.
> >>>>>>>>>>
> >>>>>>>>>> Another option is introducing build profiles for different
> >> sets
> >>> of
> >>>>>>>>> modules
> >>>>>>>>>> as well as separating integration and unit tests. The third
> >>>>>> alternative
> >>>>>>>>>> would be creating sub-projects with their own repositories. I
> >>>>> actually
> >>>>>>>>>> think that these two proposal are not necessarily exclusive
> >> and
> >>> it
> >>>>>>>> would
> >>>>>>>>>> also make sense to have a separation between unit and
> >>> integration
> >>>>>> tests
> >>>>>>>>> if
> >>>>>>>>>> we split the respository.
> >>>>>>>>>>
> >>>>>>>>>> The overall consensus seems to be that we don't want to split
> >>> the
> >>>>>>>>> community
> >>>>>>>>>> and want to keep everything under the same umbrella. I think
> >>> this
> >>>> is
> >>>>>>>> the
> >>>>>>>>>> right way to go, because otherwise some parts of the project
> >>> could
> >>>>>>>> become
> >>>>>>>>>> second class citizens. Given that and that we continue using
> >>>> Maven,
> >>>>> I
> >>>>>>>>> still
> >>>>>>>>>> think that creating sub-projects for the libraries, for
> >> example,
> >>>>> could
> >>>>>>>> be
> >>>>>>>>>> beneficial. A split could reduce the project's complexity and
> >>> make
> >>>>> it
> >>>>>>>>>> potentially easier for libraries to get actively developed.
> >> The
> >>>> main
> >>>>>>>>>> concern is setting up the build infrastructure to aggregate
> >> docs
> >>>>> from
> >>>>>>>>>> multiple repositories and making them publicly available.
> >>>>>>>>>>
> >>>>>>>>>> Since I started this thread and I would really like to see
> >>> Flink's
> >>>>> ML
> >>>>>>>>>> library being revived again, I'd volunteer investigating first
> >>>>> whether
> >>>>>>>> it
> >>>>>>>>>> is doable establishing a proper incremental build for Flink.
> >> If
> >>>> that
> >>>>>>>>> should
> >>>>>>>>>> not be possible, I will look into splitting the repository,
> >>> first
> >>>>> only
> >>>>>>>>> for
> >>>>>>>>>> the libraries. I'll share my results with the community once
> >> I'm
> >>>>> done
> >>>>>>>>> with
> >>>>>>>>>> the investigation.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Till
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> >>>>> [hidden email]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
> >> open
> >>>>>>>> source
> >>>>>>>>>>> projects. It only works for private repositories (at least
> >> back
> >>>>> then
> >>>>>>>>> when
> >>>>>>>>>>> we've asked them about that).
> >>>>>>>>>>>
> >>>>>>>>>>> @Stephan: I don't think that incremental builds will be
> >>> available
> >>>>>>>> with
> >>>>>>>>>>> Maven anytime soon.
> >>>>>>>>>>>
> >>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
> >>> I've
> >>>>>>>>> recently
> >>>>>>>>>>> pushed a commit to use now three instead of two test groups.
> >>>>>>>>>>> But I don't think that this is feasible long-term solution.
> >>>>>>>>>>>
> >>>>>>>>>>> If this discussion is only about reducing the build and test
> >>>> time,
> >>>>>>>>>>> introducing build profiles for different components as
> >> Aljoscha
> >>>>>>>>> suggested
> >>>>>>>>>>> would solve the problem Till mentioned.
> >>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
> >>> the
> >>>>>>>>>> testing,
> >>>>>>>>>>> I guess we can find a different solution. There are now
> >>>> competitors
> >>>>>>>> to
> >>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
> >>>>> source
> >>>>>>>>>>> project, or we set up our own infra on a server sponsored by
> >>> one
> >>>> of
> >>>>>>>> the
> >>>>>>>>>>> contributing companies.
> >>>>>>>>>>> If we want to solve "community issues" with the change as
> >> well,
> >>>>> then
> >>>>>>>> I
> >>>>>>>>>>> think its work the effort of splitting up Flink into
> >> different
> >>>>>>>>>>> repositories.
> >>>>>>>>>>>
> >>>>>>>>>>> Splitting up repositories is not a trivial task in my
> >> opinion.
> >>> As
> >>>>>>>>> others
> >>>>>>>>>>> have mentioned before, we need to consider the following
> >>> things:
> >>>>>>>>>>> - How are we doing to build the documentation? Ideally every
> >>> repo
> >>>>>>>>> should
> >>>>>>>>>>> contain its docs, so we would need to pull them together when
> >>>>>>>> building
> >>>>>>>>>> the
> >>>>>>>>>>> main docs.
> >>>>>>>>>>> - How do organize the dependencies? If we have library
> >>> repository
> >>>>>>>>> depend
> >>>>>>>>>> on
> >>>>>>>>>>> snapshot Flink versions, we need to make sure that the
> >> snapshot
> >>>>>>>>>> deployment
> >>>>>>>>>>> always works. This also means that people working on a
> >> library
> >>>>>>>>> repository
> >>>>>>>>>>> will pull from snapshot OR need to build first locally.
> >>>>>>>>>>> - We need to update the release scripts
> >>>>>>>>>>>
> >>>>>>>>>>> If we commit to do these changes, we need to assign at least
> >>> one
> >>>>>>>>>> committer
> >>>>>>>>>>> (yes, in this case we need somebody who can commit, for
> >> example
> >>>> for
> >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
> >>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
> >>>>> currently
> >>>>>>>>>>> pretty booked with many other things, so I don't
> >> realistically
> >>>> see
> >>>>>>>>> myself
> >>>>>>>>>>> doing that. Max who used to work on these things is taking
> >> some
> >>>>> time
> >>>>>>>>> off.
> >>>>>>>>>>> I think we need, best case 3 days for the change, worst case
> >> 5
> >>>>> days.
> >>>>>>>>> The
> >>>>>>>>>>> problem is that there are no "unit tests" for the infra
> >> stuff,
> >>> so
> >>>>>>>> many
> >>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
> >>> release
> >>>>>>>>>> scripts,
> >>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> >>> [hidden email]>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> If we can get a incremental builds to work, that would
> >>> actually
> >>>> be
> >>>>>>>>> the
> >>>>>>>>>>>> preferred solution in my opinion.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Many companies have invested heavily in making a "single
> >>>>>>>> repository"
> >>>>>>>>>> code
> >>>>>>>>>>>> base work, because it has the advantage of not having to
> >>>>>>>>> update/publish
> >>>>>>>>>>>> several repositories first.
> >>>>>>>>>>>> However, the strong prerequisite for that is an incremental
> >>>> build
> >>>>>>>>>> system
> >>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
> >> not
> >>>>> sure
> >>>>>>>>> how
> >>>>>>>>>>> we
> >>>>>>>>>>>> could make that work
> >>>>>>>>>>>> with Maven and Travis...
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> >>>> [hidden email]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> An additional option for reducing time to build and test is
> >>>>>>>>> parallel
> >>>>>>>>>>>>> execution. This would help users more than on TravisCI
> >> since
> >>>>>>>> we're
> >>>>>>>>>>>>> generally running on multi-core machines rather than VM
> >>> slices.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Is the idea that each user would only check out the modules
> >>>> that
> >>>>>>>> he
> >>>>>>>>>> or
> >>>>>>>>>>>> she
> >>>>>>>>>>>>> is developing with? For example, if a developer is not
> >>> working
> >>>> on
> >>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
> >>> would
> >>>>>>>> not
> >>>>>>>>> be
> >>>>>>>>>>>> clone
> >>>>>>>>>>>>> to their filesystem?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
> >> validate
> >>>>>>>>> against
> >>>>>>>>>>> API
> >>>>>>>>>>>>> changes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Greg
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> >>>>>>>> [hidden email]
> >>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi everybody,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think this should be a discussion about the benefits and
> >>>>>>>>>> drawbacks
> >>>>>>>>>>> of
> >>>>>>>>>>>>>> separating the code into distinct repositories from a
> >>>>>>>> development
> >>>>>>>>>>> point
> >>>>>>>>>>>>> of
> >>>>>>>>>>>>>> view.
> >>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
> >>>> community
> >>>>>>>>> by
> >>>>>>>>>>>>> creating
> >>>>>>>>>>>>>> separate groups of committers.
> >>>>>>>>>>>>>> Also the discussion about independent releases is not be
> >>>>>>>> strictly
> >>>>>>>>>>>> related
> >>>>>>>>>>>>>> to the decision, IMO.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
> >>>>>>>>> separate
> >>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
> >> before:
> >>>>>>>>>>>>>> pros:
> >>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
> >>> the
> >>>>>>>>>> whole
> >>>>>>>>>>>> code
> >>>>>>>>>>>>>> base to run a test after switching a branch.
> >>>>>>>>>>>>>> cons:
> >>>>>>>>>>>>>> - developing libraries features that require changes in
> >> the
> >>>>>>>> core
> >>>>>>>>> /
> >>>>>>>>>>> APIs
> >>>>>>>>>>>>>> become more time consuming due to back-and-forth between
> >>> code
> >>>>>>>>>> bases.
> >>>>>>>>>>>>>> However, I think this is not very often the case.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
> >>>>>>>> could
> >>>>>>>>> be
> >>>>>>>>>>>>> solved
> >>>>>>>>>>>>>> by different build profiles and configurations.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best, Fabian
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> >>>>>>>> [hidden email]
> >>>>>>>>>> :
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> @Stephan:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
> >>>>>>>>> committers,
> >>>>>>>>>>> I'm
> >>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
> >>>>>>>> just
> >>>>>>>>>> have
> >>>>>>>>>>>> to
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>> conscious about the disadvantages.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
> >>> the
> >>>>>>>>>> same
> >>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
> >>>>>>>>> current
> >>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
> >>> resolve
> >>>>>>>>> the
> >>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>>> But that requires time from current committers. It seems
> >>> like
> >>>>>>>>>>>>> trade-offs
> >>>>>>>>>>>>>>> between code quality, speed of development, and committer
> >>>>>>>>>> efforts.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
> >> many
> >>>>>>>>> people
> >>>>>>>>>>>>> willing
> >>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
> >>> we
> >>>>>>>>>> could
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>>> should move forward. However, the development speed is
> >>>>>>>>>>> significantly
> >>>>>>>>>>>>>> slowed
> >>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
> >> helping
> >>>>>>>> the
> >>>>>>>>>>>> review
> >>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
> >>> either
> >>>>>>>>>> code
> >>>>>>>>>>>>>> quality
> >>>>>>>>>>>>>>> (by more easily accepting new committers) or some
> >> committer
> >>>>>>>>> time
> >>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
> >>> As
> >>>>>>>>> Till
> >>>>>>>>>>> has
> >>>>>>>>>>>>>>> indicated, it would be shameful if we let this
> >> contribution
> >>>>>>>>>> effort
> >>>>>>>>>>>> die.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>> Gabor
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
Thank you for looking into the build times.

I didn't know that the build time situation is so bad. Even with yarn,
mesos, connectors and libraries removed, we are still running into the
build timeout :(

Aljoscha told me that the Beam community is using Jenkins for running the
tests, and they are planning to completely move away from Travis. I wonder
whether we should do the same, as having our own Jenkins servers would
allow us to run tests for more than 50 minutes.

I agree with Stephan that we should keep the yarn and mesos tests in the
core for stability / testing quality purposes.


On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email]> wrote:

> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
>
> Stephan
>
>
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>
> > I’d like to use this refactoring opportunity to unspilt the Travis tests.
> > With 51 builds queued up for the weekend (some of which may fail or have
> > been force pushed) we are at the limit of the number of contributions we
> > can process. Fixing this requires 1) splitting the project, 2)
> > investigating speedups for long-running tests, and 3) staying cognizant
> of
> > test performance when accepting new code.
> >
> > I’d like to add one to Stephan’s list of module group. I like that the
> > modules are generic (“libraries”) so that no one module is alone and
> > independent.
> >
> > Flink has three “libraries”: cep, ml, and gelly.
> >
> > “connectors” is a hotspot due to the long-running Kafka tests (and
> > connectors for three Kafka versions).
> >
> > Both flink-storm and flink-python have a modest number of number of tests
> > and could live with the miscellaneous modules in “contrib”.
> >
> > The YARN tests are long-running and problematic (I am unable to
> > successfully run these locally). A “cluster” module could host
> flink-mesos,
> > flink-yarn, and flink-yarn-tests.
> >
> > That gets us close to running all tests in a single Travis build.
> >   https://travis-ci.org/greghogan/flink/builds/212122590 <
> > https://travis-ci.org/greghogan/flink/builds/212122590>
> >
> > I also tested (https://github.com/greghogan/flink/commits/core_build <
> > https://github.com/greghogan/flink/commits/core_build>) with a maven
> > parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >   https://travis-ci.org/greghogan/flink/builds/212137659 <
> > https://travis-ci.org/greghogan/flink/builds/212137659>
> >   https://travis-ci.org/greghogan/flink/builds/212154470 <
> > https://travis-ci.org/greghogan/flink/builds/212154470>
> >
> > We can run Travis CI builds nightly to guard against breaking changes.
> >
> > I also wanted to get an idea of how disruptive it would be to developers
> > to divide the project into multiple git repos. I wrote a simple python
> > script and configured it with the module partitions listed above. The
> usage
> > string from the top of the file lists commits with files from multiple
> > partitions and well as the modified files.
> >   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
> > https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
> >
> > Accounting for the merging of the batch and streaming connector modules,
> > and assuming that the project structure has not changed much over the
> past
> > 15 months, for the following date ranges the listed number of commits
> would
> > have been split across repositories.
> >
> > since "2017-01-01"
> > 36 of 571 commits were mixed
> >
> > since "2016-07-01"
> > 155 of 1607 commits were mixed
> >
> > since "2016-01-01"
> > 272 of 2561 commits were mixed
> >
> > Greg
> >
> >
> > > On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
> > >
> > > @Robert - I think once we know that a separate git repo works well, and
> > > that it actually solves problems, I see no reason to not create a
> > > connectors repository later. The infrastructure changes should be
> > identical
> > > for two or more repositories.
> > >
> > > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
> > wrote:
> > >
> > >> I think it should not be at least the flink-dist but exactly the
> > remaining
> > >> flink-dist module. Otherwise we do redundant work.
> > >>
> > >> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
> > >> wrote:
> > >>
> > >>> "flink-core" means the main repository, not the "flink-core" module.
> > >>>
> > >>> When doing a release, we need to build the flink main code first,
> > because
> > >>> the flink-libraries depend on that.
> > >>> Once the "flink-libraries" are build, we need to run the main build
> > again
> > >>> (at least the flink-dist module), so that it is pulling the artifacts
> > >> from
> > >>> the flink-libraries to put them into the opt/ folder of the final
> > >> artifact.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]
> >
> > >>> wrote:
> > >>>
> > >>>> I'm ok with point 3.
> > >>>>
> > >>>> Concerning point 8: Why do we have to build flink-core twice after
> > >> having
> > >>>> it built as a dependency for flink-libraries? This seems wrong to
> me.
> > >>>>
> > >>>> Cheers,
> > >>>> Till
> > >>>>
> > >>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
> [hidden email]>
> > >>>> wrote:
> > >>>>
> > >>>>> Thank you. Running on AWS is a good idea!
> > >>>>> Let me know if you (or anybody else) wants to help me with the
> > >>>>> infrastructure work! Any help is much appreciated (as I've said
> > >>> before, I
> > >>>>> don't really have time for doing this, but it has to be done :) )
> > >>>>>
> > >>>>> I'm against creating two new repositories. I fear that this
> > >> introduces
> > >>>> too
> > >>>>> much complexity and too many repositories.
> > >>>>> "flink" and "flink-libraries" are hopefully enough to get the build
> > >>> time
> > >>>>> significantly down.
> > >>>>> We can also consider putting the connectors into the
> > >> "flink-libraries"
> > >>>> repo
> > >>>>> if we need to further reduce the build time.
> > >>>>>
> > >>>>> We should probably move "flink-table" of out "flink-libraries" if
> we
> > >>> want
> > >>>>> to keep "flink-table" in the main repo. (This would eliminate the
> > >>>>> "flink-libraries" module from main.
> > >>>>>
> > >>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
> > >> placed
> > >>>> in
> > >>>>> contrib anymore.
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
> > >>> wrote:
> > >>>>>
> > >>>>>> Robert, appreciate your kickstarting this task.
> > >>>>>>
> > >>>>>> We should compare the verification time with and without the
> listed
> > >>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
> > >>>>>>
> > >>>>>> Should we maintain separate repos for flink-contrib and
> > >>>> flink-libraries?
> > >>>>>> Are you intending that we move flink-table out of flink-libraries
> > >>> (and
> > >>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
> > >>>>>>
> > >>>>>> Greg
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
> > >>>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>> Thank you for looking into this Till.
> > >>>>>>>
> > >>>>>>> I think we then have to split the repositories.
> > >>>>>>> My main motivation for doing this is that it seems to be the only
> > >>>>>> feasible
> > >>>>>>> way of scaling the community to allow more committers working on
> > >>> the
> > >>>>>>> libraries.
> > >>>>>>>
> > >>>>>>> I'll take care of getting things started.
> > >>>>>>>
> > >>>>>>> As the next steps I propose to:
> > >>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
> > >>>>> repos/asf?p=flink-
> > >>>>>>> connectors.git;a=summary to "flink-libraries"
> > >>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
> > >>>>>> "flink-libraries"
> > >>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> > >>>>>> "flink-cep",
> > >>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
> > >>>> decided
> > >>>>>>> against moving flink-contrib there, because rocksdb is in the
> > >>> contrib
> > >>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
> > >>>> repo
> > >>>>>>> because its probably going to interact more with the core code in
> > >>> the
> > >>>>>>> future)
> > >>>>>>> I try to preserve the history of those modules when splitting
> > >> them
> > >>>> into
> > >>>>>> the
> > >>>>>>> new repo
> > >>>>>>> 4. I'll close all pull requests against those modules in the main
> > >>>> repo.
> > >>>>>>> 5. I'll set up a minimal documentation page for the library
> > >>>> repository,
> > >>>>>>> similar to the main documentation.
> > >>>>>>> 6. I'll update the documentation build process to build both
> > >>>>>> documentations
> > >>>>>>> & link them to each other
> > >>>>>>> 7. I'll update the nightly deployment process to include both
> > >>>>>> repositories
> > >>>>>>> 8. I'll update the release script to create the Flink release out
> > >>> of
> > >>>>> both
> > >>>>>>> repositories. In order to put the libraries into the opt/ dir of
> > >>> the
> > >>>>>>> release, I'll need to change the build of "flink-dist" so that it
> > >>>> first
> > >>>>>>> builds flink core, then the libraries and then the core again
> > >> with
> > >>>> the
> > >>>>>>> libraries as an additional dependency.
> > >>>>>>>
> > >>>>>>> The main question for the community is: do you agree with point
> > >> 3 ?
> > >>>>> Would
> > >>>>>>> you like to include more or less?
> > >>>>>>>
> > >>>>>>> I'll start with 1. and 2. tomorrow morning.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> > >>> [hidden email]
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> In theory we could have a merging bot which solves the problem
> > >> of
> > >>>> the
> > >>>>>>>> "commit window". Once the PR passes all tests and has enough
> > >> +1s,
> > >>>> the
> > >>>>>> bot
> > >>>>>>>> could do the merging and, thus, it effectively linearizes the
> > >>> merge
> > >>>>>>>> process.
> > >>>>>>>>
> > >>>>>>>> I think the second point is actually a disadvantage because
> > >> there
> > >>> is
> > >>>>> not
> > >>>>>>>> such an immediate incentive/pressure to fix the broken module if
> > >>> it
> > >>>>>> lives
> > >>>>>>>> in a separate repository. Furthermore, breaking API changes in
> > >> the
> > >>>>> core
> > >>>>>>>> will most likely go unnoticed for some time in other modules
> > >> which
> > >>>> are
> > >>>>>> not
> > >>>>>>>> developed so actively. In the worst case these things will only
> > >> be
> > >>>>>> noticed
> > >>>>>>>> when we try to make a release.
> > >>>>>>>>
> > >>>>>>>> But I also agree that we are not Google and we don't have the
> > >>>>>> capacities to
> > >>>>>>>> maintain such a smooth a build process that we can keep all the
> > >>> code
> > >>>>> in
> > >>>>>> a
> > >>>>>>>> single repository.
> > >>>>>>>>
> > >>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
> > >> some
> > >>>>> nice
> > >>>>>>>> features wrt incrementally building projects. This would be
> > >>>> beneficial
> > >>>>>> for
> > >>>>>>>> local development but it would not solve our build time problems
> > >>> on
> > >>>>>> Travis.
> > >>>>>>>> Gradle intends to introduce a task result cache which allows to
> > >>>> reuse
> > >>>>>>>> results across builds. This could help when building on Travis,
> > >>>>>> however, it
> > >>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
> > >>>> Gradle
> > >>>>>>>> won't come for free (there's simply no free lunch out there) and
> > >>> we
> > >>>>>> might
> > >>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
> > >>>>>> repository
> > >>>>>>>> in order to mitigate our current problems with Travis and the
> > >>> build
> > >>>>>> time in
> > >>>>>>>> general. Whether to use a different build system or not can then
> > >>> be
> > >>>>>>>> discussed as an orthogonal question.
> > >>>>>>>>
> > >>>>>>>> Cheers,
> > >>>>>>>> Till
> > >>>>>>>>
> > >>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
> > >>>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Some other thoughts on how repository split would help. I am
> > >> not
> > >>>> sure
> > >>>>>> for
> > >>>>>>>>> all of them, so please comment:
> > >>>>>>>>>
> > >>>>>>>>> - There is less competition for a "commit window". It happens
> > >> a
> > >>>> lot
> > >>>>>>>>> already that you run all tests and want to commit, but there
> > >> was
> > >>> a
> > >>>>>> commit
> > >>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
> > >> the
> > >>>>>>>> meantime.
> > >>>>>>>>>   For a "linear" commit history, this may become a bottleneck
> > >>>>>>>> eventually
> > >>>>>>>>> as well.
> > >>>>>>>>>
> > >>>>>>>>> - There is less risk of broken master. If one
> > >> repository/modules
> > >>>>>> breaks
> > >>>>>>>>> its master, the others can still continue.
> > >>>>>>>>>
> > >>>>>>>>> Stephan
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> > >>>>> [hidden email]>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
> > >>> I'd
> > >>>>> like
> > >>>>>>>> to
> > >>>>>>>>>> summarize the mentioned points:
> > >>>>>>>>>>
> > >>>>>>>>>> The problem of increasing build times and complexity of the
> > >>>> project
> > >>>>>> has
> > >>>>>>>>>> been acknowledged. Ideally we would have everything in one
> > >>>>> repository
> > >>>>>>>>> using
> > >>>>>>>>>> an incremental build tool. Since Maven does not properly
> > >> support
> > >>>>> this
> > >>>>>>>> we
> > >>>>>>>>>> would have to switch our build tool to something like Gradle,
> > >>> for
> > >>>>>>>>> example.
> > >>>>>>>>>>
> > >>>>>>>>>> Another option is introducing build profiles for different
> > >> sets
> > >>> of
> > >>>>>>>>> modules
> > >>>>>>>>>> as well as separating integration and unit tests. The third
> > >>>>>> alternative
> > >>>>>>>>>> would be creating sub-projects with their own repositories. I
> > >>>>> actually
> > >>>>>>>>>> think that these two proposal are not necessarily exclusive
> > >> and
> > >>> it
> > >>>>>>>> would
> > >>>>>>>>>> also make sense to have a separation between unit and
> > >>> integration
> > >>>>>> tests
> > >>>>>>>>> if
> > >>>>>>>>>> we split the respository.
> > >>>>>>>>>>
> > >>>>>>>>>> The overall consensus seems to be that we don't want to split
> > >>> the
> > >>>>>>>>> community
> > >>>>>>>>>> and want to keep everything under the same umbrella. I think
> > >>> this
> > >>>> is
> > >>>>>>>> the
> > >>>>>>>>>> right way to go, because otherwise some parts of the project
> > >>> could
> > >>>>>>>> become
> > >>>>>>>>>> second class citizens. Given that and that we continue using
> > >>>> Maven,
> > >>>>> I
> > >>>>>>>>> still
> > >>>>>>>>>> think that creating sub-projects for the libraries, for
> > >> example,
> > >>>>> could
> > >>>>>>>> be
> > >>>>>>>>>> beneficial. A split could reduce the project's complexity and
> > >>> make
> > >>>>> it
> > >>>>>>>>>> potentially easier for libraries to get actively developed.
> > >> The
> > >>>> main
> > >>>>>>>>>> concern is setting up the build infrastructure to aggregate
> > >> docs
> > >>>>> from
> > >>>>>>>>>> multiple repositories and making them publicly available.
> > >>>>>>>>>>
> > >>>>>>>>>> Since I started this thread and I would really like to see
> > >>> Flink's
> > >>>>> ML
> > >>>>>>>>>> library being revived again, I'd volunteer investigating first
> > >>>>> whether
> > >>>>>>>> it
> > >>>>>>>>>> is doable establishing a proper incremental build for Flink.
> > >> If
> > >>>> that
> > >>>>>>>>> should
> > >>>>>>>>>> not be possible, I will look into splitting the repository,
> > >>> first
> > >>>>> only
> > >>>>>>>>> for
> > >>>>>>>>>> the libraries. I'll share my results with the community once
> > >> I'm
> > >>>>> done
> > >>>>>>>>> with
> > >>>>>>>>>> the investigation.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Till
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> > >>>>> [hidden email]>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
> > >> open
> > >>>>>>>> source
> > >>>>>>>>>>> projects. It only works for private repositories (at least
> > >> back
> > >>>>> then
> > >>>>>>>>> when
> > >>>>>>>>>>> we've asked them about that).
> > >>>>>>>>>>>
> > >>>>>>>>>>> @Stephan: I don't think that incremental builds will be
> > >>> available
> > >>>>>>>> with
> > >>>>>>>>>>> Maven anytime soon.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
> > >>> I've
> > >>>>>>>>> recently
> > >>>>>>>>>>> pushed a commit to use now three instead of two test groups.
> > >>>>>>>>>>> But I don't think that this is feasible long-term solution.
> > >>>>>>>>>>>
> > >>>>>>>>>>> If this discussion is only about reducing the build and test
> > >>>> time,
> > >>>>>>>>>>> introducing build profiles for different components as
> > >> Aljoscha
> > >>>>>>>>> suggested
> > >>>>>>>>>>> would solve the problem Till mentioned.
> > >>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
> > >>> the
> > >>>>>>>>>> testing,
> > >>>>>>>>>>> I guess we can find a different solution. There are now
> > >>>> competitors
> > >>>>>>>> to
> > >>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
> > >>>>> source
> > >>>>>>>>>>> project, or we set up our own infra on a server sponsored by
> > >>> one
> > >>>> of
> > >>>>>>>> the
> > >>>>>>>>>>> contributing companies.
> > >>>>>>>>>>> If we want to solve "community issues" with the change as
> > >> well,
> > >>>>> then
> > >>>>>>>> I
> > >>>>>>>>>>> think its work the effort of splitting up Flink into
> > >> different
> > >>>>>>>>>>> repositories.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Splitting up repositories is not a trivial task in my
> > >> opinion.
> > >>> As
> > >>>>>>>>> others
> > >>>>>>>>>>> have mentioned before, we need to consider the following
> > >>> things:
> > >>>>>>>>>>> - How are we doing to build the documentation? Ideally every
> > >>> repo
> > >>>>>>>>> should
> > >>>>>>>>>>> contain its docs, so we would need to pull them together when
> > >>>>>>>> building
> > >>>>>>>>>> the
> > >>>>>>>>>>> main docs.
> > >>>>>>>>>>> - How do organize the dependencies? If we have library
> > >>> repository
> > >>>>>>>>> depend
> > >>>>>>>>>> on
> > >>>>>>>>>>> snapshot Flink versions, we need to make sure that the
> > >> snapshot
> > >>>>>>>>>> deployment
> > >>>>>>>>>>> always works. This also means that people working on a
> > >> library
> > >>>>>>>>> repository
> > >>>>>>>>>>> will pull from snapshot OR need to build first locally.
> > >>>>>>>>>>> - We need to update the release scripts
> > >>>>>>>>>>>
> > >>>>>>>>>>> If we commit to do these changes, we need to assign at least
> > >>> one
> > >>>>>>>>>> committer
> > >>>>>>>>>>> (yes, in this case we need somebody who can commit, for
> > >> example
> > >>>> for
> > >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
> > >>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
> > >>>>> currently
> > >>>>>>>>>>> pretty booked with many other things, so I don't
> > >> realistically
> > >>>> see
> > >>>>>>>>> myself
> > >>>>>>>>>>> doing that. Max who used to work on these things is taking
> > >> some
> > >>>>> time
> > >>>>>>>>> off.
> > >>>>>>>>>>> I think we need, best case 3 days for the change, worst case
> > >> 5
> > >>>>> days.
> > >>>>>>>>> The
> > >>>>>>>>>>> problem is that there are no "unit tests" for the infra
> > >> stuff,
> > >>> so
> > >>>>>>>> many
> > >>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
> > >>> release
> > >>>>>>>>>> scripts,
> > >>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> > >>> [hidden email]>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> If we can get a incremental builds to work, that would
> > >>> actually
> > >>>> be
> > >>>>>>>>> the
> > >>>>>>>>>>>> preferred solution in my opinion.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Many companies have invested heavily in making a "single
> > >>>>>>>> repository"
> > >>>>>>>>>> code
> > >>>>>>>>>>>> base work, because it has the advantage of not having to
> > >>>>>>>>> update/publish
> > >>>>>>>>>>>> several repositories first.
> > >>>>>>>>>>>> However, the strong prerequisite for that is an incremental
> > >>>> build
> > >>>>>>>>>> system
> > >>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
> > >> not
> > >>>>> sure
> > >>>>>>>>> how
> > >>>>>>>>>>> we
> > >>>>>>>>>>>> could make that work
> > >>>>>>>>>>>> with Maven and Travis...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> > >>>> [hidden email]>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> An additional option for reducing time to build and test is
> > >>>>>>>>> parallel
> > >>>>>>>>>>>>> execution. This would help users more than on TravisCI
> > >> since
> > >>>>>>>> we're
> > >>>>>>>>>>>>> generally running on multi-core machines rather than VM
> > >>> slices.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Is the idea that each user would only check out the modules
> > >>>> that
> > >>>>>>>> he
> > >>>>>>>>>> or
> > >>>>>>>>>>>> she
> > >>>>>>>>>>>>> is developing with? For example, if a developer is not
> > >>> working
> > >>>> on
> > >>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
> > >>> would
> > >>>>>>>> not
> > >>>>>>>>> be
> > >>>>>>>>>>>> clone
> > >>>>>>>>>>>>> to their filesystem?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
> > >> validate
> > >>>>>>>>> against
> > >>>>>>>>>>> API
> > >>>>>>>>>>>>> changes.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Greg
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> > >>>>>>>> [hidden email]
> > >>>>>>>>>>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi everybody,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I think this should be a discussion about the benefits and
> > >>>>>>>>>> drawbacks
> > >>>>>>>>>>> of
> > >>>>>>>>>>>>>> separating the code into distinct repositories from a
> > >>>>>>>> development
> > >>>>>>>>>>> point
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>> view.
> > >>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
> > >>>> community
> > >>>>>>>>> by
> > >>>>>>>>>>>>> creating
> > >>>>>>>>>>>>>> separate groups of committers.
> > >>>>>>>>>>>>>> Also the discussion about independent releases is not be
> > >>>>>>>> strictly
> > >>>>>>>>>>>> related
> > >>>>>>>>>>>>>> to the decision, IMO.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
> > >>>>>>>>> separate
> > >>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
> > >> before:
> > >>>>>>>>>>>>>> pros:
> > >>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
> > >>> the
> > >>>>>>>>>> whole
> > >>>>>>>>>>>> code
> > >>>>>>>>>>>>>> base to run a test after switching a branch.
> > >>>>>>>>>>>>>> cons:
> > >>>>>>>>>>>>>> - developing libraries features that require changes in
> > >> the
> > >>>>>>>> core
> > >>>>>>>>> /
> > >>>>>>>>>>> APIs
> > >>>>>>>>>>>>>> become more time consuming due to back-and-forth between
> > >>> code
> > >>>>>>>>>> bases.
> > >>>>>>>>>>>>>> However, I think this is not very often the case.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
> > >>>>>>>> could
> > >>>>>>>>> be
> > >>>>>>>>>>>>> solved
> > >>>>>>>>>>>>>> by different build profiles and configurations.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best, Fabian
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> > >>>>>>>> [hidden email]
> > >>>>>>>>>> :
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> @Stephan:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
> > >>>>>>>>> committers,
> > >>>>>>>>>>> I'm
> > >>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
> > >>>>>>>> just
> > >>>>>>>>>> have
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> conscious about the disadvantages.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
> > >>> the
> > >>>>>>>>>> same
> > >>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
> > >>>>>>>>> current
> > >>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
> > >>> resolve
> > >>>>>>>>> the
> > >>>>>>>>>>>>> issue.
> > >>>>>>>>>>>>>>> But that requires time from current committers. It seems
> > >>> like
> > >>>>>>>>>>>>> trade-offs
> > >>>>>>>>>>>>>>> between code quality, speed of development, and committer
> > >>>>>>>>>> efforts.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
> > >> many
> > >>>>>>>>> people
> > >>>>>>>>>>>>> willing
> > >>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
> > >>> we
> > >>>>>>>>>> could
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> should move forward. However, the development speed is
> > >>>>>>>>>>> significantly
> > >>>>>>>>>>>>>> slowed
> > >>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
> > >> helping
> > >>>>>>>> the
> > >>>>>>>>>>>> review
> > >>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
> > >>> either
> > >>>>>>>>>> code
> > >>>>>>>>>>>>>> quality
> > >>>>>>>>>>>>>>> (by more easily accepting new committers) or some
> > >> committer
> > >>>>>>>>> time
> > >>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
> > >>> As
> > >>>>>>>>> Till
> > >>>>>>>>>>> has
> > >>>>>>>>>>>>>>> indicated, it would be shameful if we let this
> > >> contribution
> > >>>>>>>>>> effort
> > >>>>>>>>>>>> die.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>> Gabor
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Aljoscha Krettek-2
I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins integration, has opened my eyes to what is possible with good CI integration.

For example, look at this recent Beam PR: https://github.com/apache/beam/pull/2263 <https://github.com/apache/beam/pull/2263>. The Jenkins-Github integration will tell you exactly which tests failed and if you click on the links you can look at the log output/std out of the tests in question.

This is the overview page of one of the Jenkins Jobs that we have in Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a stable build: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/>. Notice how it gives you fine grained information about the Maven run. This is an unstable run: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There you can see which tests failed and you can easily drill down.

Best,
Aljoscha

> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>
> Thank you for looking into the build times.
>
> I didn't know that the build time situation is so bad. Even with yarn, mesos, connectors and libraries removed, we are still running into the build timeout :(
>
> Aljoscha told me that the Beam community is using Jenkins for running the tests, and they are planning to completely move away from Travis. I wonder whether we should do the same, as having our own Jenkins servers would allow us to run tests for more than 50 minutes.
>
> I agree with Stephan that we should keep the yarn and mesos tests in the core for stability / testing quality purposes.
>
>
> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email] <mailto:[hidden email]>> wrote:
> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
>
> Stephan
>
>
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email] <mailto:[hidden email]>> wrote:
>
> > I’d like to use this refactoring opportunity to unspilt the Travis tests.
> > With 51 builds queued up for the weekend (some of which may fail or have
> > been force pushed) we are at the limit of the number of contributions we
> > can process. Fixing this requires 1) splitting the project, 2)
> > investigating speedups for long-running tests, and 3) staying cognizant of
> > test performance when accepting new code.
> >
> > I’d like to add one to Stephan’s list of module group. I like that the
> > modules are generic (“libraries”) so that no one module is alone and
> > independent.
> >
> > Flink has three “libraries”: cep, ml, and gelly.
> >
> > “connectors” is a hotspot due to the long-running Kafka tests (and
> > connectors for three Kafka versions).
> >
> > Both flink-storm and flink-python have a modest number of number of tests
> > and could live with the miscellaneous modules in “contrib”.
> >
> > The YARN tests are long-running and problematic (I am unable to
> > successfully run these locally). A “cluster” module could host flink-mesos,
> > flink-yarn, and flink-yarn-tests.
> >
> > That gets us close to running all tests in a single Travis build.
> >   https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590> <
> > https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590>>
> >
> > I also tested (https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build> <
> > https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build>>) with a maven
> > parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >   https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659> <
> > https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659>>
> >   https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470> <
> > https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470>>
> >
> > We can run Travis CI builds nightly to guard against breaking changes.
> >
> > I also wanted to get an idea of how disruptive it would be to developers
> > to divide the project into multiple git repos. I wrote a simple python
> > script and configured it with the module partitions listed above. The usage
> > string from the top of the file lists commits with files from multiple
> > partitions and well as the modified files.
> >   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
> > https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
> >
> > Accounting for the merging of the batch and streaming connector modules,
> > and assuming that the project structure has not changed much over the past
> > 15 months, for the following date ranges the listed number of commits would
> > have been split across repositories.
> >
> > since "2017-01-01"
> > 36 of 571 commits were mixed
> >
> > since "2016-07-01"
> > 155 of 1607 commits were mixed
> >
> > since "2016-01-01"
> > 272 of 2561 commits were mixed
> >
> > Greg
> >
> >
> > > On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:[hidden email]>> wrote:
> > >
> > > @Robert - I think once we know that a separate git repo works well, and
> > > that it actually solves problems, I see no reason to not create a
> > > connectors repository later. The infrastructure changes should be
> > identical
> > > for two or more repositories.
> > >
> > > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email] <mailto:[hidden email]>>
> > wrote:
> > >
> > >> I think it should not be at least the flink-dist but exactly the
> > remaining
> > >> flink-dist module. Otherwise we do redundant work.
> > >>
> > >> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email] <mailto:[hidden email]>>
> > >> wrote:
> > >>
> > >>> "flink-core" means the main repository, not the "flink-core" module.
> > >>>
> > >>> When doing a release, we need to build the flink main code first,
> > because
> > >>> the flink-libraries depend on that.
> > >>> Once the "flink-libraries" are build, we need to run the main build
> > again
> > >>> (at least the flink-dist module), so that it is pulling the artifacts
> > >> from
> > >>> the flink-libraries to put them into the opt/ folder of the final
> > >> artifact.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email] <mailto:[hidden email]>>
> > >>> wrote:
> > >>>
> > >>>> I'm ok with point 3.
> > >>>>
> > >>>> Concerning point 8: Why do we have to build flink-core twice after
> > >> having
> > >>>> it built as a dependency for flink-libraries? This seems wrong to me.
> > >>>>
> > >>>> Cheers,
> > >>>> Till
> > >>>>
> > >>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email] <mailto:[hidden email]>>
> > >>>> wrote:
> > >>>>
> > >>>>> Thank you. Running on AWS is a good idea!
> > >>>>> Let me know if you (or anybody else) wants to help me with the
> > >>>>> infrastructure work! Any help is much appreciated (as I've said
> > >>> before, I
> > >>>>> don't really have time for doing this, but it has to be done :) )
> > >>>>>
> > >>>>> I'm against creating two new repositories. I fear that this
> > >> introduces
> > >>>> too
> > >>>>> much complexity and too many repositories.
> > >>>>> "flink" and "flink-libraries" are hopefully enough to get the build
> > >>> time
> > >>>>> significantly down.
> > >>>>> We can also consider putting the connectors into the
> > >> "flink-libraries"
> > >>>> repo
> > >>>>> if we need to further reduce the build time.
> > >>>>>
> > >>>>> We should probably move "flink-table" of out "flink-libraries" if we
> > >>> want
> > >>>>> to keep "flink-table" in the main repo. (This would eliminate the
> > >>>>> "flink-libraries" module from main.
> > >>>>>
> > >>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
> > >> placed
> > >>>> in
> > >>>>> contrib anymore.
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email] <mailto:[hidden email]>>
> > >>> wrote:
> > >>>>>
> > >>>>>> Robert, appreciate your kickstarting this task.
> > >>>>>>
> > >>>>>> We should compare the verification time with and without the listed
> > >>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
> > >>>>>>
> > >>>>>> Should we maintain separate repos for flink-contrib and
> > >>>> flink-libraries?
> > >>>>>> Are you intending that we move flink-table out of flink-libraries
> > >>> (and
> > >>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
> > >>>>>>
> > >>>>>> Greg
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email] <mailto:[hidden email]>
> > >>>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>> Thank you for looking into this Till.
> > >>>>>>>
> > >>>>>>> I think we then have to split the repositories.
> > >>>>>>> My main motivation for doing this is that it seems to be the only
> > >>>>>> feasible
> > >>>>>>> way of scaling the community to allow more committers working on
> > >>> the
> > >>>>>>> libraries.
> > >>>>>>>
> > >>>>>>> I'll take care of getting things started.
> > >>>>>>>
> > >>>>>>> As the next steps I propose to:
> > >>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <https://git-wip-us.apache.org/>
> > >>>>> repos/asf?p=flink-
> > >>>>>>> connectors.git;a=summary to "flink-libraries"
> > >>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
> > >>>>>> "flink-libraries"
> > >>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> > >>>>>> "flink-cep",
> > >>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
> > >>>> decided
> > >>>>>>> against moving flink-contrib there, because rocksdb is in the
> > >>> contrib
> > >>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
> > >>>> repo
> > >>>>>>> because its probably going to interact more with the core code in
> > >>> the
> > >>>>>>> future)
> > >>>>>>> I try to preserve the history of those modules when splitting
> > >> them
> > >>>> into
> > >>>>>> the
> > >>>>>>> new repo
> > >>>>>>> 4. I'll close all pull requests against those modules in the main
> > >>>> repo.
> > >>>>>>> 5. I'll set up a minimal documentation page for the library
> > >>>> repository,
> > >>>>>>> similar to the main documentation.
> > >>>>>>> 6. I'll update the documentation build process to build both
> > >>>>>> documentations
> > >>>>>>> & link them to each other
> > >>>>>>> 7. I'll update the nightly deployment process to include both
> > >>>>>> repositories
> > >>>>>>> 8. I'll update the release script to create the Flink release out
> > >>> of
> > >>>>> both
> > >>>>>>> repositories. In order to put the libraries into the opt/ dir of
> > >>> the
> > >>>>>>> release, I'll need to change the build of "flink-dist" so that it
> > >>>> first
> > >>>>>>> builds flink core, then the libraries and then the core again
> > >> with
> > >>>> the
> > >>>>>>> libraries as an additional dependency.
> > >>>>>>>
> > >>>>>>> The main question for the community is: do you agree with point
> > >> 3 ?
> > >>>>> Would
> > >>>>>>> you like to include more or less?
> > >>>>>>>
> > >>>>>>> I'll start with 1. and 2. tomorrow morning.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> > >>> [hidden email] <mailto:[hidden email]>
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> In theory we could have a merging bot which solves the problem
> > >> of
> > >>>> the
> > >>>>>>>> "commit window". Once the PR passes all tests and has enough
> > >> +1s,
> > >>>> the
> > >>>>>> bot
> > >>>>>>>> could do the merging and, thus, it effectively linearizes the
> > >>> merge
> > >>>>>>>> process.
> > >>>>>>>>
> > >>>>>>>> I think the second point is actually a disadvantage because
> > >> there
> > >>> is
> > >>>>> not
> > >>>>>>>> such an immediate incentive/pressure to fix the broken module if
> > >>> it
> > >>>>>> lives
> > >>>>>>>> in a separate repository. Furthermore, breaking API changes in
> > >> the
> > >>>>> core
> > >>>>>>>> will most likely go unnoticed for some time in other modules
> > >> which
> > >>>> are
> > >>>>>> not
> > >>>>>>>> developed so actively. In the worst case these things will only
> > >> be
> > >>>>>> noticed
> > >>>>>>>> when we try to make a release.
> > >>>>>>>>
> > >>>>>>>> But I also agree that we are not Google and we don't have the
> > >>>>>> capacities to
> > >>>>>>>> maintain such a smooth a build process that we can keep all the
> > >>> code
> > >>>>> in
> > >>>>>> a
> > >>>>>>>> single repository.
> > >>>>>>>>
> > >>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
> > >> some
> > >>>>> nice
> > >>>>>>>> features wrt incrementally building projects. This would be
> > >>>> beneficial
> > >>>>>> for
> > >>>>>>>> local development but it would not solve our build time problems
> > >>> on
> > >>>>>> Travis.
> > >>>>>>>> Gradle intends to introduce a task result cache which allows to
> > >>>> reuse
> > >>>>>>>> results across builds. This could help when building on Travis,
> > >>>>>> however, it
> > >>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
> > >>>> Gradle
> > >>>>>>>> won't come for free (there's simply no free lunch out there) and
> > >>> we
> > >>>>>> might
> > >>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
> > >>>>>> repository
> > >>>>>>>> in order to mitigate our current problems with Travis and the
> > >>> build
> > >>>>>> time in
> > >>>>>>>> general. Whether to use a different build system or not can then
> > >>> be
> > >>>>>>>> discussed as an orthogonal question.
> > >>>>>>>>
> > >>>>>>>> Cheers,
> > >>>>>>>> Till
> > >>>>>>>>
> > >>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email] <mailto:[hidden email]>
> > >>>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Some other thoughts on how repository split would help. I am
> > >> not
> > >>>> sure
> > >>>>>> for
> > >>>>>>>>> all of them, so please comment:
> > >>>>>>>>>
> > >>>>>>>>> - There is less competition for a "commit window". It happens
> > >> a
> > >>>> lot
> > >>>>>>>>> already that you run all tests and want to commit, but there
> > >> was
> > >>> a
> > >>>>>> commit
> > >>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
> > >> the
> > >>>>>>>> meantime.
> > >>>>>>>>>   For a "linear" commit history, this may become a bottleneck
> > >>>>>>>> eventually
> > >>>>>>>>> as well.
> > >>>>>>>>>
> > >>>>>>>>> - There is less risk of broken master. If one
> > >> repository/modules
> > >>>>>> breaks
> > >>>>>>>>> its master, the others can still continue.
> > >>>>>>>>>
> > >>>>>>>>> Stephan
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> > >>>>> [hidden email] <mailto:[hidden email]>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
> > >>> I'd
> > >>>>> like
> > >>>>>>>> to
> > >>>>>>>>>> summarize the mentioned points:
> > >>>>>>>>>>
> > >>>>>>>>>> The problem of increasing build times and complexity of the
> > >>>> project
> > >>>>>> has
> > >>>>>>>>>> been acknowledged. Ideally we would have everything in one
> > >>>>> repository
> > >>>>>>>>> using
> > >>>>>>>>>> an incremental build tool. Since Maven does not properly
> > >> support
> > >>>>> this
> > >>>>>>>> we
> > >>>>>>>>>> would have to switch our build tool to something like Gradle,
> > >>> for
> > >>>>>>>>> example.
> > >>>>>>>>>>
> > >>>>>>>>>> Another option is introducing build profiles for different
> > >> sets
> > >>> of
> > >>>>>>>>> modules
> > >>>>>>>>>> as well as separating integration and unit tests. The third
> > >>>>>> alternative
> > >>>>>>>>>> would be creating sub-projects with their own repositories. I
> > >>>>> actually
> > >>>>>>>>>> think that these two proposal are not necessarily exclusive
> > >> and
> > >>> it
> > >>>>>>>> would
> > >>>>>>>>>> also make sense to have a separation between unit and
> > >>> integration
> > >>>>>> tests
> > >>>>>>>>> if
> > >>>>>>>>>> we split the respository.
> > >>>>>>>>>>
> > >>>>>>>>>> The overall consensus seems to be that we don't want to split
> > >>> the
> > >>>>>>>>> community
> > >>>>>>>>>> and want to keep everything under the same umbrella. I think
> > >>> this
> > >>>> is
> > >>>>>>>> the
> > >>>>>>>>>> right way to go, because otherwise some parts of the project
> > >>> could
> > >>>>>>>> become
> > >>>>>>>>>> second class citizens. Given that and that we continue using
> > >>>> Maven,
> > >>>>> I
> > >>>>>>>>> still
> > >>>>>>>>>> think that creating sub-projects for the libraries, for
> > >> example,
> > >>>>> could
> > >>>>>>>> be
> > >>>>>>>>>> beneficial. A split could reduce the project's complexity and
> > >>> make
> > >>>>> it
> > >>>>>>>>>> potentially easier for libraries to get actively developed.
> > >> The
> > >>>> main
> > >>>>>>>>>> concern is setting up the build infrastructure to aggregate
> > >> docs
> > >>>>> from
> > >>>>>>>>>> multiple repositories and making them publicly available.
> > >>>>>>>>>>
> > >>>>>>>>>> Since I started this thread and I would really like to see
> > >>> Flink's
> > >>>>> ML
> > >>>>>>>>>> library being revived again, I'd volunteer investigating first
> > >>>>> whether
> > >>>>>>>> it
> > >>>>>>>>>> is doable establishing a proper incremental build for Flink.
> > >> If
> > >>>> that
> > >>>>>>>>> should
> > >>>>>>>>>> not be possible, I will look into splitting the repository,
> > >>> first
> > >>>>> only
> > >>>>>>>>> for
> > >>>>>>>>>> the libraries. I'll share my results with the community once
> > >> I'm
> > >>>>> done
> > >>>>>>>>> with
> > >>>>>>>>>> the investigation.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Till
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> > >>>>> [hidden email] <mailto:[hidden email]>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
> > >> open
> > >>>>>>>> source
> > >>>>>>>>>>> projects. It only works for private repositories (at least
> > >> back
> > >>>>> then
> > >>>>>>>>> when
> > >>>>>>>>>>> we've asked them about that).
> > >>>>>>>>>>>
> > >>>>>>>>>>> @Stephan: I don't think that incremental builds will be
> > >>> available
> > >>>>>>>> with
> > >>>>>>>>>>> Maven anytime soon.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
> > >>> I've
> > >>>>>>>>> recently
> > >>>>>>>>>>> pushed a commit to use now three instead of two test groups.
> > >>>>>>>>>>> But I don't think that this is feasible long-term solution.
> > >>>>>>>>>>>
> > >>>>>>>>>>> If this discussion is only about reducing the build and test
> > >>>> time,
> > >>>>>>>>>>> introducing build profiles for different components as
> > >> Aljoscha
> > >>>>>>>>> suggested
> > >>>>>>>>>>> would solve the problem Till mentioned.
> > >>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
> > >>> the
> > >>>>>>>>>> testing,
> > >>>>>>>>>>> I guess we can find a different solution. There are now
> > >>>> competitors
> > >>>>>>>> to
> > >>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
> > >>>>> source
> > >>>>>>>>>>> project, or we set up our own infra on a server sponsored by
> > >>> one
> > >>>> of
> > >>>>>>>> the
> > >>>>>>>>>>> contributing companies.
> > >>>>>>>>>>> If we want to solve "community issues" with the change as
> > >> well,
> > >>>>> then
> > >>>>>>>> I
> > >>>>>>>>>>> think its work the effort of splitting up Flink into
> > >> different
> > >>>>>>>>>>> repositories.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Splitting up repositories is not a trivial task in my
> > >> opinion.
> > >>> As
> > >>>>>>>>> others
> > >>>>>>>>>>> have mentioned before, we need to consider the following
> > >>> things:
> > >>>>>>>>>>> - How are we doing to build the documentation? Ideally every
> > >>> repo
> > >>>>>>>>> should
> > >>>>>>>>>>> contain its docs, so we would need to pull them together when
> > >>>>>>>> building
> > >>>>>>>>>> the
> > >>>>>>>>>>> main docs.
> > >>>>>>>>>>> - How do organize the dependencies? If we have library
> > >>> repository
> > >>>>>>>>> depend
> > >>>>>>>>>> on
> > >>>>>>>>>>> snapshot Flink versions, we need to make sure that the
> > >> snapshot
> > >>>>>>>>>> deployment
> > >>>>>>>>>>> always works. This also means that people working on a
> > >> library
> > >>>>>>>>> repository
> > >>>>>>>>>>> will pull from snapshot OR need to build first locally.
> > >>>>>>>>>>> - We need to update the release scripts
> > >>>>>>>>>>>
> > >>>>>>>>>>> If we commit to do these changes, we need to assign at least
> > >>> one
> > >>>>>>>>>> committer
> > >>>>>>>>>>> (yes, in this case we need somebody who can commit, for
> > >> example
> > >>>> for
> > >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
> > >>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
> > >>>>> currently
> > >>>>>>>>>>> pretty booked with many other things, so I don't
> > >> realistically
> > >>>> see
> > >>>>>>>>> myself
> > >>>>>>>>>>> doing that. Max who used to work on these things is taking
> > >> some
> > >>>>> time
> > >>>>>>>>> off.
> > >>>>>>>>>>> I think we need, best case 3 days for the change, worst case
> > >> 5
> > >>>>> days.
> > >>>>>>>>> The
> > >>>>>>>>>>> problem is that there are no "unit tests" for the infra
> > >> stuff,
> > >>> so
> > >>>>>>>> many
> > >>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
> > >>> release
> > >>>>>>>>>> scripts,
> > >>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> > >>> [hidden email] <mailto:[hidden email]>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> If we can get a incremental builds to work, that would
> > >>> actually
> > >>>> be
> > >>>>>>>>> the
> > >>>>>>>>>>>> preferred solution in my opinion.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Many companies have invested heavily in making a "single
> > >>>>>>>> repository"
> > >>>>>>>>>> code
> > >>>>>>>>>>>> base work, because it has the advantage of not having to
> > >>>>>>>>> update/publish
> > >>>>>>>>>>>> several repositories first.
> > >>>>>>>>>>>> However, the strong prerequisite for that is an incremental
> > >>>> build
> > >>>>>>>>>> system
> > >>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
> > >> not
> > >>>>> sure
> > >>>>>>>>> how
> > >>>>>>>>>>> we
> > >>>>>>>>>>>> could make that work
> > >>>>>>>>>>>> with Maven and Travis...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> > >>>> [hidden email] <mailto:[hidden email]>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> An additional option for reducing time to build and test is
> > >>>>>>>>> parallel
> > >>>>>>>>>>>>> execution. This would help users more than on TravisCI
> > >> since
> > >>>>>>>> we're
> > >>>>>>>>>>>>> generally running on multi-core machines rather than VM
> > >>> slices.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Is the idea that each user would only check out the modules
> > >>>> that
> > >>>>>>>> he
> > >>>>>>>>>> or
> > >>>>>>>>>>>> she
> > >>>>>>>>>>>>> is developing with? For example, if a developer is not
> > >>> working
> > >>>> on
> > >>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
> > >>> would
> > >>>>>>>> not
> > >>>>>>>>> be
> > >>>>>>>>>>>> clone
> > >>>>>>>>>>>>> to their filesystem?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
> > >> validate
> > >>>>>>>>> against
> > >>>>>>>>>>> API
> > >>>>>>>>>>>>> changes.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Greg
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> > >>>>>>>> [hidden email] <mailto:[hidden email]>
> > >>>>>>>>>>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi everybody,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I think this should be a discussion about the benefits and
> > >>>>>>>>>> drawbacks
> > >>>>>>>>>>> of
> > >>>>>>>>>>>>>> separating the code into distinct repositories from a
> > >>>>>>>> development
> > >>>>>>>>>>> point
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>> view.
> > >>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
> > >>>> community
> > >>>>>>>>> by
> > >>>>>>>>>>>>> creating
> > >>>>>>>>>>>>>> separate groups of committers.
> > >>>>>>>>>>>>>> Also the discussion about independent releases is not be
> > >>>>>>>> strictly
> > >>>>>>>>>>>> related
> > >>>>>>>>>>>>>> to the decision, IMO.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
> > >>>>>>>>> separate
> > >>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
> > >> before:
> > >>>>>>>>>>>>>> pros:
> > >>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
> > >>> the
> > >>>>>>>>>> whole
> > >>>>>>>>>>>> code
> > >>>>>>>>>>>>>> base to run a test after switching a branch.
> > >>>>>>>>>>>>>> cons:
> > >>>>>>>>>>>>>> - developing libraries features that require changes in
> > >> the
> > >>>>>>>> core
> > >>>>>>>>> /
> > >>>>>>>>>>> APIs
> > >>>>>>>>>>>>>> become more time consuming due to back-and-forth between
> > >>> code
> > >>>>>>>>>> bases.
> > >>>>>>>>>>>>>> However, I think this is not very often the case.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
> > >>>>>>>> could
> > >>>>>>>>> be
> > >>>>>>>>>>>>> solved
> > >>>>>>>>>>>>>> by different build profiles and configurations.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best, Fabian
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> > >>>>>>>> [hidden email] <mailto:[hidden email]>
> > >>>>>>>>>> :
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> @Stephan:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
> > >>>>>>>>> committers,
> > >>>>>>>>>>> I'm
> > >>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
> > >>>>>>>> just
> > >>>>>>>>>> have
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> conscious about the disadvantages.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
> > >>> the
> > >>>>>>>>>> same
> > >>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
> > >>>>>>>>> current
> > >>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
> > >>> resolve
> > >>>>>>>>> the
> > >>>>>>>>>>>>> issue.
> > >>>>>>>>>>>>>>> But that requires time from current committers. It seems
> > >>> like
> > >>>>>>>>>>>>> trade-offs
> > >>>>>>>>>>>>>>> between code quality, speed of development, and committer
> > >>>>>>>>>> efforts.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
> > >> many
> > >>>>>>>>> people
> > >>>>>>>>>>>>> willing
> > >>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
> > >>> we
> > >>>>>>>>>> could
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> should move forward. However, the development speed is
> > >>>>>>>>>>> significantly
> > >>>>>>>>>>>>>> slowed
> > >>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
> > >> helping
> > >>>>>>>> the
> > >>>>>>>>>>>> review
> > >>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
> > >>> either
> > >>>>>>>>>> code
> > >>>>>>>>>>>>>> quality
> > >>>>>>>>>>>>>>> (by more easily accepting new committers) or some
> > >> committer
> > >>>>>>>>> time
> > >>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
> > >>> As
> > >>>>>>>>> Till
> > >>>>>>>>>>> has
> > >>>>>>>>>>>>>>> indicated, it would be shameful if we let this
> > >> contribution
> > >>>>>>>>>> effort
> > >>>>>>>>>>>> die.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>> Gabor
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Timo Walther-2
I agress with Aljoscha that we might consider moving from Jenkins to
Travis. Is there any disadvantage in using Jenkins?

I think we should structure the project according to release management
(e.g. more frequent releases of libraries) or other criteria (e.g. core
and non-core) instead of build time. What would happen if the built of
another submodule would become too long, would we split/restructure
again and again? If Jenkins solves all our problems we should use it.

Regards,
Timo



Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:

> I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins integration, has opened my eyes to what is possible with good CI integration.
>
> For example, look at this recent Beam PR: https://github.com/apache/beam/pull/2263 <https://github.com/apache/beam/pull/2263>. The Jenkins-Github integration will tell you exactly which tests failed and if you click on the links you can look at the log output/std out of the tests in question.
>
> This is the overview page of one of the Jenkins Jobs that we have in Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a stable build: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/>. Notice how it gives you fine grained information about the Maven run. This is an unstable run: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There you can see which tests failed and you can easily drill down.
>
> Best,
> Aljoscha
>
>> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>>
>> Thank you for looking into the build times.
>>
>> I didn't know that the build time situation is so bad. Even with yarn, mesos, connectors and libraries removed, we are still running into the build timeout :(
>>
>> Aljoscha told me that the Beam community is using Jenkins for running the tests, and they are planning to completely move away from Travis. I wonder whether we should do the same, as having our own Jenkins servers would allow us to run tests for more than 50 minutes.
>>
>> I agree with Stephan that we should keep the yarn and mesos tests in the core for stability / testing quality purposes.
>>
>>
>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email] <mailto:[hidden email]>> wrote:
>> @Greg
>>
>> I am personally in favor of splitting "connectors" and "contrib" out as
>> well. I know that @rmetzger has some reservations about the connectors, but
>> we may be able to convince him.
>>
>> For the cluster tests (yarn / mesos) - in the past there were many cases
>> where these tests caught cases that other tests did not, because they are
>> the only tests that actually use the "flink-dist.jar" and thus discover
>> many dependency and configuration issues. For that reason, my feeling would
>> be that they are valuable in the core repository.
>>
>> I would actually suggest to do only the library split initially, to see
>> what the challenges are in setting up the multi-repo build and release
>> tooling. Once we gathered experience there, we can probably easily see what
>> else we can split out.
>>
>> Stephan
>>
>>
>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>> With 51 builds queued up for the weekend (some of which may fail or have
>>> been force pushed) we are at the limit of the number of contributions we
>>> can process. Fixing this requires 1) splitting the project, 2)
>>> investigating speedups for long-running tests, and 3) staying cognizant of
>>> test performance when accepting new code.
>>>
>>> I’d like to add one to Stephan’s list of module group. I like that the
>>> modules are generic (“libraries”) so that no one module is alone and
>>> independent.
>>>
>>> Flink has three “libraries”: cep, ml, and gelly.
>>>
>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>> connectors for three Kafka versions).
>>>
>>> Both flink-storm and flink-python have a modest number of number of tests
>>> and could live with the miscellaneous modules in “contrib”.
>>>
>>> The YARN tests are long-running and problematic (I am unable to
>>> successfully run these locally). A “cluster” module could host flink-mesos,
>>> flink-yarn, and flink-yarn-tests.
>>>
>>> That gets us close to running all tests in a single Travis build.
>>>    https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590> <
>>> https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>
>>> I also tested (https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build> <
>>> https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>    https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659> <
>>> https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>    https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470> <
>>> https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>
>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>
>>> I also wanted to get an idea of how disruptive it would be to developers
>>> to divide the project into multiple git repos. I wrote a simple python
>>> script and configured it with the module partitions listed above. The usage
>>> string from the top of the file lists commits with files from multiple
>>> partitions and well as the modified files.
>>>    https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>
>>> Accounting for the merging of the batch and streaming connector modules,
>>> and assuming that the project structure has not changed much over the past
>>> 15 months, for the following date ranges the listed number of commits would
>>> have been split across repositories.
>>>
>>> since "2017-01-01"
>>> 36 of 571 commits were mixed
>>>
>>> since "2016-07-01"
>>> 155 of 1607 commits were mixed
>>>
>>> since "2016-01-01"
>>> 272 of 2561 commits were mixed
>>>
>>> Greg
>>>
>>>
>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>> @Robert - I think once we know that a separate git repo works well, and
>>>> that it actually solves problems, I see no reason to not create a
>>>> connectors repository later. The infrastructure changes should be
>>> identical
>>>> for two or more repositories.
>>>>
>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email] <mailto:[hidden email]>>
>>> wrote:
>>>>> I think it should not be at least the flink-dist but exactly the
>>> remaining
>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>
>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email] <mailto:[hidden email]>>
>>>>> wrote:
>>>>>
>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>
>>>>>> When doing a release, we need to build the flink main code first,
>>> because
>>>>>> the flink-libraries depend on that.
>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>> again
>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>> from
>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>> artifact.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email] <mailto:[hidden email]>>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm ok with point 3.
>>>>>>>
>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>> having
>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email] <mailto:[hidden email]>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>> before, I
>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>
>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>> introduces
>>>>>>> too
>>>>>>>> much complexity and too many repositories.
>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>> time
>>>>>>>> significantly down.
>>>>>>>> We can also consider putting the connectors into the
>>>>> "flink-libraries"
>>>>>>> repo
>>>>>>>> if we need to further reduce the build time.
>>>>>>>>
>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if we
>>>>>> want
>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>> "flink-libraries" module from main.
>>>>>>>>
>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>> placed
>>>>>>> in
>>>>>>>> contrib anymore.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email] <mailto:[hidden email]>>
>>>>>> wrote:
>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>
>>>>>>>>> We should compare the verification time with and without the listed
>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>
>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>> flink-libraries?
>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>> (and
>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>
>>>>>>>>> Greg
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email] <mailto:[hidden email]>
>>>>>>>> wrote:
>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>
>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>> feasible
>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>> the
>>>>>>>>>> libraries.
>>>>>>>>>>
>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>
>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <https://git-wip-us.apache.org/>
>>>>>>>> repos/asf?p=flink-
>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>> "flink-libraries"
>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>> "flink-cep",
>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>> decided
>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>> contrib
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>> repo
>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>> the
>>>>>>>>>> future)
>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>> them
>>>>>>> into
>>>>>>>>> the
>>>>>>>>>> new repo
>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>> repo.
>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>> repository,
>>>>>>>>>> similar to the main documentation.
>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>> documentations
>>>>>>>>>> & link them to each other
>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>> repositories
>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>> of
>>>>>>>> both
>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>> the
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>> first
>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>> with
>>>>>>> the
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>
>>>>>>>>>> The main question for the community is: do you agree with point
>>>>> 3 ?
>>>>>>>> Would
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>
>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>> of
>>>>>>> the
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>> +1s,
>>>>>>> the
>>>>>>>>> bot
>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>> merge
>>>>>>>>>>> process.
>>>>>>>>>>>
>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>> there
>>>>>> is
>>>>>>>> not
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>> it
>>>>>>>>> lives
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>> the
>>>>>>>> core
>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>> which
>>>>>>> are
>>>>>>>>> not
>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>> be
>>>>>>>>> noticed
>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>
>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>> capacities to
>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>> code
>>>>>>>> in
>>>>>>>>> a
>>>>>>>>>>> single repository.
>>>>>>>>>>>
>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>> some
>>>>>>>> nice
>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>> beneficial
>>>>>>>>> for
>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>> on
>>>>>>>>> Travis.
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>> reuse
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>> however, it
>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>> Gradle
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>> we
>>>>>>>>> might
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>> repository
>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>> build
>>>>>>>>> time in
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>> be
>>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Till
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email] <mailto:[hidden email]>
>>>>>>>> wrote:
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>> not
>>>>>>> sure
>>>>>>>>> for
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>> a
>>>>>>> lot
>>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>> was
>>>>>> a
>>>>>>>>> commit
>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>> the
>>>>>>>>>>> meantime.
>>>>>>>>>>>>    For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>> eventually
>>>>>>>>>>>> as well.
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>> repository/modules
>>>>>>>>> breaks
>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>
>>>>>>>>>>>> Stephan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>> I'd
>>>>>>>> like
>>>>>>>>>>> to
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>> project
>>>>>>>>> has
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>> repository
>>>>>>>>>>>> using
>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>> support
>>>>>>>> this
>>>>>>>>>>> we
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>> for
>>>>>>>>>>>> example.
>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>> sets
>>>>>> of
>>>>>>>>>>>> modules
>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>> alternative
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>> actually
>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>> and
>>>>>> it
>>>>>>>>>>> would
>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>> integration
>>>>>>>>> tests
>>>>>>>>>>>> if
>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>> the
>>>>>>>>>>>> community
>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>> this
>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>> could
>>>>>>>>>>> become
>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>> Maven,
>>>>>>>> I
>>>>>>>>>>>> still
>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>> example,
>>>>>>>> could
>>>>>>>>>>> be
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>> make
>>>>>>>> it
>>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>> The
>>>>>>> main
>>>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>> docs
>>>>>>>> from
>>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>> Flink's
>>>>>>>> ML
>>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>> whether
>>>>>>>>>>> it
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>> If
>>>>>>> that
>>>>>>>>>>>> should
>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>> first
>>>>>>>> only
>>>>>>>>>>>> for
>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>> I'm
>>>>>>>> done
>>>>>>>>>>>> with
>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>> open
>>>>>>>>>>> source
>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>> back
>>>>>>>> then
>>>>>>>>>>>> when
>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>> available
>>>>>>>>>>> with
>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>> I've
>>>>>>>>>>>> recently
>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>> time,
>>>>>>>>>>>>>> introducing build profiles for different components as
>>>>> Aljoscha
>>>>>>>>>>>> suggested
>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>> the
>>>>>>>>>>>>> testing,
>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>> competitors
>>>>>>>>>>> to
>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>> source
>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>> one
>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>> well,
>>>>>>>> then
>>>>>>>>>>> I
>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>> different
>>>>>>>>>>>>>> repositories.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>> opinion.
>>>>>> As
>>>>>>>>>>>> others
>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>> things:
>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>> repo
>>>>>>>>>>>> should
>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>> building
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>> repository
>>>>>>>>>>>> depend
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>> snapshot
>>>>>>>>>>>>> deployment
>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>> library
>>>>>>>>>>>> repository
>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>> one
>>>>>>>>>>>>> committer
>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>> example
>>>>>>> for
>>>>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>> currently
>>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>> realistically
>>>>>>> see
>>>>>>>>>>>> myself
>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>> some
>>>>>>>> time
>>>>>>>>>>>> off.
>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>> 5
>>>>>>>> days.
>>>>>>>>>>>> The
>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>> stuff,
>>>>>> so
>>>>>>>>>>> many
>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>> release
>>>>>>>>>>>>> scripts,
>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>> actually
>>>>>>> be
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>> repository"
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>> build
>>>>>>>>>>>>> system
>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>> not
>>>>>>>> sure
>>>>>>>>>>>> how
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>> parallel
>>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>> since
>>>>>>>>>>> we're
>>>>>>>>>>>>>>>> generally running on multi-core machines rather than VM
>>>>>> slices.
>>>>>>>>>>>>>>>> Is the idea that each user would only check out the modules
>>>>>>> that
>>>>>>>>>>> he
>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> she
>>>>>>>>>>>>>>>> is developing with? For example, if a developer is not
>>>>>> working
>>>>>>> on
>>>>>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
>>>>>> would
>>>>>>>>>>> not
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> clone
>>>>>>>>>>>>>>>> to their filesystem?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
>>>>> validate
>>>>>>>>>>>> against
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
>>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think this should be a discussion about the benefits and
>>>>>>>>>>>>> drawbacks
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> separating the code into distinct repositories from a
>>>>>>>>>>> development
>>>>>>>>>>>>>> point
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> view.
>>>>>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
>>>>>>> community
>>>>>>>>>>>> by
>>>>>>>>>>>>>>>> creating
>>>>>>>>>>>>>>>>> separate groups of committers.
>>>>>>>>>>>>>>>>> Also the discussion about independent releases is not be
>>>>>>>>>>> strictly
>>>>>>>>>>>>>>> related
>>>>>>>>>>>>>>>>> to the decision, IMO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
>>>>> before:
>>>>>>>>>>>>>>>>> pros:
>>>>>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
>>>>>> the
>>>>>>>>>>>>> whole
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> base to run a test after switching a branch.
>>>>>>>>>>>>>>>>> cons:
>>>>>>>>>>>>>>>>> - developing libraries features that require changes in
>>>>> the
>>>>>>>>>>> core
>>>>>>>>>>>> /
>>>>>>>>>>>>>> APIs
>>>>>>>>>>>>>>>>> become more time consuming due to back-and-forth between
>>>>>> code
>>>>>>>>>>>>> bases.
>>>>>>>>>>>>>>>>> However, I think this is not very often the case.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
>>>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> solved
>>>>>>>>>>>>>>>>> by different build profiles and configurations.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
>>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>> @Stephan:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
>>>>>>>>>>>> committers,
>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
>>>>>>>>>>> just
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> conscious about the disadvantages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
>>>>>>>>>>>> current
>>>>>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
>>>>>> resolve
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>>>> But that requires time from current committers. It seems
>>>>>> like
>>>>>>>>>>>>>>>> trade-offs
>>>>>>>>>>>>>>>>>> between code quality, speed of development, and committer
>>>>>>>>>>>>> efforts.
>>>>>>>>>>>>>>>>>>  From what I see in the discussion about ML, there are
>>>>> many
>>>>>>>>>>>> people
>>>>>>>>>>>>>>>> willing
>>>>>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
>>>>>> we
>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> should move forward. However, the development speed is
>>>>>>>>>>>>>> significantly
>>>>>>>>>>>>>>>>> slowed
>>>>>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
>>>>> helping
>>>>>>>>>>> the
>>>>>>>>>>>>>>> review
>>>>>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
>>>>>> either
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> quality
>>>>>>>>>>>>>>>>>> (by more easily accepting new committers) or some
>>>>> committer
>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
>>>>>> As
>>>>>>>>>>>> Till
>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> indicated, it would be shameful if we let this
>>>>> contribution
>>>>>>>>>>>>> effort
>>>>>>>>>>>>>>> die.
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> Gabor
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
Aljoscha, do you know how to configure jenkins?
Is Apache INFRA doing that, or are the beam people doing that themselves?

One downside of Jenkins is that we probably need some machines that execute
the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
currently have 10 such containers available on travis concurrently. I think
we would need at least the same amount on Jenkins.


On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <[hidden email]> wrote:

> I agress with Aljoscha that we might consider moving from Jenkins to
> Travis. Is there any disadvantage in using Jenkins?
>
> I think we should structure the project according to release management
> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
> non-core) instead of build time. What would happen if the built of another
> submodule would become too long, would we split/restructure again and
> again? If Jenkins solves all our problems we should use it.
>
> Regards,
> Timo
>
>
>
> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>
>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>> Jenkins integration, has opened my eyes to what is possible with good CI
>> integration.
>>
>> For example, look at this recent Beam PR: https://github.com/apache/beam
>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>> Jenkins-Github integration will tell you exactly which tests failed and if
>> you click on the links you can look at the log output/std out of the tests
>> in question.
>>
>> This is the overview page of one of the Jenkins Jobs that we have in
>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>> nService_Flink/ <https://builds.apache.org/job
>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a
>> stable build: https://builds.apache.org/job/
>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>> information about the Maven run. This is an unstable run:
>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There
>> you can see which tests failed and you can easily drill down.
>>
>> Best,
>> Aljoscha
>>
>> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>>>
>>> Thank you for looking into the build times.
>>>
>>> I didn't know that the build time situation is so bad. Even with yarn,
>>> mesos, connectors and libraries removed, we are still running into the
>>> build timeout :(
>>>
>>> Aljoscha told me that the Beam community is using Jenkins for running
>>> the tests, and they are planning to completely move away from Travis. I
>>> wonder whether we should do the same, as having our own Jenkins servers
>>> would allow us to run tests for more than 50 minutes.
>>>
>>> I agree with Stephan that we should keep the yarn and mesos tests in the
>>> core for stability / testing quality purposes.
>>>
>>>
>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>> @Greg
>>>
>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>> well. I know that @rmetzger has some reservations about the connectors,
>>> but
>>> we may be able to convince him.
>>>
>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>> where these tests caught cases that other tests did not, because they are
>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>> many dependency and configuration issues. For that reason, my feeling
>>> would
>>> be that they are valuable in the core repository.
>>>
>>> I would actually suggest to do only the library split initially, to see
>>> what the challenges are in setting up the multi-repo build and release
>>> tooling. Once we gathered experience there, we can probably easily see
>>> what
>>> else we can split out.
>>>
>>> Stephan
>>>
>>>
>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email] <mailto:
>>> [hidden email]>> wrote:
>>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>>> With 51 builds queued up for the weekend (some of which may fail or have
>>>> been force pushed) we are at the limit of the number of contributions we
>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>> investigating speedups for long-running tests, and 3) staying cognizant
>>>> of
>>>> test performance when accepting new code.
>>>>
>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>> modules are generic (“libraries”) so that no one module is alone and
>>>> independent.
>>>>
>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>
>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>> connectors for three Kafka versions).
>>>>
>>>> Both flink-storm and flink-python have a modest number of number of
>>>> tests
>>>> and could live with the miscellaneous modules in “contrib”.
>>>>
>>>> The YARN tests are long-running and problematic (I am unable to
>>>> successfully run these locally). A “cluster” module could host
>>>> flink-mesos,
>>>> flink-yarn, and flink-yarn-tests.
>>>>
>>>> That gets us close to running all tests in a single Travis build.
>>>>    https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>
>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>> https://github.com/greghogan/flink/commits/core_build <
>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>    https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>    https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>
>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>
>>>> I also wanted to get an idea of how disruptive it would be to developers
>>>> to divide the project into multiple git repos. I wrote a simple python
>>>> script and configured it with the module partitions listed above. The
>>>> usage
>>>> string from the top of the file lists commits with files from multiple
>>>> partitions and well as the modified files.
>>>>    https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>
>>>> Accounting for the merging of the batch and streaming connector modules,
>>>> and assuming that the project structure has not changed much over the
>>>> past
>>>> 15 months, for the following date ranges the listed number of commits
>>>> would
>>>> have been split across repositories.
>>>>
>>>> since "2017-01-01"
>>>> 36 of 571 commits were mixed
>>>>
>>>> since "2016-07-01"
>>>> 155 of 1607 commits were mixed
>>>>
>>>> since "2016-01-01"
>>>> 272 of 2561 commits were mixed
>>>>
>>>> Greg
>>>>
>>>>
>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:
>>>>> [hidden email]>> wrote:
>>>>>
>>>>> @Robert - I think once we know that a separate git repo works well, and
>>>>> that it actually solves problems, I see no reason to not create a
>>>>> connectors repository later. The infrastructure changes should be
>>>>>
>>>> identical
>>>>
>>>>> for two or more repositories.
>>>>>
>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]
>>>>> <mailto:[hidden email]>>
>>>>>
>>>> wrote:
>>>>
>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>
>>>>> remaining
>>>>
>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]
>>>>>> <mailto:[hidden email]>>
>>>>>> wrote:
>>>>>>
>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>
>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>
>>>>>> because
>>>>
>>>>> the flink-libraries depend on that.
>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>
>>>>>> again
>>>>
>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>
>>>>>> from
>>>>>>
>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>
>>>>>> artifact.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]
>>>>>>> <mailto:[hidden email]>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I'm ok with point 3.
>>>>>>>>
>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>
>>>>>>> having
>>>>>>
>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Till
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>
>>>>>>>> before, I
>>>>>>>
>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>
>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>
>>>>>>>> introduces
>>>>>>
>>>>>>> too
>>>>>>>>
>>>>>>>>> much complexity and too many repositories.
>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>>>>>
>>>>>>>> time
>>>>>>>
>>>>>>>> significantly down.
>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>
>>>>>>>> "flink-libraries"
>>>>>>
>>>>>>> repo
>>>>>>>>
>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>
>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>> we
>>>>>>>>>
>>>>>>>> want
>>>>>>>
>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>
>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>
>>>>>>>> placed
>>>>>>
>>>>>>> in
>>>>>>>>
>>>>>>>>> contrib anymore.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]
>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>
>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>> listed
>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>
>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>
>>>>>>>>> flink-libraries?
>>>>>>>>
>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>
>>>>>>>>> (and
>>>>>>>
>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>
>>>>>>>>>> Greg
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>
>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>>>>
>>>>>>>>>> feasible
>>>>>>>>>>
>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> libraries.
>>>>>>>>>>>
>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>
>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>
>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>
>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>
>>>>>>>>>> "flink-libraries"
>>>>>>>>>>
>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>
>>>>>>>>>> "flink-cep",
>>>>>>>>>>
>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>
>>>>>>>>>> decided
>>>>>>>>
>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>
>>>>>>>>>> contrib
>>>>>>>
>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>
>>>>>>>>>> repo
>>>>>>>>
>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> future)
>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>
>>>>>>>>>> them
>>>>>>
>>>>>>> into
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> new repo
>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>>>>>>
>>>>>>>>>> repo.
>>>>>>>>
>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>
>>>>>>>>>> repository,
>>>>>>>>
>>>>>>>>> similar to the main documentation.
>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>
>>>>>>>>>> documentations
>>>>>>>>>>
>>>>>>>>>>> & link them to each other
>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>
>>>>>>>>>> repositories
>>>>>>>>>>
>>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>>>>>>>
>>>>>>>>>> of
>>>>>>>
>>>>>>>> both
>>>>>>>>>
>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>
>>>>>>>>>> first
>>>>>>>>
>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>
>>>>>>>>>> with
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>
>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>
>>>>>>>>>> 3 ?
>>>>>>
>>>>>>> Would
>>>>>>>>>
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>
>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>
>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>
>>>>>>>>>>> of
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>
>>>>>>>>>>> +1s,
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> bot
>>>>>>>>>>
>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>
>>>>>>>>>>> merge
>>>>>>>
>>>>>>>> process.
>>>>>>>>>>>>
>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>
>>>>>>>>>>> there
>>>>>>
>>>>>>> is
>>>>>>>
>>>>>>>> not
>>>>>>>>>
>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>
>>>>>>>>>>> it
>>>>>>>
>>>>>>>> lives
>>>>>>>>>>
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>
>>>>>>> core
>>>>>>>>>
>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>
>>>>>>>>>>> which
>>>>>>
>>>>>>> are
>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>
>>>>>>> noticed
>>>>>>>>>>
>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>
>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>
>>>>>>>>>>> capacities to
>>>>>>>>>>
>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>
>>>>>>>>>>> code
>>>>>>>
>>>>>>>> in
>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>>
>>>>>>>>>>> single repository.
>>>>>>>>>>>>
>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>
>>>>>>>>>>> some
>>>>>>
>>>>>>> nice
>>>>>>>>>
>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>
>>>>>>>>>>> beneficial
>>>>>>>>
>>>>>>>>> for
>>>>>>>>>>
>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>
>>>>>>>>>>> on
>>>>>>>
>>>>>>>> Travis.
>>>>>>>>>>
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>
>>>>>>>>>>> reuse
>>>>>>>>
>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>
>>>>>>>>>>> however, it
>>>>>>>>>>
>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>
>>>>>>>>>>> Gradle
>>>>>>>>
>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>
>>>>>>>>>>> we
>>>>>>>
>>>>>>>> might
>>>>>>>>>>
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>
>>>>>>>>>>> repository
>>>>>>>>>>
>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>
>>>>>>>>>>> build
>>>>>>>
>>>>>>>> time in
>>>>>>>>>>
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>
>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Till
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>
>>>>>>>>>>>> not
>>>>>>
>>>>>>> sure
>>>>>>>>
>>>>>>>>> for
>>>>>>>>>>
>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>
>>>>>>>>>>>> a
>>>>>>
>>>>>>> lot
>>>>>>>>
>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>
>>>>>>> a
>>>>>>>
>>>>>>>> commit
>>>>>>>>>>
>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>
>>>>>>> meantime.
>>>>>>>>>>>>
>>>>>>>>>>>>>    For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>
>>>>>>>>>>>> eventually
>>>>>>>>>>>>
>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>
>>>>>>>>>>>> repository/modules
>>>>>>
>>>>>>> breaks
>>>>>>>>>>
>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd
>>>>>>>
>>>>>>>> like
>>>>>>>>>
>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> project
>>>>>>>>
>>>>>>>>> has
>>>>>>>>>>
>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>
>>>>>>>>>>>>> repository
>>>>>>>>>
>>>>>>>>>> using
>>>>>>>>>>>>>
>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>
>>>>>>>>>>>>> support
>>>>>>
>>>>>>> this
>>>>>>>>>
>>>>>>>>>> we
>>>>>>>>>>>>
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> for
>>>>>>>
>>>>>>>> example.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>
>>>>>>>>>>>>> sets
>>>>>>
>>>>>>> of
>>>>>>>
>>>>>>>> modules
>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>
>>>>>>>>>>>>> alternative
>>>>>>>>>>
>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> actually
>>>>>>>>>
>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>
>>>>>>> it
>>>>>>>
>>>>>>>> would
>>>>>>>>>>>>
>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>
>>>>>>>>>>>>> integration
>>>>>>>
>>>>>>>> tests
>>>>>>>>>>
>>>>>>>>>>> if
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> community
>>>>>>>>>>>>>
>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>
>>>>>>>>>>>>> this
>>>>>>>
>>>>>>>> is
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>
>>>>>>>>>>>>> could
>>>>>>>
>>>>>>>> become
>>>>>>>>>>>>
>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Maven,
>>>>>>>>
>>>>>>>>> I
>>>>>>>>>
>>>>>>>>>> still
>>>>>>>>>>>>>
>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>
>>>>>>>>>>>>> example,
>>>>>>
>>>>>>> could
>>>>>>>>>
>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>
>>>>>>>>>>>>> make
>>>>>>>
>>>>>>>> it
>>>>>>>>>
>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> The
>>>>>>
>>>>>>> main
>>>>>>>>
>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>
>>>>>>>>>>>>> docs
>>>>>>
>>>>>>> from
>>>>>>>>>
>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Flink's
>>>>>>>
>>>>>>>> ML
>>>>>>>>>
>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>
>>>>>>>>>>>>> whether
>>>>>>>>>
>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> If
>>>>>>
>>>>>>> that
>>>>>>>>
>>>>>>>>> should
>>>>>>>>>>>>>
>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> first
>>>>>>>
>>>>>>>> only
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm
>>>>>>
>>>>>>> done
>>>>>>>>>
>>>>>>>>>> with
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> open
>>>>>>
>>>>>>> source
>>>>>>>>>>>>
>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> back
>>>>>>
>>>>>>> then
>>>>>>>>>
>>>>>>>>>> when
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> available
>>>>>>>
>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've
>>>>>>>
>>>>>>>> recently
>>>>>>>>>>>>>
>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> time,
>>>>>>>>
>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aljoscha
>>>>>>
>>>>>>> suggested
>>>>>>>>>>>>>
>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> testing,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> competitors
>>>>>>>>
>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> source
>>>>>>>>>
>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> one
>>>>>>>
>>>>>>>> of
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> well,
>>>>>>
>>>>>>> then
>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>>>>
>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> different
>>>>>>
>>>>>>> repositories.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> opinion.
>>>>>>
>>>>>>> As
>>>>>>>
>>>>>>>> others
>>>>>>>>>>>>>
>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> things:
>>>>>>>
>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repo
>>>>>>>
>>>>>>>> should
>>>>>>>>>>>>>
>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> building
>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository
>>>>>>>
>>>>>>>> depend
>>>>>>>>>>>>>
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> snapshot
>>>>>>
>>>>>>> deployment
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> library
>>>>>>
>>>>>>> repository
>>>>>>>>>>>>>
>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> one
>>>>>>>
>>>>>>>> committer
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example
>>>>>>
>>>>>>> for
>>>>>>>>
>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> currently
>>>>>>>>>
>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> realistically
>>>>>>
>>>>>>> see
>>>>>>>>
>>>>>>>>> myself
>>>>>>>>>>>>>
>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> some
>>>>>>
>>>>>>> time
>>>>>>>>>
>>>>>>>>>> off.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 5
>>>>>>
>>>>>>> days.
>>>>>>>>>
>>>>>>>>>> The
>>>>>>>>>>>>>
>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stuff,
>>>>>
>>>>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Timo Walther-2
Another solution would be to make the Travis builds more efficient. For
example, we could write a script that determines the modified Maven
module and only run the test for this module (and maybe transitive
dependencies). PRs for libraries such as Gelly, Table, CEP or connectors
would not trigger a compilation of the entire stack anymore. Of course
this would not solve all problems but many of it.

What do you think about this?



Am 20/03/17 um 14:02 schrieb Robert Metzger:

> Aljoscha, do you know how to configure jenkins?
> Is Apache INFRA doing that, or are the beam people doing that themselves?
>
> One downside of Jenkins is that we probably need some machines that execute
> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
> currently have 10 such containers available on travis concurrently. I think
> we would need at least the same amount on Jenkins.
>
>
> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <[hidden email]> wrote:
>
>> I agress with Aljoscha that we might consider moving from Jenkins to
>> Travis. Is there any disadvantage in using Jenkins?
>>
>> I think we should structure the project according to release management
>> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
>> non-core) instead of build time. What would happen if the built of another
>> submodule would become too long, would we split/restructure again and
>> again? If Jenkins solves all our problems we should use it.
>>
>> Regards,
>> Timo
>>
>>
>>
>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>
>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>>> Jenkins integration, has opened my eyes to what is possible with good CI
>>> integration.
>>>
>>> For example, look at this recent Beam PR: https://github.com/apache/beam
>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>>> Jenkins-Github integration will tell you exactly which tests failed and if
>>> you click on the links you can look at the log output/std out of the tests
>>> in question.
>>>
>>> This is the overview page of one of the Jenkins Jobs that we have in
>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>> nService_Flink/ <https://builds.apache.org/job
>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a
>>> stable build: https://builds.apache.org/job/
>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>>> information about the Maven run. This is an unstable run:
>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There
>>> you can see which tests failed and you can easily drill down.
>>>
>>> Best,
>>> Aljoscha
>>>
>>> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>>>> Thank you for looking into the build times.
>>>>
>>>> I didn't know that the build time situation is so bad. Even with yarn,
>>>> mesos, connectors and libraries removed, we are still running into the
>>>> build timeout :(
>>>>
>>>> Aljoscha told me that the Beam community is using Jenkins for running
>>>> the tests, and they are planning to completely move away from Travis. I
>>>> wonder whether we should do the same, as having our own Jenkins servers
>>>> would allow us to run tests for more than 50 minutes.
>>>>
>>>> I agree with Stephan that we should keep the yarn and mesos tests in the
>>>> core for stability / testing quality purposes.
>>>>
>>>>
>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>> @Greg
>>>>
>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>> but
>>>> we may be able to convince him.
>>>>
>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>> where these tests caught cases that other tests did not, because they are
>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>> many dependency and configuration issues. For that reason, my feeling
>>>> would
>>>> be that they are valuable in the core repository.
>>>>
>>>> I would actually suggest to do only the library split initially, to see
>>>> what the challenges are in setting up the multi-repo build and release
>>>> tooling. Once we gathered experience there, we can probably easily see
>>>> what
>>>> else we can split out.
>>>>
>>>> Stephan
>>>>
>>>>
>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email] <mailto:
>>>> [hidden email]>> wrote:
>>>>
>>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>>>> With 51 builds queued up for the weekend (some of which may fail or have
>>>>> been force pushed) we are at the limit of the number of contributions we
>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>> investigating speedups for long-running tests, and 3) staying cognizant
>>>>> of
>>>>> test performance when accepting new code.
>>>>>
>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>> independent.
>>>>>
>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>
>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>> connectors for three Kafka versions).
>>>>>
>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>> tests
>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>
>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>> successfully run these locally). A “cluster” module could host
>>>>> flink-mesos,
>>>>> flink-yarn, and flink-yarn-tests.
>>>>>
>>>>> That gets us close to running all tests in a single Travis build.
>>>>>     https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>>
>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>>> https://github.com/greghogan/flink/commits/core_build <
>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>     https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>>     https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>>
>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>
>>>>> I also wanted to get an idea of how disruptive it would be to developers
>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>> script and configured it with the module partitions listed above. The
>>>>> usage
>>>>> string from the top of the file lists commits with files from multiple
>>>>> partitions and well as the modified files.
>>>>>     https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>>
>>>>> Accounting for the merging of the batch and streaming connector modules,
>>>>> and assuming that the project structure has not changed much over the
>>>>> past
>>>>> 15 months, for the following date ranges the listed number of commits
>>>>> would
>>>>> have been split across repositories.
>>>>>
>>>>> since "2017-01-01"
>>>>> 36 of 571 commits were mixed
>>>>>
>>>>> since "2016-07-01"
>>>>> 155 of 1607 commits were mixed
>>>>>
>>>>> since "2016-01-01"
>>>>> 272 of 2561 commits were mixed
>>>>>
>>>>> Greg
>>>>>
>>>>>
>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:
>>>>>> [hidden email]>> wrote:
>>>>>>
>>>>>> @Robert - I think once we know that a separate git repo works well, and
>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>
>>>>> identical
>>>>>
>>>>>> for two or more repositories.
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]
>>>>>> <mailto:[hidden email]>>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>> remaining
>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]
>>>>>>> <mailto:[hidden email]>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>
>>>>>>> because
>>>>>> the flink-libraries depend on that.
>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>
>>>>>>> again
>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>> from
>>>>>>>
>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>
>>>>>>> artifact.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]
>>>>>>>> <mailto:[hidden email]>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I'm ok with point 3.
>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>
>>>>>>>> having
>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>>> Cheers,
>>>>>>>>> Till
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>
>>>>>>>>> before, I
>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>
>>>>>>>>> introduces
>>>>>>>> too
>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>>>>>>
>>>>>>>>> time
>>>>>>>>> significantly down.
>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>
>>>>>>>>> "flink-libraries"
>>>>>>>> repo
>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>
>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>> we
>>>>>>>>>>
>>>>>>>>> want
>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>
>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>
>>>>>>>>> placed
>>>>>>>> in
>>>>>>>>>> contrib anymore.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]
>>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>> listed
>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>
>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>
>>>>>>>>>> flink-libraries?
>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>> (and
>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>> Greg
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>>>>>
>>>>>>>>>>> feasible
>>>>>>>>>>>
>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>> libraries.
>>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>
>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>>
>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>
>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>
>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>
>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>
>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>
>>>>>>>>>>> decided
>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>> contrib
>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>> repo
>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>> the
>>>>>>>>> future)
>>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>
>>>>>>>>>>> them
>>>>>>>> into
>>>>>>>>>> the
>>>>>>>>>>>> new repo
>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>>>>>>>
>>>>>>>>>>> repo.
>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>> repository,
>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>
>>>>>>>>>>> documentations
>>>>>>>>>>>
>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>
>>>>>>>>>>> repositories
>>>>>>>>>>>
>>>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>>>>>>>>
>>>>>>>>>>> of
>>>>>>>>> both
>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>> the
>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>> first
>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>> with
>>>>>>>> the
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>
>>>>>>>>>>> 3 ?
>>>>>>>> Would
>>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>
>>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>> of
>>>>>>>> the
>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>> +1s,
>>>>>>>> the
>>>>>>>>>> bot
>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>> merge
>>>>>>>>> process.
>>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>
>>>>>>>>>>>> there
>>>>>>>> is
>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>> it
>>>>>>>>> lives
>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>> the
>>>>>>>> core
>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>> which
>>>>>>>> are
>>>>>>>>>> not
>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>> be
>>>>>>>> noticed
>>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>
>>>>>>>>>>>> capacities to
>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>> code
>>>>>>>>> in
>>>>>>>>>>> a
>>>>>>>>>>>
>>>>>>>>>>>> single repository.
>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>
>>>>>>>>>>>> some
>>>>>>>> nice
>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>> beneficial
>>>>>>>>>> for
>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>> on
>>>>>>>>> Travis.
>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>> reuse
>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>> however, it
>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>> Gradle
>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>> we
>>>>>>>>> might
>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>> repository
>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>> build
>>>>>>>>> time in
>>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>> be
>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>> not
>>>>>>>> sure
>>>>>>>>>> for
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>
>>>>>>>>>>>>> a
>>>>>>>> lot
>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>> was
>>>>>>>> a
>>>>>>>>
>>>>>>>>> commit
>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>> the
>>>>>>>> meantime.
>>>>>>>>>>>>>>     For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>
>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>
>>>>>>>>>>>>> repository/modules
>>>>>>>> breaks
>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>> I'd
>>>>>>>>> like
>>>>>>>>>>> to
>>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> project
>>>>>>>>>> has
>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>> repository
>>>>>>>>>>> using
>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> support
>>>>>>>> this
>>>>>>>>>>> we
>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>> for
>>>>>>>>> example.
>>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sets
>>>>>>>> of
>>>>>>>>
>>>>>>>>> modules
>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> alternative
>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>> actually
>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>> and
>>>>>>>> it
>>>>>>>>
>>>>>>>>> would
>>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>> integration
>>>>>>>>> tests
>>>>>>>>>>>> if
>>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>> community
>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> this
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>> could
>>>>>>>>> become
>>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>> Maven,
>>>>>>>>>> I
>>>>>>>>>>
>>>>>>>>>>> still
>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example,
>>>>>>>> could
>>>>>>>>>>> be
>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>> make
>>>>>>>>> it
>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>> The
>>>>>>>> main
>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>> docs
>>>>>>>> from
>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Flink's
>>>>>>>>> ML
>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>> whether
>>>>>>>>>>> it
>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>> If
>>>>>>>> that
>>>>>>>>>> should
>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> first
>>>>>>>>> only
>>>>>>>>>>> for
>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm
>>>>>>>> done
>>>>>>>>>>> with
>>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>> open
>>>>>>>> source
>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>> back
>>>>>>>> then
>>>>>>>>>>> when
>>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> available
>>>>>>>>> with
>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've
>>>>>>>>> recently
>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> time,
>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>> Aljoscha
>>>>>>>> suggested
>>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>> testing,
>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> competitors
>>>>>>>>>> to
>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>> source
>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>> one
>>>>>>>>> of
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> well,
>>>>>>>> then
>>>>>>>>>>> I
>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>> different
>>>>>>>> repositories.
>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> opinion.
>>>>>>>> As
>>>>>>>>
>>>>>>>>> others
>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>> things:
>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>> repo
>>>>>>>>> should
>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository
>>>>>>>>> depend
>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> snapshot
>>>>>>>> deployment
>>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> library
>>>>>>>> repository
>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> one
>>>>>>>>> committer
>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> example
>>>>>>>> for
>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> currently
>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>> realistically
>>>>>>>> see
>>>>>>>>>> myself
>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>> some
>>>>>>>> time
>>>>>>>>>>> off.
>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>> 5
>>>>>>>> days.
>>>>>>>>>>> The
>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>> stuff,
>>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
It looks like Jetbrains TeamCity supports something in that direction:
https://blog.jetbrains.com/teamcity/2012/03/incremental-building-with-maven-and-teamcity/


On Mon, Mar 20, 2017 at 2:40 PM, Timo Walther <[hidden email]> wrote:

> Another solution would be to make the Travis builds more efficient. For
> example, we could write a script that determines the modified Maven module
> and only run the test for this module (and maybe transitive dependencies).
> PRs for libraries such as Gelly, Table, CEP or connectors would not trigger
> a compilation of the entire stack anymore. Of course this would not solve
> all problems but many of it.
>
> What do you think about this?
>
>
>
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>
> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>>
>> One downside of Jenkins is that we probably need some machines that
>> execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I
>> think
>> we would need at least the same amount on Jenkins.
>>
>>
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <[hidden email]> wrote:
>>
>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>>
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core
>>> and
>>> non-core) instead of build time. What would happen if the built of
>>> another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>>
>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>>>> Jenkins integration, has opened my eyes to what is possible with good CI
>>>> integration.
>>>>
>>>> For example, look at this recent Beam PR:
>>>> https://github.com/apache/beam
>>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>>>> Jenkins-Github integration will tell you exactly which tests failed and
>>>> if
>>>> you click on the links you can look at the log output/std out of the
>>>> tests
>>>> in question.
>>>>
>>>> This is the overview page of one of the Jenkins Jobs that we have in
>>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of
>>>> a
>>>> stable build: https://builds.apache.org/job/
>>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>>>> information about the Maven run. This is an unstable run:
>>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>.
>>>> There
>>>> you can see which tests failed and you can easily drill down.
>>>>
>>>> Best,
>>>> Aljoscha
>>>>
>>>> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>>>>
>>>>> Thank you for looking into the build times.
>>>>>
>>>>> I didn't know that the build time situation is so bad. Even with yarn,
>>>>> mesos, connectors and libraries removed, we are still running into the
>>>>> build timeout :(
>>>>>
>>>>> Aljoscha told me that the Beam community is using Jenkins for running
>>>>> the tests, and they are planning to completely move away from Travis. I
>>>>> wonder whether we should do the same, as having our own Jenkins servers
>>>>> would allow us to run tests for more than 50 minutes.
>>>>>
>>>>> I agree with Stephan that we should keep the yarn and mesos tests in
>>>>> the
>>>>> core for stability / testing quality purposes.
>>>>>
>>>>>
>>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email]
>>>>> <mailto:[hidden email]>> wrote:
>>>>> @Greg
>>>>>
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>>
>>>>> For the cluster tests (yarn / mesos) - in the past there were many
>>>>> cases
>>>>> where these tests caught cases that other tests did not, because they
>>>>> are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>>
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and release
>>>>> tooling. Once we gathered experience there, we can probably easily see
>>>>> what
>>>>> else we can split out.
>>>>>
>>>>> Stephan
>>>>>
>>>>>
>>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]
>>>>> <mailto:
>>>>> [hidden email]>> wrote:
>>>>>
>>>>> I’d like to use this refactoring opportunity to unspilt the Travis
>>>>> tests.
>>>>>
>>>>>> With 51 builds queued up for the weekend (some of which may fail or
>>>>>> have
>>>>>> been force pushed) we are at the limit of the number of contributions
>>>>>> we
>>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>>> investigating speedups for long-running tests, and 3) staying
>>>>>> cognizant
>>>>>> of
>>>>>> test performance when accepting new code.
>>>>>>
>>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>>> independent.
>>>>>>
>>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>>
>>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>>> connectors for three Kafka versions).
>>>>>>
>>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>>> tests
>>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>>
>>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>>> successfully run these locally). A “cluster” module could host
>>>>>> flink-mesos,
>>>>>> flink-yarn, and flink-yarn-tests.
>>>>>>
>>>>>> That gets us close to running all tests in a single Travis build.
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>>>
>>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build
>>>>>> <
>>>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>>>> https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>>>
>>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>>
>>>>>> I also wanted to get an idea of how disruptive it would be to
>>>>>> developers
>>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>>> script and configured it with the module partitions listed above. The
>>>>>> usage
>>>>>> string from the top of the file lists commits with files from multiple
>>>>>> partitions and well as the modified files.
>>>>>>     https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335
>>>>>> ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>>>
>>>>>> Accounting for the merging of the batch and streaming connector
>>>>>> modules,
>>>>>> and assuming that the project structure has not changed much over the
>>>>>> past
>>>>>> 15 months, for the following date ranges the listed number of commits
>>>>>> would
>>>>>> have been split across repositories.
>>>>>>
>>>>>> since "2017-01-01"
>>>>>> 36 of 571 commits were mixed
>>>>>>
>>>>>> since "2016-07-01"
>>>>>> 155 of 1607 commits were mixed
>>>>>>
>>>>>> since "2016-01-01"
>>>>>> 272 of 2561 commits were mixed
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:
>>>>>>
>>>>>>> [hidden email]>> wrote:
>>>>>>>
>>>>>>> @Robert - I think once we know that a separate git repo works well,
>>>>>>> and
>>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>>
>>>>>>> identical
>>>>>>
>>>>>> for two or more repositories.
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]
>>>>>>> <mailto:[hidden email]>>
>>>>>>>
>>>>>>> wrote:
>>>>>>
>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>> remaining
>>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <
>>>>>>>> [hidden email]
>>>>>>>> <mailto:[hidden email]>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>
>>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>>
>>>>>>>>> because
>>>>>>>>
>>>>>>> the flink-libraries depend on that.
>>>>>>>
>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>>
>>>>>>>>> again
>>>>>>>>
>>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>
>>>>>>>> from
>>>>>>>>
>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>>
>>>>>>>>> artifact.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <
>>>>>>>>> [hidden email]
>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I'm ok with point 3.
>>>>>>>>>
>>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>>
>>>>>>>>>> having
>>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to
>>>>>>>>> me.
>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>>
>>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>>
>>>>>>>>>>> before, I
>>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>
>>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>>
>>>>>>>>>>> introduces
>>>>>>>>>>
>>>>>>>>> too
>>>>>>>>>
>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the
>>>>>>>>>>> build
>>>>>>>>>>>
>>>>>>>>>>> time
>>>>>>>>>> significantly down.
>>>>>>>>>>
>>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>>
>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>
>>>>>>>>> repo
>>>>>>>>>
>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>>
>>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>>> we
>>>>>>>>>>>
>>>>>>>>>>> want
>>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>>
>>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>>
>>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>>
>>>>>>>>>>> placed
>>>>>>>>>>
>>>>>>>>> in
>>>>>>>>>
>>>>>>>>>> contrib anymore.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]
>>>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>
>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>>> listed
>>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>>
>>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>>
>>>>>>>>>>>> flink-libraries?
>>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>> (and
>>>>>>>>>>>
>>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>
>>>>>>>>>>> Greg
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <
>>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>
>>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>
>>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>>> My main motivation for doing this is that it seems to be the
>>>>>>>>>>>>> only
>>>>>>>>>>>>>
>>>>>>>>>>>>> feasible
>>>>>>>>>>>>
>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> libraries.
>>>>>>>>>>
>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>>>
>>>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>>
>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>>
>>>>>>>>>>>>> decided
>>>>>>>>>>>>
>>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>
>>>>>>>>>>>> contrib
>>>>>>>>>>>>
>>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>
>>>>>>>>>>> repo
>>>>>>>>>>>>
>>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> future)
>>>>>>>>>>
>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>>
>>>>>>>>>>>>> them
>>>>>>>>>>>>
>>>>>>>>>>> into
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> new repo
>>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the
>>>>>>>>>>>>> main
>>>>>>>>>>>>>
>>>>>>>>>>>>> repo.
>>>>>>>>>>>>
>>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>
>>>>>>>>>>>> repository,
>>>>>>>>>>>>
>>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>
>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>>
>>>>>>>>>>>>> documentations
>>>>>>>>>>>>
>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>>
>>>>>>>>>>>>> repositories
>>>>>>>>>>>>
>>>>>>>>>>>> 8. I'll update the release script to create the Flink release
>>>>>>>>>>>>> out
>>>>>>>>>>>>>
>>>>>>>>>>>>> of
>>>>>>>>>>>>
>>>>>>>>>>> both
>>>>>>>>>>
>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>
>>>>>>>>>>> first
>>>>>>>>>>>>
>>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>
>>>>>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>
>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3 ?
>>>>>>>>>>>>
>>>>>>>>>>> Would
>>>>>>>>>
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>
>>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>> of
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>
>>>>>>>>>>>> +1s,
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> bot
>>>>>>>>>>>
>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>> merge
>>>>>>>>>>>>>
>>>>>>>>>>>> process.
>>>>>>>>>>
>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> there
>>>>>>>>>>>>>
>>>>>>>>>>>> is
>>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>
>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>> lives
>>>>>>>>>>
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> core
>>>>>>>>>
>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>
>>>>>>>>>>>>> which
>>>>>>>>>>>>>
>>>>>>>>>>>> are
>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>> noticed
>>>>>>>>>
>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> capacities to
>>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>> code
>>>>>>>>>>>>>
>>>>>>>>>>>> in
>>>>>>>>>>
>>>>>>>>>>> a
>>>>>>>>>>>>
>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>
>>>>>>>>>>>> nice
>>>>>>>>>
>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial
>>>>>>>>>>>>>
>>>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>> on
>>>>>>>>>>>>>
>>>>>>>>>>>> Travis.
>>>>>>>>>>
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>> reuse
>>>>>>>>>>>>>
>>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>
>>>>>>>>>>>> however, it
>>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>> Gradle
>>>>>>>>>>>>>
>>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>
>>>>>>>>>>>> we
>>>>>>>>>>>>>
>>>>>>>>>>>> might
>>>>>>>>>>
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>> repository
>>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>> build
>>>>>>>>>>>>>
>>>>>>>>>>>> time in
>>>>>>>>>>
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <
>>>>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>
>>>>>>>>>>>>> sure
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> lot
>>>>>>>>>
>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>>>>>>>>>
>>>>>>>>>>>>> a
>>>>>>>>>
>>>>>>>>> commit
>>>>>>>>>>
>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> meantime.
>>>>>>>>>
>>>>>>>>>>     For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository/modules
>>>>>>>>>>>>>>
>>>>>>>>>>>>> breaks
>>>>>>>>>
>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>> I'd
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> like
>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> project
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> has
>>>>>>>>>>>
>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using
>>>>>>>>>>>>
>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> this
>>>>>>>>>
>>>>>>>>>> we
>>>>>>>>>>>>
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example.
>>>>>>>>>>
>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> of
>>>>>>>>>
>>>>>>>>> modules
>>>>>>>>>>
>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>
>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>
>>>>>>>>> would
>>>>>>>>>>
>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>> integration
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tests
>>>>>>>>>>
>>>>>>>>>>> if
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to
>>>>>>>>>>>>>>>> split
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> community
>>>>>>>>>>
>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> become
>>>>>>>>>>
>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>> Maven,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I
>>>>>>>>>>>
>>>>>>>>>>> still
>>>>>>>>>>>>
>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> could
>>>>>>>>>
>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>>
>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>
>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> main
>>>>>>>>>
>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>
>>>>>>>>>>>> docs
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> from
>>>>>>>>>
>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>
>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Flink's
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ML
>>>>>>>>>>
>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>
>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> that
>>>>>>>>>
>>>>>>>>>> should
>>>>>>>>>>>
>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> only
>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>>>
>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> done
>>>>>>>>>
>>>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>> open
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> source
>>>>>>>>>
>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> then
>>>>>>>>>
>>>>>>>>>> when
>>>>>>>>>>>>
>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> with
>>>>>>>>>>
>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> recently
>>>>>>>>>>
>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and
>>>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> time,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>
>>>>>>>>>>>> Aljoscha
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> suggested
>>>>>>>>>
>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> testing,
>>>>>>>>>>
>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> competitors
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>
>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> of
>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> well,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> then
>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>>>>
>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repositories.
>>>>>>>>>
>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> opinion.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As
>>>>>>>>>
>>>>>>>>> others
>>>>>>>>>>
>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>> things:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>
>>>>>>>>>>> repo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> should
>>>>>>>>>>
>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> depend
>>>>>>>>>>
>>>>>>>>>>> on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> snapshot
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> deployment
>>>>>>>>>
>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository
>>>>>>>>>
>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at
>>>>>>>>>>>>>>>>> least
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> committer
>>>>>>>>>>
>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> example
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> for
>>>>>>>>>
>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>
>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>
>>>>>>>>>>>>> realistically
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> see
>>>>>>>>>
>>>>>>>>>> myself
>>>>>>>>>>>
>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> time
>>>>>>>>>
>>>>>>>>>> off.
>>>>>>>>>>>>
>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>> 5
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> days.
>>>>>>>>>
>>>>>>>>>> The
>>>>>>>>>>>>
>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>> stuff,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Greg Hogan
In reply to this post by Stephan Ewen
We can add cluster tests using the distribution jar, and will need to do so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would still run nightly and running cluster tests should be much faster. As troublesome as TravisCI has been, a major driver for this change has been local build time.

I agree with splitting off one repo at a time, but we’ll first need to reorganize the core repo if using git submodules as flink-python and flink-table would need to first be moved. So I think planning this out first is a healthy idea, with the understanding that the plan will be reevaluated.

Any changes to the project structure need a scheduled period, perhaps a week, for existing pull requests to be reviewed and accepted or closed and later migrated.


> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
>
> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
>
> Stephan
>
>
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>
>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>> With 51 builds queued up for the weekend (some of which may fail or have
>> been force pushed) we are at the limit of the number of contributions we
>> can process. Fixing this requires 1) splitting the project, 2)
>> investigating speedups for long-running tests, and 3) staying cognizant of
>> test performance when accepting new code.
>>
>> I’d like to add one to Stephan’s list of module group. I like that the
>> modules are generic (“libraries”) so that no one module is alone and
>> independent.
>>
>> Flink has three “libraries”: cep, ml, and gelly.
>>
>> “connectors” is a hotspot due to the long-running Kafka tests (and
>> connectors for three Kafka versions).
>>
>> Both flink-storm and flink-python have a modest number of number of tests
>> and could live with the miscellaneous modules in “contrib”.
>>
>> The YARN tests are long-running and problematic (I am unable to
>> successfully run these locally). A “cluster” module could host flink-mesos,
>> flink-yarn, and flink-yarn-tests.
>>
>> That gets us close to running all tests in a single Travis build.
>>  https://travis-ci.org/greghogan/flink/builds/212122590 <
>> https://travis-ci.org/greghogan/flink/builds/212122590>
>>
>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>  https://travis-ci.org/greghogan/flink/builds/212137659 <
>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>  https://travis-ci.org/greghogan/flink/builds/212154470 <
>> https://travis-ci.org/greghogan/flink/builds/212154470>
>>
>> We can run Travis CI builds nightly to guard against breaking changes.
>>
>> I also wanted to get an idea of how disruptive it would be to developers
>> to divide the project into multiple git repos. I wrote a simple python
>> script and configured it with the module partitions listed above. The usage
>> string from the top of the file lists commits with files from multiple
>> partitions and well as the modified files.
>>  https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>>
>> Accounting for the merging of the batch and streaming connector modules,
>> and assuming that the project structure has not changed much over the past
>> 15 months, for the following date ranges the listed number of commits would
>> have been split across repositories.
>>
>> since "2017-01-01"
>> 36 of 571 commits were mixed
>>
>> since "2016-07-01"
>> 155 of 1607 commits were mixed
>>
>> since "2016-01-01"
>> 272 of 2561 commits were mixed
>>
>> Greg
>>
>>
>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>>>
>>> @Robert - I think once we know that a separate git repo works well, and
>>> that it actually solves problems, I see no reason to not create a
>>> connectors repository later. The infrastructure changes should be
>> identical
>>> for two or more repositories.
>>>
>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
>> wrote:
>>>
>>>> I think it should not be at least the flink-dist but exactly the
>> remaining
>>>> flink-dist module. Otherwise we do redundant work.
>>>>
>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
>>>> wrote:
>>>>
>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>
>>>>> When doing a release, we need to build the flink main code first,
>> because
>>>>> the flink-libraries depend on that.
>>>>> Once the "flink-libraries" are build, we need to run the main build
>> again
>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>> from
>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>> artifact.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> I'm ok with point 3.
>>>>>>
>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>> having
>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>
>>>>>> Cheers,
>>>>>> Till
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>> before, I
>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>
>>>>>>> I'm against creating two new repositories. I fear that this
>>>> introduces
>>>>>> too
>>>>>>> much complexity and too many repositories.
>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>> time
>>>>>>> significantly down.
>>>>>>> We can also consider putting the connectors into the
>>>> "flink-libraries"
>>>>>> repo
>>>>>>> if we need to further reduce the build time.
>>>>>>>
>>>>>>> We should probably move "flink-table" of out "flink-libraries" if we
>>>>> want
>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>> "flink-libraries" module from main.
>>>>>>>
>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>> placed
>>>>>> in
>>>>>>> contrib anymore.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>>>> wrote:
>>>>>>>
>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>
>>>>>>>> We should compare the verification time with and without the listed
>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>
>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>> flink-libraries?
>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>> (and
>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>
>>>>>>>> Greg
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>
>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>> feasible
>>>>>>>>> way of scaling the community to allow more committers working on
>>>>> the
>>>>>>>>> libraries.
>>>>>>>>>
>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>
>>>>>>>>> As the next steps I propose to:
>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>>>> repos/asf?p=flink-
>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>> "flink-libraries"
>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>> "flink-cep",
>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>> decided
>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>> contrib
>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>> repo
>>>>>>>>> because its probably going to interact more with the core code in
>>>>> the
>>>>>>>>> future)
>>>>>>>>> I try to preserve the history of those modules when splitting
>>>> them
>>>>>> into
>>>>>>>> the
>>>>>>>>> new repo
>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>> repo.
>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>> repository,
>>>>>>>>> similar to the main documentation.
>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>> documentations
>>>>>>>>> & link them to each other
>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>> repositories
>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>> of
>>>>>>> both
>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>> the
>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>> first
>>>>>>>>> builds flink core, then the libraries and then the core again
>>>> with
>>>>>> the
>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>
>>>>>>>>> The main question for the community is: do you agree with point
>>>> 3 ?
>>>>>>> Would
>>>>>>>>> you like to include more or less?
>>>>>>>>>
>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>> [hidden email]
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>> of
>>>>>> the
>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>> +1s,
>>>>>> the
>>>>>>>> bot
>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>> merge
>>>>>>>>>> process.
>>>>>>>>>>
>>>>>>>>>> I think the second point is actually a disadvantage because
>>>> there
>>>>> is
>>>>>>> not
>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>> it
>>>>>>>> lives
>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>> the
>>>>>>> core
>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>> which
>>>>>> are
>>>>>>>> not
>>>>>>>>>> developed so actively. In the worst case these things will only
>>>> be
>>>>>>>> noticed
>>>>>>>>>> when we try to make a release.
>>>>>>>>>>
>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>> capacities to
>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>> code
>>>>>>> in
>>>>>>>> a
>>>>>>>>>> single repository.
>>>>>>>>>>
>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>> some
>>>>>>> nice
>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>> beneficial
>>>>>>>> for
>>>>>>>>>> local development but it would not solve our build time problems
>>>>> on
>>>>>>>> Travis.
>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>> reuse
>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>> however, it
>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>> Gradle
>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>> we
>>>>>>>> might
>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>> repository
>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>> build
>>>>>>>> time in
>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>> be
>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>> not
>>>>>> sure
>>>>>>>> for
>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>
>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>> a
>>>>>> lot
>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>> was
>>>>> a
>>>>>>>> commit
>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>> the
>>>>>>>>>> meantime.
>>>>>>>>>>>  For a "linear" commit history, this may become a bottleneck
>>>>>>>>>> eventually
>>>>>>>>>>> as well.
>>>>>>>>>>>
>>>>>>>>>>> - There is less risk of broken master. If one
>>>> repository/modules
>>>>>>>> breaks
>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>
>>>>>>>>>>> Stephan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>> [hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>> I'd
>>>>>>> like
>>>>>>>>>> to
>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>
>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>> project
>>>>>>>> has
>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>> repository
>>>>>>>>>>> using
>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>> support
>>>>>>> this
>>>>>>>>>> we
>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>> for
>>>>>>>>>>> example.
>>>>>>>>>>>>
>>>>>>>>>>>> Another option is introducing build profiles for different
>>>> sets
>>>>> of
>>>>>>>>>>> modules
>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>> alternative
>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>> actually
>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>> and
>>>>> it
>>>>>>>>>> would
>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>> integration
>>>>>>>> tests
>>>>>>>>>>> if
>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>
>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>> the
>>>>>>>>>>> community
>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>> this
>>>>>> is
>>>>>>>>>> the
>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>> could
>>>>>>>>>> become
>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>> Maven,
>>>>>>> I
>>>>>>>>>>> still
>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>> example,
>>>>>>> could
>>>>>>>>>> be
>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>> make
>>>>>>> it
>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>> The
>>>>>> main
>>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>> docs
>>>>>>> from
>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>
>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>> Flink's
>>>>>>> ML
>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>> whether
>>>>>>>>>> it
>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>> If
>>>>>> that
>>>>>>>>>>> should
>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>> first
>>>>>>> only
>>>>>>>>>>> for
>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>> I'm
>>>>>>> done
>>>>>>>>>>> with
>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Till
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>> [hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>> open
>>>>>>>>>> source
>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>> back
>>>>>>> then
>>>>>>>>>>> when
>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>
>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>> available
>>>>>>>>>> with
>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>> I've
>>>>>>>>>>> recently
>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>> time,
>>>>>>>>>>>>> introducing build profiles for different components as
>>>> Aljoscha
>>>>>>>>>>> suggested
>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>> the
>>>>>>>>>>>> testing,
>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>> competitors
>>>>>>>>>> to
>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>> source
>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>> one
>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>> well,
>>>>>>> then
>>>>>>>>>> I
>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>> different
>>>>>>>>>>>>> repositories.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>> opinion.
>>>>> As
>>>>>>>>>>> others
>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>> things:
>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>> repo
>>>>>>>>>>> should
>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>> building
>>>>>>>>>>>> the
>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>> repository
>>>>>>>>>>> depend
>>>>>>>>>>>> on
>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>> snapshot
>>>>>>>>>>>> deployment
>>>>>>>>>>>>> always works. This also means that people working on a
>>>> library
>>>>>>>>>>> repository
>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>
>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>> one
>>>>>>>>>>>> committer
>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>> example
>>>>>> for
>>>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>> currently
>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>> realistically
>>>>>> see
>>>>>>>>>>> myself
>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>> some
>>>>>>> time
>>>>>>>>>>> off.
>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>> 5
>>>>>>> days.
>>>>>>>>>>> The
>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>> stuff,
>>>>> so
>>>>>>>>>> many
>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>> release
>>>>>>>>>>>> scripts,
>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>> [hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>> actually
>>>>>> be
>>>>>>>>>>> the
>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>> repository"
>>>>>>>>>>>> code
>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>> update/publish
>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>> build
>>>>>>>>>>>> system
>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>> not
>>>>>>> sure
>>>>>>>>>>> how
>>>>>>>>>>>>> we
>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>> [hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>> parallel
>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>> since
>>>>>>>>>> we're
>>>>>>>>>>>>>>> generally running on multi-core machines rather than VM
>>>>> slices.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is the idea that each user would only check out the modules
>>>>>> that
>>>>>>>>>> he
>>>>>>>>>>>> or
>>>>>>>>>>>>>> she
>>>>>>>>>>>>>>> is developing with? For example, if a developer is not
>>>>> working
>>>>>> on
>>>>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
>>>>> would
>>>>>>>>>> not
>>>>>>>>>>> be
>>>>>>>>>>>>>> clone
>>>>>>>>>>>>>>> to their filesystem?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
>>>> validate
>>>>>>>>>>> against
>>>>>>>>>>>>> API
>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
>>>>>>>>>> [hidden email]
>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think this should be a discussion about the benefits and
>>>>>>>>>>>> drawbacks
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> separating the code into distinct repositories from a
>>>>>>>>>> development
>>>>>>>>>>>>> point
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> view.
>>>>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
>>>>>> community
>>>>>>>>>>> by
>>>>>>>>>>>>>>> creating
>>>>>>>>>>>>>>>> separate groups of committers.
>>>>>>>>>>>>>>>> Also the discussion about independent releases is not be
>>>>>>>>>> strictly
>>>>>>>>>>>>>> related
>>>>>>>>>>>>>>>> to the decision, IMO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
>>>>>>>>>>> separate
>>>>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
>>>> before:
>>>>>>>>>>>>>>>> pros:
>>>>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
>>>>> the
>>>>>>>>>>>> whole
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>> base to run a test after switching a branch.
>>>>>>>>>>>>>>>> cons:
>>>>>>>>>>>>>>>> - developing libraries features that require changes in
>>>> the
>>>>>>>>>> core
>>>>>>>>>>> /
>>>>>>>>>>>>> APIs
>>>>>>>>>>>>>>>> become more time consuming due to back-and-forth between
>>>>> code
>>>>>>>>>>>> bases.
>>>>>>>>>>>>>>>> However, I think this is not very often the case.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
>>>>>>>>>> could
>>>>>>>>>>> be
>>>>>>>>>>>>>>> solved
>>>>>>>>>>>>>>>> by different build profiles and configurations.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
>>>>>>>>>> [hidden email]
>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @Stephan:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
>>>>>>>>>>> committers,
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
>>>>>>>>>> just
>>>>>>>>>>>> have
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> conscious about the disadvantages.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
>>>>> the
>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
>>>>>>>>>>> current
>>>>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
>>>>> resolve
>>>>>>>>>>> the
>>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>>> But that requires time from current committers. It seems
>>>>> like
>>>>>>>>>>>>>>> trade-offs
>>>>>>>>>>>>>>>>> between code quality, speed of development, and committer
>>>>>>>>>>>> efforts.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
>>>> many
>>>>>>>>>>> people
>>>>>>>>>>>>>>> willing
>>>>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
>>>>> we
>>>>>>>>>>>> could
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> should move forward. However, the development speed is
>>>>>>>>>>>>> significantly
>>>>>>>>>>>>>>>> slowed
>>>>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
>>>> helping
>>>>>>>>>> the
>>>>>>>>>>>>>> review
>>>>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
>>>>> either
>>>>>>>>>>>> code
>>>>>>>>>>>>>>>> quality
>>>>>>>>>>>>>>>>> (by more easily accepting new committers) or some
>>>> committer
>>>>>>>>>>> time
>>>>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
>>>>> As
>>>>>>>>>>> Till
>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>> indicated, it would be shameful if we let this
>>>> contribution
>>>>>>>>>>>> effort
>>>>>>>>>>>>>> die.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Gabor
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Aljoscha Krettek-2
In reply to this post by Timo Walther-2
The Beam Jenkins jobs are configured inside the Beam src repo itself. For example: https://github.com/apache/beam/blob/master/.jenkins/job_beam_PostCommit_Java_RunnableOnService_Flink.groovy

For initial setup of the seed job you need admin rights on Jenkins, as described here: https://cwiki.apache.org/confluence/display/INFRA/Jenkins.

The somewhat annoying thing is setting up our own “flink” build slaves and maintaining them. There are some general purpose build slaves but high-throughput projects usually have their own build slaves to ensure speedy processing of Jenkins jobs: https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels

> On 20 Mar 2017, at 14:40, Timo Walther <[hidden email]> wrote:
>
> Another solution would be to make the Travis builds more efficient. For example, we could write a script that determines the modified Maven module and only run the test for this module (and maybe transitive dependencies). PRs for libraries such as Gelly, Table, CEP or connectors would not trigger a compilation of the entire stack anymore. Of course this would not solve all problems but many of it.
>
> What do you think about this?
>
>
>
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>>
>> One downside of Jenkins is that we probably need some machines that execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I think
>> we would need at least the same amount on Jenkins.
>>
>>
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <[hidden email]> wrote:
>>
>>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>>
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
>>> non-core) instead of build time. What would happen if the built of another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>>
>>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>>>> Jenkins integration, has opened my eyes to what is possible with good CI
>>>> integration.
>>>>
>>>> For example, look at this recent Beam PR: https://github.com/apache/beam
>>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>>>> Jenkins-Github integration will tell you exactly which tests failed and if
>>>> you click on the links you can look at the log output/std out of the tests
>>>> in question.
>>>>
>>>> This is the overview page of one of the Jenkins Jobs that we have in
>>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a
>>>> stable build: https://builds.apache.org/job/
>>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>>>> information about the Maven run. This is an unstable run:
>>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There
>>>> you can see which tests failed and you can easily drill down.
>>>>
>>>> Best,
>>>> Aljoscha
>>>>
>>>> On 20 Mar 2017, at 11:46, Robert Metzger <[hidden email]> wrote:
>>>>> Thank you for looking into the build times.
>>>>>
>>>>> I didn't know that the build time situation is so bad. Even with yarn,
>>>>> mesos, connectors and libraries removed, we are still running into the
>>>>> build timeout :(
>>>>>
>>>>> Aljoscha told me that the Beam community is using Jenkins for running
>>>>> the tests, and they are planning to completely move away from Travis. I
>>>>> wonder whether we should do the same, as having our own Jenkins servers
>>>>> would allow us to run tests for more than 50 minutes.
>>>>>
>>>>> I agree with Stephan that we should keep the yarn and mesos tests in the
>>>>> core for stability / testing quality purposes.
>>>>>
>>>>>
>>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[hidden email]
>>>>> <mailto:[hidden email]>> wrote:
>>>>> @Greg
>>>>>
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>>
>>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>>> where these tests caught cases that other tests did not, because they are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>>
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and release
>>>>> tooling. Once we gathered experience there, we can probably easily see
>>>>> what
>>>>> else we can split out.
>>>>>
>>>>> Stephan
>>>>>
>>>>>
>>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email] <mailto:
>>>>> [hidden email]>> wrote:
>>>>>
>>>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>>>>> With 51 builds queued up for the weekend (some of which may fail or have
>>>>>> been force pushed) we are at the limit of the number of contributions we
>>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>>> investigating speedups for long-running tests, and 3) staying cognizant
>>>>>> of
>>>>>> test performance when accepting new code.
>>>>>>
>>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>>> independent.
>>>>>>
>>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>>
>>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>>> connectors for three Kafka versions).
>>>>>>
>>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>>> tests
>>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>>
>>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>>> successfully run these locally). A “cluster” module could host
>>>>>> flink-mesos,
>>>>>> flink-yarn, and flink-yarn-tests.
>>>>>>
>>>>>> That gets us close to running all tests in a single Travis build.
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>>>
>>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>>>> https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>>>
>>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>>
>>>>>> I also wanted to get an idea of how disruptive it would be to developers
>>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>>> script and configured it with the module partitions listed above. The
>>>>>> usage
>>>>>> string from the top of the file lists commits with files from multiple
>>>>>> partitions and well as the modified files.
>>>>>>    https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>>>
>>>>>> Accounting for the merging of the batch and streaming connector modules,
>>>>>> and assuming that the project structure has not changed much over the
>>>>>> past
>>>>>> 15 months, for the following date ranges the listed number of commits
>>>>>> would
>>>>>> have been split across repositories.
>>>>>>
>>>>>> since "2017-01-01"
>>>>>> 36 of 571 commits were mixed
>>>>>>
>>>>>> since "2016-07-01"
>>>>>> 155 of 1607 commits were mixed
>>>>>>
>>>>>> since "2016-01-01"
>>>>>> 272 of 2561 commits were mixed
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email] <mailto:
>>>>>>> [hidden email]>> wrote:
>>>>>>>
>>>>>>> @Robert - I think once we know that a separate git repo works well, and
>>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>>
>>>>>> identical
>>>>>>
>>>>>>> for two or more repositories.
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]
>>>>>>> <mailto:[hidden email]>>
>>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>> remaining
>>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]
>>>>>>>> <mailto:[hidden email]>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>>
>>>>>>>> because
>>>>>>> the flink-libraries depend on that.
>>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>>
>>>>>>>> again
>>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>> from
>>>>>>>>
>>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>>
>>>>>>>> artifact.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]
>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I'm ok with point 3.
>>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>>
>>>>>>>>> having
>>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>>
>>>>>>>>>> before, I
>>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>>
>>>>>>>>>> introduces
>>>>>>>>> too
>>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>>>>>>>
>>>>>>>>>> time
>>>>>>>>>> significantly down.
>>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>>
>>>>>>>>>> "flink-libraries"
>>>>>>>>> repo
>>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>>
>>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>>> we
>>>>>>>>>>>
>>>>>>>>>> want
>>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>>
>>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>>
>>>>>>>>>> placed
>>>>>>>>> in
>>>>>>>>>>> contrib anymore.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]
>>>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>>> listed
>>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>>
>>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>>
>>>>>>>>>>> flink-libraries?
>>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>> (and
>>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>>> Greg
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>>>>>>
>>>>>>>>>>>> feasible
>>>>>>>>>>>>
>>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>> libraries.
>>>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>>>
>>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>>
>>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>>
>>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>>
>>>>>>>>>>>> decided
>>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>> contrib
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>> repo
>>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>> the
>>>>>>>>>> future)
>>>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>>
>>>>>>>>>>>> them
>>>>>>>>> into
>>>>>>>>>>> the
>>>>>>>>>>>>> new repo
>>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>>>>>>>>
>>>>>>>>>>>> repo.
>>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>> repository,
>>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>>
>>>>>>>>>>>> documentations
>>>>>>>>>>>>
>>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>>
>>>>>>>>>>>> repositories
>>>>>>>>>>>>
>>>>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>>>>>>>> both
>>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>> the
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>> first
>>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>> with
>>>>>>>>> the
>>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>>
>>>>>>>>>>>> 3 ?
>>>>>>>>> Would
>>>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>> of
>>>>>>>>> the
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>> +1s,
>>>>>>>>> the
>>>>>>>>>>> bot
>>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>> merge
>>>>>>>>>> process.
>>>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>>
>>>>>>>>>>>>> there
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>> it
>>>>>>>>>> lives
>>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>> the
>>>>>>>>> core
>>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>> which
>>>>>>>>> are
>>>>>>>>>>> not
>>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>> be
>>>>>>>>> noticed
>>>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> capacities to
>>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>> code
>>>>>>>>>> in
>>>>>>>>>>>> a
>>>>>>>>>>>>
>>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>>
>>>>>>>>>>>>> some
>>>>>>>>> nice
>>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>> beneficial
>>>>>>>>>>> for
>>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>> on
>>>>>>>>>> Travis.
>>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>> reuse
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>> however, it
>>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>> Gradle
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>> we
>>>>>>>>>> might
>>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>> repository
>>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>> build
>>>>>>>>>> time in
>>>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>> be
>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>>>>>>>>>> <mailto:[hidden email]>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>> not
>>>>>>>>> sure
>>>>>>>>>>> for
>>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> a
>>>>>>>>> lot
>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>> was
>>>>>>>>> a
>>>>>>>>>
>>>>>>>>>> commit
>>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>> the
>>>>>>>>> meantime.
>>>>>>>>>>>>>>>    For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository/modules
>>>>>>>>> breaks
>>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>> I'd
>>>>>>>>>> like
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> project
>>>>>>>>>>> has
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>> repository
>>>>>>>>>>>> using
>>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> support
>>>>>>>>> this
>>>>>>>>>>>> we
>>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>> for
>>>>>>>>>> example.
>>>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> sets
>>>>>>>>> of
>>>>>>>>>
>>>>>>>>>> modules
>>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>> actually
>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>> and
>>>>>>>>> it
>>>>>>>>>
>>>>>>>>>> would
>>>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>> integration
>>>>>>>>>> tests
>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>> community
>>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> this
>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>> could
>>>>>>>>>> become
>>>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>> Maven,
>>>>>>>>>>> I
>>>>>>>>>>>
>>>>>>>>>>>> still
>>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> example,
>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>> make
>>>>>>>>>> it
>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>> The
>>>>>>>>> main
>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>> docs
>>>>>>>>> from
>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Flink's
>>>>>>>>>> ML
>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>> whether
>>>>>>>>>>>> it
>>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>> If
>>>>>>>>> that
>>>>>>>>>>> should
>>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> first
>>>>>>>>>> only
>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm
>>>>>>>>> done
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [hidden email] <mailto:[hidden email]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>> open
>>>>>>>>> source
>>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>> back
>>>>>>>>> then
>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> available
>>>>>>>>>> with
>>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've
>>>>>>>>>> recently
>>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> time,
>>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>> Aljoscha
>>>>>>>>> suggested
>>>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the
>>>>>>>>>> testing,
>>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> competitors
>>>>>>>>>>> to
>>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>> one
>>>>>>>>>> of
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> well,
>>>>>>>>> then
>>>>>>>>>>>> I
>>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>> different
>>>>>>>>> repositories.
>>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> opinion.
>>>>>>>>> As
>>>>>>>>>
>>>>>>>>>> others
>>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>> things:
>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>> repo
>>>>>>>>>> should
>>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> repository
>>>>>>>>>> depend
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot
>>>>>>>>> deployment
>>>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> library
>>>>>>>>> repository
>>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> one
>>>>>>>>>> committer
>>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> example
>>>>>>>>> for
>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>> realistically
>>>>>>>>> see
>>>>>>>>>>> myself
>>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>> some
>>>>>>>>> time
>>>>>>>>>>>> off.
>>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>> 5
>>>>>>>>> days.
>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>> stuff,
>>>>>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Timo Walther-2
In reply to this post by Greg Hogan
So what do we want to move to the libraries repository?

I would propose to move these modules first:

flink-cep-scala
flink-cep
flink-gelly-examples
flink-gelly-scala
flink-gelly
flink-ml

All other modules (e.g. in flink-contrib) are rather connectors. I think
it would be better to move those in a connectors repository later.

If we are not in a rush, we could do the moving after the
feature-freeze. This is the time where most of the PR will have been merged.

Timo


Am 20/03/17 um 15:00 schrieb Greg Hogan:

> We can add cluster tests using the distribution jar, and will need to do so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would still run nightly and running cluster tests should be much faster. As troublesome as TravisCI has been, a major driver for this change has been local build time.
>
> I agree with splitting off one repo at a time, but we’ll first need to reorganize the core repo if using git submodules as flink-python and flink-table would need to first be moved. So I think planning this out first is a healthy idea, with the understanding that the plan will be reevaluated.
>
> Any changes to the project structure need a scheduled period, perhaps a week, for existing pull requests to be reviewed and accepted or closed and later migrated.
>
>
>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
>>
>> @Greg
>>
>> I am personally in favor of splitting "connectors" and "contrib" out as
>> well. I know that @rmetzger has some reservations about the connectors, but
>> we may be able to convince him.
>>
>> For the cluster tests (yarn / mesos) - in the past there were many cases
>> where these tests caught cases that other tests did not, because they are
>> the only tests that actually use the "flink-dist.jar" and thus discover
>> many dependency and configuration issues. For that reason, my feeling would
>> be that they are valuable in the core repository.
>>
>> I would actually suggest to do only the library split initially, to see
>> what the challenges are in setting up the multi-repo build and release
>> tooling. Once we gathered experience there, we can probably easily see what
>> else we can split out.
>>
>> Stephan
>>
>>
>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>> With 51 builds queued up for the weekend (some of which may fail or have
>>> been force pushed) we are at the limit of the number of contributions we
>>> can process. Fixing this requires 1) splitting the project, 2)
>>> investigating speedups for long-running tests, and 3) staying cognizant of
>>> test performance when accepting new code.
>>>
>>> I’d like to add one to Stephan’s list of module group. I like that the
>>> modules are generic (“libraries”) so that no one module is alone and
>>> independent.
>>>
>>> Flink has three “libraries”: cep, ml, and gelly.
>>>
>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>> connectors for three Kafka versions).
>>>
>>> Both flink-storm and flink-python have a modest number of number of tests
>>> and could live with the miscellaneous modules in “contrib”.
>>>
>>> The YARN tests are long-running and problematic (I am unable to
>>> successfully run these locally). A “cluster” module could host flink-mesos,
>>> flink-yarn, and flink-yarn-tests.
>>>
>>> That gets us close to running all tests in a single Travis build.
>>>   https://travis-ci.org/greghogan/flink/builds/212122590 <
>>> https://travis-ci.org/greghogan/flink/builds/212122590>
>>>
>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>   https://travis-ci.org/greghogan/flink/builds/212137659 <
>>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>>   https://travis-ci.org/greghogan/flink/builds/212154470 <
>>> https://travis-ci.org/greghogan/flink/builds/212154470>
>>>
>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>
>>> I also wanted to get an idea of how disruptive it would be to developers
>>> to divide the project into multiple git repos. I wrote a simple python
>>> script and configured it with the module partitions listed above. The usage
>>> string from the top of the file lists commits with files from multiple
>>> partitions and well as the modified files.
>>>   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>>>
>>> Accounting for the merging of the batch and streaming connector modules,
>>> and assuming that the project structure has not changed much over the past
>>> 15 months, for the following date ranges the listed number of commits would
>>> have been split across repositories.
>>>
>>> since "2017-01-01"
>>> 36 of 571 commits were mixed
>>>
>>> since "2016-07-01"
>>> 155 of 1607 commits were mixed
>>>
>>> since "2016-01-01"
>>> 272 of 2561 commits were mixed
>>>
>>> Greg
>>>
>>>
>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>>>>
>>>> @Robert - I think once we know that a separate git repo works well, and
>>>> that it actually solves problems, I see no reason to not create a
>>>> connectors repository later. The infrastructure changes should be
>>> identical
>>>> for two or more repositories.
>>>>
>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
>>> wrote:
>>>>> I think it should not be at least the flink-dist but exactly the
>>> remaining
>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>
>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>
>>>>>> When doing a release, we need to build the flink main code first,
>>> because
>>>>>> the flink-libraries depend on that.
>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>> again
>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>> from
>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>> artifact.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm ok with point 3.
>>>>>>>
>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>> having
>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <[hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>> before, I
>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>
>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>> introduces
>>>>>>> too
>>>>>>>> much complexity and too many repositories.
>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>> time
>>>>>>>> significantly down.
>>>>>>>> We can also consider putting the connectors into the
>>>>> "flink-libraries"
>>>>>>> repo
>>>>>>>> if we need to further reduce the build time.
>>>>>>>>
>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if we
>>>>>> want
>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>> "flink-libraries" module from main.
>>>>>>>>
>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>> placed
>>>>>>> in
>>>>>>>> contrib anymore.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>>>>> wrote:
>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>
>>>>>>>>> We should compare the verification time with and without the listed
>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>
>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>> flink-libraries?
>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>> (and
>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>
>>>>>>>>> Greg
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>> wrote:
>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>
>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>> feasible
>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>> the
>>>>>>>>>> libraries.
>>>>>>>>>>
>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>
>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>>>>> repos/asf?p=flink-
>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>> "flink-libraries"
>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>> "flink-cep",
>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>> decided
>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>> contrib
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>> repo
>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>> the
>>>>>>>>>> future)
>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>> them
>>>>>>> into
>>>>>>>>> the
>>>>>>>>>> new repo
>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>> repo.
>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>> repository,
>>>>>>>>>> similar to the main documentation.
>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>> documentations
>>>>>>>>>> & link them to each other
>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>> repositories
>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>> of
>>>>>>>> both
>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>> the
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>> first
>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>> with
>>>>>>> the
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>
>>>>>>>>>> The main question for the community is: do you agree with point
>>>>> 3 ?
>>>>>>>> Would
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>
>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>> [hidden email]
>>>>>>>>> wrote:
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>> of
>>>>>>> the
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>> +1s,
>>>>>>> the
>>>>>>>>> bot
>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>> merge
>>>>>>>>>>> process.
>>>>>>>>>>>
>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>> there
>>>>>> is
>>>>>>>> not
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>> it
>>>>>>>>> lives
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>> the
>>>>>>>> core
>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>> which
>>>>>>> are
>>>>>>>>> not
>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>> be
>>>>>>>>> noticed
>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>
>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>> capacities to
>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>> code
>>>>>>>> in
>>>>>>>>> a
>>>>>>>>>>> single repository.
>>>>>>>>>>>
>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>> some
>>>>>>>> nice
>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>> beneficial
>>>>>>>>> for
>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>> on
>>>>>>>>> Travis.
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>> reuse
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>> however, it
>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>> Gradle
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>> we
>>>>>>>>> might
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>> repository
>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>> build
>>>>>>>>> time in
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>> be
>>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Till
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>>>> wrote:
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>> not
>>>>>>> sure
>>>>>>>>> for
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>> a
>>>>>>> lot
>>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>> was
>>>>>> a
>>>>>>>>> commit
>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>> the
>>>>>>>>>>> meantime.
>>>>>>>>>>>>   For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>> eventually
>>>>>>>>>>>> as well.
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>> repository/modules
>>>>>>>>> breaks
>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>
>>>>>>>>>>>> Stephan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>> [hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>> I'd
>>>>>>>> like
>>>>>>>>>>> to
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>> project
>>>>>>>>> has
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>> repository
>>>>>>>>>>>> using
>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>> support
>>>>>>>> this
>>>>>>>>>>> we
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>> for
>>>>>>>>>>>> example.
>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>> sets
>>>>>> of
>>>>>>>>>>>> modules
>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>> alternative
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>> actually
>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>> and
>>>>>> it
>>>>>>>>>>> would
>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>> integration
>>>>>>>>> tests
>>>>>>>>>>>> if
>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>> the
>>>>>>>>>>>> community
>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>> this
>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>> could
>>>>>>>>>>> become
>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>> Maven,
>>>>>>>> I
>>>>>>>>>>>> still
>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>> example,
>>>>>>>> could
>>>>>>>>>>> be
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>> make
>>>>>>>> it
>>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>> The
>>>>>>> main
>>>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>> docs
>>>>>>>> from
>>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>> Flink's
>>>>>>>> ML
>>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>> whether
>>>>>>>>>>> it
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>> If
>>>>>>> that
>>>>>>>>>>>> should
>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>> first
>>>>>>>> only
>>>>>>>>>>>> for
>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>> I'm
>>>>>>>> done
>>>>>>>>>>>> with
>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>> [hidden email]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>> open
>>>>>>>>>>> source
>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>> back
>>>>>>>> then
>>>>>>>>>>>> when
>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>> available
>>>>>>>>>>> with
>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>> I've
>>>>>>>>>>>> recently
>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>> time,
>>>>>>>>>>>>>> introducing build profiles for different components as
>>>>> Aljoscha
>>>>>>>>>>>> suggested
>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>> the
>>>>>>>>>>>>> testing,
>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>> competitors
>>>>>>>>>>> to
>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>> source
>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>> one
>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>> well,
>>>>>>>> then
>>>>>>>>>>> I
>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>> different
>>>>>>>>>>>>>> repositories.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>> opinion.
>>>>>> As
>>>>>>>>>>>> others
>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>> things:
>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>> repo
>>>>>>>>>>>> should
>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>> building
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>> repository
>>>>>>>>>>>> depend
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>> snapshot
>>>>>>>>>>>>> deployment
>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>> library
>>>>>>>>>>>> repository
>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>> one
>>>>>>>>>>>>> committer
>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>> example
>>>>>>> for
>>>>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>> currently
>>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>> realistically
>>>>>>> see
>>>>>>>>>>>> myself
>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>> some
>>>>>>>> time
>>>>>>>>>>>> off.
>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>> 5
>>>>>>>> days.
>>>>>>>>>>>> The
>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>> stuff,
>>>>>> so
>>>>>>>>>>> many
>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>> release
>>>>>>>>>>>>> scripts,
>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>> [hidden email]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>> actually
>>>>>>> be
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>> repository"
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>> build
>>>>>>>>>>>>> system
>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>> not
>>>>>>>> sure
>>>>>>>>>>>> how
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>> [hidden email]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>> parallel
>>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>> since
>>>>>>>>>>> we're
>>>>>>>>>>>>>>>> generally running on multi-core machines rather than VM
>>>>>> slices.
>>>>>>>>>>>>>>>> Is the idea that each user would only check out the modules
>>>>>>> that
>>>>>>>>>>> he
>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> she
>>>>>>>>>>>>>>>> is developing with? For example, if a developer is not
>>>>>> working
>>>>>>> on
>>>>>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
>>>>>> would
>>>>>>>>>>> not
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> clone
>>>>>>>>>>>>>>>> to their filesystem?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
>>>>> validate
>>>>>>>>>>>> against
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think this should be a discussion about the benefits and
>>>>>>>>>>>>> drawbacks
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> separating the code into distinct repositories from a
>>>>>>>>>>> development
>>>>>>>>>>>>>> point
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> view.
>>>>>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
>>>>>>> community
>>>>>>>>>>>> by
>>>>>>>>>>>>>>>> creating
>>>>>>>>>>>>>>>>> separate groups of committers.
>>>>>>>>>>>>>>>>> Also the discussion about independent releases is not be
>>>>>>>>>>> strictly
>>>>>>>>>>>>>>> related
>>>>>>>>>>>>>>>>> to the decision, IMO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
>>>>> before:
>>>>>>>>>>>>>>>>> pros:
>>>>>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
>>>>>> the
>>>>>>>>>>>>> whole
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> base to run a test after switching a branch.
>>>>>>>>>>>>>>>>> cons:
>>>>>>>>>>>>>>>>> - developing libraries features that require changes in
>>>>> the
>>>>>>>>>>> core
>>>>>>>>>>>> /
>>>>>>>>>>>>>> APIs
>>>>>>>>>>>>>>>>> become more time consuming due to back-and-forth between
>>>>>> code
>>>>>>>>>>>>> bases.
>>>>>>>>>>>>>>>>> However, I think this is not very often the case.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
>>>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> solved
>>>>>>>>>>>>>>>>> by different build profiles and configurations.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>> @Stephan:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
>>>>>>>>>>>> committers,
>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
>>>>>>>>>>> just
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> conscious about the disadvantages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
>>>>>>>>>>>> current
>>>>>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
>>>>>> resolve
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>>>> But that requires time from current committers. It seems
>>>>>> like
>>>>>>>>>>>>>>>> trade-offs
>>>>>>>>>>>>>>>>>> between code quality, speed of development, and committer
>>>>>>>>>>>>> efforts.
>>>>>>>>>>>>>>>>>>  From what I see in the discussion about ML, there are
>>>>> many
>>>>>>>>>>>> people
>>>>>>>>>>>>>>>> willing
>>>>>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
>>>>>> we
>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> should move forward. However, the development speed is
>>>>>>>>>>>>>> significantly
>>>>>>>>>>>>>>>>> slowed
>>>>>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
>>>>> helping
>>>>>>>>>>> the
>>>>>>>>>>>>>>> review
>>>>>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
>>>>>> either
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> quality
>>>>>>>>>>>>>>>>>> (by more easily accepting new committers) or some
>>>>> committer
>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
>>>>>> As
>>>>>>>>>>>> Till
>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> indicated, it would be shameful if we let this
>>>>> contribution
>>>>>>>>>>>>> effort
>>>>>>>>>>>>>>> die.
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> Gabor
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
I think your selection of modules is okay.
Moving out storm and the scala shell would be nice as well. But storm is
not really maintained, so maybe we should consider moving it out of the
Flink repo entirely.
And the scala shell is not a library, but it also doesn't really  belong
into the main repo.

Regarding the feature freeze: We either do it with a lot of  time in
advance to avoid any delays for the release, OR we do it right after the
release branch has been forked off.



On Tue, Mar 21, 2017 at 1:09 PM, Timo Walther <[hidden email]> wrote:

> So what do we want to move to the libraries repository?
>
> I would propose to move these modules first:
>
> flink-cep-scala
> flink-cep
> flink-gelly-examples
> flink-gelly-scala
> flink-gelly
> flink-ml
>
> All other modules (e.g. in flink-contrib) are rather connectors. I think
> it would be better to move those in a connectors repository later.
>
> If we are not in a rush, we could do the moving after the feature-freeze.
> This is the time where most of the PR will have been merged.
>
> Timo
>
>
> Am 20/03/17 um 15:00 schrieb Greg Hogan:
>
>> We can add cluster tests using the distribution jar, and will need to do
>> so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would
>> still run nightly and running cluster tests should be much faster. As
>> troublesome as TravisCI has been, a major driver for this change has been
>> local build time.
>>
>> I agree with splitting off one repo at a time, but we’ll first need to
>> reorganize the core repo if using git submodules as flink-python and
>> flink-table would need to first be moved. So I think planning this out
>> first is a healthy idea, with the understanding that the plan will be
>> reevaluated.
>>
>> Any changes to the project structure need a scheduled period, perhaps a
>> week, for existing pull requests to be reviewed and accepted or closed and
>> later migrated.
>>
>>
>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
>>>
>>> @Greg
>>>
>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>> well. I know that @rmetzger has some reservations about the connectors,
>>> but
>>> we may be able to convince him.
>>>
>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>> where these tests caught cases that other tests did not, because they are
>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>> many dependency and configuration issues. For that reason, my feeling
>>> would
>>> be that they are valuable in the core repository.
>>>
>>> I would actually suggest to do only the library split initially, to see
>>> what the challenges are in setting up the multi-repo build and release
>>> tooling. Once we gathered experience there, we can probably easily see
>>> what
>>> else we can split out.
>>>
>>> Stephan
>>>
>>>
>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>>> With 51 builds queued up for the weekend (some of which may fail or have
>>>> been force pushed) we are at the limit of the number of contributions we
>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>> investigating speedups for long-running tests, and 3) staying cognizant
>>>> of
>>>> test performance when accepting new code.
>>>>
>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>> modules are generic (“libraries”) so that no one module is alone and
>>>> independent.
>>>>
>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>
>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>> connectors for three Kafka versions).
>>>>
>>>> Both flink-storm and flink-python have a modest number of number of
>>>> tests
>>>> and could live with the miscellaneous modules in “contrib”.
>>>>
>>>> The YARN tests are long-running and problematic (I am unable to
>>>> successfully run these locally). A “cluster” module could host
>>>> flink-mesos,
>>>> flink-yarn, and flink-yarn-tests.
>>>>
>>>> That gets us close to running all tests in a single Travis build.
>>>>   https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>> https://travis-ci.org/greghogan/flink/builds/212122590>
>>>>
>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>   https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>>>   https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>> https://travis-ci.org/greghogan/flink/builds/212154470>
>>>>
>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>
>>>> I also wanted to get an idea of how disruptive it would be to developers
>>>> to divide the project into multiple git repos. I wrote a simple python
>>>> script and configured it with the module partitions listed above. The
>>>> usage
>>>> string from the top of the file lists commits with files from multiple
>>>> partitions and well as the modified files.
>>>>   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>>>>
>>>> Accounting for the merging of the batch and streaming connector modules,
>>>> and assuming that the project structure has not changed much over the
>>>> past
>>>> 15 months, for the following date ranges the listed number of commits
>>>> would
>>>> have been split across repositories.
>>>>
>>>> since "2017-01-01"
>>>> 36 of 571 commits were mixed
>>>>
>>>> since "2016-07-01"
>>>> 155 of 1607 commits were mixed
>>>>
>>>> since "2016-01-01"
>>>> 272 of 2561 commits were mixed
>>>>
>>>> Greg
>>>>
>>>>
>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>>>>>
>>>>> @Robert - I think once we know that a separate git repo works well, and
>>>>> that it actually solves problems, I see no reason to not create a
>>>>> connectors repository later. The infrastructure changes should be
>>>>>
>>>> identical
>>>>
>>>>> for two or more repositories.
>>>>>
>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
>>>>>
>>>> wrote:
>>>>
>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>
>>>>> remaining
>>>>
>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>
>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>
>>>>>> because
>>>>
>>>>> the flink-libraries depend on that.
>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>
>>>>>> again
>>>>
>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>
>>>>>> from
>>>>>>
>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>
>>>>>> artifact.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[hidden email]
>>>>>>> >
>>>>>>> wrote:
>>>>>>>
>>>>>>> I'm ok with point 3.
>>>>>>>>
>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>
>>>>>>> having
>>>>>>
>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Till
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>> [hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>
>>>>>>>> before, I
>>>>>>>
>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>
>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>
>>>>>>>> introduces
>>>>>>
>>>>>>> too
>>>>>>>>
>>>>>>>>> much complexity and too many repositories.
>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>>>>>
>>>>>>>> time
>>>>>>>
>>>>>>>> significantly down.
>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>
>>>>>>>> "flink-libraries"
>>>>>>
>>>>>>> repo
>>>>>>>>
>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>
>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>> we
>>>>>>>>>
>>>>>>>> want
>>>>>>>
>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>
>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>
>>>>>>>> placed
>>>>>>
>>>>>>> in
>>>>>>>>
>>>>>>>>> contrib anymore.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>
>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>> listed
>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>
>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>
>>>>>>>>> flink-libraries?
>>>>>>>>
>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>
>>>>>>>>> (and
>>>>>>>
>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>
>>>>>>>>>> Greg
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>
>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>>>>
>>>>>>>>>> feasible
>>>>>>>>>>
>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> libraries.
>>>>>>>>>>>
>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>
>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>>>>>>>>
>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>
>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>
>>>>>>>>>> "flink-libraries"
>>>>>>>>>>
>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>
>>>>>>>>>> "flink-cep",
>>>>>>>>>>
>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>
>>>>>>>>>> decided
>>>>>>>>
>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>
>>>>>>>>>> contrib
>>>>>>>
>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>
>>>>>>>>>> repo
>>>>>>>>
>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> future)
>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>
>>>>>>>>>> them
>>>>>>
>>>>>>> into
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> new repo
>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>>>>>>
>>>>>>>>>> repo.
>>>>>>>>
>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>
>>>>>>>>>> repository,
>>>>>>>>
>>>>>>>>> similar to the main documentation.
>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>
>>>>>>>>>> documentations
>>>>>>>>>>
>>>>>>>>>>> & link them to each other
>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>
>>>>>>>>>> repositories
>>>>>>>>>>
>>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>>>>>>>
>>>>>>>>>> of
>>>>>>>
>>>>>>>> both
>>>>>>>>>
>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>
>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>
>>>>>>>>>> first
>>>>>>>>
>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>
>>>>>>>>>> with
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>
>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>
>>>>>>>>>> 3 ?
>>>>>>
>>>>>>> Would
>>>>>>>>>
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>
>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>
>>>>>>>>>> [hidden email]
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>
>>>>>>>>>>> of
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>
>>>>>>>>>>> +1s,
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> bot
>>>>>>>>>>
>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>
>>>>>>>>>>> merge
>>>>>>>
>>>>>>>> process.
>>>>>>>>>>>>
>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>
>>>>>>>>>>> there
>>>>>>
>>>>>>> is
>>>>>>>
>>>>>>>> not
>>>>>>>>>
>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>
>>>>>>>>>>> it
>>>>>>>
>>>>>>>> lives
>>>>>>>>>>
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>
>>>>>>> core
>>>>>>>>>
>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>
>>>>>>>>>>> which
>>>>>>
>>>>>>> are
>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>
>>>>>>> noticed
>>>>>>>>>>
>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>
>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>
>>>>>>>>>>> capacities to
>>>>>>>>>>
>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>
>>>>>>>>>>> code
>>>>>>>
>>>>>>>> in
>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>>
>>>>>>>>>>> single repository.
>>>>>>>>>>>>
>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>
>>>>>>>>>>> some
>>>>>>
>>>>>>> nice
>>>>>>>>>
>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>
>>>>>>>>>>> beneficial
>>>>>>>>
>>>>>>>>> for
>>>>>>>>>>
>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>
>>>>>>>>>>> on
>>>>>>>
>>>>>>>> Travis.
>>>>>>>>>>
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>
>>>>>>>>>>> reuse
>>>>>>>>
>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>
>>>>>>>>>>> however, it
>>>>>>>>>>
>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>
>>>>>>>>>>> Gradle
>>>>>>>>
>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>
>>>>>>>>>>> we
>>>>>>>
>>>>>>>> might
>>>>>>>>>>
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>
>>>>>>>>>>> repository
>>>>>>>>>>
>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>
>>>>>>>>>>> build
>>>>>>>
>>>>>>>> time in
>>>>>>>>>>
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>
>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Till
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[hidden email]
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>
>>>>>>>>>>>> not
>>>>>>
>>>>>>> sure
>>>>>>>>
>>>>>>>>> for
>>>>>>>>>>
>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>
>>>>>>>>>>>> a
>>>>>>
>>>>>>> lot
>>>>>>>>
>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>
>>>>>>> a
>>>>>>>
>>>>>>>> commit
>>>>>>>>>>
>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>
>>>>>>> meantime.
>>>>>>>>>>>>
>>>>>>>>>>>>>   For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>
>>>>>>>>>>>> eventually
>>>>>>>>>>>>
>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>
>>>>>>>>>>>> repository/modules
>>>>>>
>>>>>>> breaks
>>>>>>>>>>
>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>> [hidden email]>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd
>>>>>>>
>>>>>>>> like
>>>>>>>>>
>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> project
>>>>>>>>
>>>>>>>>> has
>>>>>>>>>>
>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>
>>>>>>>>>>>>> repository
>>>>>>>>>
>>>>>>>>>> using
>>>>>>>>>>>>>
>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>
>>>>>>>>>>>>> support
>>>>>>
>>>>>>> this
>>>>>>>>>
>>>>>>>>>> we
>>>>>>>>>>>>
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> for
>>>>>>>
>>>>>>>> example.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>
>>>>>>>>>>>>> sets
>>>>>>
>>>>>>> of
>>>>>>>
>>>>>>>> modules
>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>
>>>>>>>>>>>>> alternative
>>>>>>>>>>
>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> actually
>>>>>>>>>
>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>
>>>>>>> it
>>>>>>>
>>>>>>>> would
>>>>>>>>>>>>
>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>
>>>>>>>>>>>>> integration
>>>>>>>
>>>>>>>> tests
>>>>>>>>>>
>>>>>>>>>>> if
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> community
>>>>>>>>>>>>>
>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>
>>>>>>>>>>>>> this
>>>>>>>
>>>>>>>> is
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>
>>>>>>>>>>>>> could
>>>>>>>
>>>>>>>> become
>>>>>>>>>>>>
>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Maven,
>>>>>>>>
>>>>>>>>> I
>>>>>>>>>
>>>>>>>>>> still
>>>>>>>>>>>>>
>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>
>>>>>>>>>>>>> example,
>>>>>>
>>>>>>> could
>>>>>>>>>
>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>
>>>>>>>>>>>>> make
>>>>>>>
>>>>>>>> it
>>>>>>>>>
>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> The
>>>>>>
>>>>>>> main
>>>>>>>>
>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>
>>>>>>>>>>>>> docs
>>>>>>
>>>>>>> from
>>>>>>>>>
>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Flink's
>>>>>>>
>>>>>>>> ML
>>>>>>>>>
>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>
>>>>>>>>>>>>> whether
>>>>>>>>>
>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> If
>>>>>>
>>>>>>> that
>>>>>>>>
>>>>>>>>> should
>>>>>>>>>>>>>
>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> first
>>>>>>>
>>>>>>>> only
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm
>>>>>>
>>>>>>> done
>>>>>>>>>
>>>>>>>>>> with
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> open
>>>>>>
>>>>>>> source
>>>>>>>>>>>>
>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> back
>>>>>>
>>>>>>> then
>>>>>>>>>
>>>>>>>>>> when
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> available
>>>>>>>
>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've
>>>>>>>
>>>>>>>> recently
>>>>>>>>>>>>>
>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> time,
>>>>>>>>
>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aljoscha
>>>>>>
>>>>>>> suggested
>>>>>>>>>>>>>
>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> testing,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> competitors
>>>>>>>>
>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> source
>>>>>>>>>
>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> one
>>>>>>>
>>>>>>>> of
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> well,
>>>>>>
>>>>>>> then
>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>>>>
>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> different
>>>>>>
>>>>>>> repositories.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> opinion.
>>>>>>
>>>>>>> As
>>>>>>>
>>>>>>>> others
>>>>>>>>>>>>>
>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> things:
>>>>>>>
>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repo
>>>>>>>
>>>>>>>> should
>>>>>>>>>>>>>
>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> building
>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository
>>>>>>>
>>>>>>>> depend
>>>>>>>>>>>>>
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> snapshot
>>>>>>
>>>>>>> deployment
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> library
>>>>>>
>>>>>>> repository
>>>>>>>>>>>>>
>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> one
>>>>>>>
>>>>>>>> committer
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example
>>>>>>
>>>>>>> for
>>>>>>>>
>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> currently
>>>>>>>>>
>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> realistically
>>>>>>
>>>>>>> see
>>>>>>>>
>>>>>>>>> myself
>>>>>>>>>>>>>
>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> some
>>>>>>
>>>>>>> time
>>>>>>>>>
>>>>>>>>>> off.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 5
>>>>>>
>>>>>>> days.
>>>>>>>>>
>>>>>>>>>> The
>>>>>>>>>>>>>
>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stuff,
>>>>>>
>>>>>>> so
>>>>>>>
>>>>>>>> many
>>>>>>>>>>>>
>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> release
>>>>>>>
>>>>>>>> scripts,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [hidden email]>
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> actually
>>>>>>>
>>>>>>>> be
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository"
>>>>>>>>>>>>
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>
>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> build
>>>>>>>>
>>>>>>>>> system
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> not
>>>>>>
>>>>>>> sure
>>>>>>>>>
>>>>>>>>>> how
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> parallel
>>>>>>>>>>>>>
>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
Good news :)
A few weeks ago, I got an email from travis asking for feedback. I filled
out the form and said, that the 50 minutes build time limit is a
showstopper for us.
And now, a few weeks later they got back to me and told me that they have
increased the build time for "apache/flink" to 120 minutes. Also, we can
set the settings to use the "sudo enabled infrastructure", with 7.5 Gb of
main memory guaranteed.

I'll do a push to a separate branch to see how well it works :)

On Tue, Mar 28, 2017 at 4:36 PM, Robert Metzger <[hidden email]> wrote:

> I think your selection of modules is okay.
> Moving out storm and the scala shell would be nice as well. But storm is
> not really maintained, so maybe we should consider moving it out of the
> Flink repo entirely.
> And the scala shell is not a library, but it also doesn't really  belong
> into the main repo.
>
> Regarding the feature freeze: We either do it with a lot of  time in
> advance to avoid any delays for the release, OR we do it right after the
> release branch has been forked off.
>
>
>
> On Tue, Mar 21, 2017 at 1:09 PM, Timo Walther <[hidden email]> wrote:
>
>> So what do we want to move to the libraries repository?
>>
>> I would propose to move these modules first:
>>
>> flink-cep-scala
>> flink-cep
>> flink-gelly-examples
>> flink-gelly-scala
>> flink-gelly
>> flink-ml
>>
>> All other modules (e.g. in flink-contrib) are rather connectors. I think
>> it would be better to move those in a connectors repository later.
>>
>> If we are not in a rush, we could do the moving after the feature-freeze.
>> This is the time where most of the PR will have been merged.
>>
>> Timo
>>
>>
>> Am 20/03/17 um 15:00 schrieb Greg Hogan:
>>
>>> We can add cluster tests using the distribution jar, and will need to do
>>> so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would
>>> still run nightly and running cluster tests should be much faster. As
>>> troublesome as TravisCI has been, a major driver for this change has been
>>> local build time.
>>>
>>> I agree with splitting off one repo at a time, but we’ll first need to
>>> reorganize the core repo if using git submodules as flink-python and
>>> flink-table would need to first be moved. So I think planning this out
>>> first is a healthy idea, with the understanding that the plan will be
>>> reevaluated.
>>>
>>> Any changes to the project structure need a scheduled period, perhaps a
>>> week, for existing pull requests to be reviewed and accepted or closed and
>>> later migrated.
>>>
>>>
>>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
>>>>
>>>> @Greg
>>>>
>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>> but
>>>> we may be able to convince him.
>>>>
>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>> where these tests caught cases that other tests did not, because they
>>>> are
>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>> many dependency and configuration issues. For that reason, my feeling
>>>> would
>>>> be that they are valuable in the core repository.
>>>>
>>>> I would actually suggest to do only the library split initially, to see
>>>> what the challenges are in setting up the multi-repo build and release
>>>> tooling. Once we gathered experience there, we can probably easily see
>>>> what
>>>> else we can split out.
>>>>
>>>> Stephan
>>>>
>>>>
>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>>>>
>>>> I’d like to use this refactoring opportunity to unspilt the Travis
>>>>> tests.
>>>>> With 51 builds queued up for the weekend (some of which may fail or
>>>>> have
>>>>> been force pushed) we are at the limit of the number of contributions
>>>>> we
>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>> investigating speedups for long-running tests, and 3) staying
>>>>> cognizant of
>>>>> test performance when accepting new code.
>>>>>
>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>> independent.
>>>>>
>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>
>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>> connectors for three Kafka versions).
>>>>>
>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>> tests
>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>
>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>> successfully run these locally). A “cluster” module could host
>>>>> flink-mesos,
>>>>> flink-yarn, and flink-yarn-tests.
>>>>>
>>>>> That gets us close to running all tests in a single Travis build.
>>>>>   https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>
>>>>>
>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>   https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>>>>   https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>
>>>>>
>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>
>>>>> I also wanted to get an idea of how disruptive it would be to
>>>>> developers
>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>> script and configured it with the module partitions listed above. The
>>>>> usage
>>>>> string from the top of the file lists commits with files from multiple
>>>>> partitions and well as the modified files.
>>>>>   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>>>>>
>>>>> Accounting for the merging of the batch and streaming connector
>>>>> modules,
>>>>> and assuming that the project structure has not changed much over the
>>>>> past
>>>>> 15 months, for the following date ranges the listed number of commits
>>>>> would
>>>>> have been split across repositories.
>>>>>
>>>>> since "2017-01-01"
>>>>> 36 of 571 commits were mixed
>>>>>
>>>>> since "2016-07-01"
>>>>> 155 of 1607 commits were mixed
>>>>>
>>>>> since "2016-01-01"
>>>>> 272 of 2561 commits were mixed
>>>>>
>>>>> Greg
>>>>>
>>>>>
>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>>>>>>
>>>>>> @Robert - I think once we know that a separate git repo works well,
>>>>>> and
>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>
>>>>> identical
>>>>>
>>>>>> for two or more repositories.
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>>
>>>>>> remaining
>>>>>
>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]
>>>>>>> >
>>>>>>> wrote:
>>>>>>>
>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>
>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>
>>>>>>> because
>>>>>
>>>>>> the flink-libraries depend on that.
>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>
>>>>>>> again
>>>>>
>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>>
>>>>>>> from
>>>>>>>
>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>
>>>>>>> artifact.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <
>>>>>>>> [hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I'm ok with point 3.
>>>>>>>>>
>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>
>>>>>>>> having
>>>>>>>
>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to
>>>>>>>>> me.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Till
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>
>>>>>>>>> before, I
>>>>>>>>
>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>
>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>
>>>>>>>>> introduces
>>>>>>>
>>>>>>>> too
>>>>>>>>>
>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the
>>>>>>>>>> build
>>>>>>>>>>
>>>>>>>>> time
>>>>>>>>
>>>>>>>>> significantly down.
>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>
>>>>>>>>> "flink-libraries"
>>>>>>>
>>>>>>>> repo
>>>>>>>>>
>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>
>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>> we
>>>>>>>>>>
>>>>>>>>> want
>>>>>>>>
>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>
>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>
>>>>>>>>> placed
>>>>>>>
>>>>>>>> in
>>>>>>>>>
>>>>>>>>>> contrib anymore.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>>
>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>> listed
>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>
>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>
>>>>>>>>>> flink-libraries?
>>>>>>>>>
>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>>
>>>>>>>>>> (and
>>>>>>>>
>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>>
>>>>>>>>>>> Greg
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>
>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>> My main motivation for doing this is that it seems to be the
>>>>>>>>>>>> only
>>>>>>>>>>>>
>>>>>>>>>>> feasible
>>>>>>>>>>>
>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> libraries.
>>>>>>>>>>>>
>>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>
>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>>>>>>>>>
>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>
>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>
>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>
>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>
>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>
>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>
>>>>>>>>>>> decided
>>>>>>>>>
>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>>
>>>>>>>>>>> contrib
>>>>>>>>
>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>>
>>>>>>>>>>> repo
>>>>>>>>>
>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> future)
>>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>
>>>>>>>>>>> them
>>>>>>>
>>>>>>>> into
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> new repo
>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the
>>>>>>>>>>>> main
>>>>>>>>>>>>
>>>>>>>>>>> repo.
>>>>>>>>>
>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>>
>>>>>>>>>>> repository,
>>>>>>>>>
>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>
>>>>>>>>>>> documentations
>>>>>>>>>>>
>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>
>>>>>>>>>>> repositories
>>>>>>>>>>>
>>>>>>>>>>>> 8. I'll update the release script to create the Flink release
>>>>>>>>>>>> out
>>>>>>>>>>>>
>>>>>>>>>>> of
>>>>>>>>
>>>>>>>>> both
>>>>>>>>>>
>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>>
>>>>>>>>>>> first
>>>>>>>>>
>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>>
>>>>>>>>>>> with
>>>>>>>
>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>>
>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>
>>>>>>>>>>> 3 ?
>>>>>>>
>>>>>>>> Would
>>>>>>>>>>
>>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>
>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>
>>>>>>>>>>> [hidden email]
>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>>>>>
>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>>
>>>>>>>>>>>> +1s,
>>>>>>>
>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> bot
>>>>>>>>>>>
>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>>
>>>>>>>>>>>> merge
>>>>>>>>
>>>>>>>>> process.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>
>>>>>>>>>>>> there
>>>>>>>
>>>>>>>> is
>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>>
>>>>>>>>>>>> it
>>>>>>>>
>>>>>>>>> lives
>>>>>>>>>>>
>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> core
>>>>>>>>>>
>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>>
>>>>>>>>>>>> which
>>>>>>>
>>>>>>>> are
>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>
>>>>>>>> noticed
>>>>>>>>>>>
>>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>
>>>>>>>>>>>> capacities to
>>>>>>>>>>>
>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>>
>>>>>>>>>>>> code
>>>>>>>>
>>>>>>>>> in
>>>>>>>>>>
>>>>>>>>>>> a
>>>>>>>>>>>
>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>
>>>>>>>>>>>> some
>>>>>>>
>>>>>>>> nice
>>>>>>>>>>
>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>>
>>>>>>>>>>>> beneficial
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>>
>>>>>>>>>>>> on
>>>>>>>>
>>>>>>>>> Travis.
>>>>>>>>>>>
>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>>
>>>>>>>>>>>> reuse
>>>>>>>>>
>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>>
>>>>>>>>>>>> however, it
>>>>>>>>>>>
>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>>
>>>>>>>>>>>> Gradle
>>>>>>>>>
>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>>
>>>>>>>>>>>> we
>>>>>>>>
>>>>>>>>> might
>>>>>>>>>>>
>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>>
>>>>>>>>>>>> repository
>>>>>>>>>>>
>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>>
>>>>>>>>>>>> build
>>>>>>>>
>>>>>>>>> time in
>>>>>>>>>>>
>>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>>
>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <
>>>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>>
>>>>>>>>>>>>> not
>>>>>>>
>>>>>>>> sure
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>
>>>>>>>>>>>>> a
>>>>>>>
>>>>>>>> lot
>>>>>>>>>
>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>>
>>>>>>>>>>>>> was
>>>>>>>
>>>>>>>> a
>>>>>>>>
>>>>>>>>> commit
>>>>>>>>>>>
>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>> meantime.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>   For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>
>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>
>>>>>>>>>>>>> repository/modules
>>>>>>>
>>>>>>>> breaks
>>>>>>>>>>>
>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd
>>>>>>>>
>>>>>>>>> like
>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> project
>>>>>>>>>
>>>>>>>>>> has
>>>>>>>>>>>
>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository
>>>>>>>>>>
>>>>>>>>>>> using
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> support
>>>>>>>
>>>>>>>> this
>>>>>>>>>>
>>>>>>>>>>> we
>>>>>>>>>>>>>
>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> for
>>>>>>>>
>>>>>>>>> example.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sets
>>>>>>>
>>>>>>>> of
>>>>>>>>
>>>>>>>>> modules
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> alternative
>>>>>>>>>>>
>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> actually
>>>>>>>>>>
>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and
>>>>>>>
>>>>>>>> it
>>>>>>>>
>>>>>>>>> would
>>>>>>>>>>>>>
>>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> integration
>>>>>>>>
>>>>>>>>> tests
>>>>>>>>>>>
>>>>>>>>>>>> if
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> community
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> this
>>>>>>>>
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> could
>>>>>>>>
>>>>>>>>> become
>>>>>>>>>>>>>
>>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maven,
>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>>
>>>>>>>>>>> still
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example,
>>>>>>>
>>>>>>>> could
>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> make
>>>>>>>>
>>>>>>>>> it
>>>>>>>>>>
>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The
>>>>>>>
>>>>>>>> main
>>>>>>>>>
>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> docs
>>>>>>>
>>>>>>>> from
>>>>>>>>>>
>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Flink's
>>>>>>>>
>>>>>>>>> ML
>>>>>>>>>>
>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> whether
>>>>>>>>>>
>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If
>>>>>>>
>>>>>>>> that
>>>>>>>>>
>>>>>>>>>> should
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> first
>>>>>>>>
>>>>>>>>> only
>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm
>>>>>>>
>>>>>>>> done
>>>>>>>>>>
>>>>>>>>>>> with
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> open
>>>>>>>
>>>>>>>> source
>>>>>>>>>>>>>
>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> back
>>>>>>>
>>>>>>>> then
>>>>>>>>>>
>>>>>>>>>>> when
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> available
>>>>>>>>
>>>>>>>>> with
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've
>>>>>>>>
>>>>>>>>> recently
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> time,
>>>>>>>>>
>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Aljoscha
>>>>>>>
>>>>>>>> suggested
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> testing,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> competitors
>>>>>>>>>
>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> source
>>>>>>>>>>
>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> one
>>>>>>>>
>>>>>>>>> of
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> well,
>>>>>>>
>>>>>>>> then
>>>>>>>>>>
>>>>>>>>>>> I
>>>>>>>>>>>>>
>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> different
>>>>>>>
>>>>>>>> repositories.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> opinion.
>>>>>>>
>>>>>>>> As
>>>>>>>>
>>>>>>>>> others
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> things:
>>>>>>>>
>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repo
>>>>>>>>
>>>>>>>>> should
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository
>>>>>>>>
>>>>>>>>> depend
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> snapshot
>>>>>>>
>>>>>>>> deployment
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> library
>>>>>>>
>>>>>>>> repository
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> one
>>>>>>>>
>>>>>>>>> committer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> example
>>>>>>>
>>>>>>>> for
>>>>>>>>>
>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> currently
>>>>>>>>>>
>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> realistically
>>>>>>>
>>>>>>>> see
>>>>>>>>>
>>>>>>>>>> myself
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> some
>>>>>>>
>>>>>>>> time
>>>>>>>>>>
>>>>>>>>>>> off.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 5
>>>>>>>
>>>>>>>> days.
>>>>>>>>>>
>>>>>>>>>>> The
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> stuff,
>>>>>>>
>>>>>>>> so
>>>>>>>>
>>>>>>>>> many
>>>>>>>>>>>>>
>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> release
>>>>>>>>
>>>>>>>>> scripts,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> actually
>>>>>>>>
>>>>>>>>> be
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> repository"
>>>>>>>>>>>>>
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> build
>>>>>>>>>
>>>>>>>>>> system
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> not
>>>>>>>
>>>>>>>> sure
>>>>>>>>>>
>>>>>>>>>>> how
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> parallel
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Greg Hogan
Thanks for pursuing this Robert.

I appreciate their receptiveness to increasing the time and memory limits but we’ll still be bound by the old limits for our personal repos. Does this change any of the proposed actions for splitting the repo?

Has anyone looked into why we see many jobs timeout right at 50 minutes? Passing job take well under 50 minutes and the 5 minute watchdog timeout is not being triggered. Just pulling up a recent build: https://travis-ci.org/apache/flink/builds/217034084

Greg


> On Mar 31, 2017, at 9:12 AM, Robert Metzger <[hidden email]> wrote:
>
> Good news :)
> A few weeks ago, I got an email from travis asking for feedback. I filled
> out the form and said, that the 50 minutes build time limit is a
> showstopper for us.
> And now, a few weeks later they got back to me and told me that they have
> increased the build time for "apache/flink" to 120 minutes. Also, we can
> set the settings to use the "sudo enabled infrastructure", with 7.5 Gb of
> main memory guaranteed.
>
> I'll do a push to a separate branch to see how well it works :)
>
> On Tue, Mar 28, 2017 at 4:36 PM, Robert Metzger <[hidden email]> wrote:
>
>> I think your selection of modules is okay.
>> Moving out storm and the scala shell would be nice as well. But storm is
>> not really maintained, so maybe we should consider moving it out of the
>> Flink repo entirely.
>> And the scala shell is not a library, but it also doesn't really  belong
>> into the main repo.
>>
>> Regarding the feature freeze: We either do it with a lot of  time in
>> advance to avoid any delays for the release, OR we do it right after the
>> release branch has been forked off.
>>
>>
>>
>> On Tue, Mar 21, 2017 at 1:09 PM, Timo Walther <[hidden email]> wrote:
>>
>>> So what do we want to move to the libraries repository?
>>>
>>> I would propose to move these modules first:
>>>
>>> flink-cep-scala
>>> flink-cep
>>> flink-gelly-examples
>>> flink-gelly-scala
>>> flink-gelly
>>> flink-ml
>>>
>>> All other modules (e.g. in flink-contrib) are rather connectors. I think
>>> it would be better to move those in a connectors repository later.
>>>
>>> If we are not in a rush, we could do the moving after the feature-freeze.
>>> This is the time where most of the PR will have been merged.
>>>
>>> Timo
>>>
>>>
>>> Am 20/03/17 um 15:00 schrieb Greg Hogan:
>>>
>>>> We can add cluster tests using the distribution jar, and will need to do
>>>> so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would
>>>> still run nightly and running cluster tests should be much faster. As
>>>> troublesome as TravisCI has been, a major driver for this change has been
>>>> local build time.
>>>>
>>>> I agree with splitting off one repo at a time, but we’ll first need to
>>>> reorganize the core repo if using git submodules as flink-python and
>>>> flink-table would need to first be moved. So I think planning this out
>>>> first is a healthy idea, with the understanding that the plan will be
>>>> reevaluated.
>>>>
>>>> Any changes to the project structure need a scheduled period, perhaps a
>>>> week, for existing pull requests to be reviewed and accepted or closed and
>>>> later migrated.
>>>>
>>>>
>>>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
>>>>>
>>>>> @Greg
>>>>>
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>>
>>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>>> where these tests caught cases that other tests did not, because they
>>>>> are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>>
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and release
>>>>> tooling. Once we gathered experience there, we can probably easily see
>>>>> what
>>>>> else we can split out.
>>>>>
>>>>> Stephan
>>>>>
>>>>>
>>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]> wrote:
>>>>>
>>>>> I’d like to use this refactoring opportunity to unspilt the Travis
>>>>>> tests.
>>>>>> With 51 builds queued up for the weekend (some of which may fail or
>>>>>> have
>>>>>> been force pushed) we are at the limit of the number of contributions
>>>>>> we
>>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>>> investigating speedups for long-running tests, and 3) staying
>>>>>> cognizant of
>>>>>> test performance when accepting new code.
>>>>>>
>>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>>> independent.
>>>>>>
>>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>>
>>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>>> connectors for three Kafka versions).
>>>>>>
>>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>>> tests
>>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>>
>>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>>> successfully run these locally). A “cluster” module could host
>>>>>> flink-mesos,
>>>>>> flink-yarn, and flink-yarn-tests.
>>>>>>
>>>>>> That gets us close to running all tests in a single Travis build.
>>>>>>  https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>
>>>>>>
>>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>>  https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>>>>>  https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>
>>>>>>
>>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>>
>>>>>> I also wanted to get an idea of how disruptive it would be to
>>>>>> developers
>>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>>> script and configured it with the module partitions listed above. The
>>>>>> usage
>>>>>> string from the top of the file lists commits with files from multiple
>>>>>> partitions and well as the modified files.
>>>>>>  https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>>>>>>
>>>>>> Accounting for the merging of the batch and streaming connector
>>>>>> modules,
>>>>>> and assuming that the project structure has not changed much over the
>>>>>> past
>>>>>> 15 months, for the following date ranges the listed number of commits
>>>>>> would
>>>>>> have been split across repositories.
>>>>>>
>>>>>> since "2017-01-01"
>>>>>> 36 of 571 commits were mixed
>>>>>>
>>>>>> since "2016-07-01"
>>>>>> 155 of 1607 commits were mixed
>>>>>>
>>>>>> since "2016-01-01"
>>>>>> 272 of 2561 commits were mixed
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
>>>>>>>
>>>>>>> @Robert - I think once we know that a separate git repo works well,
>>>>>>> and
>>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>>
>>>>>> identical
>>>>>>
>>>>>>> for two or more repositories.
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[hidden email]>
>>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>>>
>>>>>>> remaining
>>>>>>
>>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[hidden email]
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>>
>>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>>
>>>>>>>> because
>>>>>>
>>>>>>> the flink-libraries depend on that.
>>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>>
>>>>>>>> again
>>>>>>
>>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>>>
>>>>>>>> from
>>>>>>>>
>>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>>
>>>>>>>> artifact.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <
>>>>>>>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I'm ok with point 3.
>>>>>>>>>>
>>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>>
>>>>>>>>> having
>>>>>>>>
>>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to
>>>>>>>>>> me.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>>> [hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>>
>>>>>>>>>> before, I
>>>>>>>>>
>>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>>
>>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>>
>>>>>>>>>> introduces
>>>>>>>>
>>>>>>>>> too
>>>>>>>>>>
>>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the
>>>>>>>>>>> build
>>>>>>>>>>>
>>>>>>>>>> time
>>>>>>>>>
>>>>>>>>>> significantly down.
>>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>>
>>>>>>>>>> "flink-libraries"
>>>>>>>>
>>>>>>>>> repo
>>>>>>>>>>
>>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>>
>>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>>> we
>>>>>>>>>>>
>>>>>>>>>> want
>>>>>>>>>
>>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>>
>>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>>
>>>>>>>>>> placed
>>>>>>>>
>>>>>>>>> in
>>>>>>>>>>
>>>>>>>>>>> contrib anymore.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[hidden email]>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>>>
>>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>>> listed
>>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>>
>>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>>
>>>>>>>>>>> flink-libraries?
>>>>>>>>>>
>>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>>>
>>>>>>>>>>> (and
>>>>>>>>>
>>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>>>
>>>>>>>>>>>> Greg
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[hidden email]
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>>> My main motivation for doing this is that it seems to be the
>>>>>>>>>>>>> only
>>>>>>>>>>>>>
>>>>>>>>>>>> feasible
>>>>>>>>>>>>
>>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> libraries.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
>>>>>>>>>>>>>
>>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>>
>>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>>
>>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>>
>>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>>
>>>>>>>>>>>> decided
>>>>>>>>>>
>>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>>>
>>>>>>>>>>>> contrib
>>>>>>>>>
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>>>
>>>>>>>>>>>> repo
>>>>>>>>>>
>>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> future)
>>>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>>
>>>>>>>>>>>> them
>>>>>>>>
>>>>>>>>> into
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>> new repo
>>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the
>>>>>>>>>>>>> main
>>>>>>>>>>>>>
>>>>>>>>>>>> repo.
>>>>>>>>>>
>>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>>>
>>>>>>>>>>>> repository,
>>>>>>>>>>
>>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>>
>>>>>>>>>>>> documentations
>>>>>>>>>>>>
>>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>>
>>>>>>>>>>>> repositories
>>>>>>>>>>>>
>>>>>>>>>>>>> 8. I'll update the release script to create the Flink release
>>>>>>>>>>>>> out
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>>>>>>>
>>>>>>>>>> both
>>>>>>>>>>>
>>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>>>
>>>>>>>>>>>> first
>>>>>>>>>>
>>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>>>
>>>>>>>>>>>> with
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>>
>>>>>>>>>>>> 3 ?
>>>>>>>>
>>>>>>>>> Would
>>>>>>>>>>>
>>>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>> [hidden email]
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>>>
>>>>>>>>>>>>> of
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>>>
>>>>>>>>>>>>> +1s,
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> bot
>>>>>>>>>>>>
>>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> merge
>>>>>>>>>
>>>>>>>>>> process.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>>
>>>>>>>>>>>>> there
>>>>>>>>
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>>>
>>>>>>>>>>>>> it
>>>>>>>>>
>>>>>>>>>> lives
>>>>>>>>>>>>
>>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> core
>>>>>>>>>>>
>>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>>>
>>>>>>>>>>>>> which
>>>>>>>>
>>>>>>>>> are
>>>>>>>>>>
>>>>>>>>>>> not
>>>>>>>>>>>>
>>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>>>
>>>>>>>>>>>>> be
>>>>>>>>
>>>>>>>>> noticed
>>>>>>>>>>>>
>>>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> capacities to
>>>>>>>>>>>>
>>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> code
>>>>>>>>>
>>>>>>>>>> in
>>>>>>>>>>>
>>>>>>>>>>>> a
>>>>>>>>>>>>
>>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>>
>>>>>>>>>>>>> some
>>>>>>>>
>>>>>>>>> nice
>>>>>>>>>>>
>>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial
>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>>>
>>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>>>
>>>>>>>>>>>>> on
>>>>>>>>>
>>>>>>>>>> Travis.
>>>>>>>>>>>>
>>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>>>
>>>>>>>>>>>>> reuse
>>>>>>>>>>
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> however, it
>>>>>>>>>>>>
>>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Gradle
>>>>>>>>>>
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>>>
>>>>>>>>>>>>> we
>>>>>>>>>
>>>>>>>>>> might
>>>>>>>>>>>>
>>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> repository
>>>>>>>>>>>>
>>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> build
>>>>>>>>>
>>>>>>>>>> time in
>>>>>>>>>>>>
>>>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>>>
>>>>>>>>>>>>> be
>>>>>>>>>
>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <
>>>>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> not
>>>>>>>>
>>>>>>>>> sure
>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>>>
>>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> a
>>>>>>>>
>>>>>>>>> lot
>>>>>>>>>>
>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> was
>>>>>>>>
>>>>>>>>> a
>>>>>>>>>
>>>>>>>>>> commit
>>>>>>>>>>>>
>>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>
>>>>>>>>> meantime.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository/modules
>>>>>>>>
>>>>>>>>> breaks
>>>>>>>>>>>>
>>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd
>>>>>>>>>
>>>>>>>>>> like
>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> project
>>>>>>>>>>
>>>>>>>>>>> has
>>>>>>>>>>>>
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository
>>>>>>>>>>>
>>>>>>>>>>>> using
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> support
>>>>>>>>
>>>>>>>>> this
>>>>>>>>>>>
>>>>>>>>>>>> we
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> for
>>>>>>>>>
>>>>>>>>>> example.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> sets
>>>>>>>>
>>>>>>>>> of
>>>>>>>>>
>>>>>>>>>> modules
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> actually
>>>>>>>>>>>
>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and
>>>>>>>>
>>>>>>>>> it
>>>>>>>>>
>>>>>>>>>> would
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> integration
>>>>>>>>>
>>>>>>>>>> tests
>>>>>>>>>>>>
>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> community
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> this
>>>>>>>>>
>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> could
>>>>>>>>>
>>>>>>>>>> become
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maven,
>>>>>>>>>>
>>>>>>>>>>> I
>>>>>>>>>>>
>>>>>>>>>>>> still
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> example,
>>>>>>>>
>>>>>>>>> could
>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> make
>>>>>>>>>
>>>>>>>>>> it
>>>>>>>>>>>
>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The
>>>>>>>>
>>>>>>>>> main
>>>>>>>>>>
>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> docs
>>>>>>>>
>>>>>>>>> from
>>>>>>>>>>>
>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Flink's
>>>>>>>>>
>>>>>>>>>> ML
>>>>>>>>>>>
>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> whether
>>>>>>>>>>>
>>>>>>>>>>>> it
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If
>>>>>>>>
>>>>>>>>> that
>>>>>>>>>>
>>>>>>>>>>> should
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> first
>>>>>>>>>
>>>>>>>>>> only
>>>>>>>>>>>
>>>>>>>>>>>> for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm
>>>>>>>>
>>>>>>>>> done
>>>>>>>>>>>
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> open
>>>>>>>>
>>>>>>>>> source
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> back
>>>>>>>>
>>>>>>>>> then
>>>>>>>>>>>
>>>>>>>>>>>> when
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> available
>>>>>>>>>
>>>>>>>>>> with
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've
>>>>>>>>>
>>>>>>>>>> recently
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> time,
>>>>>>>>>>
>>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Aljoscha
>>>>>>>>
>>>>>>>>> suggested
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> testing,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> competitors
>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>
>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> one
>>>>>>>>>
>>>>>>>>>> of
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> well,
>>>>>>>>
>>>>>>>>> then
>>>>>>>>>>>
>>>>>>>>>>>> I
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> different
>>>>>>>>
>>>>>>>>> repositories.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> opinion.
>>>>>>>>
>>>>>>>>> As
>>>>>>>>>
>>>>>>>>>> others
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> things:
>>>>>>>>>
>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> repo
>>>>>>>>>
>>>>>>>>>> should
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> repository
>>>>>>>>>
>>>>>>>>>> depend
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot
>>>>>>>>
>>>>>>>>> deployment
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> library
>>>>>>>>
>>>>>>>>> repository
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> one
>>>>>>>>>
>>>>>>>>>> committer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> example
>>>>>>>>
>>>>>>>>> for
>>>>>>>>>>
>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>
>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> realistically
>>>>>>>>
>>>>>>>>> see
>>>>>>>>>>
>>>>>>>>>>> myself
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> some
>>>>>>>>
>>>>>>>>> time
>>>>>>>>>>>
>>>>>>>>>>>> off.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 5
>>>>>>>>
>>>>>>>>> days.
>>>>>>>>>>>
>>>>>>>>>>>> The
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> stuff,
>>>>>>>>
>>>>>>>>> so
>>>>>>>>>
>>>>>>>>>> many
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> release
>>>>>>>>>
>>>>>>>>>> scripts,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> actually
>>>>>>>>>
>>>>>>>>>> be
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> repository"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> build
>>>>>>>>>>
>>>>>>>>>>> system
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> not
>>>>>>>>
>>>>>>>>> sure
>>>>>>>>>>>
>>>>>>>>>>>> how
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> parallel
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger
Hi Grey,
No, I still think we should split the repos. But it makes the whole thing a
bit easier, because we don't need to introduce Jenkins at the same time.
If we do the change, I'm going to ask Travis to extend the limit for all
forks of the Flink repo? (Hope that's possible).

Here is how it would look like with the longer worker time:
https://travis-ci.org/apache/flink/builds/217199621
Looks like the YARN tests are failing consistently on that infrastructure.


I think the watchdog only kicks in if the job hasn't produced output for 5
minutes.



On Fri, Mar 31, 2017 at 4:41 PM, Greg Hogan <[hidden email]> wrote:

> Thanks for pursuing this Robert.
>
> I appreciate their receptiveness to increasing the time and memory limits
> but we’ll still be bound by the old limits for our personal repos. Does
> this change any of the proposed actions for splitting the repo?
>
> Has anyone looked into why we see many jobs timeout right at 50 minutes?
> Passing job take well under 50 minutes and the 5 minute watchdog timeout is
> not being triggered. Just pulling up a recent build:
> https://travis-ci.org/apache/flink/builds/217034084
>
> Greg
>
>
> > On Mar 31, 2017, at 9:12 AM, Robert Metzger <[hidden email]> wrote:
> >
> > Good news :)
> > A few weeks ago, I got an email from travis asking for feedback. I filled
> > out the form and said, that the 50 minutes build time limit is a
> > showstopper for us.
> > And now, a few weeks later they got back to me and told me that they have
> > increased the build time for "apache/flink" to 120 minutes. Also, we can
> > set the settings to use the "sudo enabled infrastructure", with 7.5 Gb of
> > main memory guaranteed.
> >
> > I'll do a push to a separate branch to see how well it works :)
> >
> > On Tue, Mar 28, 2017 at 4:36 PM, Robert Metzger <[hidden email]>
> wrote:
> >
> >> I think your selection of modules is okay.
> >> Moving out storm and the scala shell would be nice as well. But storm is
> >> not really maintained, so maybe we should consider moving it out of the
> >> Flink repo entirely.
> >> And the scala shell is not a library, but it also doesn't really  belong
> >> into the main repo.
> >>
> >> Regarding the feature freeze: We either do it with a lot of  time in
> >> advance to avoid any delays for the release, OR we do it right after the
> >> release branch has been forked off.
> >>
> >>
> >>
> >> On Tue, Mar 21, 2017 at 1:09 PM, Timo Walther <[hidden email]>
> wrote:
> >>
> >>> So what do we want to move to the libraries repository?
> >>>
> >>> I would propose to move these modules first:
> >>>
> >>> flink-cep-scala
> >>> flink-cep
> >>> flink-gelly-examples
> >>> flink-gelly-scala
> >>> flink-gelly
> >>> flink-ml
> >>>
> >>> All other modules (e.g. in flink-contrib) are rather connectors. I
> think
> >>> it would be better to move those in a connectors repository later.
> >>>
> >>> If we are not in a rush, we could do the moving after the
> feature-freeze.
> >>> This is the time where most of the PR will have been merged.
> >>>
> >>> Timo
> >>>
> >>>
> >>> Am 20/03/17 um 15:00 schrieb Greg Hogan:
> >>>
> >>>> We can add cluster tests using the distribution jar, and will need to
> do
> >>>> so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests
> would
> >>>> still run nightly and running cluster tests should be much faster. As
> >>>> troublesome as TravisCI has been, a major driver for this change has
> been
> >>>> local build time.
> >>>>
> >>>> I agree with splitting off one repo at a time, but we’ll first need to
> >>>> reorganize the core repo if using git submodules as flink-python and
> >>>> flink-table would need to first be moved. So I think planning this out
> >>>> first is a healthy idea, with the understanding that the plan will be
> >>>> reevaluated.
> >>>>
> >>>> Any changes to the project structure need a scheduled period, perhaps
> a
> >>>> week, for existing pull requests to be reviewed and accepted or
> closed and
> >>>> later migrated.
> >>>>
> >>>>
> >>>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <[hidden email]> wrote:
> >>>>>
> >>>>> @Greg
> >>>>>
> >>>>> I am personally in favor of splitting "connectors" and "contrib" out
> as
> >>>>> well. I know that @rmetzger has some reservations about the
> connectors,
> >>>>> but
> >>>>> we may be able to convince him.
> >>>>>
> >>>>> For the cluster tests (yarn / mesos) - in the past there were many
> cases
> >>>>> where these tests caught cases that other tests did not, because they
> >>>>> are
> >>>>> the only tests that actually use the "flink-dist.jar" and thus
> discover
> >>>>> many dependency and configuration issues. For that reason, my feeling
> >>>>> would
> >>>>> be that they are valuable in the core repository.
> >>>>>
> >>>>> I would actually suggest to do only the library split initially, to
> see
> >>>>> what the challenges are in setting up the multi-repo build and
> release
> >>>>> tooling. Once we gathered experience there, we can probably easily
> see
> >>>>> what
> >>>>> else we can split out.
> >>>>>
> >>>>> Stephan
> >>>>>
> >>>>>
> >>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[hidden email]>
> wrote:
> >>>>>
> >>>>> I’d like to use this refactoring opportunity to unspilt the Travis
> >>>>>> tests.
> >>>>>> With 51 builds queued up for the weekend (some of which may fail or
> >>>>>> have
> >>>>>> been force pushed) we are at the limit of the number of
> contributions
> >>>>>> we
> >>>>>> can process. Fixing this requires 1) splitting the project, 2)
> >>>>>> investigating speedups for long-running tests, and 3) staying
> >>>>>> cognizant of
> >>>>>> test performance when accepting new code.
> >>>>>>
> >>>>>> I’d like to add one to Stephan’s list of module group. I like that
> the
> >>>>>> modules are generic (“libraries”) so that no one module is alone and
> >>>>>> independent.
> >>>>>>
> >>>>>> Flink has three “libraries”: cep, ml, and gelly.
> >>>>>>
> >>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
> >>>>>> connectors for three Kafka versions).
> >>>>>>
> >>>>>> Both flink-storm and flink-python have a modest number of number of
> >>>>>> tests
> >>>>>> and could live with the miscellaneous modules in “contrib”.
> >>>>>>
> >>>>>> The YARN tests are long-running and problematic (I am unable to
> >>>>>> successfully run these locally). A “cluster” module could host
> >>>>>> flink-mesos,
> >>>>>> flink-yarn, and flink-yarn-tests.
> >>>>>>
> >>>>>> That gets us close to running all tests in a single Travis build.
> >>>>>>  https://travis-ci.org/greghogan/flink/builds/212122590 <
> >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>
> >>>>>>
> >>>>>> I also tested (https://github.com/greghogan/
> flink/commits/core_build <
> >>>>>> https://github.com/greghogan/flink/commits/core_build>) with a
> maven
> >>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >>>>>>  https://travis-ci.org/greghogan/flink/builds/212137659 <
> >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>
> >>>>>>  https://travis-ci.org/greghogan/flink/builds/212154470 <
> >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>
> >>>>>>
> >>>>>> We can run Travis CI builds nightly to guard against breaking
> changes.
> >>>>>>
> >>>>>> I also wanted to get an idea of how disruptive it would be to
> >>>>>> developers
> >>>>>> to divide the project into multiple git repos. I wrote a simple
> python
> >>>>>> script and configured it with the module partitions listed above.
> The
> >>>>>> usage
> >>>>>> string from the top of the file lists commits with files from
> multiple
> >>>>>> partitions and well as the modified files.
> >>>>>>  https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897
> <
> >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
> >>>>>>
> >>>>>> Accounting for the merging of the batch and streaming connector
> >>>>>> modules,
> >>>>>> and assuming that the project structure has not changed much over
> the
> >>>>>> past
> >>>>>> 15 months, for the following date ranges the listed number of
> commits
> >>>>>> would
> >>>>>> have been split across repositories.
> >>>>>>
> >>>>>> since "2017-01-01"
> >>>>>> 36 of 571 commits were mixed
> >>>>>>
> >>>>>> since "2016-07-01"
> >>>>>> 155 of 1607 commits were mixed
> >>>>>>
> >>>>>> since "2016-01-01"
> >>>>>> 272 of 2561 commits were mixed
> >>>>>>
> >>>>>> Greg
> >>>>>>
> >>>>>>
> >>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[hidden email]> wrote:
> >>>>>>>
> >>>>>>> @Robert - I think once we know that a separate git repo works well,
> >>>>>>> and
> >>>>>>> that it actually solves problems, I see no reason to not create a
> >>>>>>> connectors repository later. The infrastructure changes should be
> >>>>>>>
> >>>>>> identical
> >>>>>>
> >>>>>>> for two or more repositories.
> >>>>>>>
> >>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <
> [hidden email]>
> >>>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> I think it should not be at least the flink-dist but exactly the
> >>>>>>>>
> >>>>>>> remaining
> >>>>>>
> >>>>>>> flink-dist module. Otherwise we do redundant work.
> >>>>>>>>
> >>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <
> [hidden email]
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> "flink-core" means the main repository, not the "flink-core"
> module.
> >>>>>>>>>
> >>>>>>>>> When doing a release, we need to build the flink main code first,
> >>>>>>>>>
> >>>>>>>> because
> >>>>>>
> >>>>>>> the flink-libraries depend on that.
> >>>>>>>>> Once the "flink-libraries" are build, we need to run the main
> build
> >>>>>>>>>
> >>>>>>>> again
> >>>>>>
> >>>>>>> (at least the flink-dist module), so that it is pulling the
> artifacts
> >>>>>>>>>
> >>>>>>>> from
> >>>>>>>>
> >>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
> >>>>>>>>>
> >>>>>>>> artifact.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <
> >>>>>>>>> [hidden email]>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I'm ok with point 3.
> >>>>>>>>>>
> >>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice
> after
> >>>>>>>>>>
> >>>>>>>>> having
> >>>>>>>>
> >>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to
> >>>>>>>>>> me.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Till
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
> >>>>>>>>>> [hidden email]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Thank you. Running on AWS is a good idea!
> >>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
> >>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
> >>>>>>>>>>>
> >>>>>>>>>> before, I
> >>>>>>>>>
> >>>>>>>>>> don't really have time for doing this, but it has to be done :)
> )
> >>>>>>>>>>>
> >>>>>>>>>>> I'm against creating two new repositories. I fear that this
> >>>>>>>>>>>
> >>>>>>>>>> introduces
> >>>>>>>>
> >>>>>>>>> too
> >>>>>>>>>>
> >>>>>>>>>>> much complexity and too many repositories.
> >>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the
> >>>>>>>>>>> build
> >>>>>>>>>>>
> >>>>>>>>>> time
> >>>>>>>>>
> >>>>>>>>>> significantly down.
> >>>>>>>>>>> We can also consider putting the connectors into the
> >>>>>>>>>>>
> >>>>>>>>>> "flink-libraries"
> >>>>>>>>
> >>>>>>>>> repo
> >>>>>>>>>>
> >>>>>>>>>>> if we need to further reduce the build time.
> >>>>>>>>>>>
> >>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries"
> if
> >>>>>>>>>>> we
> >>>>>>>>>>>
> >>>>>>>>>> want
> >>>>>>>>>
> >>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate
> the
> >>>>>>>>>>> "flink-libraries" module from main.
> >>>>>>>>>>>
> >>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not
> correctly
> >>>>>>>>>>>
> >>>>>>>>>> placed
> >>>>>>>>
> >>>>>>>>> in
> >>>>>>>>>>
> >>>>>>>>>>> contrib anymore.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <
> [hidden email]>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Robert, appreciate your kickstarting this task.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We should compare the verification time with and without the
> >>>>>>>>>>>> listed
> >>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on
> Travis.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
> >>>>>>>>>>>>
> >>>>>>>>>>> flink-libraries?
> >>>>>>>>>>
> >>>>>>>>>>> Are you intending that we move flink-table out of
> flink-libraries
> >>>>>>>>>>>>
> >>>>>>>>>>> (and
> >>>>>>>>>
> >>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Greg
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <
> [hidden email]
> >>>>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Thank you for looking into this Till.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think we then have to split the repositories.
> >>>>>>>>>>>>> My main motivation for doing this is that it seems to be the
> >>>>>>>>>>>>> only
> >>>>>>>>>>>>>
> >>>>>>>>>>>> feasible
> >>>>>>>>>>>>
> >>>>>>>>>>>>> way of scaling the community to allow more committers
> working on
> >>>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> libraries.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'll take care of getting things started.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> As the next steps I propose to:
> >>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
> >>>>>>>>>>>>>
> >>>>>>>>>>>> repos/asf?p=flink-
> >>>>>>>>>>>
> >>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
> >>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
> >>>>>>>>>>>>>
> >>>>>>>>>>>> "flink-libraries"
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> >>>>>>>>>>>>>
> >>>>>>>>>>>> "flink-cep",
> >>>>>>>>>>>>
> >>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository.
> (I
> >>>>>>>>>>>>>
> >>>>>>>>>>>> decided
> >>>>>>>>>>
> >>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> contrib
> >>>>>>>>>
> >>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the
> main
> >>>>>>>>>>>>>
> >>>>>>>>>>>> repo
> >>>>>>>>>>
> >>>>>>>>>>> because its probably going to interact more with the core code
> in
> >>>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> future)
> >>>>>>>>>>>>> I try to preserve the history of those modules when splitting
> >>>>>>>>>>>>>
> >>>>>>>>>>>> them
> >>>>>>>>
> >>>>>>>>> into
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>>> new repo
> >>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the
> >>>>>>>>>>>>> main
> >>>>>>>>>>>>>
> >>>>>>>>>>>> repo.
> >>>>>>>>>>
> >>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
> >>>>>>>>>>>>>
> >>>>>>>>>>>> repository,
> >>>>>>>>>>
> >>>>>>>>>>> similar to the main documentation.
> >>>>>>>>>>>>> 6. I'll update the documentation build process to build both
> >>>>>>>>>>>>>
> >>>>>>>>>>>> documentations
> >>>>>>>>>>>>
> >>>>>>>>>>>>> & link them to each other
> >>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
> >>>>>>>>>>>>>
> >>>>>>>>>>>> repositories
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 8. I'll update the release script to create the Flink release
> >>>>>>>>>>>>> out
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>>>>>>>
> >>>>>>>>>> both
> >>>>>>>>>>>
> >>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir
> of
> >>>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> release, I'll need to change the build of "flink-dist" so that
> it
> >>>>>>>>>>>>>
> >>>>>>>>>>>> first
> >>>>>>>>>>
> >>>>>>>>>>> builds flink core, then the libraries and then the core again
> >>>>>>>>>>>>>
> >>>>>>>>>>>> with
> >>>>>>>>
> >>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>>>> libraries as an additional dependency.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The main question for the community is: do you agree with
> point
> >>>>>>>>>>>>>
> >>>>>>>>>>>> 3 ?
> >>>>>>>>
> >>>>>>>>> Would
> >>>>>>>>>>>
> >>>>>>>>>>>> you like to include more or less?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> >>>>>>>>>>>>>
> >>>>>>>>>>>> [hidden email]
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> In theory we could have a merging bot which solves the
> problem
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> of
> >>>>>>>>
> >>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> +1s,
> >>>>>>>>
> >>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>>>> bot
> >>>>>>>>>>>>
> >>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> merge
> >>>>>>>>>
> >>>>>>>>>> process.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think the second point is actually a disadvantage because
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> there
> >>>>>>>>
> >>>>>>>>> is
> >>>>>>>>>
> >>>>>>>>>> not
> >>>>>>>>>>>
> >>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module
> if
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> it
> >>>>>>>>>
> >>>>>>>>>> lives
> >>>>>>>>>>>>
> >>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes
> in
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> the
> >>>>>>>>
> >>>>>>>>> core
> >>>>>>>>>>>
> >>>>>>>>>>>> will most likely go unnoticed for some time in other modules
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> which
> >>>>>>>>
> >>>>>>>>> are
> >>>>>>>>>>
> >>>>>>>>>>> not
> >>>>>>>>>>>>
> >>>>>>>>>>>>> developed so actively. In the worst case these things will
> only
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> be
> >>>>>>>>
> >>>>>>>>> noticed
> >>>>>>>>>>>>
> >>>>>>>>>>>>> when we try to make a release.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> But I also agree that we are not Google and we don't have
> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> capacities to
> >>>>>>>>>>>>
> >>>>>>>>>>>>> maintain such a smooth a build process that we can keep all
> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> code
> >>>>>>>>>
> >>>>>>>>>> in
> >>>>>>>>>>>
> >>>>>>>>>>>> a
> >>>>>>>>>>>>
> >>>>>>>>>>>>> single repository.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it
> offers
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> some
> >>>>>>>>
> >>>>>>>>> nice
> >>>>>>>>>>>
> >>>>>>>>>>>> features wrt incrementally building projects. This would be
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> beneficial
> >>>>>>>>>>
> >>>>>>>>>>> for
> >>>>>>>>>>>>
> >>>>>>>>>>>>> local development but it would not solve our build time
> problems
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> on
> >>>>>>>>>
> >>>>>>>>>> Travis.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows
> to
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> reuse
> >>>>>>>>>>
> >>>>>>>>>>> results across builds. This could help when building on Travis,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> however, it
> >>>>>>>>>>>>
> >>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven
> to
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> Gradle
> >>>>>>>>>>
> >>>>>>>>>>> won't come for free (there's simply no free lunch out there)
> and
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> we
> >>>>>>>>>
> >>>>>>>>>> might
> >>>>>>>>>>>>
> >>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split
> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> repository
> >>>>>>>>>>>>
> >>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> build
> >>>>>>>>>
> >>>>>>>>>> time in
> >>>>>>>>>>>>
> >>>>>>>>>>>>> general. Whether to use a different build system or not can
> then
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> be
> >>>>>>>>>
> >>>>>>>>>> discussed as an orthogonal question.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <
> >>>>>>>>>>>>>> [hidden email]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Some other thoughts on how repository split would help. I am
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> not
> >>>>>>>>
> >>>>>>>>> sure
> >>>>>>>>>>
> >>>>>>>>>>> for
> >>>>>>>>>>>>
> >>>>>>>>>>>>> all of them, so please comment:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - There is less competition for a "commit window". It
> happens
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> a
> >>>>>>>>
> >>>>>>>>> lot
> >>>>>>>>>>
> >>>>>>>>>>> already that you run all tests and want to commit, but there
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> was
> >>>>>>>>
> >>>>>>>>> a
> >>>>>>>>>
> >>>>>>>>>> commit
> >>>>>>>>>>>>
> >>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>>>>>>
> >>>>>>>>> meantime.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  For a "linear" commit history, this may become a
> bottleneck
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> eventually
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> as well.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - There is less risk of broken master. If one
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> repository/modules
> >>>>>>>>
> >>>>>>>>> breaks
> >>>>>>>>>>>>
> >>>>>>>>>>>>> its master, the others can still continue.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Stephan
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [hidden email]>
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion
> up
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'd
> >>>>>>>>>
> >>>>>>>>>> like
> >>>>>>>>>>>
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> summarize the mentioned points:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The problem of increasing build times and complexity of
> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> project
> >>>>>>>>>>
> >>>>>>>>>>> has
> >>>>>>>>>>>>
> >>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> repository
> >>>>>>>>>>>
> >>>>>>>>>>>> using
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> support
> >>>>>>>>
> >>>>>>>>> this
> >>>>>>>>>>>
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> would have to switch our build tool to something like
> Gradle,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> for
> >>>>>>>>>
> >>>>>>>>>> example.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Another option is introducing build profiles for different
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> sets
> >>>>>>>>
> >>>>>>>>> of
> >>>>>>>>>
> >>>>>>>>>> modules
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> as well as separating integration and unit tests. The
> third
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> alternative
> >>>>>>>>>>>>
> >>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> actually
> >>>>>>>>>>>
> >>>>>>>>>>>> think that these two proposal are not necessarily exclusive
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> and
> >>>>>>>>
> >>>>>>>>> it
> >>>>>>>>>
> >>>>>>>>>> would
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> also make sense to have a separation between unit and
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> integration
> >>>>>>>>>
> >>>>>>>>>> tests
> >>>>>>>>>>>>
> >>>>>>>>>>>>> if
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> we split the respository.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to
> split
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> community
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I
> think
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> this
> >>>>>>>>>
> >>>>>>>>>> is
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> right way to go, because otherwise some parts of the
> project
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> could
> >>>>>>>>>
> >>>>>>>>>> become
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> second class citizens. Given that and that we continue
> using
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Maven,
> >>>>>>>>>>
> >>>>>>>>>>> I
> >>>>>>>>>>>
> >>>>>>>>>>>> still
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> example,
> >>>>>>>>
> >>>>>>>>> could
> >>>>>>>>>>>
> >>>>>>>>>>>> be
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity
> and
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> make
> >>>>>>>>>
> >>>>>>>>>> it
> >>>>>>>>>>>
> >>>>>>>>>>>> potentially easier for libraries to get actively developed.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The
> >>>>>>>>
> >>>>>>>>> main
> >>>>>>>>>>
> >>>>>>>>>>> concern is setting up the build infrastructure to aggregate
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> docs
> >>>>>>>>
> >>>>>>>>> from
> >>>>>>>>>>>
> >>>>>>>>>>>> multiple repositories and making them publicly available.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Flink's
> >>>>>>>>>
> >>>>>>>>>> ML
> >>>>>>>>>>>
> >>>>>>>>>>>> library being revived again, I'd volunteer investigating first
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> whether
> >>>>>>>>>>>
> >>>>>>>>>>>> it
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> is doable establishing a proper incremental build for
> Flink.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If
> >>>>>>>>
> >>>>>>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> should
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> not be possible, I will look into splitting the
> repository,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> first
> >>>>>>>>>
> >>>>>>>>>> only
> >>>>>>>>>>>
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> the libraries. I'll share my results with the community
> once
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm
> >>>>>>>>
> >>>>>>>>> done
> >>>>>>>>>>>
> >>>>>>>>>>>> with
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> the investigation.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [hidden email]>
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> open
> >>>>>>>>
> >>>>>>>>> source
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> projects. It only works for private repositories (at least
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> back
> >>>>>>>>
> >>>>>>>>> then
> >>>>>>>>>>>
> >>>>>>>>>>>> when
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> we've asked them about that).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> available
> >>>>>>>>>
> >>>>>>>>>> with
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Maven anytime soon.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on
> Travis.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I've
> >>>>>>>>>
> >>>>>>>>>> recently
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test
> groups.
> >>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term
> solution.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and
> test
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> time,
> >>>>>>>>>>
> >>>>>>>>>>> introducing build profiles for different components as
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Aljoscha
> >>>>>>>>
> >>>>>>>>> suggested
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> would solve the problem Till mentioned.
> >>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore
> >>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>> testing,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> competitors
> >>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an
> open
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> source
> >>>>>>>>>>>
> >>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> one
> >>>>>>>>>
> >>>>>>>>>> of
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> contributing companies.
> >>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> well,
> >>>>>>>>
> >>>>>>>>> then
> >>>>>>>>>>>
> >>>>>>>>>>>> I
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> different
> >>>>>>>>
> >>>>>>>>> repositories.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> opinion.
> >>>>>>>>
> >>>>>>>>> As
> >>>>>>>>>
> >>>>>>>>>> others
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> things:
> >>>>>>>>>
> >>>>>>>>>> - How are we doing to build the documentation? Ideally every
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> repo
> >>>>>>>>>
> >>>>>>>>>> should
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together
> when
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> building
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> main docs.
> >>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> repository
> >>>>>>>>>
> >>>>>>>>>> depend
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> snapshot
> >>>>>>>>
> >>>>>>>>> deployment
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> always works. This also means that people working on a
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> library
> >>>>>>>>
> >>>>>>>>> repository
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
> >>>>>>>>>>>>>>>>> - We need to update the release scripts
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at
> least
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> one
> >>>>>>>>>
> >>>>>>>>>> committer
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> example
> >>>>>>>>
> >>>>>>>>> for
> >>>>>>>>>>
> >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
> >>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but
> I'm
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> currently
> >>>>>>>>>>>
> >>>>>>>>>>>> pretty booked with many other things, so I don't
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> realistically
> >>>>>>>>
> >>>>>>>>> see
> >>>>>>>>>>
> >>>>>>>>>>> myself
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> some
> >>>>>>>>
> >>>>>>>>> time
> >>>>>>>>>>>
> >>>>>>>>>>>> off.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst
> case
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 5
> >>>>>>>>
> >>>>>>>>> days.
> >>>>>>>>>>>
> >>>>>>>>>>>> The
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> stuff,
> >>>>>>>>
> >>>>>>>>> so
> >>>>>>>>>
> >>>>>>>>>> many
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> release
> >>>>>>>>>
> >>>>>>>>>> scripts,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [hidden email]>
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> actually
> >>>>>>>>>
> >>>>>>>>>> be
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> preferred solution in my opinion.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> repository"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> update/publish
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> several repositories first.
> >>>>>>>>>>>>>>>>>> However, the strong prerequisite for that is an
> incremental
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> build
> >>>>>>>>>>
> >>>>>>>>>>> system
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I
> am
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> not
> >>>>>>>>
> >>>>>>>>> sure
> >>>>>>>>>>>
> >>>>>>>>>>>> how
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> could make that work
> >>>>>>>>>>>>>>>>>> with Maven and Travis...
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> [hidden email]>
> >>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> An additional option for reducing time to build and test
> is
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> parallel
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>
>
>
12