[ANNOUNCE] Contributing Alibaba's Blink

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

Becket Qin
Thanks Stephan,

The plan makes sense to me.

Regarding the docs, it seems better to have a separate versioned website
because there are a lot of changes spread over the places. We can add the
banner to remind users that they are looking at the blink docs, which is
temporary and will eventually be merged into Flink master. (The banner is
pretty similar to what user will see when they visit docs of old flink
versions
<https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html>
[1]).

Thanks,

Jiangjie (Becket) Qn

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html

On Thu, Jan 24, 2019 at 6:21 AM Shaoxuan Wang <[hidden email]> wrote:

> Thanks Stephan,
> The entire plan looks good to me. WRT the "Docs for Flink", a subsection
> should be good enough if we just introduce the outlines of what blink has
> changed. However, we have made detailed introductions to blink based on the
> framework of current release document of Flink (those introductions are
> distributed in each subsections). Does it make sense to create a blink
> document as a separate one, under the documentation section, say blink-1.5
> (temporary, not a release).
>
> Regards,
> Shaoxuan
>
>
> On Wed, Jan 23, 2019 at 10:15 PM Stephan Ewen <[hidden email]> wrote:
>
> > Nice to see this lively discussion.
> >
> > *--- Branch Versus Repository ---*
> >
> > Looks like this is converging towards pushing a branch.
> > How about naming the branch simply "blink-1.5" ? That would be in line
> with
> > the 1.5 version branch of Flink, which is simply called "release-1.5" ?
> >
> > *--- SGA --- *
> >
> > The SGA (Software Grant Agreement) should be either filed already or in
> the
> > process of filing.
> >
> > *--- Offering Jars for Blink ---*
> >
> > As Chesnay and Timo mentioned, we cannot easily offer a "Release" of
> Blink
> > (source or binary), because that would require a thorough
> > checking of licenses and creating/ bundling license files. That is a lot
> of
> > work, as we recently experienced again in the Flink master.
> >
> > What we can do is upload compiled jar files and link to them somewhere in
> > the blink docs. We need to add a disclaimer that these are
> > convenience jars, and not an official Apache release. I hope that would
> > work for the users that are curious to try things out.
> >
> > *--- Docs for Blink --- *
> >
> > Do we need a versioned website here? If not, can we simply make this a
> > subsection of the current Flink snapshot docs?
> > Next to "Flink Development" and "Internals", we could have a section on
> > "Blink branch".
> > I think it is crucial, thought, to make it clear that this is temporary
> and
> > will eventually be subsumed by the main release, just
> > so that users do not get confused.
> >
> > Best,
> > Stephan
> >
> >
> > On Wed, Jan 23, 2019 at 12:23 PM Becket Qin <[hidden email]>
> wrote:
> >
> > > Really excited to see Blink joining the Flink community!
> > >
> > > My two cents regarding repo v.s. branch, I am +1 for a branch in Flink.
> > > Among many things, what's most important at this point is probably to
> > make
> > > Blink code available to the developers so people can discuss the merge
> > > strategy. Creating a branch is probably the one of the fastest way to
> do
> > > that. We can always create separate repo later if necessary.
> > >
> > > WRT the doc and jar distribution, It is true that we are going to have
> > > some major refactoring to the code. But I can imagine some curious
> users
> > > may still want to try out something in Blink and it would be good if we
> > can
> > > do them a favor. Legal wise, my hunch is that it is probably OK for
> > someone
> > > to just build the jars and docs, host it somewhere for convenience. But
> > it
> > > should be clear that this is just for convenience purpose instead of an
> > > official release form Apache (unless we would like to make it
> official).
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <[hidden email]>
> > > wrote:
> > >
> > >>  From the ASF side Jar files do notrequire a vote/release process,
> this
> > >> is at the discretion of the PMC.
> > >>
> > >> However, I have my doubts whether at this time we could even create a
> > >> source release of Blink given that we'd have to vet the code-base
> first.
> > >>
> > >> Even without source release we could still distribute jars, but would
> > >> not be allowed to advertise them to users as they do not constitute an
> > >> official release.
> > >>
> > >> On 23.01.2019 11:41, Timo Walther wrote:
> > >> > As far as I know it, we will not provide any binaries but only the
> > >> > source code. JAR files on Apache servers would need an official
> > >> > voting/release process. Interested users can build Blink themselves
> > >> > using `mvn clean package`.
> > >> >
> > >> > @Stephan: Please correct me if I'm wrong.
> > >> >
> > >> > Regards,
> > >> > Timo
> > >> >
> > >> > Am 23.01.19 um 11:16 schrieb Kurt Young:
> > >> >> Hi Timo,
> > >> >>
> > >> >> What about the jar files, will blink's jar be uploaded to apache
> > >> >> repository? If not, i think it will be very inconvenient for users
> > who
> > >> >> wants to try blink and view the documents if they need some help
> from
> > >> >> doc.
> > >> >>
> > >> >> Best,
> > >> >> Kurt
> > >> >>
> > >> >>
> > >> >> On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <[hidden email]>
> > >> wrote:
> > >> >>
> > >> >>> Hi Kurt,
> > >> >>>
> > >> >>> I would not make the Blink's documentation visible to users or
> > search
> > >> >>> engines via a website. Otherwise this would communicate that Blink
> > >> >>> is an
> > >> >>> official release. I would suggest to put the Blink docs into
> `/docs`
> > >> >>> and
> > >> >>> people can build it with `./docs/build.sh -pi` if there are
> > >> interested.
> > >> >>> I would not invest time into setting up a docs infrastructure.
> > >> >>>
> > >> >>> Regards,
> > >> >>> Timo
> > >> >>>
> > >> >>> Am 23.01.19 um 08:56 schrieb Kurt Young:
> > >> >>>> Thanks @Stephan for this exciting announcement!
> > >> >>>>
> > >> >>>> >From my point of view, i would prefer to use branch. It makes
> the
> > >> >>> message
> > >> >>>> "Blink is pat of Flink" more straightforward and clear.
> > >> >>>>
> > >> >>>> Except for the location of blink codes, there are some other
> > >> questions
> > >> >>> like
> > >> >>>> what version should should use, and where do we put blink's
> > >> documents.
> > >> >>>> Currently, we choose to use "1.5.1-blink-r0" as blink's version
> > since
> > >> >>> blink
> > >> >>>> forked from Flink's 1.5.1. We also added some docs to blink just
> as
> > >> >>>> Flink
> > >> >>>> did. Can blink use a website like
> > >> >>>> "https://ci.apache.org/projects/flink/flink-docs-release-1.7/"
> to
> > >> put
> > >> >>> all
> > >> >>>> blink's docs, change it to something like
> > >> >>>> https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
> > >> >>>>
> > >> >>>> Best,
> > >> >>>> Kurt
> > >> >>>>
> > >> >>>>
> > >> >>>> On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <
> [hidden email]
> > >
> > >> >>> wrote:
> > >> >>>>> Hi all,
> > >> >>>>>
> > >> >>>>> @Stephan  Thanks a lot for driving these efforts. I think a lot
> of
> > >> >>> people
> > >> >>>>> is already waiting for this.
> > >> >>>>> +1 for opening the blink source code.
> > >> >>>>> Both a separate repository or a special branch is ok for me.
> > >> >>>>> Hopefully,
> > >> >>>>> this will not last too long.
> > >> >>>>>
> > >> >>>>> Best, Hequn
> > >> >>>>>
> > >> >>>>>
> > >> >>>>> On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <[hidden email]>
> > wrote:
> > >> >>>>>
> > >> >>>>>> Great news! Looking forward to the new wave of developments.
> > >> >>>>>>
> > >> >>>>>> If Blink needs to be continuously updated, fix bugs, release
> > >> >>>>>> versions,
> > >> >>>>>> maybe a separate repository is a better idea.
> > >> >>>>>>
> > >> >>>>>> Best,
> > >> >>>>>> Jark
> > >> >>>>>>
> > >> >>>>>> On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <
> [hidden email]
> > >
> > >> >>> wrote:
> > >> >>>>>>> Hey!
> > >> >>>>>>> I also think that creating the separate branch for Blink in
> > >> >>>>>>> Flink repo
> > >> >>>>>> is a
> > >> >>>>>>> better idea than creating the fork as IMHO it will allow
> merging
> > >> >>>>> changes
> > >> >>>>>>> more easily.
> > >> >>>>>>>
> > >> >>>>>>> Best Regards,
> > >> >>>>>>> Dom.
> > >> >>>>>>>
> > >> >>>>>>> wt., 22 sty 2019 o 10:09 Ufuk Celebi <[hidden email]>
> > napisał(a):
> > >> >>>>>>>
> > >> >>>>>>>> Hey Stephan and others,
> > >> >>>>>>>>
> > >> >>>>>>>> thanks for the summary. I'm very excited about the outlined
> > >> >>>>>> improvements.
> > >> >>>>>>>> :-)
> > >> >>>>>>>>
> > >> >>>>>>>> Separate branch vs. fork: I'm fine with either of the
> > >> suggestions.
> > >> >>>>>>>> Depending on the expected strategy for merging the changes,
> > >> >>>>>>>> expected
> > >> >>>>>>>> number of additional changes, etc., either one or the other
> > >> >>>>>>>> approach
> > >> >>>>>>>> might be better suited.
> > >> >>>>>>>>
> > >> >>>>>>>> – Ufuk
> > >> >>>>>>>>
> > >> >>>>>>>> On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <[hidden email]
> >
> > >> >>>>>>>> wrote:
> > >> >>>>>>>>> Hi Driesprong,
> > >> >>>>>>>>>
> > >> >>>>>>>>> Glad to hear that you're interested with blink's codes.
> > >> Actually,
> > >> >>>>>> blink
> > >> >>>>>>>>> only has one branch by itself, so either a separated repo
> or a
> > >> >>>>>> flink's
> > >> >>>>>>>>> branch works for blink's code share.
> > >> >>>>>>>>>
> > >> >>>>>>>>> Best,
> > >> >>>>>>>>> Kurt
> > >> >>>>>>>>>
> > >> >>>>>>>>>
> > >> >>>>>>>>> On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
> > >> >>>>>> <[hidden email]
> > >> >>>>>>>>> wrote:
> > >> >>>>>>>>>
> > >> >>>>>>>>>> Great news Stephan!
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Why not make the code available by having a fork of Flink
> on
> > >> >>>>>>> Alibaba's
> > >> >>>>>>>>>> Github account. This will allow us to do easy diff's in the
> > >> >>>>> Github
> > >> >>>>>> UI
> > >> >>>>>>>> and
> > >> >>>>>>>>>> create PR's of cherry-picked commits if needed. I can
> imagine
> > >> >>>>> that
> > >> >>>>>>> the
> > >> >>>>>>>>>> Blink codebase has a lot of branches by itself, so just
> > >> >>>>>>>>>> pushing a
> > >> >>>>>>>> couple of
> > >> >>>>>>>>>> branches to the main Flink repo is not ideal. Looking
> forward
> > >> to
> > >> >>>>>> it!
> > >> >>>>>>>>>> Cheers, Fokko
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
> > >> >>>>>>> [hidden email]
> > >> >>>>>>>>> :
> > >> >>>>>>>>>>> big +1 to contribute Blink codebase directly into the
> Apache
> > >> >>>>>> Flink
> > >> >>>>>>>>>> project.
> > >> >>>>>>>>>>> Looking forward to the new journey.
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> Regards,
> > >> >>>>>>>>>>> Shaoxuan
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
> > >> >>>>>> [hidden email]>
> > >> >>>>>>>>>> wrote:
> > >> >>>>>>>>>>>>    Thanks Stephan! We are hoping to make the process as
> > >> >>>>>>>> non-disruptive as
> > >> >>>>>>>>>>>> possible to the Flink community. Making the Blink
> codebase
> > >> >>>>>> public
> > >> >>>>>>>> is
> > >> >>>>>>>>>> the
> > >> >>>>>>>>>>>> first step that hopefully facilitates further
> discussions.
> > >> >>>>>>>>>>>> Xiaowei
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>       On Monday, January 21, 2019, 11:46:28 AM PST,
> Stephan
> > >> >>>>> Ewen
> > >> >>>>>> <
> > >> >>>>>>>>>>>> [hidden email]> wrote:
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>    Dear Flink Community!
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Some of you may have heard it already from announcements
> or
> > >> >>>>>> from
> > >> >>>>>>> a
> > >> >>>>>>>>>> Flink
> > >> >>>>>>>>>>>> Forward talk:
> > >> >>>>>>>>>>>> Alibaba has decided to open source its in-house
> > improvements
> > >> >>>>> to
> > >> >>>>>>>> Flink,
> > >> >>>>>>>>>>>> called Blink!
> > >> >>>>>>>>>>>> First of all, big thanks to team that developed these
> > >> >>>>>>> improvements
> > >> >>>>>>>> and
> > >> >>>>>>>>>>> made
> > >> >>>>>>>>>>>> this
> > >> >>>>>>>>>>>> contribution possible!
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Blink has some very exciting enhancements, most
> prominently
> > >> >>>>> on
> > >> >>>>>>> the
> > >> >>>>>>>>>> Table
> > >> >>>>>>>>>>>> API/SQL side
> > >> >>>>>>>>>>>> and the unified execution of these programs. For batch
> > >> >>>>>> (bounded)
> > >> >>>>>>>> data,
> > >> >>>>>>>>>>> the
> > >> >>>>>>>>>>>> SQL execution
> > >> >>>>>>>>>>>> has full TPC-DS coverage (which is a big deal), and the
> > >> >>>>>> execution
> > >> >>>>>>>> is
> > >> >>>>>>>>>> more
> > >> >>>>>>>>>>>> than 10x faster
> > >> >>>>>>>>>>>> than the current SQL runtime in Flink. Blink has also
> added
> > >> >>>>>>>> support for
> > >> >>>>>>>>>>>> catalogs,
> > >> >>>>>>>>>>>> improved the failover speed of batch queries and the
> > resource
> > >> >>>>>>>>>> management.
> > >> >>>>>>>>>>>> It also
> > >> >>>>>>>>>>>> makes some good steps in the direction of more deeply
> > >> >>>>> unifying
> > >> >>>>>>> the
> > >> >>>>>>>>>> batch
> > >> >>>>>>>>>>>> and streaming
> > >> >>>>>>>>>>>> execution.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> The proposal is to merge Blink's enhancements into Flink,
> > to
> > >> >>>>>> give
> > >> >>>>>>>>>> Flink's
> > >> >>>>>>>>>>>> SQL/Table API and
> > >> >>>>>>>>>>>> execution a big boost in usability and performance.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Just to avoid any confusion: This is not a suggested
> change
> > >> >>>>> of
> > >> >>>>>>>> focus to
> > >> >>>>>>>>>>>> batch processing,
> > >> >>>>>>>>>>>> nor would this break with any of the streaming
> architecture
> > >> >>>>> and
> > >> >>>>>>>> vision
> > >> >>>>>>>>>> of
> > >> >>>>>>>>>>>> Flink.
> > >> >>>>>>>>>>>> This contribution follows very much the principle of
> "batch
> > >> >>>>> is
> > >> >>>>>> a
> > >> >>>>>>>>>> special
> > >> >>>>>>>>>>>> case of streaming".
> > >> >>>>>>>>>>>> As a special case, batch makes special optimizations
> > >> >>>>> possible.
> > >> >>>>>> In
> > >> >>>>>>>> its
> > >> >>>>>>>>>>>> current state,
> > >> >>>>>>>>>>>> Flink does not exploit many of these optimizations. This
> > >> >>>>>>>> contribution
> > >> >>>>>>>>>>> adds
> > >> >>>>>>>>>>>> exactly these
> > >> >>>>>>>>>>>> optimizations and makes the streaming model of Flink
> > >> >>>>> applicable
> > >> >>>>>>> to
> > >> >>>>>>>>>> harder
> > >> >>>>>>>>>>>> batch use cases.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Assuming that the community is excited about this as
> well,
> > >> >>>>> and
> > >> >>>>>> in
> > >> >>>>>>>> favor
> > >> >>>>>>>>>>> of
> > >> >>>>>>>>>>>> these enhancements
> > >> >>>>>>>>>>>> to Flink's capabilities, below are some thoughts on how
> > this
> > >> >>>>>>>>>> contribution
> > >> >>>>>>>>>>>> and integration
> > >> >>>>>>>>>>>> could work.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> --- Making the code available ---
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> At the moment, the Blink code is in the form of a big
> Flink
> > >> >>>>>> fork
> > >> >>>>>>>>>> (rather
> > >> >>>>>>>>>>>> than isolated
> > >> >>>>>>>>>>>> patches on top of Flink), so the integration is
> > unfortunately
> > >> >>>>>> not
> > >> >>>>>>>> as
> > >> >>>>>>>>>> easy
> > >> >>>>>>>>>>>> as merging a
> > >> >>>>>>>>>>>> few patches or pull requests.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> To support a non-disruptive merge of such a big
> > >> >>>>> contribution, I
> > >> >>>>>>>> believe
> > >> >>>>>>>>>>> it
> > >> >>>>>>>>>>>> make sense to make
> > >> >>>>>>>>>>>> the code of the fork available in the Flink project
> first.
> > >> >>>>>>>>>>>>   From there on, we can start to work on the details for
> > >> >>>>> merging
> > >> >>>>>>> the
> > >> >>>>>>>>>>>> enhancements, including
> > >> >>>>>>>>>>>> the refactoring of the necessary parts in the Flink
> master
> > >> >>>>> and
> > >> >>>>>>> the
> > >> >>>>>>>>>> Blink
> > >> >>>>>>>>>>>> code to make a
> > >> >>>>>>>>>>>> merge possible without repeatedly breaking compatibility.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> The first question is where do we put the code of the
> Blink
> > >> >>>>>> fork
> > >> >>>>>>>> during
> > >> >>>>>>>>>>> the
> > >> >>>>>>>>>>>> merging procedure?
> > >> >>>>>>>>>>>> My first thought was to temporarily add a repository
> (like
> > >> >>>>>>>>>>>> "flink-blink-staging"), but we could
> > >> >>>>>>>>>>>> also put it into a special branch in the main Flink
> > >> >>>>> repository.
> > >> >>>>>>>>>>>> I will start a separate thread about discussing a
> possible
> > >> >>>>>>>> strategy to
> > >> >>>>>>>>>>>> handle and merge
> > >> >>>>>>>>>>>> such a big contribution.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Best,
> > >> >>>>>>>>>>>> Stephan
> > >> >>>>>>>>>>>>
> > >> >>>
> > >> >
> > >> >
> > >>
> > >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

Timo Walther-2
Regarding the content of a `blink-1.5` branch, is it possible to rebase
the big Blink commit on top of the current master or the last Flink release?

I don't mean a full rebase here, but just forking the branch from
current Flink, and putting the Blink content into the repository, and
commit it. This would enable to see a diff which classes and lines have
changed and which are still the same. I guess this would be very helpful
instead of a branch with a big commit that has no common origin.

Thanks,
Timo

Am 24.01.19 um 02:54 schrieb Becket Qin:

> Thanks Stephan,
>
> The plan makes sense to me.
>
> Regarding the docs, it seems better to have a separate versioned website
> because there are a lot of changes spread over the places. We can add the
> banner to remind users that they are looking at the blink docs, which is
> temporary and will eventually be merged into Flink master. (The banner is
> pretty similar to what user will see when they visit docs of old flink
> versions
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html>
> [1]).
>
> Thanks,
>
> Jiangjie (Becket) Qn
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
>
> On Thu, Jan 24, 2019 at 6:21 AM Shaoxuan Wang <[hidden email]> wrote:
>
>> Thanks Stephan,
>> The entire plan looks good to me. WRT the "Docs for Flink", a subsection
>> should be good enough if we just introduce the outlines of what blink has
>> changed. However, we have made detailed introductions to blink based on the
>> framework of current release document of Flink (those introductions are
>> distributed in each subsections). Does it make sense to create a blink
>> document as a separate one, under the documentation section, say blink-1.5
>> (temporary, not a release).
>>
>> Regards,
>> Shaoxuan
>>
>>
>> On Wed, Jan 23, 2019 at 10:15 PM Stephan Ewen <[hidden email]> wrote:
>>
>>> Nice to see this lively discussion.
>>>
>>> *--- Branch Versus Repository ---*
>>>
>>> Looks like this is converging towards pushing a branch.
>>> How about naming the branch simply "blink-1.5" ? That would be in line
>> with
>>> the 1.5 version branch of Flink, which is simply called "release-1.5" ?
>>>
>>> *--- SGA --- *
>>>
>>> The SGA (Software Grant Agreement) should be either filed already or in
>> the
>>> process of filing.
>>>
>>> *--- Offering Jars for Blink ---*
>>>
>>> As Chesnay and Timo mentioned, we cannot easily offer a "Release" of
>> Blink
>>> (source or binary), because that would require a thorough
>>> checking of licenses and creating/ bundling license files. That is a lot
>> of
>>> work, as we recently experienced again in the Flink master.
>>>
>>> What we can do is upload compiled jar files and link to them somewhere in
>>> the blink docs. We need to add a disclaimer that these are
>>> convenience jars, and not an official Apache release. I hope that would
>>> work for the users that are curious to try things out.
>>>
>>> *--- Docs for Blink --- *
>>>
>>> Do we need a versioned website here? If not, can we simply make this a
>>> subsection of the current Flink snapshot docs?
>>> Next to "Flink Development" and "Internals", we could have a section on
>>> "Blink branch".
>>> I think it is crucial, thought, to make it clear that this is temporary
>> and
>>> will eventually be subsumed by the main release, just
>>> so that users do not get confused.
>>>
>>> Best,
>>> Stephan
>>>
>>>
>>> On Wed, Jan 23, 2019 at 12:23 PM Becket Qin <[hidden email]>
>> wrote:
>>>> Really excited to see Blink joining the Flink community!
>>>>
>>>> My two cents regarding repo v.s. branch, I am +1 for a branch in Flink.
>>>> Among many things, what's most important at this point is probably to
>>> make
>>>> Blink code available to the developers so people can discuss the merge
>>>> strategy. Creating a branch is probably the one of the fastest way to
>> do
>>>> that. We can always create separate repo later if necessary.
>>>>
>>>> WRT the doc and jar distribution, It is true that we are going to have
>>>> some major refactoring to the code. But I can imagine some curious
>> users
>>>> may still want to try out something in Blink and it would be good if we
>>> can
>>>> do them a favor. Legal wise, my hunch is that it is probably OK for
>>> someone
>>>> to just build the jars and docs, host it somewhere for convenience. But
>>> it
>>>> should be clear that this is just for convenience purpose instead of an
>>>> official release form Apache (unless we would like to make it
>> official).
>>>> Thanks,
>>>>
>>>> Jiangjie (Becket) Qin
>>>>
>>>> On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <[hidden email]>
>>>> wrote:
>>>>
>>>>>   From the ASF side Jar files do notrequire a vote/release process,
>> this
>>>>> is at the discretion of the PMC.
>>>>>
>>>>> However, I have my doubts whether at this time we could even create a
>>>>> source release of Blink given that we'd have to vet the code-base
>> first.
>>>>> Even without source release we could still distribute jars, but would
>>>>> not be allowed to advertise them to users as they do not constitute an
>>>>> official release.
>>>>>
>>>>> On 23.01.2019 11:41, Timo Walther wrote:
>>>>>> As far as I know it, we will not provide any binaries but only the
>>>>>> source code. JAR files on Apache servers would need an official
>>>>>> voting/release process. Interested users can build Blink themselves
>>>>>> using `mvn clean package`.
>>>>>>
>>>>>> @Stephan: Please correct me if I'm wrong.
>>>>>>
>>>>>> Regards,
>>>>>> Timo
>>>>>>
>>>>>> Am 23.01.19 um 11:16 schrieb Kurt Young:
>>>>>>> Hi Timo,
>>>>>>>
>>>>>>> What about the jar files, will blink's jar be uploaded to apache
>>>>>>> repository? If not, i think it will be very inconvenient for users
>>> who
>>>>>>> wants to try blink and view the documents if they need some help
>> from
>>>>>>> doc.
>>>>>>>
>>>>>>> Best,
>>>>>>> Kurt
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <[hidden email]>
>>>>> wrote:
>>>>>>>> Hi Kurt,
>>>>>>>>
>>>>>>>> I would not make the Blink's documentation visible to users or
>>> search
>>>>>>>> engines via a website. Otherwise this would communicate that Blink
>>>>>>>> is an
>>>>>>>> official release. I would suggest to put the Blink docs into
>> `/docs`
>>>>>>>> and
>>>>>>>> people can build it with `./docs/build.sh -pi` if there are
>>>>> interested.
>>>>>>>> I would not invest time into setting up a docs infrastructure.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>> Am 23.01.19 um 08:56 schrieb Kurt Young:
>>>>>>>>> Thanks @Stephan for this exciting announcement!
>>>>>>>>>
>>>>>>>>> >From my point of view, i would prefer to use branch. It makes
>> the
>>>>>>>> message
>>>>>>>>> "Blink is pat of Flink" more straightforward and clear.
>>>>>>>>>
>>>>>>>>> Except for the location of blink codes, there are some other
>>>>> questions
>>>>>>>> like
>>>>>>>>> what version should should use, and where do we put blink's
>>>>> documents.
>>>>>>>>> Currently, we choose to use "1.5.1-blink-r0" as blink's version
>>> since
>>>>>>>> blink
>>>>>>>>> forked from Flink's 1.5.1. We also added some docs to blink just
>> as
>>>>>>>>> Flink
>>>>>>>>> did. Can blink use a website like
>>>>>>>>> "https://ci.apache.org/projects/flink/flink-docs-release-1.7/"
>> to
>>>>> put
>>>>>>>> all
>>>>>>>>> blink's docs, change it to something like
>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Kurt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <
>> [hidden email]
>>>>>>>> wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> @Stephan  Thanks a lot for driving these efforts. I think a lot
>> of
>>>>>>>> people
>>>>>>>>>> is already waiting for this.
>>>>>>>>>> +1 for opening the blink source code.
>>>>>>>>>> Both a separate repository or a special branch is ok for me.
>>>>>>>>>> Hopefully,
>>>>>>>>>> this will not last too long.
>>>>>>>>>>
>>>>>>>>>> Best, Hequn
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <[hidden email]>
>>> wrote:
>>>>>>>>>>> Great news! Looking forward to the new wave of developments.
>>>>>>>>>>>
>>>>>>>>>>> If Blink needs to be continuously updated, fix bugs, release
>>>>>>>>>>> versions,
>>>>>>>>>>> maybe a separate repository is a better idea.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jark
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <
>> [hidden email]
>>>>>>>> wrote:
>>>>>>>>>>>> Hey!
>>>>>>>>>>>> I also think that creating the separate branch for Blink in
>>>>>>>>>>>> Flink repo
>>>>>>>>>>> is a
>>>>>>>>>>>> better idea than creating the fork as IMHO it will allow
>> merging
>>>>>>>>>> changes
>>>>>>>>>>>> more easily.
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Dom.
>>>>>>>>>>>>
>>>>>>>>>>>> wt., 22 sty 2019 o 10:09 Ufuk Celebi <[hidden email]>
>>> napisał(a):
>>>>>>>>>>>>> Hey Stephan and others,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for the summary. I'm very excited about the outlined
>>>>>>>>>>> improvements.
>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Separate branch vs. fork: I'm fine with either of the
>>>>> suggestions.
>>>>>>>>>>>>> Depending on the expected strategy for merging the changes,
>>>>>>>>>>>>> expected
>>>>>>>>>>>>> number of additional changes, etc., either one or the other
>>>>>>>>>>>>> approach
>>>>>>>>>>>>> might be better suited.
>>>>>>>>>>>>>
>>>>>>>>>>>>> – Ufuk
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <[hidden email]
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi Driesprong,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Glad to hear that you're interested with blink's codes.
>>>>> Actually,
>>>>>>>>>>> blink
>>>>>>>>>>>>>> only has one branch by itself, so either a separated repo
>> or a
>>>>>>>>>>> flink's
>>>>>>>>>>>>>> branch works for blink's code share.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Kurt
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
>>>>>>>>>>> <[hidden email]
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Great news Stephan!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Why not make the code available by having a fork of Flink
>> on
>>>>>>>>>>>> Alibaba's
>>>>>>>>>>>>>>> Github account. This will allow us to do easy diff's in the
>>>>>>>>>> Github
>>>>>>>>>>> UI
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> create PR's of cherry-picked commits if needed. I can
>> imagine
>>>>>>>>>> that
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Blink codebase has a lot of branches by itself, so just
>>>>>>>>>>>>>>> pushing a
>>>>>>>>>>>>> couple of
>>>>>>>>>>>>>>> branches to the main Flink repo is not ideal. Looking
>> forward
>>>>> to
>>>>>>>>>>> it!
>>>>>>>>>>>>>>> Cheers, Fokko
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
>>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>> big +1 to contribute Blink codebase directly into the
>> Apache
>>>>>>>>>>> Flink
>>>>>>>>>>>>>>> project.
>>>>>>>>>>>>>>>> Looking forward to the new journey.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Shaoxuan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>     Thanks Stephan! We are hoping to make the process as
>>>>>>>>>>>>> non-disruptive as
>>>>>>>>>>>>>>>>> possible to the Flink community. Making the Blink
>> codebase
>>>>>>>>>>> public
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> first step that hopefully facilitates further
>> discussions.
>>>>>>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>        On Monday, January 21, 2019, 11:46:28 AM PST,
>> Stephan
>>>>>>>>>> Ewen
>>>>>>>>>>> <
>>>>>>>>>>>>>>>>> [hidden email]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Dear Flink Community!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Some of you may have heard it already from announcements
>> or
>>>>>>>>>>> from
>>>>>>>>>>>> a
>>>>>>>>>>>>>>> Flink
>>>>>>>>>>>>>>>>> Forward talk:
>>>>>>>>>>>>>>>>> Alibaba has decided to open source its in-house
>>> improvements
>>>>>>>>>> to
>>>>>>>>>>>>> Flink,
>>>>>>>>>>>>>>>>> called Blink!
>>>>>>>>>>>>>>>>> First of all, big thanks to team that developed these
>>>>>>>>>>>> improvements
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> made
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> contribution possible!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Blink has some very exciting enhancements, most
>> prominently
>>>>>>>>>> on
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Table
>>>>>>>>>>>>>>>>> API/SQL side
>>>>>>>>>>>>>>>>> and the unified execution of these programs. For batch
>>>>>>>>>>> (bounded)
>>>>>>>>>>>>> data,
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> SQL execution
>>>>>>>>>>>>>>>>> has full TPC-DS coverage (which is a big deal), and the
>>>>>>>>>>> execution
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>> than 10x faster
>>>>>>>>>>>>>>>>> than the current SQL runtime in Flink. Blink has also
>> added
>>>>>>>>>>>>> support for
>>>>>>>>>>>>>>>>> catalogs,
>>>>>>>>>>>>>>>>> improved the failover speed of batch queries and the
>>> resource
>>>>>>>>>>>>>>> management.
>>>>>>>>>>>>>>>>> It also
>>>>>>>>>>>>>>>>> makes some good steps in the direction of more deeply
>>>>>>>>>> unifying
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> batch
>>>>>>>>>>>>>>>>> and streaming
>>>>>>>>>>>>>>>>> execution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The proposal is to merge Blink's enhancements into Flink,
>>> to
>>>>>>>>>>> give
>>>>>>>>>>>>>>> Flink's
>>>>>>>>>>>>>>>>> SQL/Table API and
>>>>>>>>>>>>>>>>> execution a big boost in usability and performance.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Just to avoid any confusion: This is not a suggested
>> change
>>>>>>>>>> of
>>>>>>>>>>>>> focus to
>>>>>>>>>>>>>>>>> batch processing,
>>>>>>>>>>>>>>>>> nor would this break with any of the streaming
>> architecture
>>>>>>>>>> and
>>>>>>>>>>>>> vision
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> Flink.
>>>>>>>>>>>>>>>>> This contribution follows very much the principle of
>> "batch
>>>>>>>>>> is
>>>>>>>>>>> a
>>>>>>>>>>>>>>> special
>>>>>>>>>>>>>>>>> case of streaming".
>>>>>>>>>>>>>>>>> As a special case, batch makes special optimizations
>>>>>>>>>> possible.
>>>>>>>>>>> In
>>>>>>>>>>>>> its
>>>>>>>>>>>>>>>>> current state,
>>>>>>>>>>>>>>>>> Flink does not exploit many of these optimizations. This
>>>>>>>>>>>>> contribution
>>>>>>>>>>>>>>>> adds
>>>>>>>>>>>>>>>>> exactly these
>>>>>>>>>>>>>>>>> optimizations and makes the streaming model of Flink
>>>>>>>>>> applicable
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> harder
>>>>>>>>>>>>>>>>> batch use cases.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Assuming that the community is excited about this as
>> well,
>>>>>>>>>> and
>>>>>>>>>>> in
>>>>>>>>>>>>> favor
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> these enhancements
>>>>>>>>>>>>>>>>> to Flink's capabilities, below are some thoughts on how
>>> this
>>>>>>>>>>>>>>> contribution
>>>>>>>>>>>>>>>>> and integration
>>>>>>>>>>>>>>>>> could work.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --- Making the code available ---
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At the moment, the Blink code is in the form of a big
>> Flink
>>>>>>>>>>> fork
>>>>>>>>>>>>>>> (rather
>>>>>>>>>>>>>>>>> than isolated
>>>>>>>>>>>>>>>>> patches on top of Flink), so the integration is
>>> unfortunately
>>>>>>>>>>> not
>>>>>>>>>>>>> as
>>>>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>>> as merging a
>>>>>>>>>>>>>>>>> few patches or pull requests.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To support a non-disruptive merge of such a big
>>>>>>>>>> contribution, I
>>>>>>>>>>>>> believe
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> make sense to make
>>>>>>>>>>>>>>>>> the code of the fork available in the Flink project
>> first.
>>>>>>>>>>>>>>>>>    From there on, we can start to work on the details for
>>>>>>>>>> merging
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> enhancements, including
>>>>>>>>>>>>>>>>> the refactoring of the necessary parts in the Flink
>> master
>>>>>>>>>> and
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Blink
>>>>>>>>>>>>>>>>> code to make a
>>>>>>>>>>>>>>>>> merge possible without repeatedly breaking compatibility.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The first question is where do we put the code of the
>> Blink
>>>>>>>>>>> fork
>>>>>>>>>>>>> during
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> merging procedure?
>>>>>>>>>>>>>>>>> My first thought was to temporarily add a repository
>> (like
>>>>>>>>>>>>>>>>> "flink-blink-staging"), but we could
>>>>>>>>>>>>>>>>> also put it into a special branch in the main Flink
>>>>>>>>>> repository.
>>>>>>>>>>>>>>>>> I will start a separate thread about discussing a
>> possible
>>>>>>>>>>>>> strategy to
>>>>>>>>>>>>>>>>> handle and merge
>>>>>>>>>>>>>>>>> such a big contribution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

Kurt Young
Sure, i will do the rebase before pushing the branch.

Timo Walther <[hidden email]>于2019年1月24日 周四18:20写道:

> Regarding the content of a `blink-1.5` branch, is it possible to rebase
> the big Blink commit on top of the current master or the last Flink
> release?
>
> I don't mean a full rebase here, but just forking the branch from
> current Flink, and putting the Blink content into the repository, and
> commit it. This would enable to see a diff which classes and lines have
> changed and which are still the same. I guess this would be very helpful
> instead of a branch with a big commit that has no common origin.
>
> Thanks,
> Timo
>
> Am 24.01.19 um 02:54 schrieb Becket Qin:
> > Thanks Stephan,
> >
> > The plan makes sense to me.
> >
> > Regarding the docs, it seems better to have a separate versioned website
> > because there are a lot of changes spread over the places. We can add the
> > banner to remind users that they are looking at the blink docs, which is
> > temporary and will eventually be merged into Flink master. (The banner is
> > pretty similar to what user will see when they visit docs of old flink
> > versions
> > <
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
> >
> > [1]).
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qn
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
> >
> > On Thu, Jan 24, 2019 at 6:21 AM Shaoxuan Wang <[hidden email]>
> wrote:
> >
> >> Thanks Stephan,
> >> The entire plan looks good to me. WRT the "Docs for Flink", a subsection
> >> should be good enough if we just introduce the outlines of what blink
> has
> >> changed. However, we have made detailed introductions to blink based on
> the
> >> framework of current release document of Flink (those introductions are
> >> distributed in each subsections). Does it make sense to create a blink
> >> document as a separate one, under the documentation section, say
> blink-1.5
> >> (temporary, not a release).
> >>
> >> Regards,
> >> Shaoxuan
> >>
> >>
> >> On Wed, Jan 23, 2019 at 10:15 PM Stephan Ewen <[hidden email]> wrote:
> >>
> >>> Nice to see this lively discussion.
> >>>
> >>> *--- Branch Versus Repository ---*
> >>>
> >>> Looks like this is converging towards pushing a branch.
> >>> How about naming the branch simply "blink-1.5" ? That would be in line
> >> with
> >>> the 1.5 version branch of Flink, which is simply called "release-1.5" ?
> >>>
> >>> *--- SGA --- *
> >>>
> >>> The SGA (Software Grant Agreement) should be either filed already or in
> >> the
> >>> process of filing.
> >>>
> >>> *--- Offering Jars for Blink ---*
> >>>
> >>> As Chesnay and Timo mentioned, we cannot easily offer a "Release" of
> >> Blink
> >>> (source or binary), because that would require a thorough
> >>> checking of licenses and creating/ bundling license files. That is a
> lot
> >> of
> >>> work, as we recently experienced again in the Flink master.
> >>>
> >>> What we can do is upload compiled jar files and link to them somewhere
> in
> >>> the blink docs. We need to add a disclaimer that these are
> >>> convenience jars, and not an official Apache release. I hope that would
> >>> work for the users that are curious to try things out.
> >>>
> >>> *--- Docs for Blink --- *
> >>>
> >>> Do we need a versioned website here? If not, can we simply make this a
> >>> subsection of the current Flink snapshot docs?
> >>> Next to "Flink Development" and "Internals", we could have a section on
> >>> "Blink branch".
> >>> I think it is crucial, thought, to make it clear that this is temporary
> >> and
> >>> will eventually be subsumed by the main release, just
> >>> so that users do not get confused.
> >>>
> >>> Best,
> >>> Stephan
> >>>
> >>>
> >>> On Wed, Jan 23, 2019 at 12:23 PM Becket Qin <[hidden email]>
> >> wrote:
> >>>> Really excited to see Blink joining the Flink community!
> >>>>
> >>>> My two cents regarding repo v.s. branch, I am +1 for a branch in
> Flink.
> >>>> Among many things, what's most important at this point is probably to
> >>> make
> >>>> Blink code available to the developers so people can discuss the merge
> >>>> strategy. Creating a branch is probably the one of the fastest way to
> >> do
> >>>> that. We can always create separate repo later if necessary.
> >>>>
> >>>> WRT the doc and jar distribution, It is true that we are going to have
> >>>> some major refactoring to the code. But I can imagine some curious
> >> users
> >>>> may still want to try out something in Blink and it would be good if
> we
> >>> can
> >>>> do them a favor. Legal wise, my hunch is that it is probably OK for
> >>> someone
> >>>> to just build the jars and docs, host it somewhere for convenience.
> But
> >>> it
> >>>> should be clear that this is just for convenience purpose instead of
> an
> >>>> official release form Apache (unless we would like to make it
> >> official).
> >>>> Thanks,
> >>>>
> >>>> Jiangjie (Becket) Qin
> >>>>
> >>>> On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>>   From the ASF side Jar files do notrequire a vote/release process,
> >> this
> >>>>> is at the discretion of the PMC.
> >>>>>
> >>>>> However, I have my doubts whether at this time we could even create a
> >>>>> source release of Blink given that we'd have to vet the code-base
> >> first.
> >>>>> Even without source release we could still distribute jars, but would
> >>>>> not be allowed to advertise them to users as they do not constitute
> an
> >>>>> official release.
> >>>>>
> >>>>> On 23.01.2019 11:41, Timo Walther wrote:
> >>>>>> As far as I know it, we will not provide any binaries but only the
> >>>>>> source code. JAR files on Apache servers would need an official
> >>>>>> voting/release process. Interested users can build Blink themselves
> >>>>>> using `mvn clean package`.
> >>>>>>
> >>>>>> @Stephan: Please correct me if I'm wrong.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Timo
> >>>>>>
> >>>>>> Am 23.01.19 um 11:16 schrieb Kurt Young:
> >>>>>>> Hi Timo,
> >>>>>>>
> >>>>>>> What about the jar files, will blink's jar be uploaded to apache
> >>>>>>> repository? If not, i think it will be very inconvenient for users
> >>> who
> >>>>>>> wants to try blink and view the documents if they need some help
> >> from
> >>>>>>> doc.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Kurt
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <[hidden email]>
> >>>>> wrote:
> >>>>>>>> Hi Kurt,
> >>>>>>>>
> >>>>>>>> I would not make the Blink's documentation visible to users or
> >>> search
> >>>>>>>> engines via a website. Otherwise this would communicate that Blink
> >>>>>>>> is an
> >>>>>>>> official release. I would suggest to put the Blink docs into
> >> `/docs`
> >>>>>>>> and
> >>>>>>>> people can build it with `./docs/build.sh -pi` if there are
> >>>>> interested.
> >>>>>>>> I would not invest time into setting up a docs infrastructure.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> Am 23.01.19 um 08:56 schrieb Kurt Young:
> >>>>>>>>> Thanks @Stephan for this exciting announcement!
> >>>>>>>>>
> >>>>>>>>> >From my point of view, i would prefer to use branch. It makes
> >> the
> >>>>>>>> message
> >>>>>>>>> "Blink is pat of Flink" more straightforward and clear.
> >>>>>>>>>
> >>>>>>>>> Except for the location of blink codes, there are some other
> >>>>> questions
> >>>>>>>> like
> >>>>>>>>> what version should should use, and where do we put blink's
> >>>>> documents.
> >>>>>>>>> Currently, we choose to use "1.5.1-blink-r0" as blink's version
> >>> since
> >>>>>>>> blink
> >>>>>>>>> forked from Flink's 1.5.1. We also added some docs to blink just
> >> as
> >>>>>>>>> Flink
> >>>>>>>>> did. Can blink use a website like
> >>>>>>>>> "https://ci.apache.org/projects/flink/flink-docs-release-1.7/"
> >> to
> >>>>> put
> >>>>>>>> all
> >>>>>>>>> blink's docs, change it to something like
> >>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Kurt
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <
> >> [hidden email]
> >>>>>>>> wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> @Stephan  Thanks a lot for driving these efforts. I think a lot
> >> of
> >>>>>>>> people
> >>>>>>>>>> is already waiting for this.
> >>>>>>>>>> +1 for opening the blink source code.
> >>>>>>>>>> Both a separate repository or a special branch is ok for me.
> >>>>>>>>>> Hopefully,
> >>>>>>>>>> this will not last too long.
> >>>>>>>>>>
> >>>>>>>>>> Best, Hequn
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <[hidden email]>
> >>> wrote:
> >>>>>>>>>>> Great news! Looking forward to the new wave of developments.
> >>>>>>>>>>>
> >>>>>>>>>>> If Blink needs to be continuously updated, fix bugs, release
> >>>>>>>>>>> versions,
> >>>>>>>>>>> maybe a separate repository is a better idea.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Jark
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <
> >> [hidden email]
> >>>>>>>> wrote:
> >>>>>>>>>>>> Hey!
> >>>>>>>>>>>> I also think that creating the separate branch for Blink in
> >>>>>>>>>>>> Flink repo
> >>>>>>>>>>> is a
> >>>>>>>>>>>> better idea than creating the fork as IMHO it will allow
> >> merging
> >>>>>>>>>> changes
> >>>>>>>>>>>> more easily.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Dom.
> >>>>>>>>>>>>
> >>>>>>>>>>>> wt., 22 sty 2019 o 10:09 Ufuk Celebi <[hidden email]>
> >>> napisał(a):
> >>>>>>>>>>>>> Hey Stephan and others,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> thanks for the summary. I'm very excited about the outlined
> >>>>>>>>>>> improvements.
> >>>>>>>>>>>>> :-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Separate branch vs. fork: I'm fine with either of the
> >>>>> suggestions.
> >>>>>>>>>>>>> Depending on the expected strategy for merging the changes,
> >>>>>>>>>>>>> expected
> >>>>>>>>>>>>> number of additional changes, etc., either one or the other
> >>>>>>>>>>>>> approach
> >>>>>>>>>>>>> might be better suited.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> – Ufuk
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <[hidden email]
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Hi Driesprong,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Glad to hear that you're interested with blink's codes.
> >>>>> Actually,
> >>>>>>>>>>> blink
> >>>>>>>>>>>>>> only has one branch by itself, so either a separated repo
> >> or a
> >>>>>>>>>>> flink's
> >>>>>>>>>>>>>> branch works for blink's code share.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
> >>>>>>>>>>> <[hidden email]
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Great news Stephan!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Why not make the code available by having a fork of Flink
> >> on
> >>>>>>>>>>>> Alibaba's
> >>>>>>>>>>>>>>> Github account. This will allow us to do easy diff's in the
> >>>>>>>>>> Github
> >>>>>>>>>>> UI
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> create PR's of cherry-picked commits if needed. I can
> >> imagine
> >>>>>>>>>> that
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Blink codebase has a lot of branches by itself, so just
> >>>>>>>>>>>>>>> pushing a
> >>>>>>>>>>>>> couple of
> >>>>>>>>>>>>>>> branches to the main Flink repo is not ideal. Looking
> >> forward
> >>>>> to
> >>>>>>>>>>> it!
> >>>>>>>>>>>>>>> Cheers, Fokko
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
> >>>>>>>>>>>> [hidden email]
> >>>>>>>>>>>>>> :
> >>>>>>>>>>>>>>>> big +1 to contribute Blink codebase directly into the
> >> Apache
> >>>>>>>>>>> Flink
> >>>>>>>>>>>>>>> project.
> >>>>>>>>>>>>>>>> Looking forward to the new journey.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>> Shaoxuan
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
> >>>>>>>>>>> [hidden email]>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>     Thanks Stephan! We are hoping to make the process as
> >>>>>>>>>>>>> non-disruptive as
> >>>>>>>>>>>>>>>>> possible to the Flink community. Making the Blink
> >> codebase
> >>>>>>>>>>> public
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> first step that hopefully facilitates further
> >> discussions.
> >>>>>>>>>>>>>>>>> Xiaowei
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>        On Monday, January 21, 2019, 11:46:28 AM PST,
> >> Stephan
> >>>>>>>>>> Ewen
> >>>>>>>>>>> <
> >>>>>>>>>>>>>>>>> [hidden email]> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>     Dear Flink Community!
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Some of you may have heard it already from announcements
> >> or
> >>>>>>>>>>> from
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>>> Flink
> >>>>>>>>>>>>>>>>> Forward talk:
> >>>>>>>>>>>>>>>>> Alibaba has decided to open source its in-house
> >>> improvements
> >>>>>>>>>> to
> >>>>>>>>>>>>> Flink,
> >>>>>>>>>>>>>>>>> called Blink!
> >>>>>>>>>>>>>>>>> First of all, big thanks to team that developed these
> >>>>>>>>>>>> improvements
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> made
> >>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>> contribution possible!
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Blink has some very exciting enhancements, most
> >> prominently
> >>>>>>>>>> on
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Table
> >>>>>>>>>>>>>>>>> API/SQL side
> >>>>>>>>>>>>>>>>> and the unified execution of these programs. For batch
> >>>>>>>>>>> (bounded)
> >>>>>>>>>>>>> data,
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> SQL execution
> >>>>>>>>>>>>>>>>> has full TPC-DS coverage (which is a big deal), and the
> >>>>>>>>>>> execution
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>> than 10x faster
> >>>>>>>>>>>>>>>>> than the current SQL runtime in Flink. Blink has also
> >> added
> >>>>>>>>>>>>> support for
> >>>>>>>>>>>>>>>>> catalogs,
> >>>>>>>>>>>>>>>>> improved the failover speed of batch queries and the
> >>> resource
> >>>>>>>>>>>>>>> management.
> >>>>>>>>>>>>>>>>> It also
> >>>>>>>>>>>>>>>>> makes some good steps in the direction of more deeply
> >>>>>>>>>> unifying
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> batch
> >>>>>>>>>>>>>>>>> and streaming
> >>>>>>>>>>>>>>>>> execution.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The proposal is to merge Blink's enhancements into Flink,
> >>> to
> >>>>>>>>>>> give
> >>>>>>>>>>>>>>> Flink's
> >>>>>>>>>>>>>>>>> SQL/Table API and
> >>>>>>>>>>>>>>>>> execution a big boost in usability and performance.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Just to avoid any confusion: This is not a suggested
> >> change
> >>>>>>>>>> of
> >>>>>>>>>>>>> focus to
> >>>>>>>>>>>>>>>>> batch processing,
> >>>>>>>>>>>>>>>>> nor would this break with any of the streaming
> >> architecture
> >>>>>>>>>> and
> >>>>>>>>>>>>> vision
> >>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> Flink.
> >>>>>>>>>>>>>>>>> This contribution follows very much the principle of
> >> "batch
> >>>>>>>>>> is
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> special
> >>>>>>>>>>>>>>>>> case of streaming".
> >>>>>>>>>>>>>>>>> As a special case, batch makes special optimizations
> >>>>>>>>>> possible.
> >>>>>>>>>>> In
> >>>>>>>>>>>>> its
> >>>>>>>>>>>>>>>>> current state,
> >>>>>>>>>>>>>>>>> Flink does not exploit many of these optimizations. This
> >>>>>>>>>>>>> contribution
> >>>>>>>>>>>>>>>> adds
> >>>>>>>>>>>>>>>>> exactly these
> >>>>>>>>>>>>>>>>> optimizations and makes the streaming model of Flink
> >>>>>>>>>> applicable
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>> harder
> >>>>>>>>>>>>>>>>> batch use cases.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Assuming that the community is excited about this as
> >> well,
> >>>>>>>>>> and
> >>>>>>>>>>> in
> >>>>>>>>>>>>> favor
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> these enhancements
> >>>>>>>>>>>>>>>>> to Flink's capabilities, below are some thoughts on how
> >>> this
> >>>>>>>>>>>>>>> contribution
> >>>>>>>>>>>>>>>>> and integration
> >>>>>>>>>>>>>>>>> could work.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --- Making the code available ---
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> At the moment, the Blink code is in the form of a big
> >> Flink
> >>>>>>>>>>> fork
> >>>>>>>>>>>>>>> (rather
> >>>>>>>>>>>>>>>>> than isolated
> >>>>>>>>>>>>>>>>> patches on top of Flink), so the integration is
> >>> unfortunately
> >>>>>>>>>>> not
> >>>>>>>>>>>>> as
> >>>>>>>>>>>>>>> easy
> >>>>>>>>>>>>>>>>> as merging a
> >>>>>>>>>>>>>>>>> few patches or pull requests.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> To support a non-disruptive merge of such a big
> >>>>>>>>>> contribution, I
> >>>>>>>>>>>>> believe
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> make sense to make
> >>>>>>>>>>>>>>>>> the code of the fork available in the Flink project
> >> first.
> >>>>>>>>>>>>>>>>>    From there on, we can start to work on the details for
> >>>>>>>>>> merging
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> enhancements, including
> >>>>>>>>>>>>>>>>> the refactoring of the necessary parts in the Flink
> >> master
> >>>>>>>>>> and
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Blink
> >>>>>>>>>>>>>>>>> code to make a
> >>>>>>>>>>>>>>>>> merge possible without repeatedly breaking compatibility.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The first question is where do we put the code of the
> >> Blink
> >>>>>>>>>>> fork
> >>>>>>>>>>>>> during
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> merging procedure?
> >>>>>>>>>>>>>>>>> My first thought was to temporarily add a repository
> >> (like
> >>>>>>>>>>>>>>>>> "flink-blink-staging"), but we could
> >>>>>>>>>>>>>>>>> also put it into a special branch in the main Flink
> >>>>>>>>>> repository.
> >>>>>>>>>>>>>>>>> I will start a separate thread about discussing a
> >> possible
> >>>>>>>>>>>>> strategy to
> >>>>>>>>>>>>>>>>> handle and merge
> >>>>>>>>>>>>>>>>> such a big contribution.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Stephan
> >>>>>>>>>>>>>>>>>
> >>>>>>
> >>>>>
>
> --
Best,
Kurt
Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

bowen.li
Exciting to see this happening!

Wrt doc, have we done a diff which can show us how much differences are
between Flink's and Blink's documentation (flink/docs)? For example, how
many pages and how much percentage of each page is different? How many new
pages (for new features) does Blink have?If we have such a summary or
visualization, it may give us a better idea which approach we should go
with.

Another perspective is that, though the main feature differences between
Flink and Blink that the community is interested in are SQL/Table API and
Batch, Blink's code changes seem to be much more extensive and touches more
modules and behaviors. As a user, I'd love to have a more consistent
experience of understanding and trying Blink, and a separate versioned
website works best in such a case.

Thanks,
Bowen


On Thu, Jan 24, 2019 at 4:22 AM Kurt Young <[hidden email]> wrote:

> Sure, i will do the rebase before pushing the branch.
>
> Timo Walther <[hidden email]>于2019年1月24日 周四18:20写道:
>
> > Regarding the content of a `blink-1.5` branch, is it possible to rebase
> > the big Blink commit on top of the current master or the last Flink
> > release?
> >
> > I don't mean a full rebase here, but just forking the branch from
> > current Flink, and putting the Blink content into the repository, and
> > commit it. This would enable to see a diff which classes and lines have
> > changed and which are still the same. I guess this would be very helpful
> > instead of a branch with a big commit that has no common origin.
> >
> > Thanks,
> > Timo
> >
> > Am 24.01.19 um 02:54 schrieb Becket Qin:
> > > Thanks Stephan,
> > >
> > > The plan makes sense to me.
> > >
> > > Regarding the docs, it seems better to have a separate versioned
> website
> > > because there are a lot of changes spread over the places. We can add
> the
> > > banner to remind users that they are looking at the blink docs, which
> is
> > > temporary and will eventually be merged into Flink master. (The banner
> is
> > > pretty similar to what user will see when they visit docs of old flink
> > > versions
> > > <
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
> > >
> > > [1]).
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qn
> > >
> > > [1]
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
> > >
> > > On Thu, Jan 24, 2019 at 6:21 AM Shaoxuan Wang <[hidden email]>
> > wrote:
> > >
> > >> Thanks Stephan,
> > >> The entire plan looks good to me. WRT the "Docs for Flink", a
> subsection
> > >> should be good enough if we just introduce the outlines of what blink
> > has
> > >> changed. However, we have made detailed introductions to blink based
> on
> > the
> > >> framework of current release document of Flink (those introductions
> are
> > >> distributed in each subsections). Does it make sense to create a blink
> > >> document as a separate one, under the documentation section, say
> > blink-1.5
> > >> (temporary, not a release).
> > >>
> > >> Regards,
> > >> Shaoxuan
> > >>
> > >>
> > >> On Wed, Jan 23, 2019 at 10:15 PM Stephan Ewen <[hidden email]>
> wrote:
> > >>
> > >>> Nice to see this lively discussion.
> > >>>
> > >>> *--- Branch Versus Repository ---*
> > >>>
> > >>> Looks like this is converging towards pushing a branch.
> > >>> How about naming the branch simply "blink-1.5" ? That would be in
> line
> > >> with
> > >>> the 1.5 version branch of Flink, which is simply called
> "release-1.5" ?
> > >>>
> > >>> *--- SGA --- *
> > >>>
> > >>> The SGA (Software Grant Agreement) should be either filed already or
> in
> > >> the
> > >>> process of filing.
> > >>>
> > >>> *--- Offering Jars for Blink ---*
> > >>>
> > >>> As Chesnay and Timo mentioned, we cannot easily offer a "Release" of
> > >> Blink
> > >>> (source or binary), because that would require a thorough
> > >>> checking of licenses and creating/ bundling license files. That is a
> > lot
> > >> of
> > >>> work, as we recently experienced again in the Flink master.
> > >>>
> > >>> What we can do is upload compiled jar files and link to them
> somewhere
> > in
> > >>> the blink docs. We need to add a disclaimer that these are
> > >>> convenience jars, and not an official Apache release. I hope that
> would
> > >>> work for the users that are curious to try things out.
> > >>>
> > >>> *--- Docs for Blink --- *
> > >>>
> > >>> Do we need a versioned website here? If not, can we simply make this
> a
> > >>> subsection of the current Flink snapshot docs?
> > >>> Next to "Flink Development" and "Internals", we could have a section
> on
> > >>> "Blink branch".
> > >>> I think it is crucial, thought, to make it clear that this is
> temporary
> > >> and
> > >>> will eventually be subsumed by the main release, just
> > >>> so that users do not get confused.
> > >>>
> > >>> Best,
> > >>> Stephan
> > >>>
> > >>>
> > >>> On Wed, Jan 23, 2019 at 12:23 PM Becket Qin <[hidden email]>
> > >> wrote:
> > >>>> Really excited to see Blink joining the Flink community!
> > >>>>
> > >>>> My two cents regarding repo v.s. branch, I am +1 for a branch in
> > Flink.
> > >>>> Among many things, what's most important at this point is probably
> to
> > >>> make
> > >>>> Blink code available to the developers so people can discuss the
> merge
> > >>>> strategy. Creating a branch is probably the one of the fastest way
> to
> > >> do
> > >>>> that. We can always create separate repo later if necessary.
> > >>>>
> > >>>> WRT the doc and jar distribution, It is true that we are going to
> have
> > >>>> some major refactoring to the code. But I can imagine some curious
> > >> users
> > >>>> may still want to try out something in Blink and it would be good if
> > we
> > >>> can
> > >>>> do them a favor. Legal wise, my hunch is that it is probably OK for
> > >>> someone
> > >>>> to just build the jars and docs, host it somewhere for convenience.
> > But
> > >>> it
> > >>>> should be clear that this is just for convenience purpose instead of
> > an
> > >>>> official release form Apache (unless we would like to make it
> > >> official).
> > >>>> Thanks,
> > >>>>
> > >>>> Jiangjie (Becket) Qin
> > >>>>
> > >>>> On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <
> [hidden email]>
> > >>>> wrote:
> > >>>>
> > >>>>>   From the ASF side Jar files do notrequire a vote/release process,
> > >> this
> > >>>>> is at the discretion of the PMC.
> > >>>>>
> > >>>>> However, I have my doubts whether at this time we could even
> create a
> > >>>>> source release of Blink given that we'd have to vet the code-base
> > >> first.
> > >>>>> Even without source release we could still distribute jars, but
> would
> > >>>>> not be allowed to advertise them to users as they do not constitute
> > an
> > >>>>> official release.
> > >>>>>
> > >>>>> On 23.01.2019 11:41, Timo Walther wrote:
> > >>>>>> As far as I know it, we will not provide any binaries but only the
> > >>>>>> source code. JAR files on Apache servers would need an official
> > >>>>>> voting/release process. Interested users can build Blink
> themselves
> > >>>>>> using `mvn clean package`.
> > >>>>>>
> > >>>>>> @Stephan: Please correct me if I'm wrong.
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Timo
> > >>>>>>
> > >>>>>> Am 23.01.19 um 11:16 schrieb Kurt Young:
> > >>>>>>> Hi Timo,
> > >>>>>>>
> > >>>>>>> What about the jar files, will blink's jar be uploaded to apache
> > >>>>>>> repository? If not, i think it will be very inconvenient for
> users
> > >>> who
> > >>>>>>> wants to try blink and view the documents if they need some help
> > >> from
> > >>>>>>> doc.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Kurt
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <[hidden email]
> >
> > >>>>> wrote:
> > >>>>>>>> Hi Kurt,
> > >>>>>>>>
> > >>>>>>>> I would not make the Blink's documentation visible to users or
> > >>> search
> > >>>>>>>> engines via a website. Otherwise this would communicate that
> Blink
> > >>>>>>>> is an
> > >>>>>>>> official release. I would suggest to put the Blink docs into
> > >> `/docs`
> > >>>>>>>> and
> > >>>>>>>> people can build it with `./docs/build.sh -pi` if there are
> > >>>>> interested.
> > >>>>>>>> I would not invest time into setting up a docs infrastructure.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Timo
> > >>>>>>>>
> > >>>>>>>> Am 23.01.19 um 08:56 schrieb Kurt Young:
> > >>>>>>>>> Thanks @Stephan for this exciting announcement!
> > >>>>>>>>>
> > >>>>>>>>> >From my point of view, i would prefer to use branch. It makes
> > >> the
> > >>>>>>>> message
> > >>>>>>>>> "Blink is pat of Flink" more straightforward and clear.
> > >>>>>>>>>
> > >>>>>>>>> Except for the location of blink codes, there are some other
> > >>>>> questions
> > >>>>>>>> like
> > >>>>>>>>> what version should should use, and where do we put blink's
> > >>>>> documents.
> > >>>>>>>>> Currently, we choose to use "1.5.1-blink-r0" as blink's version
> > >>> since
> > >>>>>>>> blink
> > >>>>>>>>> forked from Flink's 1.5.1. We also added some docs to blink
> just
> > >> as
> > >>>>>>>>> Flink
> > >>>>>>>>> did. Can blink use a website like
> > >>>>>>>>> "https://ci.apache.org/projects/flink/flink-docs-release-1.7/"
> > >> to
> > >>>>> put
> > >>>>>>>> all
> > >>>>>>>>> blink's docs, change it to something like
> > >>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Kurt
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <
> > >> [hidden email]
> > >>>>>>>> wrote:
> > >>>>>>>>>> Hi all,
> > >>>>>>>>>>
> > >>>>>>>>>> @Stephan  Thanks a lot for driving these efforts. I think a
> lot
> > >> of
> > >>>>>>>> people
> > >>>>>>>>>> is already waiting for this.
> > >>>>>>>>>> +1 for opening the blink source code.
> > >>>>>>>>>> Both a separate repository or a special branch is ok for me.
> > >>>>>>>>>> Hopefully,
> > >>>>>>>>>> this will not last too long.
> > >>>>>>>>>>
> > >>>>>>>>>> Best, Hequn
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <[hidden email]>
> > >>> wrote:
> > >>>>>>>>>>> Great news! Looking forward to the new wave of developments.
> > >>>>>>>>>>>
> > >>>>>>>>>>> If Blink needs to be continuously updated, fix bugs, release
> > >>>>>>>>>>> versions,
> > >>>>>>>>>>> maybe a separate repository is a better idea.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Jark
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <
> > >> [hidden email]
> > >>>>>>>> wrote:
> > >>>>>>>>>>>> Hey!
> > >>>>>>>>>>>> I also think that creating the separate branch for Blink in
> > >>>>>>>>>>>> Flink repo
> > >>>>>>>>>>> is a
> > >>>>>>>>>>>> better idea than creating the fork as IMHO it will allow
> > >> merging
> > >>>>>>>>>> changes
> > >>>>>>>>>>>> more easily.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>> Dom.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> wt., 22 sty 2019 o 10:09 Ufuk Celebi <[hidden email]>
> > >>> napisał(a):
> > >>>>>>>>>>>>> Hey Stephan and others,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> thanks for the summary. I'm very excited about the outlined
> > >>>>>>>>>>> improvements.
> > >>>>>>>>>>>>> :-)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Separate branch vs. fork: I'm fine with either of the
> > >>>>> suggestions.
> > >>>>>>>>>>>>> Depending on the expected strategy for merging the changes,
> > >>>>>>>>>>>>> expected
> > >>>>>>>>>>>>> number of additional changes, etc., either one or the other
> > >>>>>>>>>>>>> approach
> > >>>>>>>>>>>>> might be better suited.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> – Ufuk
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <
> [hidden email]
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> Hi Driesprong,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Glad to hear that you're interested with blink's codes.
> > >>>>> Actually,
> > >>>>>>>>>>> blink
> > >>>>>>>>>>>>>> only has one branch by itself, so either a separated repo
> > >> or a
> > >>>>>>>>>>> flink's
> > >>>>>>>>>>>>>> branch works for blink's code share.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>> Kurt
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
> > >>>>>>>>>>> <[hidden email]
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Great news Stephan!
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Why not make the code available by having a fork of Flink
> > >> on
> > >>>>>>>>>>>> Alibaba's
> > >>>>>>>>>>>>>>> Github account. This will allow us to do easy diff's in
> the
> > >>>>>>>>>> Github
> > >>>>>>>>>>> UI
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> create PR's of cherry-picked commits if needed. I can
> > >> imagine
> > >>>>>>>>>> that
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> Blink codebase has a lot of branches by itself, so just
> > >>>>>>>>>>>>>>> pushing a
> > >>>>>>>>>>>>> couple of
> > >>>>>>>>>>>>>>> branches to the main Flink repo is not ideal. Looking
> > >> forward
> > >>>>> to
> > >>>>>>>>>>> it!
> > >>>>>>>>>>>>>>> Cheers, Fokko
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
> > >>>>>>>>>>>> [hidden email]
> > >>>>>>>>>>>>>> :
> > >>>>>>>>>>>>>>>> big +1 to contribute Blink codebase directly into the
> > >> Apache
> > >>>>>>>>>>> Flink
> > >>>>>>>>>>>>>>> project.
> > >>>>>>>>>>>>>>>> Looking forward to the new journey.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>> Shaoxuan
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
> > >>>>>>>>>>> [hidden email]>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>     Thanks Stephan! We are hoping to make the process
> as
> > >>>>>>>>>>>>> non-disruptive as
> > >>>>>>>>>>>>>>>>> possible to the Flink community. Making the Blink
> > >> codebase
> > >>>>>>>>>>> public
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> first step that hopefully facilitates further
> > >> discussions.
> > >>>>>>>>>>>>>>>>> Xiaowei
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>        On Monday, January 21, 2019, 11:46:28 AM PST,
> > >> Stephan
> > >>>>>>>>>> Ewen
> > >>>>>>>>>>> <
> > >>>>>>>>>>>>>>>>> [hidden email]> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>     Dear Flink Community!
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Some of you may have heard it already from
> announcements
> > >> or
> > >>>>>>>>>>> from
> > >>>>>>>>>>>> a
> > >>>>>>>>>>>>>>> Flink
> > >>>>>>>>>>>>>>>>> Forward talk:
> > >>>>>>>>>>>>>>>>> Alibaba has decided to open source its in-house
> > >>> improvements
> > >>>>>>>>>> to
> > >>>>>>>>>>>>> Flink,
> > >>>>>>>>>>>>>>>>> called Blink!
> > >>>>>>>>>>>>>>>>> First of all, big thanks to team that developed these
> > >>>>>>>>>>>> improvements
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>> made
> > >>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>> contribution possible!
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Blink has some very exciting enhancements, most
> > >> prominently
> > >>>>>>>>>> on
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> Table
> > >>>>>>>>>>>>>>>>> API/SQL side
> > >>>>>>>>>>>>>>>>> and the unified execution of these programs. For batch
> > >>>>>>>>>>> (bounded)
> > >>>>>>>>>>>>> data,
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> SQL execution
> > >>>>>>>>>>>>>>>>> has full TPC-DS coverage (which is a big deal), and the
> > >>>>>>>>>>> execution
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>> more
> > >>>>>>>>>>>>>>>>> than 10x faster
> > >>>>>>>>>>>>>>>>> than the current SQL runtime in Flink. Blink has also
> > >> added
> > >>>>>>>>>>>>> support for
> > >>>>>>>>>>>>>>>>> catalogs,
> > >>>>>>>>>>>>>>>>> improved the failover speed of batch queries and the
> > >>> resource
> > >>>>>>>>>>>>>>> management.
> > >>>>>>>>>>>>>>>>> It also
> > >>>>>>>>>>>>>>>>> makes some good steps in the direction of more deeply
> > >>>>>>>>>> unifying
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> batch
> > >>>>>>>>>>>>>>>>> and streaming
> > >>>>>>>>>>>>>>>>> execution.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> The proposal is to merge Blink's enhancements into
> Flink,
> > >>> to
> > >>>>>>>>>>> give
> > >>>>>>>>>>>>>>> Flink's
> > >>>>>>>>>>>>>>>>> SQL/Table API and
> > >>>>>>>>>>>>>>>>> execution a big boost in usability and performance.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Just to avoid any confusion: This is not a suggested
> > >> change
> > >>>>>>>>>> of
> > >>>>>>>>>>>>> focus to
> > >>>>>>>>>>>>>>>>> batch processing,
> > >>>>>>>>>>>>>>>>> nor would this break with any of the streaming
> > >> architecture
> > >>>>>>>>>> and
> > >>>>>>>>>>>>> vision
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>> Flink.
> > >>>>>>>>>>>>>>>>> This contribution follows very much the principle of
> > >> "batch
> > >>>>>>>>>> is
> > >>>>>>>>>>> a
> > >>>>>>>>>>>>>>> special
> > >>>>>>>>>>>>>>>>> case of streaming".
> > >>>>>>>>>>>>>>>>> As a special case, batch makes special optimizations
> > >>>>>>>>>> possible.
> > >>>>>>>>>>> In
> > >>>>>>>>>>>>> its
> > >>>>>>>>>>>>>>>>> current state,
> > >>>>>>>>>>>>>>>>> Flink does not exploit many of these optimizations.
> This
> > >>>>>>>>>>>>> contribution
> > >>>>>>>>>>>>>>>> adds
> > >>>>>>>>>>>>>>>>> exactly these
> > >>>>>>>>>>>>>>>>> optimizations and makes the streaming model of Flink
> > >>>>>>>>>> applicable
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> harder
> > >>>>>>>>>>>>>>>>> batch use cases.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Assuming that the community is excited about this as
> > >> well,
> > >>>>>>>>>> and
> > >>>>>>>>>>> in
> > >>>>>>>>>>>>> favor
> > >>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>> these enhancements
> > >>>>>>>>>>>>>>>>> to Flink's capabilities, below are some thoughts on how
> > >>> this
> > >>>>>>>>>>>>>>> contribution
> > >>>>>>>>>>>>>>>>> and integration
> > >>>>>>>>>>>>>>>>> could work.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> --- Making the code available ---
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> At the moment, the Blink code is in the form of a big
> > >> Flink
> > >>>>>>>>>>> fork
> > >>>>>>>>>>>>>>> (rather
> > >>>>>>>>>>>>>>>>> than isolated
> > >>>>>>>>>>>>>>>>> patches on top of Flink), so the integration is
> > >>> unfortunately
> > >>>>>>>>>>> not
> > >>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>> easy
> > >>>>>>>>>>>>>>>>> as merging a
> > >>>>>>>>>>>>>>>>> few patches or pull requests.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> To support a non-disruptive merge of such a big
> > >>>>>>>>>> contribution, I
> > >>>>>>>>>>>>> believe
> > >>>>>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>> make sense to make
> > >>>>>>>>>>>>>>>>> the code of the fork available in the Flink project
> > >> first.
> > >>>>>>>>>>>>>>>>>    From there on, we can start to work on the details
> for
> > >>>>>>>>>> merging
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> enhancements, including
> > >>>>>>>>>>>>>>>>> the refactoring of the necessary parts in the Flink
> > >> master
> > >>>>>>>>>> and
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> Blink
> > >>>>>>>>>>>>>>>>> code to make a
> > >>>>>>>>>>>>>>>>> merge possible without repeatedly breaking
> compatibility.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> The first question is where do we put the code of the
> > >> Blink
> > >>>>>>>>>>> fork
> > >>>>>>>>>>>>> during
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> merging procedure?
> > >>>>>>>>>>>>>>>>> My first thought was to temporarily add a repository
> > >> (like
> > >>>>>>>>>>>>>>>>> "flink-blink-staging"), but we could
> > >>>>>>>>>>>>>>>>> also put it into a special branch in the main Flink
> > >>>>>>>>>> repository.
> > >>>>>>>>>>>>>>>>> I will start a separate thread about discussing a
> > >> possible
> > >>>>>>>>>>>>> strategy to
> > >>>>>>>>>>>>>>>>> handle and merge
> > >>>>>>>>>>>>>>>>> such a big contribution.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>> Stephan
> > >>>>>>>>>>>>>>>>>
> > >>>>>>
> > >>>>>
> >
> > --
> Best,
> Kurt
>
Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

jincheng sun
In reply to this post by Stephan Ewen
Thanks Stephan,

The entire plan makes sense to me.

Regarding the branch name, how about using "blink-flink-1.5" wich meant
that the branche is based on flink-1.5.  If the name is "blink-1.5", some
users will think that this is the version number of the internal Blink of
alibaba, and will not associate with the branch of flink-1.5. This is only
one of my concerns.

Wrt Offering Jars for Blink, from the points of my view, if we do not
Release the Blink code, we can write the blog documentation to detail how
to build and deploy Blink from source code. I think after we push the blink
branch, we also some bug fix and small function development(which urgently
needed by users). So telling users how to build a release package from
source is very important。 Something like current "Building Flionk from
Source" section of flink doc. In this way we are both user-friendly and
avoid any liability issues.

Wrt the "Docs for Flink", if we expect users to take advantage of the
functionality of blink, and the blink branch will also make bugfix changes,
I suggest adding an address same as ”
https://ci.apache.org/projects/flink/flink-docs-master“, e.g.: "https ://
ci.apache.org/projects/flink/flink-docs-blink", so users can have a
complete user experience, just like every version published by flink "
https://ci.apache.org/projects /flink/flink-docs-release-XX", the
difference is that we do not declare release, do not assume the quality and
responsibility of the release. So, I agree with
@Shaoxuan Wang <[hidden email]> 's suggestion. If I misunderstood what
you mean, please correct me. @Shaoxuan Wang <[hidden email]>

Regards,
Jincheng

Stephan Ewen <[hidden email]> 于2019年1月22日周二 上午3:46写道:

> Dear Flink Community!
>
> Some of you may have heard it already from announcements or from a Flink
> Forward talk:
> Alibaba has decided to open source its in-house improvements to Flink,
> called Blink!
> First of all, big thanks to team that developed these improvements and made
> this
> contribution possible!
>
> Blink has some very exciting enhancements, most prominently on the Table
> API/SQL side
> and the unified execution of these programs. For batch (bounded) data, the
> SQL execution
> has full TPC-DS coverage (which is a big deal), and the execution is more
> than 10x faster
> than the current SQL runtime in Flink. Blink has also added support for
> catalogs,
> improved the failover speed of batch queries and the resource management.
> It also
> makes some good steps in the direction of more deeply unifying the batch
> and streaming
> execution.
>
> The proposal is to merge Blink's enhancements into Flink, to give Flink's
> SQL/Table API and
> execution a big boost in usability and performance.
>
> Just to avoid any confusion: This is not a suggested change of focus to
> batch processing,
> nor would this break with any of the streaming architecture and vision of
> Flink.
> This contribution follows very much the principle of "batch is a special
> case of streaming".
> As a special case, batch makes special optimizations possible. In its
> current state,
> Flink does not exploit many of these optimizations. This contribution adds
> exactly these
> optimizations and makes the streaming model of Flink applicable to harder
> batch use cases.
>
> Assuming that the community is excited about this as well, and in favor of
> these enhancements
> to Flink's capabilities, below are some thoughts on how this contribution
> and integration
> could work.
>
> --- Making the code available ---
>
> At the moment, the Blink code is in the form of a big Flink fork (rather
> than isolated
> patches on top of Flink), so the integration is unfortunately not as easy
> as merging a
> few patches or pull requests.
>
> To support a non-disruptive merge of such a big contribution, I believe it
> make sense to make
> the code of the fork available in the Flink project first.
> From there on, we can start to work on the details for merging the
> enhancements, including
> the refactoring of the necessary parts in the Flink master and the Blink
> code to make a
> merge possible without repeatedly breaking compatibility.
>
> The first question is where do we put the code of the Blink fork during the
> merging procedure?
> My first thought was to temporarily add a repository (like
> "flink-blink-staging"), but we could
> also put it into a special branch in the main Flink repository.
>
>
> I will start a separate thread about discussing a possible strategy to
> handle and merge
> such a big contribution.
>
> Best,
> Stephan
>
Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Contributing Alibaba's Blink

Pēteris Kļaviņš
In reply to this post by Kurt Young
IMHO it would be a pity if we lost the Git history of changes made to the
Blink repository since the time that it was forked from Flink.

If the Blink repository is a true fork, i.e., branch at a certain
time/commit id, from the Flink repository, then there is no problem. The
entire Blink repository could be folded in to the Flink repository, simply
because it actually *is* the Flink repository. Someone with access to both
repositories would point to the tips of each of the Blink repository
branches to bring in, and Git push them into the Flink repository one by
one.

If the Blink repository was taken as a source code snapshot at some point in
time/commit id, then it should be fairly easy to locate the precise commit
id within the Flink repository that the snapshot was taken of, and then the
Blink repository can be rebased on top of that commit id.

Both these sorts of repository merge operations could be experimented on by
someone with access to both repositories, on their private machines, without
pushing the results to the Apache Flink repository. It's only the push to
Apache that is actually 'publishing' the changes to the world, and that
involves the appropriate licensing searches/permission requests/approvals.

Thanks,
Peter



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
12