Hello,
the binary distribution that we release by now contains quite a lot of optional components, including various filesystems, metric reporters and libraries. Most users will only use a fraction of these, and as such pretty much only increase the size of flink-dist. With Flink growing more and more in scope I don't believe it to be feasible to ship everything we have with every distribution, and instead suggest more of a "pick-what-you-need" model, where flink-dist is rather lean and additional components are downloaded separately and added by the user. This would primarily affect the /opt directory, but could also be extended to cover flink-dist. For example, the yarn and mesos code could be spliced out into separate jars that could be added to lib manually. Let me know what you think. Regards, Chesnay |
Hi Chesnay,
Thank you for the proposal. I think this is a good idea. We follow a similar approach already for Hadoop dependencies and connectors (although in application space). +1 Fabian Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < [hidden email]>: > Hello, > > the binary distribution that we release by now contains quite a lot of > optional components, including various filesystems, metric reporters and > libraries. Most users will only use a fraction of these, and as such > pretty much only increase the size of flink-dist. > > With Flink growing more and more in scope I don't believe it to be > feasible to ship everything we have with every distribution, and instead > suggest more of a "pick-what-you-need" model, where flink-dist is rather > lean and additional components are downloaded separately and added by > the user. > > This would primarily affect the /opt directory, but could also be > extended to cover flink-dist. For example, the yarn and mesos code could > be spliced out into separate jars that could be added to lib manually. > > Let me know what you think. > > Regards, > > Chesnay > > |
I'm not sure if this is required. It's quite convenient to be able to just
grab a single tarball and you've got everything you need. I just did this for the latest binary release and it was 273MB and took about 25 seconds to download. Of course I know connection speeds vary quite a bit but I don't think 273 MB seems onerous to download and I like the simplicity of it the way it is. On Fri, Jan 18, 2019 at 3:34 AM Fabian Hueske <[hidden email]> wrote: > Hi Chesnay, > > Thank you for the proposal. > I think this is a good idea. > We follow a similar approach already for Hadoop dependencies and > connectors (although in application space). > > +1 > > Fabian > > Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < > [hidden email]>: > >> Hello, >> >> the binary distribution that we release by now contains quite a lot of >> optional components, including various filesystems, metric reporters and >> libraries. Most users will only use a fraction of these, and as such >> pretty much only increase the size of flink-dist. >> >> With Flink growing more and more in scope I don't believe it to be >> feasible to ship everything we have with every distribution, and instead >> suggest more of a "pick-what-you-need" model, where flink-dist is rather >> lean and additional components are downloaded separately and added by >> the user. >> >> This would primarily affect the /opt directory, but could also be >> extended to cover flink-dist. For example, the yarn and mesos code could >> be spliced out into separate jars that could be added to lib manually. >> >> Let me know what you think. >> >> Regards, >> >> Chesnay >> >> |
In reply to this post by Fabian Hueske-2
Thanks Chesnay for raising this discussion thread. I think there are 3
major use scenarios for flink binary distribution. 1. Use it to set up standalone cluster 2. Use it to experience features of flink, such as via scala-shell, sql-client 3. Downstream project use it to integrate with their system I did a size estimation of flink dist folder, lib folder take around 100M and opt folder take around 200M. Overall I agree to make a thin flink dist. So the next problem is which components to drop. I check the opt folder, and I think the filesystem components and metrics components could be moved out. Because they are pluggable components and is only used in scenario 1 I think (setting up standalone cluster). Other components like flink-table, flink-ml, flnk-gellay, we should still keep them IMHO, because new user may still use it to try the features of flink. For me, scala-shell is the first option to try new features of flink. Fabian Hueske <[hidden email]> 于2019年1月18日周五 下午7:34写道: > Hi Chesnay, > > Thank you for the proposal. > I think this is a good idea. > We follow a similar approach already for Hadoop dependencies and > connectors (although in application space). > > +1 > > Fabian > > Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < > [hidden email]>: > >> Hello, >> >> the binary distribution that we release by now contains quite a lot of >> optional components, including various filesystems, metric reporters and >> libraries. Most users will only use a fraction of these, and as such >> pretty much only increase the size of flink-dist. >> >> With Flink growing more and more in scope I don't believe it to be >> feasible to ship everything we have with every distribution, and instead >> suggest more of a "pick-what-you-need" model, where flink-dist is rather >> lean and additional components are downloaded separately and added by >> the user. >> >> This would primarily affect the /opt directory, but could also be >> extended to cover flink-dist. For example, the yarn and mesos code could >> be spliced out into separate jars that could be added to lib manually. >> >> Let me know what you think. >> >> Regards, >> >> Chesnay >> >> -- Best Regards Jeff Zhang |
There are some points where a leaner approach could help.
There are many libraries and connectors that are currently being adding to Flink, which makes the "include all" approach not completely feasible in long run: - Connectors: For a proper experience with the Shell/CLI (for example for SQL) we need a lot of fat connector jars. These come often for multiple versions, which alone accounts for 100s of MBs of connector jars. - The pre-bundled FileSystems are also on the verge of adding 100s of MBs themselves. - The metric reporters are bit by bit growing as well. The following could be a compromise: The flink-dist would include - the core flink libraries (core, apis, runtime, etc.) - yarn / mesos etc. adapters - examples (the examples should be a small set of self-contained programs without additional dependencies) - default logging - default metric reporter (jmx) - shells (scala, sql) The flink-dist would NOT include the following libs (and these would be offered for individual download) - Hadoop libs - the pre-shaded file systems - the pre-packaged SQL connectors - additional metric reporters On Tue, Jan 22, 2019 at 3:19 AM Jeff Zhang <[hidden email]> wrote: > Thanks Chesnay for raising this discussion thread. I think there are 3 > major use scenarios for flink binary distribution. > > 1. Use it to set up standalone cluster > 2. Use it to experience features of flink, such as via scala-shell, > sql-client > 3. Downstream project use it to integrate with their system > > I did a size estimation of flink dist folder, lib folder take around 100M > and opt folder take around 200M. Overall I agree to make a thin flink dist. > So the next problem is which components to drop. I check the opt folder, > and I think the filesystem components and metrics components could be moved > out. Because they are pluggable components and is only used in scenario 1 I > think (setting up standalone cluster). Other components like flink-table, > flink-ml, flnk-gellay, we should still keep them IMHO, because new user may > still use it to try the features of flink. For me, scala-shell is the first > option to try new features of flink. > > > > Fabian Hueske <[hidden email]> 于2019年1月18日周五 下午7:34写道: > >> Hi Chesnay, >> >> Thank you for the proposal. >> I think this is a good idea. >> We follow a similar approach already for Hadoop dependencies and >> connectors (although in application space). >> >> +1 >> >> Fabian >> >> Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < >> [hidden email]>: >> >>> Hello, >>> >>> the binary distribution that we release by now contains quite a lot of >>> optional components, including various filesystems, metric reporters and >>> libraries. Most users will only use a fraction of these, and as such >>> pretty much only increase the size of flink-dist. >>> >>> With Flink growing more and more in scope I don't believe it to be >>> feasible to ship everything we have with every distribution, and instead >>> suggest more of a "pick-what-you-need" model, where flink-dist is rather >>> lean and additional components are downloaded separately and added by >>> the user. >>> >>> This would primarily affect the /opt directory, but could also be >>> extended to cover flink-dist. For example, the yarn and mesos code could >>> be spliced out into separate jars that could be added to lib manually. >>> >>> Let me know what you think. >>> >>> Regards, >>> >>> Chesnay >>> >>> > > -- > Best Regards > > Jeff Zhang > |
I like the idea of a leaner binary distribution. At the same time I
agree with Jamie that the current binary is quite convenient and connection speeds should not be that big of a deal. Since the binary distribution is one of the first entry points for users, I'd like to keep it as user-friendly as possible. What do you think about building a lean distribution by default and a "full" distribution that still bundles all the optional dependencies for releases? (If you don't think that's feasible I'm still +1 to only go with the "lean dist" approach.) – Ufuk On Wed, Jan 23, 2019 at 9:36 AM Stephan Ewen <[hidden email]> wrote: > > There are some points where a leaner approach could help. > There are many libraries and connectors that are currently being adding to > Flink, which makes the "include all" approach not completely feasible in > long run: > > - Connectors: For a proper experience with the Shell/CLI (for example for > SQL) we need a lot of fat connector jars. > These come often for multiple versions, which alone accounts for 100s > of MBs of connector jars. > - The pre-bundled FileSystems are also on the verge of adding 100s of MBs > themselves. > - The metric reporters are bit by bit growing as well. > > The following could be a compromise: > > The flink-dist would include > - the core flink libraries (core, apis, runtime, etc.) > - yarn / mesos etc. adapters > - examples (the examples should be a small set of self-contained programs > without additional dependencies) > - default logging > - default metric reporter (jmx) > - shells (scala, sql) > > The flink-dist would NOT include the following libs (and these would be > offered for individual download) > - Hadoop libs > - the pre-shaded file systems > - the pre-packaged SQL connectors > - additional metric reporters > > > On Tue, Jan 22, 2019 at 3:19 AM Jeff Zhang <[hidden email]> wrote: > > > Thanks Chesnay for raising this discussion thread. I think there are 3 > > major use scenarios for flink binary distribution. > > > > 1. Use it to set up standalone cluster > > 2. Use it to experience features of flink, such as via scala-shell, > > sql-client > > 3. Downstream project use it to integrate with their system > > > > I did a size estimation of flink dist folder, lib folder take around 100M > > and opt folder take around 200M. Overall I agree to make a thin flink dist. > > So the next problem is which components to drop. I check the opt folder, > > and I think the filesystem components and metrics components could be moved > > out. Because they are pluggable components and is only used in scenario 1 I > > think (setting up standalone cluster). Other components like flink-table, > > flink-ml, flnk-gellay, we should still keep them IMHO, because new user may > > still use it to try the features of flink. For me, scala-shell is the first > > option to try new features of flink. > > > > > > > > Fabian Hueske <[hidden email]> 于2019年1月18日周五 下午7:34写道: > > > >> Hi Chesnay, > >> > >> Thank you for the proposal. > >> I think this is a good idea. > >> We follow a similar approach already for Hadoop dependencies and > >> connectors (although in application space). > >> > >> +1 > >> > >> Fabian > >> > >> Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < > >> [hidden email]>: > >> > >>> Hello, > >>> > >>> the binary distribution that we release by now contains quite a lot of > >>> optional components, including various filesystems, metric reporters and > >>> libraries. Most users will only use a fraction of these, and as such > >>> pretty much only increase the size of flink-dist. > >>> > >>> With Flink growing more and more in scope I don't believe it to be > >>> feasible to ship everything we have with every distribution, and instead > >>> suggest more of a "pick-what-you-need" model, where flink-dist is rather > >>> lean and additional components are downloaded separately and added by > >>> the user. > >>> > >>> This would primarily affect the /opt directory, but could also be > >>> extended to cover flink-dist. For example, the yarn and mesos code could > >>> be spliced out into separate jars that could be added to lib manually. > >>> > >>> Let me know what you think. > >>> > >>> Regards, > >>> > >>> Chesnay > >>> > >>> > > > > -- > > Best Regards > > > > Jeff Zhang > > |
+1 for Stephan's suggestion. For example, SQL connectors have never been
part of the main distribution and nobody complained about this so far. I think what is more important than a big dist bundle is a helpful "Downloads" page where users can easily find available filesystems, connectors, metric repoters. Not everyone checks Maven central for available JAR files. I just saw that we added a "Optional components" section recently [1], we just need to make it more prominent. This is also done for the SQL connectors and formats [2]. [1] https://flink.apache.org/downloads.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/table/connect.html#dependencies Regards, Timo Am 23.01.19 um 10:07 schrieb Ufuk Celebi: > I like the idea of a leaner binary distribution. At the same time I > agree with Jamie that the current binary is quite convenient and > connection speeds should not be that big of a deal. Since the binary > distribution is one of the first entry points for users, I'd like to > keep it as user-friendly as possible. > > What do you think about building a lean distribution by default and a > "full" distribution that still bundles all the optional dependencies > for releases? (If you don't think that's feasible I'm still +1 to only > go with the "lean dist" approach.) > > – Ufuk > > On Wed, Jan 23, 2019 at 9:36 AM Stephan Ewen <[hidden email]> wrote: >> There are some points where a leaner approach could help. >> There are many libraries and connectors that are currently being adding to >> Flink, which makes the "include all" approach not completely feasible in >> long run: >> >> - Connectors: For a proper experience with the Shell/CLI (for example for >> SQL) we need a lot of fat connector jars. >> These come often for multiple versions, which alone accounts for 100s >> of MBs of connector jars. >> - The pre-bundled FileSystems are also on the verge of adding 100s of MBs >> themselves. >> - The metric reporters are bit by bit growing as well. >> >> The following could be a compromise: >> >> The flink-dist would include >> - the core flink libraries (core, apis, runtime, etc.) >> - yarn / mesos etc. adapters >> - examples (the examples should be a small set of self-contained programs >> without additional dependencies) >> - default logging >> - default metric reporter (jmx) >> - shells (scala, sql) >> >> The flink-dist would NOT include the following libs (and these would be >> offered for individual download) >> - Hadoop libs >> - the pre-shaded file systems >> - the pre-packaged SQL connectors >> - additional metric reporters >> >> >> On Tue, Jan 22, 2019 at 3:19 AM Jeff Zhang <[hidden email]> wrote: >> >>> Thanks Chesnay for raising this discussion thread. I think there are 3 >>> major use scenarios for flink binary distribution. >>> >>> 1. Use it to set up standalone cluster >>> 2. Use it to experience features of flink, such as via scala-shell, >>> sql-client >>> 3. Downstream project use it to integrate with their system >>> >>> I did a size estimation of flink dist folder, lib folder take around 100M >>> and opt folder take around 200M. Overall I agree to make a thin flink dist. >>> So the next problem is which components to drop. I check the opt folder, >>> and I think the filesystem components and metrics components could be moved >>> out. Because they are pluggable components and is only used in scenario 1 I >>> think (setting up standalone cluster). Other components like flink-table, >>> flink-ml, flnk-gellay, we should still keep them IMHO, because new user may >>> still use it to try the features of flink. For me, scala-shell is the first >>> option to try new features of flink. >>> >>> >>> >>> Fabian Hueske <[hidden email]> 于2019年1月18日周五 下午7:34写道: >>> >>>> Hi Chesnay, >>>> >>>> Thank you for the proposal. >>>> I think this is a good idea. >>>> We follow a similar approach already for Hadoop dependencies and >>>> connectors (although in application space). >>>> >>>> +1 >>>> >>>> Fabian >>>> >>>> Am Fr., 18. Jan. 2019 um 10:59 Uhr schrieb Chesnay Schepler < >>>> [hidden email]>: >>>> >>>>> Hello, >>>>> >>>>> the binary distribution that we release by now contains quite a lot of >>>>> optional components, including various filesystems, metric reporters and >>>>> libraries. Most users will only use a fraction of these, and as such >>>>> pretty much only increase the size of flink-dist. >>>>> >>>>> With Flink growing more and more in scope I don't believe it to be >>>>> feasible to ship everything we have with every distribution, and instead >>>>> suggest more of a "pick-what-you-need" model, where flink-dist is rather >>>>> lean and additional components are downloaded separately and added by >>>>> the user. >>>>> >>>>> This would primarily affect the /opt directory, but could also be >>>>> extended to cover flink-dist. For example, the yarn and mesos code could >>>>> be spliced out into separate jars that could be added to lib manually. >>>>> >>>>> Let me know what you think. >>>>> >>>>> Regards, >>>>> >>>>> Chesnay >>>>> >>>>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >>> |
In reply to this post by Ufuk Celebi-2
On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> wrote:
> I think what is more important than a big dist bundle is a helpful > "Downloads" page where users can easily find available filesystems, > connectors, metric repoters. Not everyone checks Maven central for > available JAR files. I just saw that we added a "Optional components" > section recently [1], we just need to make it more prominent. This is > also done for the SQL connectors and formats [2]. +1 I fully agree with the importance of the Downloads page. We definitely need to make any optional dependencies that users need to download easy to find. |
Ufuk's proposal (having a lean default release and a user convenience
tarball) sounds good to me. That way advanced users won't be bothered by an unnecessarily large release and new users can benefit from having many useful extensions bundled in one tarball. Cheers, Till On Wed, Jan 23, 2019 at 3:42 PM Ufuk Celebi <[hidden email]> wrote: > On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> wrote: > > I think what is more important than a big dist bundle is a helpful > > "Downloads" page where users can easily find available filesystems, > > connectors, metric repoters. Not everyone checks Maven central for > > available JAR files. I just saw that we added a "Optional components" > > section recently [1], we just need to make it more prominent. This is > > also done for the SQL connectors and formats [2]. > > +1 I fully agree with the importance of the Downloads page. We > definitely need to make any optional dependencies that users need to > download easy to find. > |
+1 for trimming the size by default and offering the fat distribution as
alternative download On Wed, Jan 23, 2019 at 8:35 AM Till Rohrmann <[hidden email]> wrote: > Ufuk's proposal (having a lean default release and a user convenience > tarball) sounds good to me. That way advanced users won't be bothered by an > unnecessarily large release and new users can benefit from having many > useful extensions bundled in one tarball. > > Cheers, > Till > > On Wed, Jan 23, 2019 at 3:42 PM Ufuk Celebi <[hidden email]> wrote: > > > On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> > wrote: > > > I think what is more important than a big dist bundle is a helpful > > > "Downloads" page where users can easily find available filesystems, > > > connectors, metric repoters. Not everyone checks Maven central for > > > available JAR files. I just saw that we added a "Optional components" > > > section recently [1], we just need to make it more prominent. This is > > > also done for the SQL connectors and formats [2]. > > > > +1 I fully agree with the importance of the Downloads page. We > > definitely need to make any optional dependencies that users need to > > download easy to find. > > > |
+1 for leaner distribution and a better 'download' webpage.
+1 for a full distribution if we can automate it besides supporting the leaner one. If we support both, I'd image release managers should be able to package two distributions with a single change of parameter instead of manually package the full distribution. How to achieve that needs to be evaluated and discussed, probably can be something like 'mvn clean install -Dfull/-Dlean', I'm not sure yet. On Wed, Jan 23, 2019 at 10:11 AM Thomas Weise <[hidden email]> wrote: > +1 for trimming the size by default and offering the fat distribution as > alternative download > > > On Wed, Jan 23, 2019 at 8:35 AM Till Rohrmann <[hidden email]> > wrote: > >> Ufuk's proposal (having a lean default release and a user convenience >> tarball) sounds good to me. That way advanced users won't be bothered by >> an >> unnecessarily large release and new users can benefit from having many >> useful extensions bundled in one tarball. >> >> Cheers, >> Till >> >> On Wed, Jan 23, 2019 at 3:42 PM Ufuk Celebi <[hidden email]> wrote: >> >> > On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> >> wrote: >> > > I think what is more important than a big dist bundle is a helpful >> > > "Downloads" page where users can easily find available filesystems, >> > > connectors, metric repoters. Not everyone checks Maven central for >> > > available JAR files. I just saw that we added a "Optional components" >> > > section recently [1], we just need to make it more prominent. This is >> > > also done for the SQL connectors and formats [2]. >> > >> > +1 I fully agree with the importance of the Downloads page. We >> > definitely need to make any optional dependencies that users need to >> > download easy to find. >> > >> > |
+1 for the leaner distribution and improve the "Download" page.
On Fri, 25 Jan 2019 at 01:54, Bowen Li <[hidden email]> wrote: > +1 for leaner distribution and a better 'download' webpage. > > +1 for a full distribution if we can automate it besides supporting the > leaner one. If we support both, I'd image release managers should be able > to package two distributions with a single change of parameter instead of > manually package the full distribution. How to achieve that needs to be > evaluated and discussed, probably can be something like 'mvn clean install > -Dfull/-Dlean', I'm not sure yet. > > > On Wed, Jan 23, 2019 at 10:11 AM Thomas Weise <[hidden email]> wrote: > >> +1 for trimming the size by default and offering the fat distribution as >> alternative download >> >> >> On Wed, Jan 23, 2019 at 8:35 AM Till Rohrmann <[hidden email]> >> wrote: >> >>> Ufuk's proposal (having a lean default release and a user convenience >>> tarball) sounds good to me. That way advanced users won't be bothered by >>> an >>> unnecessarily large release and new users can benefit from having many >>> useful extensions bundled in one tarball. >>> >>> Cheers, >>> Till >>> >>> On Wed, Jan 23, 2019 at 3:42 PM Ufuk Celebi <[hidden email]> wrote: >>> >>> > On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> >>> wrote: >>> > > I think what is more important than a big dist bundle is a helpful >>> > > "Downloads" page where users can easily find available filesystems, >>> > > connectors, metric repoters. Not everyone checks Maven central for >>> > > available JAR files. I just saw that we added a "Optional components" >>> > > section recently [1], we just need to make it more prominent. This is >>> > > also done for the SQL connectors and formats [2]. >>> > >>> > +1 I fully agree with the importance of the Downloads page. We >>> > definitely need to make any optional dependencies that users need to >>> > download easy to find. >>> > >>> >> |
In reply to this post by Chesnay Schepler-3
Hi Chesnay,
Thank you for the proposal. And i like it very much. +1 for the leaner distribution. About improve the "Download" page, I think we can add the connectors download link in the "Optional components" section which @Timo Walther <[hidden email]> mentioned above. Regards, Jincheng Chesnay Schepler <[hidden email]> 于2019年1月18日周五 下午5:59写道: > Hello, > > the binary distribution that we release by now contains quite a lot of > optional components, including various filesystems, metric reporters and > libraries. Most users will only use a fraction of these, and as such > pretty much only increase the size of flink-dist. > > With Flink growing more and more in scope I don't believe it to be > feasible to ship everything we have with every distribution, and instead > suggest more of a "pick-what-you-need" model, where flink-dist is rather > lean and additional components are downloaded separately and added by > the user. > > This would primarily affect the /opt directory, but could also be > extended to cover flink-dist. For example, the yarn and mesos code could > be spliced out into separate jars that could be added to lib manually. > > Let me know what you think. > > Regards, > > Chesnay > > |
Hi Chesnay,
Thanks a lot for the proposal! +1 for a leaner flink-dist and improve the "Download" page. I think a leaner flink-dist would be very helpful. If we bundle all jars into a single one, this will easily cause class conflict problem. Best, Hequn On Fri, Jan 25, 2019 at 2:48 PM jincheng sun <[hidden email]> wrote: > Hi Chesnay, > > Thank you for the proposal. And i like it very much. > > +1 for the leaner distribution. > > About improve the "Download" page, I think we can add the connectors > download link in the "Optional components" section which @Timo Walther > <[hidden email]> mentioned above. > > > Regards, > Jincheng > > Chesnay Schepler <[hidden email]> 于2019年1月18日周五 下午5:59写道: > >> Hello, >> >> the binary distribution that we release by now contains quite a lot of >> optional components, including various filesystems, metric reporters and >> libraries. Most users will only use a fraction of these, and as such >> pretty much only increase the size of flink-dist. >> >> With Flink growing more and more in scope I don't believe it to be >> feasible to ship everything we have with every distribution, and instead >> suggest more of a "pick-what-you-need" model, where flink-dist is rather >> lean and additional components are downloaded separately and added by >> the user. >> >> This would primarily affect the /opt directory, but could also be >> extended to cover flink-dist. For example, the yarn and mesos code could >> be spliced out into separate jars that could be added to lib manually. >> >> Let me know what you think. >> >> Regards, >> >> Chesnay >> >> |
Hi Chesnay,
Thanks for the proposal. +1 for make the distribution thinner. Meanwhile, it would be useful to have all the peripheral libraries/jars hosted somewhere so users can download them from a centralized place. We can also encourage the community to contribute their libraries, such as connectors and other pluggables, to the same place (maybe a separate category), so the community can share the commonly used libraries as well. Thanks, Jiangjie (Becket) Qin On Sat, Jan 26, 2019 at 2:49 PM Hequn Cheng <[hidden email]> wrote: > Hi Chesnay, > > Thanks a lot for the proposal! +1 for a leaner flink-dist and improve the > "Download" page. > I think a leaner flink-dist would be very helpful. If we bundle all jars > into a single one, this will easily cause class conflict problem. > > Best, > Hequn > > > On Fri, Jan 25, 2019 at 2:48 PM jincheng sun <[hidden email]> > wrote: > > > Hi Chesnay, > > > > Thank you for the proposal. And i like it very much. > > > > +1 for the leaner distribution. > > > > About improve the "Download" page, I think we can add the connectors > > download link in the "Optional components" section which @Timo Walther > > <[hidden email]> mentioned above. > > > > > > Regards, > > Jincheng > > > > Chesnay Schepler <[hidden email]> 于2019年1月18日周五 下午5:59写道: > > > >> Hello, > >> > >> the binary distribution that we release by now contains quite a lot of > >> optional components, including various filesystems, metric reporters and > >> libraries. Most users will only use a fraction of these, and as such > >> pretty much only increase the size of flink-dist. > >> > >> With Flink growing more and more in scope I don't believe it to be > >> feasible to ship everything we have with every distribution, and instead > >> suggest more of a "pick-what-you-need" model, where flink-dist is rather > >> lean and additional components are downloaded separately and added by > >> the user. > >> > >> This would primarily affect the /opt directory, but could also be > >> extended to cover flink-dist. For example, the yarn and mesos code could > >> be spliced out into separate jars that could be added to lib manually. > >> > >> Let me know what you think. > >> > >> Regards, > >> > >> Chesnay > >> > >> > |
In reply to this post by Jark Wu-2
It is not viable for us, as of right now, to release both a lean and fat
version of flink-dist. We don't have the required tooling to assemble a correct NOTICE file for that scenario. Besides that his would also go against recent efforts to reduce the total size of a Flink release, as we'd be increasing the total size again by roughly 60% (and naturally also increase the compile time of releases), which I'd like to avoid. I like Stephans compromise of excluding reporters and file-systems; this removes more than 100mb from the distribution yet still retains all the user-facing APIs. Do note that hadoop will already not be included in convenience binaries for 1.8 . This was the motivation behind the new section on the download page. On 25.01.2019 06:42, Jark Wu wrote: > +1 for the leaner distribution and improve the "Download" page. > > On Fri, 25 Jan 2019 at 01:54, Bowen Li <[hidden email]> wrote: > >> +1 for leaner distribution and a better 'download' webpage. >> >> +1 for a full distribution if we can automate it besides supporting the >> leaner one. If we support both, I'd image release managers should be able >> to package two distributions with a single change of parameter instead of >> manually package the full distribution. How to achieve that needs to be >> evaluated and discussed, probably can be something like 'mvn clean install >> -Dfull/-Dlean', I'm not sure yet. >> >> >> On Wed, Jan 23, 2019 at 10:11 AM Thomas Weise <[hidden email]> wrote: >> >>> +1 for trimming the size by default and offering the fat distribution as >>> alternative download >>> >>> >>> On Wed, Jan 23, 2019 at 8:35 AM Till Rohrmann <[hidden email]> >>> wrote: >>> >>>> Ufuk's proposal (having a lean default release and a user convenience >>>> tarball) sounds good to me. That way advanced users won't be bothered by >>>> an >>>> unnecessarily large release and new users can benefit from having many >>>> useful extensions bundled in one tarball. >>>> >>>> Cheers, >>>> Till >>>> >>>> On Wed, Jan 23, 2019 at 3:42 PM Ufuk Celebi <[hidden email]> wrote: >>>> >>>>> On Wed, Jan 23, 2019 at 11:01 AM Timo Walther <[hidden email]> >>>> wrote: >>>>>> I think what is more important than a big dist bundle is a helpful >>>>>> "Downloads" page where users can easily find available filesystems, >>>>>> connectors, metric repoters. Not everyone checks Maven central for >>>>>> available JAR files. I just saw that we added a "Optional components" >>>>>> section recently [1], we just need to make it more prominent. This is >>>>>> also done for the SQL connectors and formats [2]. >>>>> +1 I fully agree with the importance of the Downloads page. We >>>>> definitely need to make any optional dependencies that users need to >>>>> download easy to find. >>>>> |
Free forum by Nabble | Edit this page |