Flink deployment fabric script

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink deployment fabric script

Le Quoc Do
Hi Flinkers,

I'm start working with Flink and I would like to contribute to Flink.
However, I'm a very new Flinker, so the first thing I could contribute
is a one-click
style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster
and cloud computing environments (OpenStack based Cloud). It's available
here: https://bitbucket.org/lequocdo/flink-setup.
I added Spark to the script, since there are many people want to compare
performance between Spark and Flink.
I have tested the script with our cluster and a OpenStack based cloud.
Please let me know if this contribution makes some senses or not. Your
feedback and comments will be greatly appreciated.

Thank you,
Do
Reply | Threaded
Open this post in threaded view
|

Re: Flink deployment fabric script

Matthias J. Sax-2
Hi Do,

thanks for you interest in Flink. It is great that you want to
contribute to the system.

Right now, I am not sure how your script could be integrated into Flink.
As a reference, please read the following guidelines:

https://flink.apache.org/how-to-contribute.html
https://flink.apache.org/contribute-code.html

Let's drive a discussion about it first. Can you describe what
advantages (differences/new features) you script offers compared to the
already provided startup scripts?


-Matthias


On 11/08/2015 07:52 PM, Le Quoc Do wrote:

> Hi Flinkers,
>
> I'm start working with Flink and I would like to contribute to Flink.
> However, I'm a very new Flinker, so the first thing I could contribute
> is a one-click
> style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster
> and cloud computing environments (OpenStack based Cloud). It's available
> here: https://bitbucket.org/lequocdo/flink-setup.
> I added Spark to the script, since there are many people want to compare
> performance between Spark and Flink.
> I have tested the script with our cluster and a OpenStack based cloud.
> Please let me know if this contribution makes some senses or not. Your
> feedback and comments will be greatly appreciated.
>
> Thank you,
> Do
>


signature.asc (836 bytes) Download Attachment
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Flink deployment fabric script

mxm
In reply to this post by Le Quoc Do
Hi Do,

Thanks for the script. I'm sure it will be helpful to people who want
to setup their own cluster. Some people use a tool for performance
testing called Yoka which also takes care of setting up a Flink and
Hadoop cluster. For example, the Flink part is available here:
https://github.com/mxm/yoka/blob/master/cluster/flink.py

Furthermore, Flink has support for Flink built into bdutil for Google
Compute Engine:
https://github.com/GoogleCloudPlatform/bdutil
https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/gce_setup.html

We have received some feedback that it would be nice to have a script
included in Flink which takes care of the cluster setup and
configuration. The scope of such a script would be to a) bring up
instances at a clouder provider b) install Flink and its dependencies
c) configure defaults according to the cluster. The last two points I
see are already somewhat covered in your script.

It would definitely make sense to link your script in the new
"External Projects" section. The main issue with including it directly
in Flink is that it needs to be maintained across different Flink
versions. The other issue I see is that its defaults might not work
for all users and cluster setups. It would be nice to also provide an
option to configure parameters explicitly.

Best,
Max

On Sun, Nov 8, 2015 at 7:52 PM, Le Quoc Do <[hidden email]> wrote:

> Hi Flinkers,
>
> I'm start working with Flink and I would like to contribute to Flink.
> However, I'm a very new Flinker, so the first thing I could contribute
> is a one-click
> style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster
> and cloud computing environments (OpenStack based Cloud). It's available
> here: https://bitbucket.org/lequocdo/flink-setup.
> I added Spark to the script, since there are many people want to compare
> performance between Spark and Flink.
> I have tested the script with our cluster and a OpenStack based cloud.
> Please let me know if this contribution makes some senses or not. Your
> feedback and comments will be greatly appreciated.
>
> Thank you,
> Do
Reply | Threaded
Open this post in threaded view
|

Re: Flink deployment fabric script

Le Quoc Do
Hi Max,

Thank you for your comments.

you wrote:

> Hi Do,
> For example, the Flink part is available here:
> https://github.com/mxm/yoka/blob/master/cluster/flink.py


nice one.

The scope of such a script would be to a) bring up
> instances at a clouder provider b) install Flink and its dependencies
> c) configure defaults according to the cluster. The last two points I
> see are already somewhat covered in your script.


Yes, the script is designed for people want to deploy Flink Hadoop with
their own clusters. However I could add a first point, too.

It would definitely make sense to link your script in the new
> "External Projects" section.


Great!


> The other issue I see is that its defaults might not work
> for all users and cluster setups. It would be nice to also provide an
> option to configure parameters explicitly.
>

I agree, I will restructure it. I will go back to you soon.

Thank you very much,
Do