Hi Flinkers,
I'm start working with Flink and I would like to contribute to Flink. However, I'm a very new Flinker, so the first thing I could contribute is a one-click style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster and cloud computing environments (OpenStack based Cloud). It's available here: https://bitbucket.org/lequocdo/flink-setup. I added Spark to the script, since there are many people want to compare performance between Spark and Flink. I have tested the script with our cluster and a OpenStack based cloud. Please let me know if this contribution makes some senses or not. Your feedback and comments will be greatly appreciated. Thank you, Do |
Hi Do,
thanks for you interest in Flink. It is great that you want to contribute to the system. Right now, I am not sure how your script could be integrated into Flink. As a reference, please read the following guidelines: https://flink.apache.org/how-to-contribute.html https://flink.apache.org/contribute-code.html Let's drive a discussion about it first. Can you describe what advantages (differences/new features) you script offers compared to the already provided startup scripts? -Matthias On 11/08/2015 07:52 PM, Le Quoc Do wrote: > Hi Flinkers, > > I'm start working with Flink and I would like to contribute to Flink. > However, I'm a very new Flinker, so the first thing I could contribute > is a one-click > style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster > and cloud computing environments (OpenStack based Cloud). It's available > here: https://bitbucket.org/lequocdo/flink-setup. > I added Spark to the script, since there are many people want to compare > performance between Spark and Flink. > I have tested the script with our cluster and a OpenStack based cloud. > Please let me know if this contribution makes some senses or not. Your > feedback and comments will be greatly appreciated. > > Thank you, > Do > |
In reply to this post by Le Quoc Do
Hi Do,
Thanks for the script. I'm sure it will be helpful to people who want to setup their own cluster. Some people use a tool for performance testing called Yoka which also takes care of setting up a Flink and Hadoop cluster. For example, the Flink part is available here: https://github.com/mxm/yoka/blob/master/cluster/flink.py Furthermore, Flink has support for Flink built into bdutil for Google Compute Engine: https://github.com/GoogleCloudPlatform/bdutil https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/gce_setup.html We have received some feedback that it would be nice to have a script included in Flink which takes care of the cluster setup and configuration. The scope of such a script would be to a) bring up instances at a clouder provider b) install Flink and its dependencies c) configure defaults according to the cluster. The last two points I see are already somewhat covered in your script. It would definitely make sense to link your script in the new "External Projects" section. The main issue with including it directly in Flink is that it needs to be maintained across different Flink versions. The other issue I see is that its defaults might not work for all users and cluster setups. It would be nice to also provide an option to configure parameters explicitly. Best, Max On Sun, Nov 8, 2015 at 7:52 PM, Le Quoc Do <[hidden email]> wrote: > Hi Flinkers, > > I'm start working with Flink and I would like to contribute to Flink. > However, I'm a very new Flinker, so the first thing I could contribute > is a one-click > style deployment script to deploy Flink, Spark and Hadoop Yarn on cluster > and cloud computing environments (OpenStack based Cloud). It's available > here: https://bitbucket.org/lequocdo/flink-setup. > I added Spark to the script, since there are many people want to compare > performance between Spark and Flink. > I have tested the script with our cluster and a OpenStack based cloud. > Please let me know if this contribution makes some senses or not. Your > feedback and comments will be greatly appreciated. > > Thank you, > Do |
Hi Max,
Thank you for your comments. you wrote: > Hi Do, > For example, the Flink part is available here: > https://github.com/mxm/yoka/blob/master/cluster/flink.py nice one. The scope of such a script would be to a) bring up > instances at a clouder provider b) install Flink and its dependencies > c) configure defaults according to the cluster. The last two points I > see are already somewhat covered in your script. Yes, the script is designed for people want to deploy Flink Hadoop with their own clusters. However I could add a first point, too. It would definitely make sense to link your script in the new > "External Projects" section. Great! > The other issue I see is that its defaults might not work > for all users and cluster setups. It would be nice to also provide an > option to configure parameters explicitly. > I agree, I will restructure it. I will go back to you soon. Thank you very much, Do |
Free forum by Nabble | Edit this page |