Hello,
I'm Dongwon Kim and I want to get involved in Flink community. Can anyone guide me through contributing to Flink with some startup issues? Although my research interest lie in big data systems including Flink, Spark, MapReduce, and Tez, I've never participated in open source communities. FYI, I've done the following things for past few years: - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and Apache Spark through the source code. - My doctoral thesis is about improving the performance of MRv1 by making network pipelines between mappers and reducers like what Flink does. - I've used Ganglia to monitor the cluster performance and I've been interested in metrics and counters in big data systems. - I gave a talk named "a comparative performance evaluation of Flink" at last Flink Forward. I would be very appreciated if someone can help me get involved in the most promising ASF project :-) Greetings, Dongwon Kim |
Hi Dongwon,
welcome to the Flink mailing list! What kind of issues are you interested in? - API / library features: DataSet API, DataStream API, SQL, StreamSQL, Graphs (Gelly) - Processing runtime: Batch, Streaming - Connectors to other systems: Stream sources/sinks - Web dashboard - Compatibility: Storm, Hadoop You can also have a look into Flink's issue tracker JIRA [1]. Right now, we have about 600 issues listed with any kind of difficulty and effort. If you find an issue that sounds interesting, just drop a note and we can give you some details about if you want to learn more. Best, Fabian [1] https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: > Hello, > > I'm Dongwon Kim and I want to get involved in Flink community. > Can anyone guide me through contributing to Flink with some startup issues? > Although my research interest lie in big data systems including Flink, > Spark, MapReduce, and Tez, I've never participated in open source > communities. > > FYI, I've done the following things for past few years: > - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and > Apache Spark through the source code. > - My doctoral thesis is about improving the performance of MRv1 by > making network pipelines between mappers and reducers like what Flink > does. > - I've used Ganglia to monitor the cluster performance and I've been > interested in metrics and counters in big data systems. > - I gave a talk named "a comparative performance evaluation of Flink" > at last Flink Forward. > > I would be very appreciated if someone can help me get involved in the > most promising ASF project :-) > > Greetings, > Dongwon Kim > |
Hi Dongwon,
very cool that you decided to join the community. Btw: very nice talk at Flink Forward! Fabian pointed out the most important things already. On more thing I wanted to add (just in case you are not aware of it already). There is a "How to contribute" section on the Flink web page: https://flink.apache.org/how-to-contribute.html This should also help to get you started. Looking forward to your first pull request! -Matthias On 02/05/2016 08:55 PM, Fabian Hueske wrote: > Hi Dongwon, > > welcome to the Flink mailing list! > What kind of issues are you interested in? > > - API / library features: DataSet API, DataStream API, SQL, StreamSQL, > Graphs (Gelly) > - Processing runtime: Batch, Streaming > - Connectors to other systems: Stream sources/sinks > - Web dashboard > - Compatibility: Storm, Hadoop > > You can also have a look into Flink's issue tracker JIRA [1]. Right now, we > have about 600 issues listed with any kind of difficulty and effort. > If you find an issue that sounds interesting, just drop a note and we can > give you some details about if you want to learn more. > > Best, Fabian > > [1] > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved > > 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: > >> Hello, >> >> I'm Dongwon Kim and I want to get involved in Flink community. >> Can anyone guide me through contributing to Flink with some startup issues? >> Although my research interest lie in big data systems including Flink, >> Spark, MapReduce, and Tez, I've never participated in open source >> communities. >> >> FYI, I've done the following things for past few years: >> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and >> Apache Spark through the source code. >> - My doctoral thesis is about improving the performance of MRv1 by >> making network pipelines between mappers and reducers like what Flink >> does. >> - I've used Ganglia to monitor the cluster performance and I've been >> interested in metrics and counters in big data systems. >> - I gave a talk named "a comparative performance evaluation of Flink" >> at last Flink Forward. >> >> I would be very appreciated if someone can help me get involved in the >> most promising ASF project :-) >> >> Greetings, >> Dongwon Kim >> > |
In reply to this post by Fabian Hueske-2
Hi Dongwon Kim,
its great to see you here. I really enjoyed your talk at Flink Forward, you did very good and detailed research on the different systems! (Those who didn't see the talk: go watch it on YouTube). Maybe you are interested in working on improving our monitoring / metrics system. People have asked for ways to expose them to other systems (via JMX, Ganglia, ...). There were also questions about writing the metrics to disk (in a csv file or so). On Fri, Feb 5, 2016 at 8:55 PM, Fabian Hueske <[hidden email]> wrote: > Hi Dongwon, > > welcome to the Flink mailing list! > What kind of issues are you interested in? > > - API / library features: DataSet API, DataStream API, SQL, StreamSQL, > Graphs (Gelly) > - Processing runtime: Batch, Streaming > - Connectors to other systems: Stream sources/sinks > - Web dashboard > - Compatibility: Storm, Hadoop > > You can also have a look into Flink's issue tracker JIRA [1]. Right now, we > have about 600 issues listed with any kind of difficulty and effort. > If you find an issue that sounds interesting, just drop a note and we can > give you some details about if you want to learn more. > > Best, Fabian > > [1] > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved > > 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: > > > Hello, > > > > I'm Dongwon Kim and I want to get involved in Flink community. > > Can anyone guide me through contributing to Flink with some startup > issues? > > Although my research interest lie in big data systems including Flink, > > Spark, MapReduce, and Tez, I've never participated in open source > > communities. > > > > FYI, I've done the following things for past few years: > > - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and > > Apache Spark through the source code. > > - My doctoral thesis is about improving the performance of MRv1 by > > making network pipelines between mappers and reducers like what Flink > > does. > > - I've used Ganglia to monitor the cluster performance and I've been > > interested in metrics and counters in big data systems. > > - I gave a talk named "a comparative performance evaluation of Flink" > > at last Flink Forward. > > > > I would be very appreciated if someone can help me get involved in the > > most promising ASF project :-) > > > > Greetings, > > Dongwon Kim > > > |
In reply to this post by Fabian Hueske-2
Hi Fabian, Matthias, Robert!
Thank you for welcoming me to the community :-) I'm taking a look at JIRA and "How to contribute" as you guys suggested. One trivial question is whether I just need to make a pull request after figuring out issues? Then I'll pick up any issue, figure it out, and then make a pull request by myself ;-) Meanwhile, I also read the roadmap and I find few plans capturing my interest. - Making YARN resource dynamic - DataSet API Enhancements - Expose more runtime metrics Would any of you informs me of new or existing issues regarding the above? Thanks! Dongwon 2016-02-06 4:55 GMT+09:00 Fabian Hueske <[hidden email]>: > Hi Dongwon, > > welcome to the Flink mailing list! > What kind of issues are you interested in? > > - API / library features: DataSet API, DataStream API, SQL, StreamSQL, > Graphs (Gelly) > - Processing runtime: Batch, Streaming > - Connectors to other systems: Stream sources/sinks > - Web dashboard > - Compatibility: Storm, Hadoop > > You can also have a look into Flink's issue tracker JIRA [1]. Right now, we > have about 600 issues listed with any kind of difficulty and effort. > If you find an issue that sounds interesting, just drop a note and we can > give you some details about if you want to learn more. > > Best, Fabian > > [1] > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved > > 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: > >> Hello, >> >> I'm Dongwon Kim and I want to get involved in Flink community. >> Can anyone guide me through contributing to Flink with some startup issues? >> Although my research interest lie in big data systems including Flink, >> Spark, MapReduce, and Tez, I've never participated in open source >> communities. >> >> FYI, I've done the following things for past few years: >> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and >> Apache Spark through the source code. >> - My doctoral thesis is about improving the performance of MRv1 by >> making network pipelines between mappers and reducers like what Flink >> does. >> - I've used Ganglia to monitor the cluster performance and I've been >> interested in metrics and counters in big data systems. >> - I gave a talk named "a comparative performance evaluation of Flink" >> at last Flink Forward. >> >> I would be very appreciated if someone can help me get involved in the >> most promising ASF project :-) >> >> Greetings, >> Dongwon Kim >> |
Hi Dongwon,
Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it. Welcome! :) Regards, Chiwan Park > On Feb 6, 2016, at 3:31 PM, Dongwon Kim <[hidden email]> wrote: > > Hi Fabian, Matthias, Robert! > > Thank you for welcoming me to the community :-) > I'm taking a look at JIRA and "How to contribute" as you guys suggested. > One trivial question is whether I just need to make a pull request > after figuring out issues? > Then I'll pick up any issue, figure it out, and then make a pull > request by myself ;-) > > Meanwhile, I also read the roadmap and I find few plans capturing my interest. > - Making YARN resource dynamic > - DataSet API Enhancements > - Expose more runtime metrics > Would any of you informs me of new or existing issues regarding the above? > > Thanks! > > Dongwon > > 2016-02-06 4:55 GMT+09:00 Fabian Hueske <[hidden email]>: >> Hi Dongwon, >> >> welcome to the Flink mailing list! >> What kind of issues are you interested in? >> >> - API / library features: DataSet API, DataStream API, SQL, StreamSQL, >> Graphs (Gelly) >> - Processing runtime: Batch, Streaming >> - Connectors to other systems: Stream sources/sinks >> - Web dashboard >> - Compatibility: Storm, Hadoop >> >> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we >> have about 600 issues listed with any kind of difficulty and effort. >> If you find an issue that sounds interesting, just drop a note and we can >> give you some details about if you want to learn more. >> >> Best, Fabian >> >> [1] >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved >> >> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: >> >>> Hello, >>> >>> I'm Dongwon Kim and I want to get involved in Flink community. >>> Can anyone guide me through contributing to Flink with some startup issues? >>> Although my research interest lie in big data systems including Flink, >>> Spark, MapReduce, and Tez, I've never participated in open source >>> communities. >>> >>> FYI, I've done the following things for past few years: >>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and >>> Apache Spark through the source code. >>> - My doctoral thesis is about improving the performance of MRv1 by >>> making network pipelines between mappers and reducers like what Flink >>> does. >>> - I've used Ganglia to monitor the cluster performance and I've been >>> interested in metrics and counters in big data systems. >>> - I gave a talk named "a comparative performance evaluation of Flink" >>> at last Flink Forward. >>> >>> I would be very appreciated if someone can help me get involved in the >>> most promising ASF project :-) >>> >>> Greetings, >>> Dongwon Kim >>> |
Hi Chiwan!
That's what I wanted to know! Thanks! Dongwon Kim 2016-02-06 22:00 GMT+09:00 Chiwan Park <[hidden email]>: > Hi Dongwon, > > Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it. > > Welcome! :) > > Regards, > Chiwan Park > >> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <[hidden email]> wrote: >> >> Hi Fabian, Matthias, Robert! >> >> Thank you for welcoming me to the community :-) >> I'm taking a look at JIRA and "How to contribute" as you guys suggested. >> One trivial question is whether I just need to make a pull request >> after figuring out issues? >> Then I'll pick up any issue, figure it out, and then make a pull >> request by myself ;-) >> >> Meanwhile, I also read the roadmap and I find few plans capturing my interest. >> - Making YARN resource dynamic >> - DataSet API Enhancements >> - Expose more runtime metrics >> Would any of you informs me of new or existing issues regarding the above? >> >> Thanks! >> >> Dongwon >> >> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <[hidden email]>: >>> Hi Dongwon, >>> >>> welcome to the Flink mailing list! >>> What kind of issues are you interested in? >>> >>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL, >>> Graphs (Gelly) >>> - Processing runtime: Batch, Streaming >>> - Connectors to other systems: Stream sources/sinks >>> - Web dashboard >>> - Compatibility: Storm, Hadoop >>> >>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we >>> have about 600 issues listed with any kind of difficulty and effort. >>> If you find an issue that sounds interesting, just drop a note and we can >>> give you some details about if you want to learn more. >>> >>> Best, Fabian >>> >>> [1] >>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved >>> >>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: >>> >>>> Hello, >>>> >>>> I'm Dongwon Kim and I want to get involved in Flink community. >>>> Can anyone guide me through contributing to Flink with some startup issues? >>>> Although my research interest lie in big data systems including Flink, >>>> Spark, MapReduce, and Tez, I've never participated in open source >>>> communities. >>>> >>>> FYI, I've done the following things for past few years: >>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and >>>> Apache Spark through the source code. >>>> - My doctoral thesis is about improving the performance of MRv1 by >>>> making network pipelines between mappers and reducers like what Flink >>>> does. >>>> - I've used Ganglia to monitor the cluster performance and I've been >>>> interested in metrics and counters in big data systems. >>>> - I gave a talk named "a comparative performance evaluation of Flink" >>>> at last Flink Forward. >>>> >>>> I would be very appreciated if someone can help me get involved in the >>>> most promising ASF project :-) >>>> >>>> Greetings, >>>> Dongwon Kim >>>> > |
For the road map ideas, there are often no JIRAs created yet. Mostly,
road map ideas are more complex things to get done, requiring design documents and discussions before the actual coding can be done. Usually, we create the JIRA (or multiple JIRAs) during the design phase. So just watch the mailing list to keep track of the road map ideas you are interested in. Of course, if you want to get started with any of those, you can start the discussion on the mail by yourself and also start a design document etc. Just be aware, that this process will take some time, as the community will give you a lot of feedback etc. If you want to get started more quickly, working on an existing JIRA with limited scope is a good starting point -- or you just do both in parallel ;) -Matthias On 02/06/2016 02:10 PM, Dongwon Kim wrote: > Hi Chiwan! > > That's what I wanted to know! > Thanks! > > Dongwon Kim > > 2016-02-06 22:00 GMT+09:00 Chiwan Park <[hidden email]>: >> Hi Dongwon, >> >> Yes, the things to do are picking an issue (by assigning the issue to you or commenting on the issue) and make changes and send a pull request for it. >> >> Welcome! :) >> >> Regards, >> Chiwan Park >> >>> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <[hidden email]> wrote: >>> >>> Hi Fabian, Matthias, Robert! >>> >>> Thank you for welcoming me to the community :-) >>> I'm taking a look at JIRA and "How to contribute" as you guys suggested. >>> One trivial question is whether I just need to make a pull request >>> after figuring out issues? >>> Then I'll pick up any issue, figure it out, and then make a pull >>> request by myself ;-) >>> >>> Meanwhile, I also read the roadmap and I find few plans capturing my interest. >>> - Making YARN resource dynamic >>> - DataSet API Enhancements >>> - Expose more runtime metrics >>> Would any of you informs me of new or existing issues regarding the above? >>> >>> Thanks! >>> >>> Dongwon >>> >>> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <[hidden email]>: >>>> Hi Dongwon, >>>> >>>> welcome to the Flink mailing list! >>>> What kind of issues are you interested in? >>>> >>>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL, >>>> Graphs (Gelly) >>>> - Processing runtime: Batch, Streaming >>>> - Connectors to other systems: Stream sources/sinks >>>> - Web dashboard >>>> - Compatibility: Storm, Hadoop >>>> >>>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we >>>> have about 600 issues listed with any kind of difficulty and effort. >>>> If you find an issue that sounds interesting, just drop a note and we can >>>> give you some details about if you want to learn more. >>>> >>>> Best, Fabian >>>> >>>> [1] >>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved >>>> >>>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <[hidden email]>: >>>> >>>>> Hello, >>>>> >>>>> I'm Dongwon Kim and I want to get involved in Flink community. >>>>> Can anyone guide me through contributing to Flink with some startup issues? >>>>> Although my research interest lie in big data systems including Flink, >>>>> Spark, MapReduce, and Tez, I've never participated in open source >>>>> communities. >>>>> >>>>> FYI, I've done the following things for past few years: >>>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and >>>>> Apache Spark through the source code. >>>>> - My doctoral thesis is about improving the performance of MRv1 by >>>>> making network pipelines between mappers and reducers like what Flink >>>>> does. >>>>> - I've used Ganglia to monitor the cluster performance and I've been >>>>> interested in metrics and counters in big data systems. >>>>> - I gave a talk named "a comparative performance evaluation of Flink" >>>>> at last Flink Forward. >>>>> >>>>> I would be very appreciated if someone can help me get involved in the >>>>> most promising ASF project :-) >>>>> >>>>> Greetings, >>>>> Dongwon Kim >>>>> >> |
Free forum by Nabble | Edit this page |