(DEPRECATED) Apache Flink Mailing List archive.

Student looking to contribute to Stratosphere

Classic

List

Threaded

10 messages Options

Rohit Shinde

Student looking to contribute to Stratosphere

Hello everyone,

I came across Stratosphere while looking for GSOC organisations working in
Machine Learning. I got to know that it had become Apache Flink.

I am interested in this project:
https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere

Backgroundd: I am proficient in C++, Java, Python and Scheme. I have taken
undergrad courses in machine learning and data mining. How can I contribute
to the above project?

Thank you,
Rohit Shinde.

Chiwan Park-2

Re: Student looking to contribute to Stratosphere

Hi, You can choose any unassigned issue about Flink Machine Learning Library (flink-ml) in JIRA. [1]
There are some issues for starter in flink-ml such as FLINK-1737 [2], FLINK-1748 [3], FLINK-1994 [4].

First, It would be better to read some articles about contributing to Flink. [5][6]
And if you decide a issue to contribute, please assign it to you. If you don’t have permission to
assign, just comment into the issue. Then other people give permission to you and assign
the issue to you.

Regards,
Chiwan Park

[1] https://issues.apache.org/jira/
[2] https://issues.apache.org/jira/browse/FLINK-1737
[3] https://issues.apache.org/jira/browse/FLINK-1748
[4] https://issues.apache.org/jira/browse/FLINK-1994
[5] http://flink.apache.org/how-to-contribute.html
[6] http://flink.apache.org/coding-guidelines.html

> On Jun 27, 2015, at 11:20 PM, Rohit Shinde <[hidden email]> wrote:
>
> Hello everyone,
>
> I came across Stratosphere while looking for GSOC organisations working in
> Machine Learning. I got to know that it had become Apache Flink.
>
> I am interested in this project:
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
>
> Backgroundd: I am proficient in C++, Java, Python and Scheme. I have taken
> undergrad courses in machine learning and data mining. How can I contribute
> to the above project?
>
> Thank you,
> Rohit Shinde.

Rohit Shinde

Re: Student looking to contribute to Stratosphere

Hi,

Sorry for the brief hiatus. I was preparing for my GRE exam, but I am back.
I am starting to build Flink and a doubt which I had was, is a single-node
cluster configuration of Hadoop enough? I assume Hadoop is needed since it
is given on the build page.

On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <[hidden email]> wrote:

> Hi, You can choose any unassigned issue about Flink Machine Learning
> Library (flink-ml) in JIRA. [1]
> There are some issues for starter in flink-ml such as FLINK-1737 [2],
> FLINK-1748 [3], FLINK-1994 [4].
>
> First, It would be better to read some articles about contributing to
> Flink. [5][6]
> And if you decide a issue to contribute, please assign it to you. If you
> don’t have permission to
> assign, just comment into the issue. Then other people give permission to
> you and assign
> the issue to you.
>
> Regards,
> Chiwan Park
>
> [1] https://issues.apache.org/jira/
> [2] https://issues.apache.org/jira/browse/FLINK-1737
> [3] https://issues.apache.org/jira/browse/FLINK-1748
> [4] https://issues.apache.org/jira/browse/FLINK-1994
> [5] http://flink.apache.org/how-to-contribute.html
> [6] http://flink.apache.org/coding-guidelines.html
>
> > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <[hidden email]>
> wrote:
> >
> > Hello everyone,
> >
> > I came across Stratosphere while looking for GSOC organisations working
> in
> > Machine Learning. I got to know that it had become Apache Flink.
> >
> > I am interested in this project:
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> >
> > Backgroundd: I am proficient in C++, Java, Python and Scheme. I have
> taken
> > undergrad courses in machine learning and data mining. How can I
> contribute
> > to the above project?
> >
> > Thank you,
> > Rohit Shinde.
>
>
>
>
>
>

Márton Balassi

Re: Student looking to contribute to Stratosphere

Hi,

Hadoop is not a necessity for running Flink, but rather an option. Try the
steps of the setup guide. [1]
If you really nee HDFS though to get the best IO performance I would
suggest having Hadoop on all your machines running Flink.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html

On Jul 15, 2015 5:27 AM, "Rohit Shinde" <[hidden email]> wrote:

> Hi,
>
> Sorry for the brief hiatus. I was preparing for my GRE exam, but I am back.
> I am starting to build Flink and a doubt which I had was, is a single-node
> cluster configuration of Hadoop enough? I assume Hadoop is needed since it
> is given on the build page.
>
> On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <[hidden email]>
> wrote:
>
> > Hi, You can choose any unassigned issue about Flink Machine Learning
> > Library (flink-ml) in JIRA. [1]
> > There are some issues for starter in flink-ml such as FLINK-1737 [2],
> > FLINK-1748 [3], FLINK-1994 [4].
> >
> > First, It would be better to read some articles about contributing to
> > Flink. [5][6]
> > And if you decide a issue to contribute, please assign it to you. If you
> > don’t have permission to
> > assign, just comment into the issue. Then other people give permission to
> > you and assign
> > the issue to you.
> >
> > Regards,
> > Chiwan Park
> >
> > [1] https://issues.apache.org/jira/
> > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > [5] http://flink.apache.org/how-to-contribute.html
> > [6] http://flink.apache.org/coding-guidelines.html
> >
> > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> [hidden email]>
> > wrote:
> > >
> > > Hello everyone,
> > >
> > > I came across Stratosphere while looking for GSOC organisations working
> > in
> > > Machine Learning. I got to know that it had become Apache Flink.
> > >
> > > I am interested in this project:
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > >
> > > Backgroundd: I am proficient in C++, Java, Python and Scheme. I have
> > taken
> > > undergrad courses in machine learning and data mining. How can I
> > contribute
> > > to the above project?
> > >
> > > Thank you,
> > > Rohit Shinde.
> >
> >
> >
> >
> >
> >
>

Kostas Tzoumas-2

Re: Student looking to contribute to Stratosphere

Hi Rohit,

If you are just working on your laptop, I personally find it much easier to
work without Hadoop and use the local file system or just Java collections
for testing and trying out ideas.

When you move to a cluster, it is common to use a Hadoop installation to
store large files in HDFS. There, you can run Flink jobs using Flink's YARN
mode.

Kostas

On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <[hidden email]>
wrote:

> Hi,
>
> Hadoop is not a necessity for running Flink, but rather an option. Try the
> steps of the setup guide. [1]
> If you really nee HDFS though to get the best IO performance I would
> suggest having Hadoop on all your machines running Flink.
>
> [1]
>
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
>
> On Jul 15, 2015 5:27 AM, "Rohit Shinde" <[hidden email]>
> wrote:
>
> > Hi,
> >
> > Sorry for the brief hiatus. I was preparing for my GRE exam, but I am
> back.
> > I am starting to build Flink and a doubt which I had was, is a
> single-node
> > cluster configuration of Hadoop enough? I assume Hadoop is needed since
> it
> > is given on the build page.
> >
> > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <[hidden email]>
> > wrote:
> >
> > > Hi, You can choose any unassigned issue about Flink Machine Learning
> > > Library (flink-ml) in JIRA. [1]
> > > There are some issues for starter in flink-ml such as FLINK-1737 [2],
> > > FLINK-1748 [3], FLINK-1994 [4].
> > >
> > > First, It would be better to read some articles about contributing to
> > > Flink. [5][6]
> > > And if you decide a issue to contribute, please assign it to you. If
> you
> > > don’t have permission to
> > > assign, just comment into the issue. Then other people give permission
> to
> > > you and assign
> > > the issue to you.
> > >
> > > Regards,
> > > Chiwan Park
> > >
> > > [1] https://issues.apache.org/jira/
> > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > [5] http://flink.apache.org/how-to-contribute.html
> > > [6] http://flink.apache.org/coding-guidelines.html
> > >
> > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > [hidden email]>
> > > wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > > I came across Stratosphere while looking for GSOC organisations
> working
> > > in
> > > > Machine Learning. I got to know that it had become Apache Flink.
> > > >
> > > > I am interested in this project:
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > >
> > > > Backgroundd: I am proficient in C++, Java, Python and Scheme. I have
> > > taken
> > > > undergrad courses in machine learning and data mining. How can I
> > > contribute
> > > > to the above project?
> > > >
> > > > Thank you,
> > > > Rohit Shinde.
> > >
> > >
> > >
> > >
> > >
> > >
> >
>

Rohit Shinde

Re: Student looking to contribute to Stratosphere

What IDE should I use? There are various options and I already have Eclipse
Luna. The IDE page lists that the Scala IDE is the best. So should I go
with the Scala IDE? Will I be able to develop in Java later?

On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[hidden email]> wrote:

> Hi Rohit,
>
> If you are just working on your laptop, I personally find it much easier to
> work without Hadoop and use the local file system or just Java collections
> for testing and trying out ideas.
>
> When you move to a cluster, it is common to use a Hadoop installation to
> store large files in HDFS. There, you can run Flink jobs using Flink's YARN
> mode.
>
> Kostas
>
> On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <[hidden email]>
> wrote:
>
> > Hi,
> >
> > Hadoop is not a necessity for running Flink, but rather an option. Try
> the
> > steps of the setup guide. [1]
> > If you really nee HDFS though to get the best IO performance I would
> > suggest having Hadoop on all your machines running Flink.
> >
> > [1]
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> >
> > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <[hidden email]>
> > wrote:
> >
> > > Hi,
> > >
> > > Sorry for the brief hiatus. I was preparing for my GRE exam, but I am
> > back.
> > > I am starting to build Flink and a doubt which I had was, is a
> > single-node
> > > cluster configuration of Hadoop enough? I assume Hadoop is needed since
> > it
> > > is given on the build page.
> > >
> > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <[hidden email]>
> > > wrote:
> > >
> > > > Hi, You can choose any unassigned issue about Flink Machine Learning
> > > > Library (flink-ml) in JIRA. [1]
> > > > There are some issues for starter in flink-ml such as FLINK-1737 [2],
> > > > FLINK-1748 [3], FLINK-1994 [4].
> > > >
> > > > First, It would be better to read some articles about contributing to
> > > > Flink. [5][6]
> > > > And if you decide a issue to contribute, please assign it to you. If
> > you
> > > > don’t have permission to
> > > > assign, just comment into the issue. Then other people give
> permission
> > to
> > > > you and assign
> > > > the issue to you.
> > > >
> > > > Regards,
> > > > Chiwan Park
> > > >
> > > > [1] https://issues.apache.org/jira/
> > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > [6] http://flink.apache.org/coding-guidelines.html
> > > >
> > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > [hidden email]>
> > > > wrote:
> > > > >
> > > > > Hello everyone,
> > > > >
> > > > > I came across Stratosphere while looking for GSOC organisations
> > working
> > > > in
> > > > > Machine Learning. I got to know that it had become Apache Flink.
> > > > >
> > > > > I am interested in this project:
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > >
> > > > > Backgroundd: I am proficient in C++, Java, Python and Scheme. I
> have
> > > > taken
> > > > > undergrad courses in machine learning and data mining. How can I
> > > > contribute
> > > > > to the above project?
> > > > >
> > > > > Thank you,
> > > > > Rohit Shinde.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>

Kostas Tzoumas-2

Re: Student looking to contribute to Stratosphere

IDE choice is up to you with some limitations, see here for IDE setup
instructions:
https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html

Scala IDE is not limited to Scala, it is based on Eclipse, so you can
develop in Java. Most committers are using IntelliJ as far as I know.

On Wed, Jul 15, 2015 at 1:24 PM, Rohit Shinde <[hidden email]>
wrote:

> What IDE should I use? There are various options and I already have Eclipse
> Luna. The IDE page lists that the Scala IDE is the best. So should I go
> with the Scala IDE? Will I be able to develop in Java later?
>
> On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[hidden email]>
> wrote:
>
> > Hi Rohit,
> >
> > If you are just working on your laptop, I personally find it much easier
> to
> > work without Hadoop and use the local file system or just Java
> collections
> > for testing and trying out ideas.
> >
> > When you move to a cluster, it is common to use a Hadoop installation to
> > store large files in HDFS. There, you can run Flink jobs using Flink's
> YARN
> > mode.
> >
> > Kostas
> >
> > On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <
> [hidden email]>
> > wrote:
> >
> > > Hi,
> > >
> > > Hadoop is not a necessity for running Flink, but rather an option. Try
> > the
> > > steps of the setup guide. [1]
> > > If you really nee HDFS though to get the best IO performance I would
> > > suggest having Hadoop on all your machines running Flink.
> > >
> > > [1]
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> > >
> > > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <[hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Sorry for the brief hiatus. I was preparing for my GRE exam, but I am
> > > back.
> > > > I am starting to build Flink and a doubt which I had was, is a
> > > single-node
> > > > cluster configuration of Hadoop enough? I assume Hadoop is needed
> since
> > > it
> > > > is given on the build page.
> > > >
> > > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi, You can choose any unassigned issue about Flink Machine
> Learning
> > > > > Library (flink-ml) in JIRA. [1]
> > > > > There are some issues for starter in flink-ml such as FLINK-1737
> [2],
> > > > > FLINK-1748 [3], FLINK-1994 [4].
> > > > >
> > > > > First, It would be better to read some articles about contributing
> to
> > > > > Flink. [5][6]
> > > > > And if you decide a issue to contribute, please assign it to you.
> If
> > > you
> > > > > don’t have permission to
> > > > > assign, just comment into the issue. Then other people give
> > permission
> > > to
> > > > > you and assign
> > > > > the issue to you.
> > > > >
> > > > > Regards,
> > > > > Chiwan Park
> > > > >
> > > > > [1] https://issues.apache.org/jira/
> > > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > > [6] http://flink.apache.org/coding-guidelines.html
> > > > >
> > > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > > [hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > I came across Stratosphere while looking for GSOC organisations
> > > working
> > > > > in
> > > > > > Machine Learning. I got to know that it had become Apache Flink.
> > > > > >
> > > > > > I am interested in this project:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > > >
> > > > > > Backgroundd: I am proficient in C++, Java, Python and Scheme. I
> > have
> > > > > taken
> > > > > > undergrad courses in machine learning and data mining. How can I
> > > > > contribute
> > > > > > to the above project?
> > > > > >
> > > > > > Thank you,
> > > > > > Rohit Shinde.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Rohit Shinde

Re: Student looking to contribute to Stratosphere

I intend to solve this issue:
https://issues.apache.org/jira/browse/FLINK-1748

Could someone give me some pointers on how to approach this?

On Wed, Jul 15, 2015 at 4:58 PM, Kostas Tzoumas <[hidden email]> wrote:

> IDE choice is up to you with some limitations, see here for IDE setup
> instructions:
>
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html
>
>
> Scala IDE is not limited to Scala, it is based on Eclipse, so you can
> develop in Java. Most committers are using IntelliJ as far as I know.
>
> On Wed, Jul 15, 2015 at 1:24 PM, Rohit Shinde <[hidden email]
> >
> wrote:
>
> > What IDE should I use? There are various options and I already have
> Eclipse
> > Luna. The IDE page lists that the Scala IDE is the best. So should I go
> > with the Scala IDE? Will I be able to develop in Java later?
> >
> > On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[hidden email]>
> > wrote:
> >
> > > Hi Rohit,
> > >
> > > If you are just working on your laptop, I personally find it much
> easier
> > to
> > > work without Hadoop and use the local file system or just Java
> > collections
> > > for testing and trying out ideas.
> > >
> > > When you move to a cluster, it is common to use a Hadoop installation
> to
> > > store large files in HDFS. There, you can run Flink jobs using Flink's
> > YARN
> > > mode.
> > >
> > > Kostas
> > >
> > > On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <
> > [hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Hadoop is not a necessity for running Flink, but rather an option.
> Try
> > > the
> > > > steps of the setup guide. [1]
> > > > If you really nee HDFS though to get the best IO performance I would
> > > > suggest having Hadoop on all your machines running Flink.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> > > >
> > > > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <[hidden email]
> >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Sorry for the brief hiatus. I was preparing for my GRE exam, but I
> am
> > > > back.
> > > > > I am starting to build Flink and a doubt which I had was, is a
> > > > single-node
> > > > > cluster configuration of Hadoop enough? I assume Hadoop is needed
> > since
> > > > it
> > > > > is given on the build page.
> > > > >
> > > > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi, You can choose any unassigned issue about Flink Machine
> > Learning
> > > > > > Library (flink-ml) in JIRA. [1]
> > > > > > There are some issues for starter in flink-ml such as FLINK-1737
> > [2],
> > > > > > FLINK-1748 [3], FLINK-1994 [4].
> > > > > >
> > > > > > First, It would be better to read some articles about
> contributing
> > to
> > > > > > Flink. [5][6]
> > > > > > And if you decide a issue to contribute, please assign it to you.
> > If
> > > > you
> > > > > > don’t have permission to
> > > > > > assign, just comment into the issue. Then other people give
> > > permission
> > > > to
> > > > > > you and assign
> > > > > > the issue to you.
> > > > > >
> > > > > > Regards,
> > > > > > Chiwan Park
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/
> > > > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > > > [6] http://flink.apache.org/coding-guidelines.html
> > > > > >
> > > > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > > > [hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > I came across Stratosphere while looking for GSOC organisations
> > > > working
> > > > > > in
> > > > > > > Machine Learning. I got to know that it had become Apache
> Flink.
> > > > > > >
> > > > > > > I am interested in this project:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > > > >
> > > > > > > Backgroundd: I am proficient in C++, Java, Python and Scheme. I
> > > have
> > > > > > taken
> > > > > > > undergrad courses in machine learning and data mining. How can
> I
> > > > > > contribute
> > > > > > > to the above project?
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Rohit Shinde.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Ufuk Celebi-2

Re: Student looking to contribute to Stratosphere

Hey Rohit,

it's best to do the discussion related to a specific issue *in* the issue
itself instead of the mailing list.

In general, it's better to ask specific questions. But a general pointer
would be to look into the existing ML algorithm implementations, Stephan's
approximate PageRank implementation linked in the issue, and then think
about how to translate it into the ML library. This would also be a first
step to asking more specific questions.

– Ufuk

On Wed, Jul 15, 2015 at 2:42 PM, Rohit Shinde <[hidden email]>
wrote:

> I intend to solve this issue:
> https://issues.apache.org/jira/browse/FLINK-1748
>
> Could someone give me some pointers on how to approach this?
>
> On Wed, Jul 15, 2015 at 4:58 PM, Kostas Tzoumas <[hidden email]>
> wrote:
>
> > IDE choice is up to you with some limitations, see here for IDE setup
> > instructions:
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html
> >
> >
> > Scala IDE is not limited to Scala, it is based on Eclipse, so you can
> > develop in Java. Most committers are using IntelliJ as far as I know.
> >
> > On Wed, Jul 15, 2015 at 1:24 PM, Rohit Shinde <
> [hidden email]
> > >
> > wrote:
> >
> > > What IDE should I use? There are various options and I already have
> > Eclipse
> > > Luna. The IDE page lists that the Scala IDE is the best. So should I go
> > > with the Scala IDE? Will I be able to develop in Java later?
> > >
> > > On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[hidden email]>
> > > wrote:
> > >
> > > > Hi Rohit,
> > > >
> > > > If you are just working on your laptop, I personally find it much
> > easier
> > > to
> > > > work without Hadoop and use the local file system or just Java
> > > collections
> > > > for testing and trying out ideas.
> > > >
> > > > When you move to a cluster, it is common to use a Hadoop installation
> > to
> > > > store large files in HDFS. There, you can run Flink jobs using
> Flink's
> > > YARN
> > > > mode.
> > > >
> > > > Kostas
> > > >
> > > > On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <
> > > [hidden email]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Hadoop is not a necessity for running Flink, but rather an option.
> > Try
> > > > the
> > > > > steps of the setup guide. [1]
> > > > > If you really nee HDFS though to get the best IO performance I
> would
> > > > > suggest having Hadoop on all your machines running Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> > > > >
> > > > > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <
> [hidden email]
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Sorry for the brief hiatus. I was preparing for my GRE exam, but
> I
> > am
> > > > > back.
> > > > > > I am starting to build Flink and a doubt which I had was, is a
> > > > > single-node
> > > > > > cluster configuration of Hadoop enough? I assume Hadoop is needed
> > > since
> > > > > it
> > > > > > is given on the build page.
> > > > > >
> > > > > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <
> > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, You can choose any unassigned issue about Flink Machine
> > > Learning
> > > > > > > Library (flink-ml) in JIRA. [1]
> > > > > > > There are some issues for starter in flink-ml such as
> FLINK-1737
> > > [2],
> > > > > > > FLINK-1748 [3], FLINK-1994 [4].
> > > > > > >
> > > > > > > First, It would be better to read some articles about
> > contributing
> > > to
> > > > > > > Flink. [5][6]
> > > > > > > And if you decide a issue to contribute, please assign it to
> you.
> > > If
> > > > > you
> > > > > > > don’t have permission to
> > > > > > > assign, just comment into the issue. Then other people give
> > > > permission
> > > > > to
> > > > > > > you and assign
> > > > > > > the issue to you.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chiwan Park
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/
> > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > > > > [6] http://flink.apache.org/coding-guidelines.html
> > > > > > >
> > > > > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > > > > [hidden email]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > I came across Stratosphere while looking for GSOC
> organisations
> > > > > working
> > > > > > > in
> > > > > > > > Machine Learning. I got to know that it had become Apache
> > Flink.
> > > > > > > >
> > > > > > > > I am interested in this project:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > > > > >
> > > > > > > > Backgroundd: I am proficient in C++, Java, Python and
> Scheme. I
> > > > have
> > > > > > > taken
> > > > > > > > undergrad courses in machine learning and data mining. How
> can
> > I
> > > > > > > contribute
> > > > > > > > to the above project?
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Rohit Shinde.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Rohit Shinde

Re: Student looking to contribute to Stratosphere

Okay!

Thank you!

On Wed, Jul 15, 2015 at 6:22 PM, Ufuk Celebi <[hidden email]> wrote:

> Hey Rohit,
>
> it's best to do the discussion related to a specific issue *in* the issue
> itself instead of the mailing list.
>
> In general, it's better to ask specific questions. But a general pointer
> would be to look into the existing ML algorithm implementations, Stephan's
> approximate PageRank implementation linked in the issue, and then think
> about how to translate it into the ML library. This would also be a first
> step to asking more specific questions.
>
> – Ufuk
>
> On Wed, Jul 15, 2015 at 2:42 PM, Rohit Shinde <[hidden email]
> >
> wrote:
>
> > I intend to solve this issue:
> > https://issues.apache.org/jira/browse/FLINK-1748
> >
> > Could someone give me some pointers on how to approach this?
> >
> > On Wed, Jul 15, 2015 at 4:58 PM, Kostas Tzoumas <[hidden email]>
> > wrote:
> >
> > > IDE choice is up to you with some limitations, see here for IDE setup
> > > instructions:
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html
> > >
> > >
> > > Scala IDE is not limited to Scala, it is based on Eclipse, so you can
> > > develop in Java. Most committers are using IntelliJ as far as I know.
> > >
> > > On Wed, Jul 15, 2015 at 1:24 PM, Rohit Shinde <
> > [hidden email]
> > > >
> > > wrote:
> > >
> > > > What IDE should I use? There are various options and I already have
> > > Eclipse
> > > > Luna. The IDE page lists that the Scala IDE is the best. So should I
> go
> > > > with the Scala IDE? Will I be able to develop in Java later?
> > > >
> > > > On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas <[hidden email]
> >
> > > > wrote:
> > > >
> > > > > Hi Rohit,
> > > > >
> > > > > If you are just working on your laptop, I personally find it much
> > > easier
> > > > to
> > > > > work without Hadoop and use the local file system or just Java
> > > > collections
> > > > > for testing and trying out ideas.
> > > > >
> > > > > When you move to a cluster, it is common to use a Hadoop
> installation
> > > to
> > > > > store large files in HDFS. There, you can run Flink jobs using
> > Flink's
> > > > YARN
> > > > > mode.
> > > > >
> > > > > Kostas
> > > > >
> > > > > On Wed, Jul 15, 2015 at 8:22 AM, Márton Balassi <
> > > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Hadoop is not a necessity for running Flink, but rather an
> option.
> > > Try
> > > > > the
> > > > > > steps of the setup guide. [1]
> > > > > > If you really nee HDFS though to get the best IO performance I
> > would
> > > > > > suggest having Hadoop on all your machines running Flink.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html
> > > > > >
> > > > > > On Jul 15, 2015 5:27 AM, "Rohit Shinde" <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Sorry for the brief hiatus. I was preparing for my GRE exam,
> but
> > I
> > > am
> > > > > > back.
> > > > > > > I am starting to build Flink and a doubt which I had was, is a
> > > > > > single-node
> > > > > > > cluster configuration of Hadoop enough? I assume Hadoop is
> needed
> > > > since
> > > > > > it
> > > > > > > is given on the build page.
> > > > > > >
> > > > > > > On Sat, Jun 27, 2015 at 8:02 PM, Chiwan Park <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, You can choose any unassigned issue about Flink Machine
> > > > Learning
> > > > > > > > Library (flink-ml) in JIRA. [1]
> > > > > > > > There are some issues for starter in flink-ml such as
> > FLINK-1737
> > > > [2],
> > > > > > > > FLINK-1748 [3], FLINK-1994 [4].
> > > > > > > >
> > > > > > > > First, It would be better to read some articles about
> > > contributing
> > > > to
> > > > > > > > Flink. [5][6]
> > > > > > > > And if you decide a issue to contribute, please assign it to
> > you.
> > > > If
> > > > > > you
> > > > > > > > don’t have permission to
> > > > > > > > assign, just comment into the issue. Then other people give
> > > > > permission
> > > > > > to
> > > > > > > > you and assign
> > > > > > > > the issue to you.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Chiwan Park
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/
> > > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-1737
> > > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-1748
> > > > > > > > [4] https://issues.apache.org/jira/browse/FLINK-1994
> > > > > > > > [5] http://flink.apache.org/how-to-contribute.html
> > > > > > > > [6] http://flink.apache.org/coding-guidelines.html
> > > > > > > >
> > > > > > > > > On Jun 27, 2015, at 11:20 PM, Rohit Shinde <
> > > > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hello everyone,
> > > > > > > > >
> > > > > > > > > I came across Stratosphere while looking for GSOC
> > organisations
> > > > > > working
> > > > > > > > in
> > > > > > > > > Machine Learning. I got to know that it had become Apache
> > > Flink.
> > > > > > > > >
> > > > > > > > > I am interested in this project:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
> > > > > > > > >
> > > > > > > > > Backgroundd: I am proficient in C++, Java, Python and
> > Scheme. I
> > > > > have
> > > > > > > > taken
> > > > > > > > > undergrad courses in machine learning and data mining. How
> > can
> > > I
> > > > > > > > contribute
> > > > > > > > > to the above project?
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > Rohit Shinde.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>