Machine Learning Library Contribution

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Machine Learning Library Contribution

Nico Duldhardt
Hello everybody!

My name is Nico Duldhardt and I am currently studying Computer Science at
the Leipzig University.
I would like to work on the Random Forest implementation of the Machine
Learning Library Jira Issue 1728
<https://issues.apache.org/jira/browse/FLINK-1728?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20%22Machine%20Learning%20Library%22%20AND%20text%20~%20%22random%20forest%22%20ORDER%20BY%20priority%20DESC>
.
For my bachelor's thesis I am currently implementing a random forest using
Apache Flink to find duplicates in databases. Feel free to take a look at
the code here: Github <https://github.com/2start/TreeBasedLearning>. The
classification of numerical data is already working. A working example can
be found in the tests. The code is not anywhere close to something that
could be put in the library, but I continually improved my coding skills
over the course of the last months and feel ready to reimplement a random
forest that will meet the standards.
I've been told about the Google Summer of Code yesterday and would
preferably write the implementation in this context. They require me to
have a mentor. Mentor Guide
<https://community.apache.org/guide-to-being-a-mentor.html>  "Most mentors
spend between 3 and 5 hours per week with their students. Most of this time
is spent encouraging them."
I would gladly share the Google Summer of Code compensation with the mentor
to compensate for his time.
Feel free to ask me any further questions.

Best Regards

Nico Duldhardt, a guy who is already enthusiastically looking forward for
his first contribution to a open source project.
Reply | Threaded
Open this post in threaded view
|

Re: Machine Learning Library Contribution

Fabian Hueske-2
Hi Nico,

Thanks for getting in touch with the Flink community!
You have dug out a pretty old JIRA issue. When the issue was created, most
of the development happened in the batch processing / machine learning area.
Since then, the focus of the project has shifted towards stream processing.
Most contributors and committers are working in that space.

I'm afraid, but I don't think there is a committer working on ML algorithms
on Flink at the moment.
AFAIK, also nobody has signed up as a GSOC mentor this year.

Best, Fabian


2018-03-09 15:03 GMT+01:00 Nico Duldhardt <[hidden email]>:

> Hello everybody!
>
> My name is Nico Duldhardt and I am currently studying Computer Science at
> the Leipzig University.
> I would like to work on the Random Forest implementation of the Machine
> Learning Library Jira Issue 1728
> <https://issues.apache.org/jira/browse/FLINK-1728?jql=
> project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%
> 20component%20%3D%20%22Machine%20Learning%20Library%22%20AND%20text%20~%
> 20%22random%20forest%22%20ORDER%20BY%20priority%20DESC>
> .
> For my bachelor's thesis I am currently implementing a random forest using
> Apache Flink to find duplicates in databases. Feel free to take a look at
> the code here: Github <https://github.com/2start/TreeBasedLearning>. The
> classification of numerical data is already working. A working example can
> be found in the tests. The code is not anywhere close to something that
> could be put in the library, but I continually improved my coding skills
> over the course of the last months and feel ready to reimplement a random
> forest that will meet the standards.
> I've been told about the Google Summer of Code yesterday and would
> preferably write the implementation in this context. They require me to
> have a mentor. Mentor Guide
> <https://community.apache.org/guide-to-being-a-mentor.html>  "Most mentors
> spend between 3 and 5 hours per week with their students. Most of this time
> is spent encouraging them."
> I would gladly share the Google Summer of Code compensation with the mentor
> to compensate for his time.
> Feel free to ask me any further questions.
>
> Best Regards
>
> Nico Duldhardt, a guy who is already enthusiastically looking forward for
> his first contribution to a open source project.
>