Hello everybody!
My name is Nico Duldhardt and I am currently studying Computer Science at the Leipzig University. I would like to work on the Random Forest implementation of the Machine Learning Library Jira Issue 1728 <https://issues.apache.org/jira/browse/FLINK-1728?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20%22Machine%20Learning%20Library%22%20AND%20text%20~%20%22random%20forest%22%20ORDER%20BY%20priority%20DESC> . For my bachelor's thesis I am currently implementing a random forest using Apache Flink to find duplicates in databases. Feel free to take a look at the code here: Github <https://github.com/2start/TreeBasedLearning>. The classification of numerical data is already working. A working example can be found in the tests. The code is not anywhere close to something that could be put in the library, but I continually improved my coding skills over the course of the last months and feel ready to reimplement a random forest that will meet the standards. I've been told about the Google Summer of Code yesterday and would preferably write the implementation in this context. They require me to have a mentor. Mentor Guide <https://community.apache.org/guide-to-being-a-mentor.html> "Most mentors spend between 3 and 5 hours per week with their students. Most of this time is spent encouraging them." I would gladly share the Google Summer of Code compensation with the mentor to compensate for his time. Feel free to ask me any further questions. Best Regards Nico Duldhardt, a guy who is already enthusiastically looking forward for his first contribution to a open source project. |
Hi Nico,
Thanks for getting in touch with the Flink community! You have dug out a pretty old JIRA issue. When the issue was created, most of the development happened in the batch processing / machine learning area. Since then, the focus of the project has shifted towards stream processing. Most contributors and committers are working in that space. I'm afraid, but I don't think there is a committer working on ML algorithms on Flink at the moment. AFAIK, also nobody has signed up as a GSOC mentor this year. Best, Fabian 2018-03-09 15:03 GMT+01:00 Nico Duldhardt <[hidden email]>: > Hello everybody! > > My name is Nico Duldhardt and I am currently studying Computer Science at > the Leipzig University. > I would like to work on the Random Forest implementation of the Machine > Learning Library Jira Issue 1728 > <https://issues.apache.org/jira/browse/FLINK-1728?jql= > project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND% > 20component%20%3D%20%22Machine%20Learning%20Library%22%20AND%20text%20~% > 20%22random%20forest%22%20ORDER%20BY%20priority%20DESC> > . > For my bachelor's thesis I am currently implementing a random forest using > Apache Flink to find duplicates in databases. Feel free to take a look at > the code here: Github <https://github.com/2start/TreeBasedLearning>. The > classification of numerical data is already working. A working example can > be found in the tests. The code is not anywhere close to something that > could be put in the library, but I continually improved my coding skills > over the course of the last months and feel ready to reimplement a random > forest that will meet the standards. > I've been told about the Google Summer of Code yesterday and would > preferably write the implementation in this context. They require me to > have a mentor. Mentor Guide > <https://community.apache.org/guide-to-being-a-mentor.html> "Most mentors > spend between 3 and 5 hours per week with their students. Most of this time > is spent encouraging them." > I would gladly share the Google Summer of Code compensation with the mentor > to compensate for his time. > Feel free to ask me any further questions. > > Best Regards > > Nico Duldhardt, a guy who is already enthusiastically looking forward for > his first contribution to a open source project. > |
Free forum by Nabble | Edit this page |