Hi
I'm using Flink ML 9.2.1 in order to perform a multiple linear regression with a csv data file. The Scala sample code for it is pretty straightforward: val mlr = MultipleLinearRegression() val parameters = ParameterMap() parameters.add(MultipleLinearRegression.Stepsize, 2.0) parameters.add(MultipleLinearRegression.Iterations, 10) parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001) val inputDS = env.fromCollection(data) mlr.fit(inputDS, parameters) When I'm using Java(8) the fit method includes 3 parameters 1. dataset 2.parameters 3. object which implements -fitOperation interface multipleLinearRegression.fit(regressionDS, parameters,fitOperation); Is there a need to implement the fitOperation interface which have been already implemented in Flinks ml source code. Another option is using MultipleLinearRegression.fitMLR() method ,but I haven't found a way to pass the train dataset to it as a parameter or by setter. I'll be more than happy if you could guide me how to implement it in Java Thanks Hanan Meyer |
Hello everyone.
Do you have a sample in Java how to implement Flink MultipleLinearRegression example? Scala is great, however we would like to see the exact example we could invoke it from Java if it is possible. Thanks and sorry for the interrupt. On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer <[hidden email]> wrote: > Hi > > I'm using Flink ML 9.2.1 in order to perform a multiple linear regression > with a csv data file. > > The Scala sample code for it is pretty straightforward: > val mlr = MultipleLinearRegression() > > val parameters = ParameterMap() > > parameters.add(MultipleLinearRegression.Stepsize, 2.0) > parameters.add(MultipleLinearRegression.Iterations, 10) > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001) > val inputDS = env.fromCollection(data) > > mlr.fit(inputDS, parameters) > > When I'm using Java(8) the fit method includes 3 parameters > 1. dataset > 2.parameters > 3. object which implements -fitOperation interface > > multipleLinearRegression.fit(regressionDS, parameters,fitOperation); > > Is there a need to implement the fitOperation interface which have been > already > implemented in Flinks ml source code. > > Another option is using MultipleLinearRegression.fitMLR() method ,but I > haven't found a way to pass the train dataset to it as a parameter or by > setter. > > I'll be more than happy if you could guide me how to implement it in Java > > Thanks > > Hanan Meyer > > > > > -- *Regards* *Alexey Sapozhnikov* CTO& Co-Founder Scalabillit Inc Aba Even 10-C, Herzelia, Israel M : +972-52-2363823 E : [hidden email] W : http://www.scalabill.it YT - https://youtu.be/9Rj309PTOFA Map:http://mapta.gs/Scalabillit Revolutionizing Proof-of-Concept |
Hi Alexey and Hanan,
one of FlinkML’s feature is the flexible pipelining mechanism. It allows you to chain multiple transformers with a trailing predictor to form a data analysis pipeline. In order to support multiple input types, the actual program logic (matching for the type) is assembled at compile time by the Scala compiler using implicits. That is also the reason why you see in Java the fourth parameter fitOperation when calling multipleLinearRegression.fit() which in Scala is an implicit parameter. In theory, it is possible to construct the pipelines yourself in Java by assembling explicitly the respective implicit operations. However, I would refrain from doing so, because it is error prone and laborious. At the moment, I don’t really see an easy solution how to port the pipelining mechanism to Java (8), because of the missing feature of implicits. However, what we could do is to provide fit, predict and transform method which can be used without the chaining support. Then you lose the pipelining, but you can do it manually by calling the methods (e.g. fit and transform) for each stage. We could add a thin Java layer which calls the Scala methods with the correctly instantiated operations. Cheers, Till On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov <[hidden email]> wrote: > Hello everyone. > > Do you have a sample in Java how to implement Flink > MultipleLinearRegression example? > Scala is great, however we would like to see the exact example we could > invoke it from Java if it is possible. > Thanks and sorry for the interrupt. > > > > On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer <[hidden email]> wrote: > > > Hi > > > > I'm using Flink ML 9.2.1 in order to perform a multiple linear regression > > with a csv data file. > > > > The Scala sample code for it is pretty straightforward: > > val mlr = MultipleLinearRegression() > > > > val parameters = ParameterMap() > > > > parameters.add(MultipleLinearRegression.Stepsize, 2.0) > > parameters.add(MultipleLinearRegression.Iterations, 10) > > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001) > > val inputDS = env.fromCollection(data) > > > > mlr.fit(inputDS, parameters) > > > > When I'm using Java(8) the fit method includes 3 parameters > > 1. dataset > > 2.parameters > > 3. object which implements -fitOperation interface > > > > multipleLinearRegression.fit(regressionDS, parameters,fitOperation); > > > > Is there a need to implement the fitOperation interface which have been > > already > > implemented in Flinks ml source code. > > > > Another option is using MultipleLinearRegression.fitMLR() method ,but I > > haven't found a way to pass the train dataset to it as a parameter or by > > setter. > > > > I'll be more than happy if you could guide me how to implement it in Java > > > > Thanks > > > > Hanan Meyer > > > > > > > > > > > > > -- > > *Regards* > > *Alexey Sapozhnikov* > CTO& Co-Founder > Scalabillit Inc > Aba Even 10-C, Herzelia, Israel > M : +972-52-2363823 > E : [hidden email] > W : http://www.scalabill.it > YT - https://youtu.be/9Rj309PTOFA > Map:http://mapta.gs/Scalabillit > Revolutionizing Proof-of-Concept > |
+1, having the convenient creation of pipelines for Java is more of a long
term project, but we should make it possible to manually create pipelines in Java. On Fri, Sep 18, 2015 at 11:15 AM, Till Rohrmann <[hidden email]> wrote: > Hi Alexey and Hanan, > > one of FlinkML’s feature is the flexible pipelining mechanism. It allows > you to chain multiple transformers with a trailing predictor to form a data > analysis pipeline. In order to support multiple input types, the actual > program logic (matching for the type) is assembled at compile time by the > Scala compiler using implicits. That is also the reason why you see in Java > the fourth parameter fitOperation when calling > multipleLinearRegression.fit() which in Scala is an implicit parameter. In > theory, it is possible to construct the pipelines yourself in Java by > assembling explicitly the respective implicit operations. However, I would > refrain from doing so, because it is error prone and laborious. > > At the moment, I don’t really see an easy solution how to port the > pipelining mechanism to Java (8), because of the missing feature of > implicits. However, what we could do is to provide fit, predict and > transform method which can be used without the chaining support. Then you > lose the pipelining, but you can do it manually by calling the methods > (e.g. fit and transform) for each stage. We could add a thin Java layer > which calls the Scala methods with the correctly instantiated operations. > > Cheers, > Till > > > On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov <[hidden email]> > wrote: > > > Hello everyone. > > > > Do you have a sample in Java how to implement Flink > > MultipleLinearRegression example? > > Scala is great, however we would like to see the exact example we could > > invoke it from Java if it is possible. > > Thanks and sorry for the interrupt. > > > > > > > > On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer <[hidden email]> wrote: > > > > > Hi > > > > > > I'm using Flink ML 9.2.1 in order to perform a multiple linear > regression > > > with a csv data file. > > > > > > The Scala sample code for it is pretty straightforward: > > > val mlr = MultipleLinearRegression() > > > > > > val parameters = ParameterMap() > > > > > > parameters.add(MultipleLinearRegression.Stepsize, 2.0) > > > parameters.add(MultipleLinearRegression.Iterations, 10) > > > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001) > > > val inputDS = env.fromCollection(data) > > > > > > mlr.fit(inputDS, parameters) > > > > > > When I'm using Java(8) the fit method includes 3 parameters > > > 1. dataset > > > 2.parameters > > > 3. object which implements -fitOperation interface > > > > > > multipleLinearRegression.fit(regressionDS, parameters,fitOperation); > > > > > > Is there a need to implement the fitOperation interface which have > been > > > already > > > implemented in Flinks ml source code. > > > > > > Another option is using MultipleLinearRegression.fitMLR() method ,but I > > > haven't found a way to pass the train dataset to it as a parameter or > by > > > setter. > > > > > > I'll be more than happy if you could guide me how to implement it in > Java > > > > > > Thanks > > > > > > Hanan Meyer > > > > > > > > > > > > > > > > > > > > > -- > > > > *Regards* > > > > *Alexey Sapozhnikov* > > CTO& Co-Founder > > Scalabillit Inc > > Aba Even 10-C, Herzelia, Israel > > M : +972-52-2363823 > > E : [hidden email] > > W : http://www.scalabill.it > > YT - https://youtu.be/9Rj309PTOFA > > Map:http://mapta.gs/Scalabillit > > Revolutionizing Proof-of-Concept > > > |
Thank you very much for the clarifications.
-----Original Message----- From: Theodore Vasiloudis [mailto:[hidden email]] Sent: Friday, September 18, 2015 2:33 PM To: [hidden email] Cc: Hanan Meyer Subject: Re: Flink ML linear regression issue +1, having the convenient creation of pipelines for Java is more of a +long term project, but we should make it possible to manually create pipelines in Java. On Fri, Sep 18, 2015 at 11:15 AM, Till Rohrmann <[hidden email]> wrote: > Hi Alexey and Hanan, > > one of FlinkML’s feature is the flexible pipelining mechanism. It > allows you to chain multiple transformers with a trailing predictor to > form a data analysis pipeline. In order to support multiple input > types, the actual program logic (matching for the type) is assembled > at compile time by the Scala compiler using implicits. That is also > the reason why you see in Java the fourth parameter fitOperation when > calling > multipleLinearRegression.fit() which in Scala is an implicit > parameter. In theory, it is possible to construct the pipelines > yourself in Java by assembling explicitly the respective implicit > operations. However, I would refrain from doing so, because it is error prone and laborious. > > At the moment, I don’t really see an easy solution how to port the > pipelining mechanism to Java (8), because of the missing feature of > implicits. However, what we could do is to provide fit, predict and > transform method which can be used without the chaining support. Then > you lose the pipelining, but you can do it manually by calling the > methods (e.g. fit and transform) for each stage. We could add a thin > Java layer which calls the Scala methods with the correctly instantiated operations. > > Cheers, > Till > > > On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov > <[hidden email]> > wrote: > > > Hello everyone. > > > > Do you have a sample in Java how to implement Flink > > MultipleLinearRegression example? > > Scala is great, however we would like to see the exact example we > > could invoke it from Java if it is possible. > > Thanks and sorry for the interrupt. > > > > > > > > On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer <[hidden email]> wrote: > > > > > Hi > > > > > > I'm using Flink ML 9.2.1 in order to perform a multiple linear > regression > > > with a csv data file. > > > > > > The Scala sample code for it is pretty straightforward: > > > val mlr = MultipleLinearRegression() > > > > > > val parameters = ParameterMap() > > > > > > parameters.add(MultipleLinearRegression.Stepsize, 2.0) > > > parameters.add(MultipleLinearRegression.Iterations, 10) > > > parameters.add(MultipleLinearRegression.ConvergenceThreshold, > > > 0.001) val inputDS = env.fromCollection(data) > > > > > > mlr.fit(inputDS, parameters) > > > > > > When I'm using Java(8) the fit method includes 3 parameters 1. > > > dataset 2.parameters 3. object which implements -fitOperation > > > interface > > > > > > multipleLinearRegression.fit(regressionDS, > > > parameters,fitOperation); > > > > > > Is there a need to implement the fitOperation interface which > > > have > been > > > already > > > implemented in Flinks ml source code. > > > > > > Another option is using MultipleLinearRegression.fitMLR() method > > > ,but I haven't found a way to pass the train dataset to it as a > > > parameter or > by > > > setter. > > > > > > I'll be more than happy if you could guide me how to implement it > > > in > Java > > > > > > Thanks > > > > > > Hanan Meyer > > > > > > > > > > > > > > > > > > > > > -- > > > > *Regards* > > > > *Alexey Sapozhnikov* > > CTO& Co-Founder > > Scalabillit Inc > > Aba Even 10-C, Herzelia, Israel > > M : +972-52-2363823 > > E : [hidden email] > > W : http://www.scalabill.it > > YT - https://youtu.be/9Rj309PTOFA > > Map:http://mapta.gs/Scalabillit > > Revolutionizing Proof-of-Concept > > > |
Free forum by Nabble | Edit this page |