Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

Thomas FOURNIER
Hello,

Following QuickStart guide in FlinkML, I have to do the following:

val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env,
"src/main/resources/svmguide1")

Instead of:

val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(
"src/main/resources/svmguide1")


Nonetheless, this implicit class in ml/packages

implicit class RichExecutionEnvironment(executionEnvironment:
ExecutionEnvironment) {
  def readLibSVM(path: String): DataSet[LabeledVector] = {
    MLUtils.readLibSVM(executionEnvironment, path)
  }
}


is supposed to pimp MLUtils in the way we want.

Does it mean that RichExecutionEnvironment is not imported in the scope ?
What can be done to solve this ?


Thanks

Regards
Thomas
Reply | Threaded
Open this post in threaded view
|

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

Theodore Vasiloudis
This has to do with not doing a wildcard import of the Scala api, it was
reported and already fixed on master [1]

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/jira-Created-FLINK-4792-Update-documentation-QuickStart-FlinkML-td13936.html

--
Sent from a mobile device. May contain autocorrect errors.

On Oct 20, 2016 2:06 PM, "Thomas FOURNIER" <[hidden email]>
wrote:

> Hello,
>
> Following QuickStart guide in FlinkML, I have to do the following:
>
> val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env,
> "src/main/resources/svmguide1")
>
> Instead of:
>
> val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(
> "src/main/resources/svmguide1")
>
>
> Nonetheless, this implicit class in ml/packages
>
> implicit class RichExecutionEnvironment(executionEnvironment:
> ExecutionEnvironment) {
>   def readLibSVM(path: String): DataSet[LabeledVector] = {
>     MLUtils.readLibSVM(executionEnvironment, path)
>   }
> }
>
>
> is supposed to pimp MLUtils in the way we want.
>
> Does it mean that RichExecutionEnvironment is not imported in the scope ?
> What can be done to solve this ?
>
>
> Thanks
>
> Regards
> Thomas
>
Reply | Threaded
Open this post in threaded view
|

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

Thomas FOURNIER
Yep I've done it: import org.apache.flink.api.scala._

I had reported this issue but still have the same problem.

My code is the following (with imports)

import org.apache.flink.api.scala._
import org.apache.flink.ml._

import org.apache.flink.ml.classification.SVM
import org.apache.flink.ml.common.LabeledVector
import org.apache.flink.ml.math.DenseVector
import org.apache.flink.ml.math.Vector

object App {

  def main(args: Array[String]) {

    val env = ExecutionEnvironment.getExecutionEnvironment
    val survival = env.readCsvFile[(String, String, String,
String)]("src/main/resources/haberman.data", ",")


    val survivalLV = survival
      .map { tuple =>
        val list = tuple.productIterator.toList
        val numList = list.map(_.asInstanceOf[String].toDouble)
        LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
      }



    val astroTrain: DataSet[LabeledVector] =
MLUtils.readLibSVM(env,"src/main/resources/svmguide1")

    val astroTest: DataSet[(Vector, Double)] = MLUtils
      .readLibSVM(env, "src/main/resources/svmguide1.t")
      .map(l => (l.vector, l.label))

    val svm = SVM()
      .setBlocks(env.getParallelism)
      .setIterations(100)
      .setRegularization(0.001)
      .setStepsize(0.1)
      .setSeed(42)

    svm.fit(astroTrain)
    println(svm.toString)


    val predictionPairs = svm.evaluate(astroTest)
    predictionPairs.print()

  }
}



And I can't write:

MLUtils.readLibSVM("src/main/resources/svmguide1")







2016-10-20 16:26 GMT+02:00 Theodore Vasiloudis <
[hidden email]>:

> This has to do with not doing a wildcard import of the Scala api, it was
> reported and already fixed on master [1]
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.
> nabble.com/jira-Created-FLINK-4792-Update-documentation-
> QuickStart-FlinkML-td13936.html
>
> --
> Sent from a mobile device. May contain autocorrect errors.
>
> On Oct 20, 2016 2:06 PM, "Thomas FOURNIER" <[hidden email]>
> wrote:
>
> > Hello,
> >
> > Following QuickStart guide in FlinkML, I have to do the following:
> >
> > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env,
> > "src/main/resources/svmguide1")
> >
> > Instead of:
> >
> > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(
> > "src/main/resources/svmguide1")
> >
> >
> > Nonetheless, this implicit class in ml/packages
> >
> > implicit class RichExecutionEnvironment(executionEnvironment:
> > ExecutionEnvironment) {
> >   def readLibSVM(path: String): DataSet[LabeledVector] = {
> >     MLUtils.readLibSVM(executionEnvironment, path)
> >   }
> > }
> >
> >
> > is supposed to pimp MLUtils in the way we want.
> >
> > Does it mean that RichExecutionEnvironment is not imported in the scope ?
> > What can be done to solve this ?
> >
> >
> > Thanks
> >
> > Regards
> > Thomas
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

Theodore Vasiloudis
I've copy pasted your code to an example and it compiles fine. Are you sure
your project imports are done correctly?

Here's the sbt file I'm using:

resolvers in ThisBuild ++= Seq("Apache Development Snapshot
Repository" at "https://repository.apache.org/content/repositories/snapshots/",
  Resolver.mavenLocal)

name := "Flink Project"

version := "0.1-SNAPSHOT"

organization := "org.example"

scalaVersion in ThisBuild := "2.11.7"

val flinkVersion = "1.1.0"

val flinkDependencies = Seq(
  "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
  "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
  "org.apache.flink" %% "flink-ml" % flinkVersion % "provided")

lazy val root = (project in file(".")).
  settings(
    libraryDependencies ++= flinkDependencies
  )

It comes from Till's Flink Quickstart project
<https://github.com/tillrohrmann/flink-project>.

The RichExecutionEnvironment comes from the import org.apache.flink.ml._
import.


On Thu, Oct 20, 2016 at 7:07 PM, Thomas FOURNIER <
[hidden email]> wrote:

> Yep I've done it: import org.apache.flink.api.scala._
>
> I had reported this issue but still have the same problem.
>
> My code is the following (with imports)
>
> import org.apache.flink.api.scala._
> import org.apache.flink.ml._
>
> import org.apache.flink.ml.classification.SVM
> import org.apache.flink.ml.common.LabeledVector
> import org.apache.flink.ml.math.DenseVector
> import org.apache.flink.ml.math.Vector
>
> object App {
>
>   def main(args: Array[String]) {
>
>     val env = ExecutionEnvironment.getExecutionEnvironment
>     val survival = env.readCsvFile[(String, String, String,
> String)]("src/main/resources/haberman.data", ",")
>
>
>     val survivalLV = survival
>       .map { tuple =>
>         val list = tuple.productIterator.toList
>         val numList = list.map(_.asInstanceOf[String].toDouble)
>         LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
>       }
>
>
>
>     val astroTrain: DataSet[LabeledVector] =
> MLUtils.readLibSVM(env,"src/main/resources/svmguide1")
>
>     val astroTest: DataSet[(Vector, Double)] = MLUtils
>       .readLibSVM(env, "src/main/resources/svmguide1.t")
>       .map(l => (l.vector, l.label))
>
>     val svm = SVM()
>       .setBlocks(env.getParallelism)
>       .setIterations(100)
>       .setRegularization(0.001)
>       .setStepsize(0.1)
>       .setSeed(42)
>
>     svm.fit(astroTrain)
>     println(svm.toString)
>
>
>     val predictionPairs = svm.evaluate(astroTest)
>     predictionPairs.print()
>
>   }
> }
>
>
>
> And I can't write:
>
> MLUtils.readLibSVM("src/main/resources/svmguide1")
>
>
>
>
>
>
>
> 2016-10-20 16:26 GMT+02:00 Theodore Vasiloudis <
> [hidden email]>:
>
> > This has to do with not doing a wildcard import of the Scala api, it was
> > reported and already fixed on master [1]
> >
> > [1]
> > http://apache-flink-mailing-list-archive.1008284.n3.
> > nabble.com/jira-Created-FLINK-4792-Update-documentation-
> > QuickStart-FlinkML-td13936.html
> >
> > --
> > Sent from a mobile device. May contain autocorrect errors.
> >
> > On Oct 20, 2016 2:06 PM, "Thomas FOURNIER" <[hidden email]>
> > wrote:
> >
> > > Hello,
> > >
> > > Following QuickStart guide in FlinkML, I have to do the following:
> > >
> > > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env,
> > > "src/main/resources/svmguide1")
> > >
> > > Instead of:
> > >
> > > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(
> > > "src/main/resources/svmguide1")
> > >
> > >
> > > Nonetheless, this implicit class in ml/packages
> > >
> > > implicit class RichExecutionEnvironment(executionEnvironment:
> > > ExecutionEnvironment) {
> > >   def readLibSVM(path: String): DataSet[LabeledVector] = {
> > >     MLUtils.readLibSVM(executionEnvironment, path)
> > >   }
> > > }
> > >
> > >
> > > is supposed to pimp MLUtils in the way we want.
> > >
> > > Does it mean that RichExecutionEnvironment is not imported in the
> scope ?
> > > What can be done to solve this ?
> > >
> > >
> > > Thanks
> > >
> > > Regards
> > > Thomas
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

Theodore Vasiloudis
Oh, sorry just noticed the error. You should be calling env.readLibSVMFile,
the implicit class pimps the ExecutionEnvironment, the MLUtils.readLibSVM
still requires the env as an argument.

On Fri, Oct 21, 2016 at 10:22 AM, Theodore Vasiloudis <
[hidden email]> wrote:

> I've copy pasted your code to an example and it compiles fine. Are you
> sure your project imports are done correctly?
>
> Here's the sbt file I'm using:
>
> resolvers in ThisBuild ++= Seq("Apache Development Snapshot Repository" at "https://repository.apache.org/content/repositories/snapshots/",
>   Resolver.mavenLocal)
>
> name := "Flink Project"
>
> version := "0.1-SNAPSHOT"
>
> organization := "org.example"
>
> scalaVersion in ThisBuild := "2.11.7"
>
> val flinkVersion = "1.1.0"
>
> val flinkDependencies = Seq(
>   "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>   "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
>   "org.apache.flink" %% "flink-ml" % flinkVersion % "provided")
>
> lazy val root = (project in file(".")).
>   settings(
>     libraryDependencies ++= flinkDependencies
>   )
>
> It comes from Till's Flink Quickstart project
> <https://github.com/tillrohrmann/flink-project>.
>
> The RichExecutionEnvironment comes from the import org.apache.flink.ml._
> import.
>
>
> On Thu, Oct 20, 2016 at 7:07 PM, Thomas FOURNIER <
> [hidden email]> wrote:
>
>> Yep I've done it: import org.apache.flink.api.scala._
>>
>> I had reported this issue but still have the same problem.
>>
>> My code is the following (with imports)
>>
>> import org.apache.flink.api.scala._
>> import org.apache.flink.ml._
>>
>> import org.apache.flink.ml.classification.SVM
>> import org.apache.flink.ml.common.LabeledVector
>> import org.apache.flink.ml.math.DenseVector
>> import org.apache.flink.ml.math.Vector
>>
>> object App {
>>
>>   def main(args: Array[String]) {
>>
>>     val env = ExecutionEnvironment.getExecutionEnvironment
>>     val survival = env.readCsvFile[(String, String, String,
>> String)]("src/main/resources/haberman.data", ",")
>>
>>
>>     val survivalLV = survival
>>       .map { tuple =>
>>         val list = tuple.productIterator.toList
>>         val numList = list.map(_.asInstanceOf[String].toDouble)
>>         LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
>>       }
>>
>>
>>
>>     val astroTrain: DataSet[LabeledVector] =
>> MLUtils.readLibSVM(env,"src/main/resources/svmguide1")
>>
>>     val astroTest: DataSet[(Vector, Double)] = MLUtils
>>       .readLibSVM(env, "src/main/resources/svmguide1.t")
>>       .map(l => (l.vector, l.label))
>>
>>     val svm = SVM()
>>       .setBlocks(env.getParallelism)
>>       .setIterations(100)
>>       .setRegularization(0.001)
>>       .setStepsize(0.1)
>>       .setSeed(42)
>>
>>     svm.fit(astroTrain)
>>     println(svm.toString)
>>
>>
>>     val predictionPairs = svm.evaluate(astroTest)
>>     predictionPairs.print()
>>
>>   }
>> }
>>
>>
>>
>> And I can't write:
>>
>> MLUtils.readLibSVM("src/main/resources/svmguide1")
>>
>>
>>
>>
>>
>>
>>
>> 2016-10-20 16:26 GMT+02:00 Theodore Vasiloudis <
>> [hidden email]>:
>>
>> > This has to do with not doing a wildcard import of the Scala api, it was
>> > reported and already fixed on master [1]
>> >
>> > [1]
>> > http://apache-flink-mailing-list-archive.1008284.n3.
>> > nabble.com/jira-Created-FLINK-4792-Update-documentation-
>> > QuickStart-FlinkML-td13936.html
>> >
>> > --
>> > Sent from a mobile device. May contain autocorrect errors.
>> >
>> > On Oct 20, 2016 2:06 PM, "Thomas FOURNIER" <[hidden email]
>> >
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > Following QuickStart guide in FlinkML, I have to do the following:
>> > >
>> > > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env,
>> > > "src/main/resources/svmguide1")
>> > >
>> > > Instead of:
>> > >
>> > > val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(
>> > > "src/main/resources/svmguide1")
>> > >
>> > >
>> > > Nonetheless, this implicit class in ml/packages
>> > >
>> > > implicit class RichExecutionEnvironment(executionEnvironment:
>> > > ExecutionEnvironment) {
>> > >   def readLibSVM(path: String): DataSet[LabeledVector] = {
>> > >     MLUtils.readLibSVM(executionEnvironment, path)
>> > >   }
>> > > }
>> > >
>> > >
>> > > is supposed to pimp MLUtils in the way we want.
>> > >
>> > > Does it mean that RichExecutionEnvironment is not imported in the
>> scope ?
>> > > What can be done to solve this ?
>> > >
>> > >
>> > > Thanks
>> > >
>> > > Regards
>> > > Thomas
>> > >
>> >
>>
>
>