[jira] [Created] (FLINK-4438) FlinkML Quickstart Guide implies incorrect type for test data

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-4438) FlinkML Quickstart Guide implies incorrect type for test data

Shang Yuanchun (Jira)
Ahmad Ragab created FLINK-4438:
----------------------------------

             Summary: FlinkML Quickstart Guide implies incorrect type for test data
                 Key: FLINK-4438
                 URL: https://issues.apache.org/jira/browse/FLINK-4438
             Project: Flink
          Issue Type: Bug
          Components: Documentation
    Affects Versions: 1.2.0
            Reporter: Ahmad Ragab
            Priority: Minor
             Fix For: 1.2.0


https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/quickstart.html

Documentation under *LibSVM* section says that:
----
We can simply import the dataset then using:

{code:java}
import org.apache.flink.ml.MLUtils

val astroTrain: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1")
val astroTest: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1.t")
{code}

This gives us two {{DataSet\[LabeledVector\]}} objects that we will use in the following section to create a classifier.
----
Test data wouldn't be of type {{LabeledVector}} generally, it would be as it is described in other examples as {{DataSet\[Vector\]}} since prediction should generate the labels. Thus after reading the file using {{MLUtils}} it should be mapped to a vector.

Also, the previous section in *Loading Data* should include an example of using the {{Splitter}} in order to prepare the {{survivalLV}} data for use with a learner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)