[jira] [Created] (FLINK-13924) Add summarizer and summary for sparse vector and dense vector.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-13924) Add summarizer and summary for sparse vector and dense vector.

Shang Yuanchun (Jira)
Xu Yang created FLINK-13924:
-------------------------------

             Summary: Add summarizer and summary for sparse vector and dense vector.
                 Key: FLINK-13924
                 URL: https://issues.apache.org/jira/browse/FLINK-13924
             Project: Flink
          Issue Type: Sub-task
          Components: Library / Machine Learning
            Reporter: Xu Yang


Summarizer is the class for calculating statistics, summary is the result class of summarizer. Summary defines methods to get statistics. Assuming that the data has dense vector and sparse vector, vectors size are not equal also, so if DenseVectorSummarizer visit a sparse vector, it will change to SparseVectorSummarizer.
Statistics include vectorSize, count, mean, variance, min, max, standardDeviation, normL1, normL2.
 * Add SparseVectorSummarizer which will calculate statistics for sparse vector.
 * Add SparseVectorSummary which can get statistics of sparse vector.
 * Add DenseVectorSummarizer which will calculate statistics for dense vector.
 * Add DenseVectorSummary which can get statistics of sparse vector.
 * Add StatisticsUtil which provides utility functions for summarizer and summary.
 * Add VectorSummarizerUtil which provides utility functions for VectorSummarizer.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)