Xu Yang created FLINK-12671:
-------------------------------
Summary: Summarizer: summary statistics for Table
Key: FLINK-12671
URL:
https://issues.apache.org/jira/browse/FLINK-12671 Project: Flink
Issue Type: Sub-task
Reporter: Xu Yang
Assignee: Xu Yang
We provide summary statistics for Table through Summarizer. User can easily get the total count and the basic column-wise metrics: max, min, mean, variance, standardDeviation, normL1, normL2, the number of missing values and the number of valid values.
SparkML has same function, [
http://spark.apache.org/docs/latest/ml-statistics.html#summarizer]
Example:
Table input = …
TableSummary summary = *new* Summarizer(_input_).collectResult();
System.*_out_*.println(summary.mean(*"age"*)); // print the mean of the column(Name: “age”)
System.out.println(summary);
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)