[jira] [Created] (FLINK-8950) "Materialize" Tables to avoid recomputation.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-8950) "Materialize" Tables to avoid recomputation.

Shang Yuanchun (Jira)
Fabian Hueske created FLINK-8950:
------------------------------------

             Summary: "Materialize" Tables to avoid recomputation.
                 Key: FLINK-8950
                 URL: https://issues.apache.org/jira/browse/FLINK-8950
             Project: Flink
          Issue Type: New Feature
          Components: Table API & SQL
    Affects Versions: 1.5.0
            Reporter: Fabian Hueske


Currently, {{Table}} objects of the Table API / SQL are treated like virtual views, i.e., all relational operators that have been applied on them are recorded and translated when a {{Table}} is emitted to a {{TableSink}} or converted into a {{DataSet}} or {{DataStream}}.

In case a {{Table}} is accessed twice, the (sub-)query that it represents is translated twice into a {{DataSet}} or {{DataStream}} program and hence also executed twice which is inefficient. Currently, the only way to avoid this is to convert the {{Table}} into a {{DataSet}} or {{DataStream}}, which will cause the optimizer to generate a plan and register it back as a new {{Table}}.

We should offer a method to internally "materialize" a {{Table}} object, i.e., to optimize, generate a plan, and register the plan as an internal table. All queries / operations that are evaluated on the materialized {{Table}} will start from the same {{DataSet}} or {{DataStream}} such that it is not computed multiple times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)