[jira] [Created] (FLINK-11899) Introduce VectorizedColumnRowInputParquetFormat for blink runtime

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-11899) Introduce VectorizedColumnRowInputParquetFormat for blink runtime

Shang Yuanchun (Jira)
Jingsong Lee created FLINK-11899:
------------------------------------

             Summary: Introduce VectorizedColumnRowInputParquetFormat for blink runtime
                 Key: FLINK-11899
                 URL: https://issues.apache.org/jira/browse/FLINK-11899
             Project: Flink
          Issue Type: New Feature
            Reporter: Jingsong Lee
            Assignee: Jingsong Lee


Vectorized Column Row Input Parquet Format is introduced to read parquet data in batches.

When returning each row of data, instead of actually retrieving each field, we use BaseRow's abstraction to return a Columnar Row-like view.

This will greatly improve the downstream filtered scenarios, so that there is no need to access redundant fields on the filtered data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)