[jira] [Created] (FLINK-12796) Introduce BaseArray and BaseMap to reduce conversion overhead to blink

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-12796) Introduce BaseArray and BaseMap to reduce conversion overhead to blink

Shang Yuanchun (Jira)
Jingsong Lee created FLINK-12796:
------------------------------------

             Summary: Introduce BaseArray and BaseMap to reduce conversion overhead to blink
                 Key: FLINK-12796
                 URL: https://issues.apache.org/jira/browse/FLINK-12796
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Planner
            Reporter: Jingsong Lee
            Assignee: Jingsong Lee


Currently, in internal data format of flink, the array is only BinaryArray, and the map is only BinaryMap. If the user writes a UDAF with arrays as parameters and return values, it will lead to frequent conversion between Java arrays and BinaryArrays (each conversion is equivalent to the entire array of copys), which is very time-consuming.

In order to avoid copy in conversion, BaseArray and BaseMap are introduced as internal formats.

BaseArray is the parent of GenericArray and BinaryArray, providing various read and write operations on an array.

GenericArray is a wrapper class for Java arrays, which internally wraps a Java array. This array stores some elements of internal data format.

Conversion can be avoided when the element type is a primitive type or a type that is consistent internally format and externally format. (Detail see: DataFormatConverters)

After our benchmark, the performance of UDAF using primitive Array has been improved by 10 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)