Jingsong Lee created FLINK-12796:
------------------------------------
Summary: Introduce BaseArray and BaseMap to reduce conversion overhead to blink
Key: FLINK-12796
URL:
https://issues.apache.org/jira/browse/FLINK-12796 Project: Flink
Issue Type: Bug
Components: Table SQL / Planner
Reporter: Jingsong Lee
Assignee: Jingsong Lee
Currently, in internal data format of flink, the array is only BinaryArray, and the map is only BinaryMap. If the user writes a UDAF with arrays as parameters and return values, it will lead to frequent conversion between Java arrays and BinaryArrays (each conversion is equivalent to the entire array of copys), which is very time-consuming.
In order to avoid copy in conversion, BaseArray and BaseMap are introduced as internal formats.
BaseArray is the parent of GenericArray and BinaryArray, providing various read and write operations on an array.
GenericArray is a wrapper class for Java arrays, which internally wraps a Java array. This array stores some elements of internal data format.
Conversion can be avoided when the element type is a primitive type or a type that is consistent internally format and externally format. (Detail see: DataFormatConverters)
After our benchmark, the performance of UDAF using primitive Array has been improved by 10 times.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)