[jira] [Created] (FLINK-11425) Support of “Hash Teams” in Hybrid Hash Join

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-11425) Support of “Hash Teams” in Hybrid Hash Join

Shang Yuanchun (Jira)
LiuJi created FLINK-11425:
-----------------------------

             Summary: Support of “Hash Teams” in Hybrid Hash Join
                 Key: FLINK-11425
                 URL: https://issues.apache.org/jira/browse/FLINK-11425
             Project: Flink
          Issue Type: New Feature
          Components: Core, Optimizer
            Reporter: LiuJi


Hybrid Hash Join is already supported in current version. The join starts operating in memory and gradually starts spilling contents to disk, when the memory is not sufficient.

 

Current hash join only support two inputs,  so when a job contains multiple hash joins which have the same join keys, it will consume some unnecessary resources (I/O, memory, etc) because some upstream output data may useless for downstream hash join.

 

According to the above observations, we want to provide a HashTeamManager to implement multiway inputs hash join by combining several two way hash join which have same join keys. HashTeamManager manage the relations of multiple HashTables and improve efficiency in memory use and lower I/O operations by joining multiple relations at one time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)