[jira] [Created] (FLINK-22767) Optimize the initialization of ExecutionSlotSharingGroupBuilder

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-22767) Optimize the initialization of ExecutionSlotSharingGroupBuilder

Shang Yuanchun (Jira)
Zhilong Hong created FLINK-22767:
------------------------------------

             Summary: Optimize the initialization of ExecutionSlotSharingGroupBuilder
                 Key: FLINK-22767
                 URL: https://issues.apache.org/jira/browse/FLINK-22767
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.13.0
            Reporter: Zhilong Hong


Based on the scheduler benchmark introduced in FLINK-21731, we find that during the initialization of ExecutionSlotSharingGroupBuilder, there's a procedure that has O(N^2) complexity: {{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}. This initialization happens during the initialization of LocalInputPreferredSlotSharingStrategy. 

The original implementation is: 
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
  for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
    get the result partition's producer vertex and determine the ExecutionSlotSharingGroup where the producer vertex locates is available for current vertex{code}
This procedure has O(N^2) complexity.

It's obvious that the result partitions in the same ConsumedPartitionGroup have the same producer vertex. So we can just iterate over the ConsumedPartitionGroups instead of all the consumed partitions. This will decrease the complexity from O(N^2) to O(N).

The optimization of this procedure will speed up the initialization of DefaultScheduler. It will accelerate the submission of a new job, especially for OLAP jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)