Zhu Zhu created FLINK-13056:
-------------------------------
Summary: Optimize region failover performance on calculating vertices to restart
Key: FLINK-13056
URL:
https://issues.apache.org/jira/browse/FLINK-13056 Project: Flink
Issue Type: Sub-task
Components: Runtime / Coordination
Affects Versions: 1.9.0
Reporter: Zhu Zhu
Assignee: Zhu Zhu
Currently some region boundary structures are calculated each time of a region failover. This calculation can be heavy as its complexity goes up with execution edge count.
We tested it in a sample case with 8000 vertices and 16,000,000 edges. It takes ~2.0s to calculate vertices to restart.
(more details in [
https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)]
That's why we'd propose to cache the region boundary structures to improve the region failover performance.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)