Zhu Zhu created FLINK-19994:
-------------------------------
Summary: All vertices in an DataSet iteration job will be eagerly scheduled
Key: FLINK-19994
URL:
https://issues.apache.org/jira/browse/FLINK-19994 Project: Flink
Issue Type: Bug
Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Zhu Zhu
Fix For: 1.12.0
After switching to pipelined region scheduling, all vertices in an DataSet iteration job will be eagerly scheduled, which means BLOCKING result consumers can be deployed even before the result finishes and resource waste happens. This is because all vertices will be put into one pipelined region if the job contains {{ColocationConstraint}}, see [PipelinedRegionComputeUtil|
https://github.com/apache/flink/blob/c0f382f5f0072441ef8933f6993f1c34168004d6/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/PipelinedRegionComputeUtil.java#L52].
IIUC, this {{makeAllOneRegion()}} behavior was introduced to ensure co-located iteration head and tail to be restarted together in pipelined region failover. However, given that edges within an iteration will always be PIPELINED ([ref|
https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]), co-located iteration head and tail will always be in the same region. So I think we can drop the {{PipelinedRegionComputeUtil#makeAllOneRegion()}} code path and build regions in the the same way no matter if there is co-location constraints or not.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)