[jira] [Created] (FLINK-15031) Calculate required shuffle memory cases before allocating slots in resources specified

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-15031) Calculate required shuffle memory cases before allocating slots in resources specified

Shang Yuanchun (Jira)
Zhu Zhu created FLINK-15031:
-------------------------------

             Summary: Calculate required shuffle memory cases before allocating slots in resources specified
                 Key: FLINK-15031
                 URL: https://issues.apache.org/jira/browse/FLINK-15031
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.10.0
            Reporter: Zhu Zhu
             Fix For: 1.10.0


In resources specified cases, we expect the behavior pattern on resources to be declare and use. No resource related error should happen if no resource is used more than declared. This ensures a job to not fail when resources are limited.

Shuffle memory is the last missing piece for this goal at the moment. Minimum network buffers are required by tasks to work. *Currently a task can be deployed to a TM with insufficient network buffers, and fails on launching.* This may result in unnecessary failures and may even cause a job hanging forever, failing repeatedly on deploying tasks to a TM with few network buffers.

To avoid that, we should calculate required network memory for a task/SlotSharingGroup before allocating a slot for it with the {{ResourceProfile}}.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)