[jira] [Created] (FLINK-18228) Release pending pods/containers timely when pending slots changed

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-18228) Release pending pods/containers timely when pending slots changed

Shang Yuanchun (Jira)
Yang Wang created FLINK-18228:
---------------------------------

             Summary: Release pending pods/containers timely when pending slots changed
                 Key: FLINK-18228
                 URL: https://issues.apache.org/jira/browse/FLINK-18228
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes, Deployment / YARN, Runtime / Coordination
    Affects Versions: 1.12.0
            Reporter: Yang Wang


Currently, when we deploy a session cluster on Yarn/K8s and submit a job into the existing cluster, some pending pods/containers may be created due to no enough resource. Even the job will fail with slot allocation timeout or be canceled, the pending pods/containers will still be there. Until allocated and launched, they could be released via TaskManager idle timeout.

 

This behavior how to release the pending pods/containers could be improved. Once the pending slots changed in the {{SlotManager}}, it could notify the {{ActiveResourceManager}} to do some corresponding actions(e.g. release the needless pending pods). This will help a lot when the cluster is small and do not have too much available resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)