[jira] [Created] (FLINK-11149) Flink will request too more containers than it actually needs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-11149) Flink will request too more containers than it actually needs

Shang Yuanchun (Jira)
Fan Xinpu created FLINK-11149:
---------------------------------

             Summary: Flink will request too more containers than it actually needs
                 Key: FLINK-11149
                 URL: https://issues.apache.org/jira/browse/FLINK-11149
             Project: Flink
          Issue Type: Improvement
          Components: YARN
    Affects Versions: 1.7.0
            Reporter: Fan Xinpu


  As known, flink will request new containers when it was notified that some allocated container is completed. Let me say, maybe one container failed, and Flink tries to request one container from NM, but actually Flink will request n+1 containers, the n refers to the number that ever requested after cluster is created.It is not graceful.

  When requesting a container, Flink will send a ContainerRequest to RM through AMRM Client, and AMRMClient will save the ContainerRequest in itself, and hopes the ContainerRequest will be removed in future, but Flink never removes the ContainerRequest, so one by one, the number of ContainerRequest accumulates to a unexpected value.

  In our environment, a cluster initially allocated 100 containers, and later on,it requests one container from RM, RM returns more than 2000 containers to it as the request actually has more than 2000 ContainerRequest. Although Flink will return the excess containers, this request behavior waste time and resource on yarn.

  So, maybe Flink can remove the ContainerRequest after the request has been sent to RM, then Flink will get exactly numbers of containers as it explicitly did.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)