[jira] [Created] (FLINK-18828) Terminate jobmanager process with zero exit code to avoid unexpected restarting by K8s

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-18828) Terminate jobmanager process with zero exit code to avoid unexpected restarting by K8s

Shang Yuanchun (Jira)
Yang Wang created FLINK-18828:
---------------------------------

             Summary: Terminate jobmanager process with zero exit code to avoid unexpected restarting by K8s
                 Key: FLINK-18828
                 URL: https://issues.apache.org/jira/browse/FLINK-18828
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Coordination
    Affects Versions: 1.11.1, 1.10.1, 1.12.0
            Reporter: Yang Wang
             Fix For: 1.10.2, 1.12.0, 1.11.2


Currently, Flink jobmanager process terminates with a non-zero exit code if the job reaches the {{ApplicationStatus.FAILED}}. It is not ideal in K8s deployment, since non-zero exit code will cause unexpected restarting. Also from a framework's perspective, a FAILED job does not mean that Flink has failed and, hence, the return code could still be 0.

> Note:
This is a special case for standalone K8s deployment. For standalone/Yarn/Mesos/native K8s, terminating with non-zero exit code is harmless. And a non-zero exit code could help to check the job result quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)