[jira] [Created] (FLINK-19806) Job may try to leave SUSPENDED state in ExecutionGraph#failJob()

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19806) Job may try to leave SUSPENDED state in ExecutionGraph#failJob()

Shang Yuanchun (Jira)
Zhu Zhu created FLINK-19806:
-------------------------------

             Summary: Job may try to leave SUSPENDED state in ExecutionGraph#failJob()
                 Key: FLINK-19806
                 URL: https://issues.apache.org/jira/browse/FLINK-19806
             Project: Flink
          Issue Type: Bug
            Reporter: Zhu Zhu
            Assignee: Zhu Zhu


{{SUSPENDED}} is a terminal state which a job is not supposed to leave this state once entering. However, {{ExecutionGraph#failJob()}} did not check it and may try to transition a job out from {{SUSPENDED}} state. This will cause unexpected errors and may lead to JM crash.
The problem can be visible if we rework {{ExecutionGraphSuspendTest}} to be based on {{DefaultScheduler}}.
We should harden the check in {{ExecutionGraph#failJob()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)