[jira] [Created] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-15169) Errors happen in the scheduling of DefaultScheduler is not shown in WebUI

Shang Yuanchun (Jira)
Zhu Zhu created FLINK-15169:
-------------------------------

             Summary: Errors happen in the scheduling of DefaultScheduler is not shown in WebUI
                 Key: FLINK-15169
                 URL: https://issues.apache.org/jira/browse/FLINK-15169
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.10.0
            Reporter: Zhu Zhu
             Fix For: 1.10.0


WebUI relies on {{ExecutionGraph#failureInfo}} and {{Execution#failureCause}} to generate error info (vis {{JobExceptionsHandler#createJobExceptionsInfo}}).
Errors happen in the scheduling of DefaultScheduler are not recorded into those fields, thus cannot be shown to users in WebUI (nor via REST queries).

To solve it,
1. global failures should be recorded into {{ExecutionGraph#failureInfo}}, via {{ExecutionGraph#initFailureCause}} which can be exposed as {{SchedulerBase#initFailureCause}}.
2. for task failures, one solution I can think of is to avoid invoking {{DefaultScheduler#handleTaskFailure}} directly on scheduler's internal failures. Instead, we can introduce {{ExecutionVertexOperations#fail(ExecutionVertex)}} to hand the error to {{ExecutionVertex}} as a common failure.

cc [~gjy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)