[jira] [Created] (FLINK-14606) Simplify params of Execution#processFail

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-14606) Simplify params of Execution#processFail

Shang Yuanchun (Jira)
Zhu Zhu created FLINK-14606:
-------------------------------

             Summary: Simplify params of Execution#processFail
                 Key: FLINK-14606
                 URL: https://issues.apache.org/jira/browse/FLINK-14606
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.10.0
            Reporter: Zhu Zhu
             Fix For: 1.10.0


The 3 params fromSchedulerNg/releasePartitions/isCallback of Execution#processFail are quite a mess while they seem to be correlated.
I'd propose to simplify the prams of processFail by using a {{isInternalError}} to replace those 3 params. {{isInternalError}} is true iff the failure is from TM(strictly speaking, notified from SchedulerBase). This also hardens the handling of cases that a task is successfully deployed but JM does not realize it(see #3 below).

Here's why these 3 params can be simplified:
1. {{fromSchedulerNg}}, true iff the failure is from TM and isLegacyScheduling==false.
    It's only used like this: {{if (!fromSchedulerNg && !isLegacyScheduling()))}}. So it's the same to use {{!isInternalFailure}} to replace it.

2. {{releasePartitions}}, true iff the failure is from TM.
  Now the value is exactly the same as {{isInternalFailure}}, we can drop it and use {{isInternalFailure}} instead.

3. {{isCallback}}, true iff the failure is from TM or the task is not deployed.
    It's only used like this: {{(!isCallback && (current == RUNNING || current == DEPLOYING))}}.
    So using {{!isInternalFailure}} to replace it would be enough. It is a bit different for the case that a task deployment to a task manager fails, which set {{isCallback}} to true previously. However, it would be safer to signal a cancel call, in case the deployment is actually a success but the response is lost on network.

cc [~GJL]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)