[jira] [Created] (FLINK-21099) Introduce JobType to distinguish between batch and streaming jobs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-21099) Introduce JobType to distinguish between batch and streaming jobs

Shang Yuanchun (Jira)
Till Rohrmann created FLINK-21099:
-------------------------------------

             Summary: Introduce JobType to distinguish between batch and streaming jobs
                 Key: FLINK-21099
                 URL: https://issues.apache.org/jira/browse/FLINK-21099
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.13.0
            Reporter: Till Rohrmann
             Fix For: 1.13.0


In order to distinguish between batch and streaming jobs we propose to introduce  an enum {{JobType}} which is set in the {{JobGraph}} when creating it. Using the {{JobType}} it will be possible to decide which scheduler to use depending on the nature of the job.

For batch jobs (from the DataSet API), setting this field is trivial (in the JobGraphGenerator).

For streaming jobs the situation is more complicated, since FLIP-134 introduced support for bounded (batch) jobs in the DataStream API. For the DataStream API, we rely on the result of StreamGraphGenerator#shouldExecuteInBatchMode, which checks if the DataStream program has unbounded sources.

Lastly, the Blink Table API / SQL Planner also generates StreamGraph instances, which contain batch jobs. We are tagging the StreamGraph as a batch job in the ExecutorUtils.setBatchProperties() method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)