[jira] [Created] (FLINK-22456) Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-22456) Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

Shang Yuanchun (Jira)
Li created FLINK-22456:
--------------------------

             Summary: Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat
                 Key: FLINK-22456
                 URL: https://issues.apache.org/jira/browse/FLINK-22456
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Task
            Reporter: Li


        In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_ are only called when the Format is _OutputFormat_, however _InputFormat_ is not be called.
        In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something before and after the task. And they only support _initializeGlobal_ and _finalizeGlobal_ in _OutputFormat_.
        I don't know why _InputFormat_ doesn't support, anyone can tell me why?
        But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also be supported in _InputFormat_.
        For example, an offline task in _JdbcInputFormat_, user can use _initializeGlobal_ to query the total counts of this task, and then user can create InputSplits by total counts. While task running, user can add progress indicators metric by calculating the total number of records divided by the current number of reads, and even the remaining time of the task can be estimated. It is very helpful for users to view task progress and remaining time through external systems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)