[jira] [Created] (FLINK-19462) Checkpoint statistics for unfinished task snapshots

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19462) Checkpoint statistics for unfinished task snapshots

Shang Yuanchun (Jira)
Nico Kruber created FLINK-19462:
-----------------------------------

             Summary: Checkpoint statistics for unfinished task snapshots
                 Key: FLINK-19462
                 URL: https://issues.apache.org/jira/browse/FLINK-19462
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Checkpointing, Runtime / Metrics
            Reporter: Nico Kruber


If a checkpoint times out, there are currently no stats on the not-yet-finished tasks in the Web UI, so you have to crawl into (debug?) logs.

It would be nice to have these incomplete stats in there instead so that you know quickly what was going on. I could think of these ways to accomplish this:
 * the checkpoint coordinator could ask the TMs for it after failing the checkpoint or
 * the TMs could send the stats when they notice that the checkpoint is aborted

Maybe there are more options, but I think, this improvement in general would benefit debugging checkpoints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)