Yu Yang created FLINK-20829:
-------------------------------
Summary: flink.jm.downtime metric is inaccurate in flink 1.9.1 and 1.11.1
Key: FLINK-20829
URL:
https://issues.apache.org/jira/browse/FLINK-20829 Project: Flink
Issue Type: Bug
Components: API / Scala, Runtime / Metrics
Affects Versions: 1.11.1, 1.9.1
Reporter: Yu Yang
Attachments: Screen Shot 2021-01-01 at 2.38.39 PM.png
According to the comments in [DownTimeGauge.java|
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/metrics/DownTimeGauge.java#L28]:
A gauge that returns (in milliseconds) how long a job has not been not running any more, in case it is in a failing/recovering situation. Running jobs return naturally a value of zero.
We noticed that flink runtime reports inaccurate value for flink.jm.downtime metric. What flink reports was actually the uptime in milliseconds before the application restarted.
!Screen Shot 2021-01-01 at 2.38.39 PM.png|width=720!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)