Congxian Qiu(klion26) created FLINK-14035:
---------------------------------------------
Summary: Introduce/Change some log for snapshot to better analysis checkpoint problem
Key: FLINK-14035
URL:
https://issues.apache.org/jira/browse/FLINK-14035 Project: Flink
Issue Type: Improvement
Components: Runtime / Checkpointing
Affects Versions: 1.10.0
Reporter: Congxian Qiu(klion26)
Currently, the information for checkpoint are mostly debug log (especially on TM side). If we want to track where the checkpoint steps and consume time during each step when we have a failed checkpoint or the checkpoint time is too long, we need to restart the job with enabling debug log, this issue wants to improve this situation, wants to change some exist debug log from debug to info, and add some more debug log. we have changed this log level in our production in Alibaba, and it seems no problem until now.
Detail
{{change the log below from debug level to info}}
* log about \{{Starting checkpoint xxx }} in TM side
* log about Sync complete in TM side
* log about async compete in TM side
Add debug log
* log about receiving the barrier for exactly once mode - align from at lease once mode
If this issue is valid, then I'm happy to contribute it.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)