king's uncle created FLINK-14339:
------------------------------------ Summary: The checkpoint ID count wrong on restore savepoint log Key: FLINK-14339 URL: https://issues.apache.org/jira/browse/FLINK-14339 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.8.0 Reporter: king's uncle I saw the below log when I tested Flink restore from the savepoint. {code:java} [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Recovering checkpoints from ZooKeeper. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Found 0 checkpoints in ZooKeeper. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Trying to fetch 0 checkpoints from storage. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Starting job 00000000000000000000000000000000 from savepoint /nfsdata/ecs/flink-savepoints/flink-savepoint-test/00000000000000000000000000000000/201910080158/savepoint-000000-003c9b080832 (allowing non restored state) [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Reset the checkpoint ID of job 00000000000000000000000000000000 to 12285. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Recovering checkpoints from ZooKeeper. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Found 1 checkpoints in ZooKeeper. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Trying to fetch 1 checkpoints from storage. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Trying to retrieve checkpoint 12284. [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Restoring job 00000000000000000000000000000000 from latest valid checkpoint: Checkpoint 12284 @ 0 for 00000000000000000000000000000000. {code} You can find the final resotre checkpoint ID is 12284, but we can see the log print "Reset the checkpoint ID of job 00000000000000000000000000000000 to 12285". So, I checked the source code. {code:java} // Reset the checkpoint ID counter long nextCheckpointId = savepoint.getCheckpointID() + 1; checkpointIdCounter.setCount(nextCheckpointId); LOG.info("Reset the checkpoint ID of job {} to {}.", job, nextCheckpointId); {code} I think they should print a checkpoint ID instead of the next checkpoint ID. {code:java} LOG.info("Reset the checkpoint ID of job {} to {}.", job, savepoint.getCheckpointID()); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) |
Free forum by Nabble | Edit this page |