Yu Li created FLINK-12699:
-----------------------------
Summary: Reduce CPU consumption when snapshot/restore the spilled key-group
Key: FLINK-12699
URL:
https://issues.apache.org/jira/browse/FLINK-12699 Project: Flink
Issue Type: Sub-task
Components: Runtime / State Backends
Reporter: Yu Li
Assignee: Yu Li
We need to prevent the unnecessary de/serialization when snapshotting/restoring the spilled state key-group. To achieve this, we need to:
1. Add meta information for {{HeapKeyedStatebackend}} checkpoint on DFS, separating the on-heap and on-disk part
2. Write the off-heap bytes directly to DFS when checkpointing and mark it as on-disk
3. Directly write the bytes onto disk when restoring the data back from DFS, if it's marked as on-disk
Notice that we cannot directly use file copy since we use mmap meanwhile support copy-on-write.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)