[jira] [Created] (FLINK-17861) Channel state handles, when inlined, duplicate underlying data

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17861) Channel state handles, when inlined, duplicate underlying data

Shang Yuanchun (Jira)
Roman Khachatryan created FLINK-17861:
-----------------------------------------

             Summary: Channel state handles, when inlined, duplicate underlying data
                 Key: FLINK-17861
                 URL: https://issues.apache.org/jira/browse/FLINK-17861
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing, Runtime / Task
    Affects Versions: 1.11.0
            Reporter: Roman Khachatryan
            Assignee: Roman Khachatryan
             Fix For: 1.11.0


When a subtask snapshots its state it creates one channelStateHandle per inputChannel/resultSubpartition. All handles of a single subtask share the underlying streamStateHandle. This is an optimisation to prevent having too many files.

But if streamStateHandle is inlined (size < state.backend.fs.memory-threshold) then most of the bytes in the underlying streamStateHandle are duplicated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)