[jira] [Created] (FLINK-17568) Task may consume data after checkpoint barrier before performing checkpoint for unaligned checkpoint

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17568) Task may consume data after checkpoint barrier before performing checkpoint for unaligned checkpoint

Shang Yuanchun (Jira)
Yingjie Cao created FLINK-17568:
-----------------------------------

             Summary: Task may consume data after checkpoint barrier before performing checkpoint for unaligned checkpoint
                 Key: FLINK-17568
                 URL: https://issues.apache.org/jira/browse/FLINK-17568
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
    Affects Versions: 1.11.0
            Reporter: Yingjie Cao
             Fix For: 1.11.0


For unaligned checkpoint, task may consume data after the checkpoint barrier before performing checkpoint which lead to consumption of duplicated data and corruption of data stream.

More specifically, when the Netty thread notifies the checkpoint barrier for the first time and enqueue a checkpointing task in the mailbox, the task thread may still in data consumption loop and if it reads a new checkpoint barrier from another channel it will not return to the mailbox and instead it will continue to read data until a all data consumed or we have a full record, meanwhile, the data after checkpoint barrier may be read and consumed which lead to inconsistency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)