[jira] [Created] (FLINK-19113) Add support for checkpointing with selectable inputs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19113) Add support for checkpointing with selectable inputs

Shang Yuanchun (Jira)
Roman Khachatryan created FLINK-19113:
-----------------------------------------

             Summary: Add support for checkpointing with selectable inputs
                 Key: FLINK-19113
                 URL: https://issues.apache.org/jira/browse/FLINK-19113
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Checkpointing
    Affects Versions: 1.12.0
            Reporter: Roman Khachatryan


Currently, there is a validation in StreamingJobGraphGenerator that fails if an operator implements InputSelectable and checkpointing is enabled.

 

One issue when removing this validation is that with Unaligned checkpoints recovery would deadlock if there are records spanning multiple buffers.

Consider the following case:
- Two {{IntputGate}}s
- Input selection is not ALL (say FIRST initially)
- Unaligned Checkpoints ON
- on recovery, there are "parts" of records in all channels (actually 1 is enough I think)

On recovery,
1. {{StreamTask}} initiates recovery and scedule partition request upon it's end
2. All gates and channels will *receive* buffers from {{StateReader}}
3. All channels of a *single* gate will *consume* those state buffers - completing that gate's {{StateConsumedFuture}}
4. {{InputProcessor}} will return {{NOTHING_AVAILABLE}} (see {{StreamTwoInputProcessor.getInputStatus}})
5. {{StreamTask}} will suspend its default action
6. State of the 2nd gate won't be consumed - so its {{StateConsumedFuture}}s won't be completed - so no partitions will be requested (edited) 

 

A simple solution is to request partitions for each channel independently.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)