(DEPRECATED) Apache Flink Mailing List archive.

[jira] [Created] (FLINK-11618) [state] Refactor operator state repartition mechanism

Classic

List

Threaded

1 message

Shang Yuanchun (Jira)

[jira] [Created] (FLINK-11618) [state] Refactor operator state repartition mechanism

Yun Tang created FLINK-11618:
--------------------------------

Summary: [state] Refactor operator state repartition mechanism
Key: FLINK-11618
URL: https://issues.apache.org/jira/browse/FLINK-11618
Project: Flink
Issue Type: Improvement
Components: State Backends, Checkpointing
Affects Versions: 1.7.0
Reporter: Yun Tang
Assignee: Yun Tang
Fix For: 1.8.0

Currently we have state assignment strategy of operator state below:
* When parallelism not changed:
** If we only have even-split redistributed state, state assignment would try to keep as the same as previously (actually not always the same).
** If we have union redistributed state, all the operator state would be redistributed as the new state assignment.
* When parallelism changed:
** all the operator state would be redistributed as the new state assignment.

There existed two problems *when parallelism not changed*:
# If we only have even-split redistributed state, current implementation actually cannot ensure state assignment to keep as the same as previously. This is because current {{StateAssignmentOperation#collectPartitionableStates}} would repartition {{managedOperatorStates}} without subtask-index information. Take and example, if we have a operator-state with parallelism as 2, and subtask-0's managed-operatorstate is empty while subtask-1 not. Although new parallelism still keeps as 2, after {{StateAssignmentOperation#collectPartitionableStates}}, subtask-0 would be assigned the managed-operatorstate but subtask-1 get none.
# We should only redistribute union state and not touch the even-split state. Redistribute even-split state would cause unexpected behavior after {{RestartPipelinedRegionStrategy}} supported to restore state.

We should fix the above two problems and this issue is a prerequisite of FLINK-10712 and FLINK-10713 .

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)