Stephan Ewen created FLINK-7231:
-----------------------------------
Summary: SlotSharingGroups are not always released in time for new restarts
Key: FLINK-7231
URL:
https://issues.apache.org/jira/browse/FLINK-7231 Project: Flink
Issue Type: Bug
Components: Distributed Coordination
Affects Versions: 1.3.1
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Fix For: 1.4.0, 1.3.2
In the case where there are not enough resources to schedule the streaming program, a race condition can lead to a sequence of the following errors:
{code}
java.lang.IllegalStateException: SlotSharingGroup cannot clear task assignment, group still has allocated resources.
{code}
This eventually recovers, but may involve many fast restart attempts before doing so.
The root cause is that slots are not cleared before the next restart attempt.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)