huweihua created FLINK-17341:
--------------------------------
Summary: freeSlot in TaskExecutor.closeJobManagerConnection cause ConcurrentModificationException
Key: FLINK-17341
URL:
https://issues.apache.org/jira/browse/FLINK-17341 Project: Flink
Issue Type: Bug
Reporter: huweihua
TaskExecutor may freeSlot when closeJobManagerConnection. freeSlot will modify the TaskSlotTable.slotsPerJob. this modify will cause ConcurrentModificationException.
{code:java}
Iterator<AllocationID> activeSlots = taskSlotTable.getActiveSlots(jobId);
final FlinkException freeingCause = new FlinkException("Slot could not be marked inactive.");
while (activeSlots.hasNext()) {
AllocationID activeSlot = activeSlots.next();
try {
if (!taskSlotTable.markSlotInactive(activeSlot, taskManagerConfiguration.getTimeout())) {
freeSlotInternal(activeSlot, freeingCause);
}
} catch (SlotNotFoundException e) {
log.debug("Could not mark the slot {} inactive.", jobId, e);
}
}
{code}
error log:
{code:java}
2020-04-21 23:37:11,363 ERROR org.apache.flink.runtime.rpc.akka.AkkaRpcActor - Caught exception while executing runnable in main thread.
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at org.apache.flink.runtime.taskexecutor.slot.TaskSlotTable$TaskSlotIterator.hasNext(TaskSlotTable.java:698)
at org.apache.flink.runtime.taskexecutor.slot.TaskSlotTable$AllocationIDIterator.hasNext(TaskSlotTable.java:652)
at org.apache.flink.runtime.taskexecutor.TaskExecutor.closeJobManagerConnection(TaskExecutor.java:1314)
at org.apache.flink.runtime.taskexecutor.TaskExecutor.access$1300(TaskExecutor.java:149)
at org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$jobManagerLostLeadership$1(TaskExecutor.java:1726)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)