Zhu Zhu created FLINK-17019:
-------------------------------
Summary: Implement FIFO Physical Slot Assignment in SlotPoolImpl
Key: FLINK-17019
URL:
https://issues.apache.org/jira/browse/FLINK-17019 Project: Flink
Issue Type: Sub-task
Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Zhu Zhu
Fix For: 1.11.0
The SlotPool should try to fulfill the oldest pending slot request once it receives an available slot, no matter if the slot is returned by another terminated task or is just offered from a task manager. This naturally ensures that slot requests of an earlier scheduled region will be fulfilled earlier than requests of a later scheduled region.
We only need to change the slot assignment logic on slot offers. This is because the fields {{pendingRequests}} and {{waitingForResourceManager}} store the pending requests in LinkedHashMaps . Therefore, {{tryFulfillSlotRequestOrMakeAvailable(...)}} will naturally fulfill the pending requests in inserted order.
When a new slot is offered via {{SlotPoolImpl#offerSlot(...)}} , we should use it to fulfill the oldest fulfillable slot request directly by invoking {{tryFulfillSlotRequestOrMakeAvailable(...)}}.
If a pending request (say R1) exists with the allocationId of the offered slot, and it is different from the request to fulfill (say R2), we should update the pendingRequest to replace AllocationID of R1 to be the AllocationID of R2. This ensures failAllocation(...) can fail slot allocation requests to trigger restarting tasks and re-allocating slots.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)