[jira] [Created] (FLINK-17019) Implement FIFO Physical Slot Assignment in SlotPoolImpl

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17019) Implement FIFO Physical Slot Assignment in SlotPoolImpl

Shang Yuanchun (Jira)
Zhu Zhu created FLINK-17019:
-------------------------------

             Summary: Implement FIFO Physical Slot Assignment in SlotPoolImpl
                 Key: FLINK-17019
                 URL: https://issues.apache.org/jira/browse/FLINK-17019
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.11.0
            Reporter: Zhu Zhu
             Fix For: 1.11.0


The SlotPool should try to fulfill the oldest pending slot request once it receives an available slot, no matter if the slot is returned by another terminated task or is just offered from a task manager. This naturally ensures that slot requests of an earlier scheduled region will be fulfilled earlier than requests of a later scheduled region.

We only need to change the slot assignment logic on slot offers. This is because the fields {{pendingRequests}} and {{waitingForResourceManager}} store the pending requests in LinkedHashMaps . Therefore, {{tryFulfillSlotRequestOrMakeAvailable(...)}} will naturally fulfill the pending requests in inserted order.

When a new slot is offered via {{SlotPoolImpl#offerSlot(...)}} , we should use it to fulfill the oldest fulfillable slot request directly by invoking {{tryFulfillSlotRequestOrMakeAvailable(...)}}.

If a pending request (say R1) exists with the allocationId of the offered slot, and it is different from the request to fulfill (say R2), we should update the pendingRequest to replace AllocationID of R1 to be the AllocationID of R2. This ensures failAllocation(...) can fail slot allocation requests to trigger restarting tasks and re-allocating slots.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)