[jira] [Created] (FLINK-13371) Release partitions in JM of producer gets restarted

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-13371) Release partitions in JM of producer gets restarted

Shang Yuanchun (Jira)
Andrey Zagrebin created FLINK-13371:
---------------------------------------

             Summary: Release partitions in JM of producer gets restarted
                 Key: FLINK-13371
                 URL: https://issues.apache.org/jira/browse/FLINK-13371
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination, Runtime / Network
    Affects Versions: 1.9.0
            Reporter: Andrey Zagrebin


As discussed in FLINK-13245, there can be a case that producer does not even detect any consumption attempt if consumer fails before the connection is established. It means we cannot fully rely on shuffle service for the release on consumption in case of consumer failure. When producer restarts it will leak partitions from the previous attempt. Previously we had an explicit release call for this case in Execution.cancel/suspend. Basically JM has to explicitly release all partitions produced by the previous task execution attempt in case of producer restart, including `released on consumption` partitions. For this change, we might need to track all partitions in PartitionTrackerImpl.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)