[jira] [Created] (FLINK-21942) KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-21942) KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak

Shang Yuanchun (Jira)
Yi Tang created FLINK-21942:
-------------------------------

             Summary: KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak
                 Key: FLINK-21942
                 URL: https://issues.apache.org/jira/browse/FLINK-21942
             Project: Flink
          Issue Type: Bug
            Reporter: Yi Tang


Looks like KubernetesLeaderRetrievalDriver is not closed even if the KubernetesLeaderElectionDriver is closed and job reach globally terminated.
This will lead to many configmap watching be still active with connections to K8s.

When the connections exceeds max concurrent requests, those new configmap watching can not be started. Finally leads to all new jobs submitted timeout.

[~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you confirm this issue?
But when many jobs are running in same session cluster, the config map is required to be active. Maybe we should merge all config maps watching?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)