Yi Tang created FLINK-21902:
-------------------------------
Summary: A deadlock while using K8s HA service
Key: FLINK-21902
URL:
https://issues.apache.org/jira/browse/FLINK-21902 Project: Flink
Issue Type: Bug
Reporter: Yi Tang
The `KubernetesStateHandleStore` using the same threadPoolExecutor with the Dispatcher to check `checkAndUpdateConfigMap`, which will lead to a deadlock.
example:
{code:java}
private CompletableFuture<Void> removeJob(JobID jobId, CleanupJobState cleanupJobState)
{ final DispatcherJob job = checkNotNull(runningJobs.remove(jobId)); final CompletableFuture<Void> jobTerminationFuture = job.closeAsync(); return jobTerminationFuture.thenRunAsync( () -> cleanUpJobData(jobId, cleanupJobState.cleanupHAData), ioExecutor); }
{code}
will finally call
{code:java}
public CompletableFuture<Boolean> checkAndUpdateConfigMap(
String configMapName,
Function<KubernetesConfigMap, Optional<KubernetesConfigMap>> function) {
...
CompletableFuture.supplyAsync(..., kubeClientExecutorService)
...
}
{code}
And the ioExecutor and kubeClientExecutorService is the same executor.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)