Weike Dong created FLINK-19677:
----------------------------------
Summary: TaskManager takes abnormally long time to register with JobManager on Kubernetes
Key: FLINK-19677
URL:
https://issues.apache.org/jira/browse/FLINK-19677 Project: Flink
Issue Type: Bug
Components: Runtime / Task
Affects Versions: 1.11.2, 1.11.1, 1.11.0
Reporter: Weike Dong
During the registration process of TaskManager, JobManager would create a
_TaskManagerLocation_ instance, which tries to get hostname of the TaskManager via reverse DNS lookup.
However, this always fails in Kubernetes environment, because for pods that are not exposed by Services, their IPs cannot be resolved to domains by coredns, and _InetAddress#getCanonicalHostName()_ would take ~5 seconds to return, blocking the whole registration process.
Therefore Flink should provide a configuration parameter to turn off reverse DNS lookup. Also, even when hostname is actually needed, this could be done lazily to avoid blocking registration of other TaskManagers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)