[jira] [Created] (FLINK-18733) Jobmanager cannot start in HA mode with Zookeeper

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-18733) Jobmanager cannot start in HA mode with Zookeeper

Shang Yuanchun (Jira)
Leonid Ilyevsky created FLINK-18733:
---------------------------------------

             Summary: Jobmanager cannot start in HA mode with Zookeeper
                 Key: FLINK-18733
                 URL: https://issues.apache.org/jira/browse/FLINK-18733
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.11.1
            Reporter: Leonid Ilyevsky


When configured in HA mode, the Jobmanager cannot start at all. First, it issues warnings like this:

{{2020-07-27 08:58:23,197 WARN org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Session 0x0 for server nj1dvloglab01.liquidnet.biz/<unresolved>:2181, unexpected error, closing socket connection and attempting reconnect}}
{{java.lang.IllegalArgumentException: Unable to canonicalize address nj1dvloglab01.liquidnet.biz/<unresolved>:2181 because it's not resolvable}}
{{ at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
{{ at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
{{ at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
{{ at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}

After few attempts connecting to Zookeeper, it finally fails:

2020-07-27 08:59:35,055 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error occurred in the cluster entrypoint.
org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderElectionService: Ensure path threw exception
 at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.unhandledError(ZooKeeperLeaderElectionService.java:430) ~[flink-dist_2.12-1.11.1.jar:1.11.1]

 

The same HA configuration works fine for me in Flink 1.10.0.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)