Till Rohrmann created FLINK-22636:
-------------------------------------
Summary: Group job specific ZooKeeper HA services under common jobs/<JobID> zNode
Key: FLINK-22636
URL:
https://issues.apache.org/jira/browse/FLINK-22636 Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Affects Versions: 1.12.3, 1.13.0, 1.14.0
Reporter: Till Rohrmann
Fix For: 1.14.0
In order to better clean up Zookeeper HA services, I suggest grouping job-specific services under a common {{jobs/<JobID>}} zNode. That way, it becomes trivial to clean up the job-specific Zookeeper data (simply deleting the {{jobs/<JobID>}} node.
Currently, our Zookeeper structure is not really structured well. The current layout looks like this:
{code}
clusterID -> jobgraphs -> <job-id>
-> checkpoints -> <job-id> -> checkpoint-1
-> checkpoint-counter -> <job-id> -> counter
-> leaderlatch -> dispatcher_lock
-> resourc_emanager_lock
-> <job-id>
-> leader -> dispatcher_lock
-> resource_manager_lock
-> <job-id>
{code}
The new layout could look like this:
{code}
clusterID -> jobgraphs -> <job-id>
-> jobs -> <job-id> -> checkpoints -> checkpoint-1
-> checkpoint_id_counter -> counter
-> leader -> latch
-> connection_info
-> leader -> dispatcher -> latch
-> connection_info
-> resource_manager -> latch
-> connection_info
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)