[jira] [Created] (FLINK-17598) Implement FileSystemHAServices for native K8s setups

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17598) Implement FileSystemHAServices for native K8s setups

Shang Yuanchun (Jira)
Canbin Zheng created FLINK-17598:
------------------------------------

             Summary: Implement FileSystemHAServices for native K8s setups
                 Key: FLINK-17598
                 URL: https://issues.apache.org/jira/browse/FLINK-17598
             Project: Flink
          Issue Type: New Feature
          Components: Deployment / Kubernetes, Runtime / Coordination
            Reporter: Canbin Zheng


At the moment we use Zookeeper as a distributed coordinator for implementing JobManager high availability services. But in the cloud-native environment, there is a trend that more and more users prefer to use *Kubernetes* as the underlying scheduler backend while *Storage Object* as the Storage medium, both of these two services don't require Zookeeper deployment.

As a result, in the K8s setups, people have to deploy and maintain additional Zookeeper clusters for solving JobManager SPOF. This ticket proposes to provide a simplified FileSystem HA implementation with the leader-election removed, it saves the efforts of Zookeeper deployment and maintenance.

To achieve this, we plan to
# Introduce the {{FileSystemHaServices}} which implements the {{HighAvailabilityServices}}.
# Replace Deployment with StatefulSet to ensure *at most one* semantics to avoid potential concurrent access to the underlying FileSystem.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)