[jira] [Created] (FLINK-19206) Add an ability to set ownerReference manually in Kubernetes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19206) Add an ability to set ownerReference manually in Kubernetes

Shang Yuanchun (Jira)
Mike Kaplinskiy created FLINK-19206:
---------------------------------------

             Summary: Add an ability to set ownerReference manually in Kubernetes
                 Key: FLINK-19206
                 URL: https://issues.apache.org/jira/browse/FLINK-19206
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes
            Reporter: Mike Kaplinskiy


The current Kubernetes deployment creates a service that is the ownerReference of all the sub-objects (the JM & TM deployments & the rest service). However, something presumably has to start the cluster in the first place. If you are using a job cluster, that can be something like a kubernetes Job, a CronJob or a tool like Airflow. Unfortunately any failures in the Flink job can cause retries from these higher-level primitives, which can yield a lot of "stale clusters" that aren't GCed.

The proposal here is to add a configuration option to set the ownerReference of the Flink Service. This way the service (and by proxy, all the cluster components) get deleted when the "parent" decides - including if the parent is itself a Kubernetes pod. For reference, Spark does something similar via {{spark.kubernetes.driver.pod.name}} (documented at [https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-executor-pod-garbage-collection]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)