Mike Kaplinskiy created FLINK-19206:
---------------------------------------
Summary: Add an ability to set ownerReference manually in Kubernetes
Key: FLINK-19206
URL:
https://issues.apache.org/jira/browse/FLINK-19206 Project: Flink
Issue Type: Improvement
Components: Deployment / Kubernetes
Reporter: Mike Kaplinskiy
The current Kubernetes deployment creates a service that is the ownerReference of all the sub-objects (the JM & TM deployments & the rest service). However, something presumably has to start the cluster in the first place. If you are using a job cluster, that can be something like a kubernetes Job, a CronJob or a tool like Airflow. Unfortunately any failures in the Flink job can cause retries from these higher-level primitives, which can yield a lot of "stale clusters" that aren't GCed.
The proposal here is to add a configuration option to set the ownerReference of the Flink Service. This way the service (and by proxy, all the cluster components) get deleted when the "parent" decides - including if the parent is itself a Kubernetes pod. For reference, Spark does something similar via {{spark.kubernetes.driver.pod.name}} (documented at [
https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-executor-pod-garbage-collection]).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)