[jira] [Created] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver

Shang Yuanchun (Jira)
Xintong Song created FLINK-19068:
------------------------------------

             Summary: Filter verbose pod events for KubernetesResourceManagerDriver
                 Key: FLINK-19068
                 URL: https://issues.apache.org/jira/browse/FLINK-19068
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes
            Reporter: Xintong Song


A status of a Kubernetes pod consists of many detailed fields. Currently, Flink receives pod {{MODIFIED}} events from theĀ {{KubernetesPodsWatcher}} on every single change to these fields, many of which Flink does not care.

The verbose events will not affect the functionality of Flink, but will pollute the logs with repeated messages, because Flink only looks into the fields it interested in and those fields are identical.

E.g., when a task manager is stopped due to idle timeout, Flink receives 3 events:
* MODIFIED: container terminated
* MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a Kubernetes internal status change after containers are gracefully terminated
* DELETED: Flink removes metadata of the terminated pod

Among the 3 messages, Flink is only interested in the 1st MODIFIED message, but will try to process all of them because the container status is terminated.

I propose to Filter the verbose events in {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process the status changes interested by Flink. This probably requires recording the status of all living pods, to compare with the incoming events for detecting status changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)