Marcos Klein created FLINK-14300:
------------------------------------
Summary: org.apache.flink.streaming.runtime.tasks.StreamTask#invoke leaks threads if org.apache.flink.streaming.runtime.tasks.OperatorChain fails to be constructed
Key: FLINK-14300
URL:
https://issues.apache.org/jira/browse/FLINK-14300 Project: Flink
Issue Type: Bug
Components: Runtime / Task
Affects Versions: 1.9.0, 1.8.2, 1.8.1
Reporter: Marcos Klein
Attachments: thread-leak-patch.diff
In the *StreamTask#invoke* method if an exception occurs during the allocation of the [operatorChain|[
https://github.com/apache/flink/blob/release-1.9.0/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L370]] class, the [exception handling|[
https://github.com/apache/flink/blob/release-1.9.0/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L485-L491]] fails to cleanup the threads allocated as *StreamTask#recordWriters*. This causes threads to leak as flink attempts to continually restart and fail for the same cause.
An example cause is a deserialization issue on a custom operator from a checkpoint.
Attached is a suggested fix for the master branch.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)