Igal Shilman created FLINK-18790:
------------------------------------
Summary: Set a connection timeout that is lower than the request timeout for remote functions
Key: FLINK-18790
URL:
https://issues.apache.org/jira/browse/FLINK-18790 Project: Flink
Issue Type: Improvement
Components: Stateful Functions
Reporter: Igal Shilman
Fix For: statefun-2.2.0
Currently for remote functions, the connection timeout is identical to the whole request timeout. A problem with this happens when a remote function is behind a NAT/load balancer/or in general behind anything that holds the port open, even tho the remote function is not present or was relocated. In that case the entire request budget would be spent on waiting for a connection.
This in particularly the case in Kubernetes where pods behind a service, were ungracefully killed at once.
To fix that issue, I propose:
1) by default use 10% of the total request timeout for the connection timeout.
2) expose a configuration parameter explicitly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)