Dan Hill created FLINK-19721:
--------------------------------
Summary: Speed up the frequency of checks in RpcGatewayRetriever
Key: FLINK-19721
URL:
https://issues.apache.org/jira/browse/FLINK-19721 Project: Flink
Issue Type: Improvement
Components: Test Infrastructure
Affects Versions: 1.11.2, 1.11.1, 1.12.0
Reporter: Dan Hill
When writing Flink tests, I could reduce the latency of my 'waitForDone' calls by writing my own looping retry-sleep logic than rely on `TableResult.getJobClient().get().getJobExecutionResult(...)`. This is because `[MiniCluster|
https://github.com/apache/flink/blob/47ca19a74e11c72842124852875262959477c459/flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java#L338]` uses [RpcGatewayRetriever|
https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/retriever/impl/RpcGatewayRetriever.java] which has a fixed 20ms retry.
For a complex test, this can save 50ms-100ms per test run.
An easy fix is to change this to an retry with exponential backoff. This reduces the impact
--
This message was sent by Atlassian Jira
(v8.3.4#803005)