ZhangWei created FLINK-22049:
--------------------------------
Summary: Simplify calculating the starting index of the local key-group range in InternalTimerServiceImpl constructor.
Key: FLINK-22049
URL:
https://issues.apache.org/jira/browse/FLINK-22049 Project: Flink
Issue Type: Improvement
Components: API / DataStream
Reporter: ZhangWei
For org.apache.flink.streaming.api.operators.InternalTimerServiceImpl;
in current implementation, when we try to find the starting index of the local key-group range, we iterate over the KeyGroupRange and try to find the min value.
while this is unnecessary and wasteful, because the KeyGroupRange is often monotonically increasing and we can just get the startKeyGroup which is what we want.
It is even worse when we set a very large max parallelism(eg: 10k level) and a small parallelism(eg: 10 level), thus a KeyGroupRange may contain about 1k index in it. And then we iterate this 1k increasing numbers to find the min value.
so if the KeyGroupRange is not an EMPTY_KEY_GROUP_RANGE(i.e. KeyGroupRange#getNumberOfKeyGroups() > 0), we can just get the startKeyGroup.
if KeyGroupRange is just an EMPTY_KEY_GROUP_RANGE, use the Integer.MAX_VALUE.
I think this can avoid much unnecessary operation and save time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)