[jira] [Created] (FLINK-22049) Simplify calculating the starting index of the local key-group range in InternalTimerServiceImpl constructor.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-22049) Simplify calculating the starting index of the local key-group range in InternalTimerServiceImpl constructor.

Shang Yuanchun (Jira)
ZhangWei created FLINK-22049:
--------------------------------

             Summary: Simplify calculating the starting index of the local key-group range in InternalTimerServiceImpl constructor.
                 Key: FLINK-22049
                 URL: https://issues.apache.org/jira/browse/FLINK-22049
             Project: Flink
          Issue Type: Improvement
          Components: API / DataStream
            Reporter: ZhangWei


For org.apache.flink.streaming.api.operators.InternalTimerServiceImpl;

in current implementation, when we try to find the starting index of the local key-group range, we iterate over the KeyGroupRange and try to find the min value.

while this is unnecessary and wasteful, because the KeyGroupRange is often monotonically increasing and we can just get the startKeyGroup which is what we want.

It is even worse when we set a very large max parallelism(eg: 10k level) and a small parallelism(eg: 10 level), thus a KeyGroupRange may contain about 1k index in it. And then we iterate this 1k increasing numbers to find the min value.

so if the KeyGroupRange is not an EMPTY_KEY_GROUP_RANGE(i.e. KeyGroupRange#getNumberOfKeyGroups() > 0), we can just get the startKeyGroup.

if KeyGroupRange is just an EMPTY_KEY_GROUP_RANGE, use the Integer.MAX_VALUE.

I think this can avoid much unnecessary operation and save time.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)