[jira] [Created] (FLINK-13034) Improve the performance when checking whether mapstate is empty for RocksDBStateBackend

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-13034) Improve the performance when checking whether mapstate is empty for RocksDBStateBackend

Shang Yuanchun (Jira)
Yun Tang created FLINK-13034:
--------------------------------

             Summary: Improve the performance when checking whether mapstate is empty for RocksDBStateBackend
                 Key: FLINK-13034
                 URL: https://issues.apache.org/jira/browse/FLINK-13034
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / State Backends
            Reporter: Yun Tang
            Assignee: Yun Tang


Currently, there existed several scenarios to check whether map state is empty in Flink source code, e.g.[TemporalRowTimeJoinOperator|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/join/temporal/TemporalRowTimeJoinOperator.java#L192], [AbstractRowTimeUnboundedPrecedingOver|#L160)].
 Developers would use below command to check whether the map state is empty:
{code:java}
boolean noRecordsToProcess = !inputState.keys().iterator().hasNext();
{code}
However, if we use {{RocksDBStateBackend}}, {{inputState.keys().iterator().hasNext()}} would actually call 1 {{seek}} and 128 {{next}} actions in [RocksDBMapState|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java#L483], in which the redundant {{next}} actions are not what we want.

I have two options to improve this:
 * Modify {{RocksDBMapState}} back to previous design which would first load one element and then load more elements in the follow-up queries. However, this would effect the performance of other map state methods.
 * Add a {{isEmpty()}} method in the public evolving interface {{MapState}}, so that we could use it to check whether the map state is empty without any redundant RocksDB actions.

I prefer to the 2nd option.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)