[jira] [Created] (FLINK-19710) Avoid performance regression introduced by thread-local keyword of FRocksDB

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19710) Avoid performance regression introduced by thread-local keyword of FRocksDB

Shang Yuanchun (Jira)
Yun Tang created FLINK-19710:
--------------------------------

             Summary: Avoid performance regression introduced by thread-local keyword of FRocksDB
                 Key: FLINK-19710
                 URL: https://issues.apache.org/jira/browse/FLINK-19710
             Project: Flink
          Issue Type: Improvement
            Reporter: Yun Tang
            Assignee: Yun Tang
             Fix For: 1.12.0


We planed to bump base rocksDB version from 5.17.2 to 6.11.x. However, we observed performance regression compared with 5.17.2 and 5.18.3 via our own flink-benchmarks, and reported to RocksDB community in [rocksdb#5774|https://github.com/facebook/rocksdb/issues/5774]. Since rocksDB-5.18.3 is a bit old for RocksDB community, and rocksDB built-in db_bench tool cannot easily reproduce this regression, we did not get any efficient help from RocksDB community.

Since code freeze of Flink-release-1.12 is close, we have to figure it out by ourself. We try to use rocksDB built-in db_bench tool first to binary searching the 160 different commits between rocksDB 5.17.2 and 5.18.3. However, the performance regression is not so clear. And after using our own flink-benchmarks. We finally detect the commit which introduced the nearly-10% performance regression: [replaced __thread with thread_local keyword |https://github.com/facebook/rocksdb/commit/d6ec288703c8fc53b54be9e3e3f3ffd6a7487c63] .

From existing knowledge, the performance regression of {{thread-local}} is known from [gcc-4.8 changes|https://gcc.gnu.org/gcc-4.8/changes.html#cxx] and become more serious in [dynamic modules usage |http://david-grs.github.io/tls_performance_overhead_cost_linux/] [[tls benchmark|https://testbit.eu/2015/thread-local-storage-benchmark]]]. That could explain why rocksDB built-in db_bench tool cannot reproduce this regression as it is complied in static mode by recommendation.

 

We plan to fix this in our FRocksDB branch first to revert related changes. And from my current local experimental result, that revert proved to be effective to avoid that performance regression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)