[jira] [Created] (FLINK-2533) Gap based random sample optimization

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-2533) Gap based random sample optimization

Shang Yuanchun (Jira)
Chengxiang Li created FLINK-2533:
------------------------------------

             Summary: Gap based random sample optimization
                 Key: FLINK-2533
                 URL: https://issues.apache.org/jira/browse/FLINK-2533
             Project: Flink
          Issue Type: Improvement
          Components: Core
            Reporter: Chengxiang Li
            Priority: Minor


For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap based random sampler could exploit O(k) sample implementation instead of previous O\(n\) sample implementation, it should perform better while sample fraction is very small. [This blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/] describes more detail about gap based random sampler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)