Chengxiang Li created FLINK-2533:
------------------------------------
Summary: Gap based random sample optimization
Key: FLINK-2533
URL:
https://issues.apache.org/jira/browse/FLINK-2533 Project: Flink
Issue Type: Improvement
Components: Core
Reporter: Chengxiang Li
Priority: Minor
For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap based random sampler could exploit O(k) sample implementation instead of previous O\(n\) sample implementation, it should perform better while sample fraction is very small. [This blog|
http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/] describes more detail about gap based random sampler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)