[jira] [Created] (FLINK-22587) Support aggregations in batch mode with DataStream API

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-22587) Support aggregations in batch mode with DataStream API

Shang Yuanchun (Jira)
Etienne Chauchot created FLINK-22587:
----------------------------------------

             Summary: Support aggregations in batch mode with DataStream API
                 Key: FLINK-22587
                 URL: https://issues.apache.org/jira/browse/FLINK-22587
             Project: Flink
          Issue Type: Improvement
          Components: API / DataStream
            Reporter: Etienne Chauchot


A pipeline like this *in batch mode* would output no data
{code:java}
stream.join(otherStream)
    .where(<KeySelector>)
    .equalTo(<KeySelector>)
    .window(GlobalWindows.create())
    .apply(<JoinFunction>)
{code}
Indeed the default trigger for GlobalWindow is NeverTrigger which never fires. If we set a _EventTimeTrigger_ it will fire with every element as the watermark will be set to +INF (batch mode) and will pass the end of the global window with each new element. A _ProcessingTimeTrigger_ never fires either and all elapsed time or delta based triggers would not be suited for batch.

Same goes for _reduce()_ instead of join().

So I guess we miss something for batch support with DataStream.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)