Hey guys,
Just a quick note on some upcoming API updates for the Streaming api. Now it will be possible to use field expressions for both grouping and aggregations in the streaming api. You can check it out here <https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102> . Or in a concise form: DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word") .sum("frequency"); I will still do some more testing before it will be available in the master branch. I am also planning to extend aggregations to more fields at the same time like sum(1,2,2) or max("a","c"). Regards, Gyula |
Hi Gyula,
great to see so much progress with Flink Streaming! Regarding the improved aggregations, there is another effort to improve the aggregations of the batch API, which was recently discussed on the mailing list [1]. I think it would make sense to try to keep the streaming and batch APIs close together. Would the proposed batch approach also work for the streaming case and if not what is missing? Cheers, Fabian [1] http://mail-archives.apache.org/mod_mbox/flink-dev/201410.mbox/%3C44B1AB07-F993-436F-AE23-8CC4CCC08A54%40tu-berlin.de%3E 2014-11-05 18:37 GMT+01:00 Gyula Fóra <[hidden email]>: > Hey guys, > > Just a quick note on some upcoming API updates for the Streaming api. > > Now it will be possible to use field expressions for both grouping and > aggregations in the streaming api. You can check it out here > < > https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102 > > > . > > Or in a concise form: > DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word") > .sum("frequency"); > > I will still do some more testing before it will be available in the master > branch. > > I am also planning to extend aggregations to more fields at the same time > like > sum(1,2,2) or max("a","c"). > > Regards, > Gyula > |
Hi Fabian,
The link you sent is broken but I think you are referring the conversation with Viktor. I agree that we should provide the same api for aggregations and I don’t think there is any reason why the proposed batch approach wouldn’t work on streaming :) When there is an agreement on the new aggregation api we will implement it for streaming. Cheers, Gyula > On 05 Nov 2014, at 19:26, Fabian Hueske <[hidden email]> wrote: > > Hi Gyula, > > great to see so much progress with Flink Streaming! > > Regarding the improved aggregations, there is another effort to improve the > aggregations of the batch API, which was recently discussed on the mailing > list [1]. > I think it would make sense to try to keep the streaming and batch APIs > close together. Would the proposed batch approach also work for the > streaming case and if not what is missing? > > Cheers, Fabian > > [1] > http://mail-archives.apache.org/mod_mbox/flink-dev/201410.mbox/%3C44B1AB07-F993-436F-AE23-8CC4CCC08A54%40tu-berlin.de%3E > > 2014-11-05 18:37 GMT+01:00 Gyula Fóra <[hidden email]>: > >> Hey guys, >> >> Just a quick note on some upcoming API updates for the Streaming api. >> >> Now it will be possible to use field expressions for both grouping and >> aggregations in the streaming api. You can check it out here >> < >> https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102 >>> >> . >> >> Or in a concise form: >> DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word") >> .sum("frequency"); >> >> I will still do some more testing before it will be available in the master >> branch. >> >> I am also planning to extend aggregations to more fields at the same time >> like >> sum(1,2,2) or max("a","c"). >> >> Regards, >> Gyula >> |
Yes, that's the thread I wanted to point to ;-)
2014-11-05 19:40 GMT+01:00 Gyula Fora <[hidden email]>: > Hi Fabian, > > The link you sent is broken but I think you are referring the conversation > with Viktor. > > I agree that we should provide the same api for aggregations and I don’t > think there is any reason why the proposed batch approach wouldn’t work on > streaming :) > > When there is an agreement on the new aggregation api we will implement it > for streaming. > > Cheers, > Gyula > > > On 05 Nov 2014, at 19:26, Fabian Hueske <[hidden email]> wrote: > > > > Hi Gyula, > > > > great to see so much progress with Flink Streaming! > > > > Regarding the improved aggregations, there is another effort to improve > the > > aggregations of the batch API, which was recently discussed on the > mailing > > list [1]. > > I think it would make sense to try to keep the streaming and batch APIs > > close together. Would the proposed batch approach also work for the > > streaming case and if not what is missing? > > > > Cheers, Fabian > > > > [1] > > > http://mail-archives.apache.org/mod_mbox/flink-dev/201410.mbox/%3C44B1AB07-F993-436F-AE23-8CC4CCC08A54%40tu-berlin.de%3E > > > > 2014-11-05 18:37 GMT+01:00 Gyula Fóra <[hidden email]>: > > > >> Hey guys, > >> > >> Just a quick note on some upcoming API updates for the Streaming api. > >> > >> Now it will be possible to use field expressions for both grouping and > >> aggregations in the streaming api. You can check it out here > >> < > >> > https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102 > >>> > >> . > >> > >> Or in a concise form: > >> DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word") > >> .sum("frequency"); > >> > >> I will still do some more testing before it will be available in the > master > >> branch. > >> > >> I am also planning to extend aggregations to more fields at the same > time > >> like > >> sum(1,2,2) or max("a","c"). > >> > >> Regards, > >> Gyula > >> > > |
In reply to this post by Gyula Fóra-2
Hi Gyula,
I will take a look at your code once I have figured out how to implement the new batch API. Best, Viktor
|
Free forum by Nabble | Edit this page |