Hey folks,
Please give feedback on FLIP-13! https://cwiki.apache.org/confluence/display/FLINK/FLIP-13+Side+Outputs+in+Flink JIRA task link to google doc https://issues.apache.org/jira/browse/FLINK-4460 Thanks, Chen Qin |
Hi Chen,
thanks for this interesting proposal. I think side output would be a very valuable feature to have! I went of the FLIP and have a few questions. - Will multiple side outputs of the same type be supported? - If I got it right, the FLIP proposes to change the signatures of many user-defined functions (FlatMapFunction, WindowFunction, ...). Most of these interfaces/classes are annotated with @Public, which means we cannot change them in the Flink 1.x release line. What would be alternatives? I can think of a) casting the Collector into a RichCollector (as you do in your prototype) or b) retrieve the RichCollector from the RuntimeContext that a RichFunction provides. I'm not so familiar with the internals of the DataStream API, so I leave comments on that to other. Best, Fabian 2016-10-25 18:00 GMT+02:00 Chen Qin <[hidden email]>: > Hey folks, > > Please give feedback on FLIP-13! > https://cwiki.apache.org/confluence/display/FLINK/FLIP- > 13+Side+Outputs+in+Flink > JIRA task link to google doc > https://issues.apache.org/jira/browse/FLINK-4460 > > Thanks, > Chen Qin > |
Is it just related to stream api? This feature could be really useful for
etl scenarios with dataset api as well. On Oct 26, 2016 22:29, "Fabian Hueske" <[hidden email]> wrote: > Hi Chen, > > thanks for this interesting proposal. I think side output would be a very > valuable feature to have! > > I went of the FLIP and have a few questions. > > - Will multiple side outputs of the same type be supported? > - If I got it right, the FLIP proposes to change the signatures of many > user-defined functions (FlatMapFunction, WindowFunction, ...). Most of > these interfaces/classes are annotated with @Public, which means we cannot > change them in the Flink 1.x release line. What would be alternatives? I > can think of a) casting the Collector into a RichCollector (as you do in > your prototype) or b) retrieve the RichCollector from the RuntimeContext > that a RichFunction provides. > > I'm not so familiar with the internals of the DataStream API, so I leave > comments on that to other. > > Best, Fabian > > 2016-10-25 18:00 GMT+02:00 Chen Qin <[hidden email]>: > > > Hey folks, > > > > Please give feedback on FLIP-13! > > https://cwiki.apache.org/confluence/display/FLINK/FLIP- > > 13+Side+Outputs+in+Flink > > JIRA task link to google doc > > https://issues.apache.org/jira/browse/FLINK-4460 > > > > Thanks, > > Chen Qin > > > |
Hi CPC,
I agree, support for side outputs would be nice for DataSet as well. However, this is not easily possible because it would require an extensive rewrite of the DataSet optimizer. IMO, that's out of scope for this proposal. Cheers, Fabian 2016-10-27 0:29 GMT+02:00 CPC <[hidden email]>: > Is it just related to stream api? This feature could be really useful for > etl scenarios with dataset api as well. > > On Oct 26, 2016 22:29, "Fabian Hueske" <[hidden email]> wrote: > > > Hi Chen, > > > > thanks for this interesting proposal. I think side output would be a very > > valuable feature to have! > > > > I went of the FLIP and have a few questions. > > > > - Will multiple side outputs of the same type be supported? > > - If I got it right, the FLIP proposes to change the signatures of many > > user-defined functions (FlatMapFunction, WindowFunction, ...). Most of > > these interfaces/classes are annotated with @Public, which means we > cannot > > change them in the Flink 1.x release line. What would be alternatives? I > > can think of a) casting the Collector into a RichCollector (as you do in > > your prototype) or b) retrieve the RichCollector from the RuntimeContext > > that a RichFunction provides. > > > > I'm not so familiar with the internals of the DataStream API, so I leave > > comments on that to other. > > > > Best, Fabian > > > > 2016-10-25 18:00 GMT+02:00 Chen Qin <[hidden email]>: > > > > > Hey folks, > > > > > > Please give feedback on FLIP-13! > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP- > > > 13+Side+Outputs+in+Flink > > > JIRA task link to google doc > > > https://issues.apache.org/jira/browse/FLINK-4460 > > > > > > Thanks, > > > Chen Qin > > > > > > |
In reply to this post by Chen Qin
Hi Fabian
Thanks for your feedback. sorry for late reply. Some of comments inline. Will update FLIP-13 wiki reflect your comments. - Will multiple side outputs of the same type be supported? > It wasn't implemented in prototype. But should be easy to support, we have unique id in stream record. - If I got it right, the FLIP proposes to change the signatures of many user-defined functions (FlatMapFunction, WindowFunction, ...). Most of these interfaces/classes are annotated with @Public, which means we cannot change them in the Flink 1.x release line. What would be alternatives? I can think of a) casting the Collector into a RichCollector (as you do in your prototype) or > This is like a private magic API. Should be 100% compatible but not good implementation. b) retrieve the RichCollector from the RuntimeContext > It seems better option, yet many highly used Function like FlatMap will not get support. To get support, we need to create some redundant classes inherited from RichFunction( like implement RichFlatMap etc) [we might put these in different package and isolate impact of this change) that a RichFunction provides. I'm not so familiar with the internals of the DataStream API, so I leave comments on that to other. Best, Fabian On Tue, Oct 25, 2016 at 9:00 AM, Chen Qin <[hidden email]> wrote: > Hey folks, > > Please give feedback on FLIP-13! > https://cwiki.apache.org/confluence/display/FLINK/FLIP- > 13+Side+Outputs+in+Flink > JIRA task link to google doc https://issues.apache.org/ > jira/browse/FLINK-4460 > > Thanks, > Chen Qin > -- -Chen Qin |
Adding another abstract method to Collector interface is also considerably
easier from API backward compatibility point of view. The cost could be either 1) many class with empty implementation of *<S> void collect(OutputTag<S> tag, S value) *method 2) split streamrecord related classes that implement Collector interface from graph generator related classes. For streamrecord ones, we might be able to implement *collect(T out)* by calling *<S> void collect(OutputTag<S> tag, S value). *For graph generator keep it as it is. On Wed, Nov 2, 2016 at 8:14 PM, Chen Qin <[hidden email]> wrote: > Hi Fabian > > Thanks for your feedback. sorry for late reply. > Some of comments inline. Will update FLIP-13 wiki reflect your comments. > > > - Will multiple side outputs of the same type be supported? > > > It wasn't implemented in prototype. But should be easy to support, we > have unique id in stream record. > > - If I got it right, the FLIP proposes to change the signatures of many > > user-defined functions (FlatMapFunction, WindowFunction, ...). Most of > > these interfaces/classes are annotated with @Public, which means we cannot > > change them in the Flink 1.x release line. What would be alternatives? I > > can think of > a) casting the Collector into a RichCollector (as you do in > > your prototype) or > > This is like a private magic API. Should be 100% compatible but not good > implementation. > > b) retrieve the RichCollector from the RuntimeContext > > > It seems better option, yet many highly used Function like FlatMap will > not get support. To get support, we need to create some redundant classes > inherited from RichFunction( like implement RichFlatMap etc) [we might put > these in different package and isolate impact of this change) > > that a RichFunction provides. > > > I'm not so familiar with the internals of the DataStream API, so I leave > > comments on that to other. > > > Best, Fabian > > On Tue, Oct 25, 2016 at 9:00 AM, Chen Qin <[hidden email]> wrote: > >> Hey folks, >> >> Please give feedback on FLIP-13! >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-13+ >> Side+Outputs+in+Flink >> JIRA task link to google doc https://issues.apache.org/ >> jira/browse/FLINK-4460 >> >> Thanks, >> Chen Qin >> > > > > -- > -Chen Qin > -- -Chen Qin |
Dear Flink community members,
Please review and comment on https://github.com/apache/flink/pull/2982. Thanks, Chen |
Free forum by Nabble | Edit this page |