[DISCUSS] Change Streaming Operators to be Push-Only

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
Hi Folks,
while working on introducing source-assigned timestamps into streaming
(https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
the punctuations (low watermarks) can be pushed through the system.
The problem is, that operators can have two ways of getting input: 1.
They read directly from input iterators, and 2. They act as a
Collector and get elements via collect() from the previous operator in
a chain.

This makes it hard to push things through a chain that are not
elements, such as barriers and/or punctuations.

I propose to change all streaming operators to be push based, with a
slightly improved interface: In addition to collect(), which I would
call receiveElement() I would add receivePunctuation() and
receiveBarrier(). The first operator in the chain would also get data
from the outside invokable that reads from the input iterator and
calls receiveElement() for the first operator in a chain.

What do you think? I would of course be willing to implement this myself.

Cheers,
Aljoscha
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Kostas Tzoumas-2
Can you give us a rough idea of the pros and cons? Do we lose some
functionality by getting rid of iterations?

Kostas

On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]>
wrote:

> Hi Folks,
> while working on introducing source-assigned timestamps into streaming
> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
> the punctuations (low watermarks) can be pushed through the system.
> The problem is, that operators can have two ways of getting input: 1.
> They read directly from input iterators, and 2. They act as a
> Collector and get elements via collect() from the previous operator in
> a chain.
>
> This makes it hard to push things through a chain that are not
> elements, such as barriers and/or punctuations.
>
> I propose to change all streaming operators to be push based, with a
> slightly improved interface: In addition to collect(), which I would
> call receiveElement() I would add receivePunctuation() and
> receiveBarrier(). The first operator in the chain would also get data
> from the outside invokable that reads from the input iterator and
> calls receiveElement() for the first operator in a chain.
>
> What do you think? I would of course be willing to implement this myself.
>
> Cheers,
> Aljoscha
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
What do you mean by "losing iterations"?

For the pros and cons:

Cons: I can't think of any, since most of the operators are chainable
already and already behave like a collector.

Pros:
 - Unified model for operators, chainable operators don't have to
worry about input iterators and the collect interface.
 - Enables features that we want in the future, such as barriers and
punctuations because they don't work with the
   simple Collector interface.
 - The while-loop is moved outside of the operators, now the Task (the
thing that runs Operators) can control the flow of data better and
deal with
   stuff like barriers and punctuations. If we want to keep the
main-loop inside each operator, then they all have to manage input
readers and inline events manually.

On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]> wrote:

> Can you give us a rough idea of the pros and cons? Do we lose some
> functionality by getting rid of iterations?
>
> Kostas
>
> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> Hi Folks,
>> while working on introducing source-assigned timestamps into streaming
>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
>> the punctuations (low watermarks) can be pushed through the system.
>> The problem is, that operators can have two ways of getting input: 1.
>> They read directly from input iterators, and 2. They act as a
>> Collector and get elements via collect() from the previous operator in
>> a chain.
>>
>> This makes it hard to push things through a chain that are not
>> elements, such as barriers and/or punctuations.
>>
>> I propose to change all streaming operators to be push based, with a
>> slightly improved interface: In addition to collect(), which I would
>> call receiveElement() I would add receivePunctuation() and
>> receiveBarrier(). The first operator in the chain would also get data
>> from the outside invokable that reads from the input iterator and
>> calls receiveElement() for the first operator in a chain.
>>
>> What do you think? I would of course be willing to implement this myself.
>>
>> Cheers,
>> Aljoscha
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Gyula Fóra-2
I think this a good idea in general. I would try to minimize the methods we
include and make the ones that we keep very concrete. For instance i would
not have the receive barrier method as that is handled on a totally
different level already. And instead of punctuation I would directly add a
method to work on watermarks.

On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:

> What do you mean by "losing iterations"?
>
> For the pros and cons:
>
> Cons: I can't think of any, since most of the operators are chainable
> already and already behave like a collector.
>
> Pros:
>  - Unified model for operators, chainable operators don't have to
> worry about input iterators and the collect interface.
>  - Enables features that we want in the future, such as barriers and
> punctuations because they don't work with the
>    simple Collector interface.
>  - The while-loop is moved outside of the operators, now the Task (the
> thing that runs Operators) can control the flow of data better and
> deal with
>    stuff like barriers and punctuations. If we want to keep the
> main-loop inside each operator, then they all have to manage input
> readers and inline events manually.
>
> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
> <javascript:;>> wrote:
> > Can you give us a rough idea of the pros and cons? Do we lose some
> > functionality by getting rid of iterations?
> >
> > Kostas
> >
> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
> <javascript:;>>
> > wrote:
> >
> >> Hi Folks,
> >> while working on introducing source-assigned timestamps into streaming
> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
> >> the punctuations (low watermarks) can be pushed through the system.
> >> The problem is, that operators can have two ways of getting input: 1.
> >> They read directly from input iterators, and 2. They act as a
> >> Collector and get elements via collect() from the previous operator in
> >> a chain.
> >>
> >> This makes it hard to push things through a chain that are not
> >> elements, such as barriers and/or punctuations.
> >>
> >> I propose to change all streaming operators to be push based, with a
> >> slightly improved interface: In addition to collect(), which I would
> >> call receiveElement() I would add receivePunctuation() and
> >> receiveBarrier(). The first operator in the chain would also get data
> >> from the outside invokable that reads from the input iterator and
> >> calls receiveElement() for the first operator in a chain.
> >>
> >> What do you think? I would of course be willing to implement this
> myself.
> >>
> >> Cheers,
> >> Aljoscha
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
I'm using the term punctuation and watermark interchangeably here
because for practical purposes they do the same thing. I'm not sure
what you meant with your comment about those.

For the Operator interface I'm thinking about something like this:

abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
    public processElement(IN element);
    public processBarrier(...);
    public processPunctuation/lowWatermark(...):
}

The operator also has access to the TaskContext and ExecutionConfig
and Serializers. The operator would emit values using an emit() method
or the Collector interface, not sure about that yet.

On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]> wrote:

> I think this a good idea in general. I would try to minimize the methods we
> include and make the ones that we keep very concrete. For instance i would
> not have the receive barrier method as that is handled on a totally
> different level already. And instead of punctuation I would directly add a
> method to work on watermarks.
>
> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>
>> What do you mean by "losing iterations"?
>>
>> For the pros and cons:
>>
>> Cons: I can't think of any, since most of the operators are chainable
>> already and already behave like a collector.
>>
>> Pros:
>>  - Unified model for operators, chainable operators don't have to
>> worry about input iterators and the collect interface.
>>  - Enables features that we want in the future, such as barriers and
>> punctuations because they don't work with the
>>    simple Collector interface.
>>  - The while-loop is moved outside of the operators, now the Task (the
>> thing that runs Operators) can control the flow of data better and
>> deal with
>>    stuff like barriers and punctuations. If we want to keep the
>> main-loop inside each operator, then they all have to manage input
>> readers and inline events manually.
>>
>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>> <javascript:;>> wrote:
>> > Can you give us a rough idea of the pros and cons? Do we lose some
>> > functionality by getting rid of iterations?
>> >
>> > Kostas
>> >
>> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>> <javascript:;>>
>> > wrote:
>> >
>> >> Hi Folks,
>> >> while working on introducing source-assigned timestamps into streaming
>> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
>> >> the punctuations (low watermarks) can be pushed through the system.
>> >> The problem is, that operators can have two ways of getting input: 1.
>> >> They read directly from input iterators, and 2. They act as a
>> >> Collector and get elements via collect() from the previous operator in
>> >> a chain.
>> >>
>> >> This makes it hard to push things through a chain that are not
>> >> elements, such as barriers and/or punctuations.
>> >>
>> >> I propose to change all streaming operators to be push based, with a
>> >> slightly improved interface: In addition to collect(), which I would
>> >> call receiveElement() I would add receivePunctuation() and
>> >> receiveBarrier(). The first operator in the chain would also get data
>> >> from the outside invokable that reads from the input iterator and
>> >> calls receiveElement() for the first operator in a chain.
>> >>
>> >> What do you think? I would of course be willing to implement this
>> myself.
>> >>
>> >> Cheers,
>> >> Aljoscha
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Gyula Fóra
What would the processBarrier method do?

On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:

> I'm using the term punctuation and watermark interchangeably here
> because for practical purposes they do the same thing. I'm not sure
> what you meant with your comment about those.
>
> For the Operator interface I'm thinking about something like this:
>
> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>     public processElement(IN element);
>     public processBarrier(...);
>     public processPunctuation/lowWatermark(...):
> }
>
> The operator also has access to the TaskContext and ExecutionConfig
> and Serializers. The operator would emit values using an emit() method
> or the Collector interface, not sure about that yet.
>
> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
> <javascript:;>> wrote:
> > I think this a good idea in general. I would try to minimize the methods
> we
> > include and make the ones that we keep very concrete. For instance i
> would
> > not have the receive barrier method as that is handled on a totally
> > different level already. And instead of punctuation I would directly add
> a
> > method to work on watermarks.
> >
> > On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
> <javascript:;>> wrote:
> >
> >> What do you mean by "losing iterations"?
> >>
> >> For the pros and cons:
> >>
> >> Cons: I can't think of any, since most of the operators are chainable
> >> already and already behave like a collector.
> >>
> >> Pros:
> >>  - Unified model for operators, chainable operators don't have to
> >> worry about input iterators and the collect interface.
> >>  - Enables features that we want in the future, such as barriers and
> >> punctuations because they don't work with the
> >>    simple Collector interface.
> >>  - The while-loop is moved outside of the operators, now the Task (the
> >> thing that runs Operators) can control the flow of data better and
> >> deal with
> >>    stuff like barriers and punctuations. If we want to keep the
> >> main-loop inside each operator, then they all have to manage input
> >> readers and inline events manually.
> >>
> >> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
> <javascript:;>
> >> <javascript:;>> wrote:
> >> > Can you give us a rough idea of the pros and cons? Do we lose some
> >> > functionality by getting rid of iterations?
> >> >
> >> > Kostas
> >> >
> >> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
> <javascript:;>
> >> <javascript:;>>
> >> > wrote:
> >> >
> >> >> Hi Folks,
> >> >> while working on introducing source-assigned timestamps into
> streaming
> >> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
> how
> >> >> the punctuations (low watermarks) can be pushed through the system.
> >> >> The problem is, that operators can have two ways of getting input: 1.
> >> >> They read directly from input iterators, and 2. They act as a
> >> >> Collector and get elements via collect() from the previous operator
> in
> >> >> a chain.
> >> >>
> >> >> This makes it hard to push things through a chain that are not
> >> >> elements, such as barriers and/or punctuations.
> >> >>
> >> >> I propose to change all streaming operators to be push based, with a
> >> >> slightly improved interface: In addition to collect(), which I would
> >> >> call receiveElement() I would add receivePunctuation() and
> >> >> receiveBarrier(). The first operator in the chain would also get data
> >> >> from the outside invokable that reads from the input iterator and
> >> >> calls receiveElement() for the first operator in a chain.
> >> >>
> >> >> What do you think? I would of course be willing to implement this
> >> myself.
> >> >>
> >> >> Cheers,
> >> >> Aljoscha
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
I don't know, I just put that there because other people are working
on the checkpointing/barrier thing. So there would need to be some
functionality there at some point.

Or maybe it is not required there and can be handled in the
StreamTask. Others might know this better than I do right now.

On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:

> What would the processBarrier method do?
>
> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>
>> I'm using the term punctuation and watermark interchangeably here
>> because for practical purposes they do the same thing. I'm not sure
>> what you meant with your comment about those.
>>
>> For the Operator interface I'm thinking about something like this:
>>
>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>     public processElement(IN element);
>>     public processBarrier(...);
>>     public processPunctuation/lowWatermark(...):
>> }
>>
>> The operator also has access to the TaskContext and ExecutionConfig
>> and Serializers. The operator would emit values using an emit() method
>> or the Collector interface, not sure about that yet.
>>
>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>> <javascript:;>> wrote:
>> > I think this a good idea in general. I would try to minimize the methods
>> we
>> > include and make the ones that we keep very concrete. For instance i
>> would
>> > not have the receive barrier method as that is handled on a totally
>> > different level already. And instead of punctuation I would directly add
>> a
>> > method to work on watermarks.
>> >
>> > On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>> <javascript:;>> wrote:
>> >
>> >> What do you mean by "losing iterations"?
>> >>
>> >> For the pros and cons:
>> >>
>> >> Cons: I can't think of any, since most of the operators are chainable
>> >> already and already behave like a collector.
>> >>
>> >> Pros:
>> >>  - Unified model for operators, chainable operators don't have to
>> >> worry about input iterators and the collect interface.
>> >>  - Enables features that we want in the future, such as barriers and
>> >> punctuations because they don't work with the
>> >>    simple Collector interface.
>> >>  - The while-loop is moved outside of the operators, now the Task (the
>> >> thing that runs Operators) can control the flow of data better and
>> >> deal with
>> >>    stuff like barriers and punctuations. If we want to keep the
>> >> main-loop inside each operator, then they all have to manage input
>> >> readers and inline events manually.
>> >>
>> >> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>> <javascript:;>
>> >> <javascript:;>> wrote:
>> >> > Can you give us a rough idea of the pros and cons? Do we lose some
>> >> > functionality by getting rid of iterations?
>> >> >
>> >> > Kostas
>> >> >
>> >> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>> <javascript:;>
>> >> <javascript:;>>
>> >> > wrote:
>> >> >
>> >> >> Hi Folks,
>> >> >> while working on introducing source-assigned timestamps into
>> streaming
>> >> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>> how
>> >> >> the punctuations (low watermarks) can be pushed through the system.
>> >> >> The problem is, that operators can have two ways of getting input: 1.
>> >> >> They read directly from input iterators, and 2. They act as a
>> >> >> Collector and get elements via collect() from the previous operator
>> in
>> >> >> a chain.
>> >> >>
>> >> >> This makes it hard to push things through a chain that are not
>> >> >> elements, such as barriers and/or punctuations.
>> >> >>
>> >> >> I propose to change all streaming operators to be push based, with a
>> >> >> slightly improved interface: In addition to collect(), which I would
>> >> >> call receiveElement() I would add receivePunctuation() and
>> >> >> receiveBarrier(). The first operator in the chain would also get data
>> >> >> from the outside invokable that reads from the input iterator and
>> >> >> calls receiveElement() for the first operator in a chain.
>> >> >>
>> >> >> What do you think? I would of course be willing to implement this
>> >> myself.
>> >> >>
>> >> >> Cheers,
>> >> >> Aljoscha
>> >> >>
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Gyula Fóra
Yes, we dont need that method there. Snapshots are handled as a call to the
streamtask from the input reader.

On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:

> I don't know, I just put that there because other people are working
> on the checkpointing/barrier thing. So there would need to be some
> functionality there at some point.
>
> Or maybe it is not required there and can be handled in the
> StreamTask. Others might know this better than I do right now.
>
> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]
> <javascript:;>> wrote:
> > What would the processBarrier method do?
> >
> > On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
> <javascript:;>> wrote:
> >
> >> I'm using the term punctuation and watermark interchangeably here
> >> because for practical purposes they do the same thing. I'm not sure
> >> what you meant with your comment about those.
> >>
> >> For the Operator interface I'm thinking about something like this:
> >>
> >> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
> >>     public processElement(IN element);
> >>     public processBarrier(...);
> >>     public processPunctuation/lowWatermark(...):
> >> }
> >>
> >> The operator also has access to the TaskContext and ExecutionConfig
> >> and Serializers. The operator would emit values using an emit() method
> >> or the Collector interface, not sure about that yet.
> >>
> >> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
> <javascript:;>
> >> <javascript:;>> wrote:
> >> > I think this a good idea in general. I would try to minimize the
> methods
> >> we
> >> > include and make the ones that we keep very concrete. For instance i
> >> would
> >> > not have the receive barrier method as that is handled on a totally
> >> > different level already. And instead of punctuation I would directly
> add
> >> a
> >> > method to work on watermarks.
> >> >
> >> > On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
> <javascript:;>
> >> <javascript:;>> wrote:
> >> >
> >> >> What do you mean by "losing iterations"?
> >> >>
> >> >> For the pros and cons:
> >> >>
> >> >> Cons: I can't think of any, since most of the operators are chainable
> >> >> already and already behave like a collector.
> >> >>
> >> >> Pros:
> >> >>  - Unified model for operators, chainable operators don't have to
> >> >> worry about input iterators and the collect interface.
> >> >>  - Enables features that we want in the future, such as barriers and
> >> >> punctuations because they don't work with the
> >> >>    simple Collector interface.
> >> >>  - The while-loop is moved outside of the operators, now the Task
> (the
> >> >> thing that runs Operators) can control the flow of data better and
> >> >> deal with
> >> >>    stuff like barriers and punctuations. If we want to keep the
> >> >> main-loop inside each operator, then they all have to manage input
> >> >> readers and inline events manually.
> >> >>
> >> >> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
> <javascript:;>
> >> <javascript:;>
> >> >> <javascript:;>> wrote:
> >> >> > Can you give us a rough idea of the pros and cons? Do we lose some
> >> >> > functionality by getting rid of iterations?
> >> >> >
> >> >> > Kostas
> >> >> >
> >> >> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <
> [hidden email] <javascript:;>
> >> <javascript:;>
> >> >> <javascript:;>>
> >> >> > wrote:
> >> >> >
> >> >> >> Hi Folks,
> >> >> >> while working on introducing source-assigned timestamps into
> >> streaming
> >> >> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought
> about
> >> how
> >> >> >> the punctuations (low watermarks) can be pushed through the
> system.
> >> >> >> The problem is, that operators can have two ways of getting
> input: 1.
> >> >> >> They read directly from input iterators, and 2. They act as a
> >> >> >> Collector and get elements via collect() from the previous
> operator
> >> in
> >> >> >> a chain.
> >> >> >>
> >> >> >> This makes it hard to push things through a chain that are not
> >> >> >> elements, such as barriers and/or punctuations.
> >> >> >>
> >> >> >> I propose to change all streaming operators to be push based,
> with a
> >> >> >> slightly improved interface: In addition to collect(), which I
> would
> >> >> >> call receiveElement() I would add receivePunctuation() and
> >> >> >> receiveBarrier(). The first operator in the chain would also get
> data
> >> >> >> from the outside invokable that reads from the input iterator and
> >> >> >> calls receiveElement() for the first operator in a chain.
> >> >> >>
> >> >> >> What do you think? I would of course be willing to implement this
> >> >> myself.
> >> >> >>
> >> >> >> Cheers,
> >> >> >> Aljoscha
> >> >> >>
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Kostas Tzoumas-2
In reply to this post by Aljoscha Krettek-2
oops, meant "iterators" :-)

On Tue, May 5, 2015 at 3:04 PM, Aljoscha Krettek <[hidden email]>
wrote:

> What do you mean by "losing iterations"?
>
> For the pros and cons:
>
> Cons: I can't think of any, since most of the operators are chainable
> already and already behave like a collector.
>
> Pros:
>  - Unified model for operators, chainable operators don't have to
> worry about input iterators and the collect interface.
>  - Enables features that we want in the future, such as barriers and
> punctuations because they don't work with the
>    simple Collector interface.
>  - The while-loop is moved outside of the operators, now the Task (the
> thing that runs Operators) can control the flow of data better and
> deal with
>    stuff like barriers and punctuations. If we want to keep the
> main-loop inside each operator, then they all have to manage input
> readers and inline events manually.
>
> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]>
> wrote:
> > Can you give us a rough idea of the pros and cons? Do we lose some
> > functionality by getting rid of iterations?
> >
> > Kostas
> >
> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> >> Hi Folks,
> >> while working on introducing source-assigned timestamps into streaming
> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about how
> >> the punctuations (low watermarks) can be pushed through the system.
> >> The problem is, that operators can have two ways of getting input: 1.
> >> They read directly from input iterators, and 2. They act as a
> >> Collector and get elements via collect() from the previous operator in
> >> a chain.
> >>
> >> This makes it hard to push things through a chain that are not
> >> elements, such as barriers and/or punctuations.
> >>
> >> I propose to change all streaming operators to be push based, with a
> >> slightly improved interface: In addition to collect(), which I would
> >> call receiveElement() I would add receivePunctuation() and
> >> receiveBarrier(). The first operator in the chain would also get data
> >> from the outside invokable that reads from the input iterator and
> >> calls receiveElement() for the first operator in a chain.
> >>
> >> What do you think? I would of course be willing to implement this
> myself.
> >>
> >> Cheers,
> >> Aljoscha
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Paris Carbone
In reply to this post by Aljoscha Krettek-2
I agree with Gyula on this one. Barriers should better not be exposed to the operator. They are system events for state management. Apart from that, watermark handling seems to be on a right track, I like it so far.

> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
>
> I don't know, I just put that there because other people are working
> on the checkpointing/barrier thing. So there would need to be some
> functionality there at some point.
>
> Or maybe it is not required there and can be handled in the
> StreamTask. Others might know this better than I do right now.
>
> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:
>> What would the processBarrier method do?
>>
>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>>
>>> I'm using the term punctuation and watermark interchangeably here
>>> because for practical purposes they do the same thing. I'm not sure
>>> what you meant with your comment about those.
>>>
>>> For the Operator interface I'm thinking about something like this:
>>>
>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>   public processElement(IN element);
>>>   public processBarrier(...);
>>>   public processPunctuation/lowWatermark(...):
>>> }
>>>
>>> The operator also has access to the TaskContext and ExecutionConfig
>>> and Serializers. The operator would emit values using an emit() method
>>> or the Collector interface, not sure about that yet.
>>>
>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>>> <javascript:;>> wrote:
>>>> I think this a good idea in general. I would try to minimize the methods
>>> we
>>>> include and make the ones that we keep very concrete. For instance i
>>> would
>>>> not have the receive barrier method as that is handled on a totally
>>>> different level already. And instead of punctuation I would directly add
>>> a
>>>> method to work on watermarks.
>>>>
>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>>> <javascript:;>> wrote:
>>>>
>>>>> What do you mean by "losing iterations"?
>>>>>
>>>>> For the pros and cons:
>>>>>
>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>> already and already behave like a collector.
>>>>>
>>>>> Pros:
>>>>> - Unified model for operators, chainable operators don't have to
>>>>> worry about input iterators and the collect interface.
>>>>> - Enables features that we want in the future, such as barriers and
>>>>> punctuations because they don't work with the
>>>>>  simple Collector interface.
>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>> thing that runs Operators) can control the flow of data better and
>>>>> deal with
>>>>>  stuff like barriers and punctuations. If we want to keep the
>>>>> main-loop inside each operator, then they all have to manage input
>>>>> readers and inline events manually.
>>>>>
>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>>> <javascript:;>
>>>>> <javascript:;>> wrote:
>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>> functionality by getting rid of iterations?
>>>>>>
>>>>>> Kostas
>>>>>>
>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>>> <javascript:;>
>>>>> <javascript:;>>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Folks,
>>>>>>> while working on introducing source-assigned timestamps into
>>> streaming
>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>> how
>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>> Collector and get elements via collect() from the previous operator
>>> in
>>>>>>> a chain.
>>>>>>>
>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>
>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>
>>>>>>> What do you think? I would of course be willing to implement this
>>>>> myself.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Aljoscha
>>>>>>>
>>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
There is no watermark handling yet. :D

But this would enable me to do this.

On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:

> I agree with Gyula on this one. Barriers should better not be exposed to the operator. They are system events for state management. Apart from that, watermark handling seems to be on a right track, I like it so far.
>
>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
>>
>> I don't know, I just put that there because other people are working
>> on the checkpointing/barrier thing. So there would need to be some
>> functionality there at some point.
>>
>> Or maybe it is not required there and can be handled in the
>> StreamTask. Others might know this better than I do right now.
>>
>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:
>>> What would the processBarrier method do?
>>>
>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>>>
>>>> I'm using the term punctuation and watermark interchangeably here
>>>> because for practical purposes they do the same thing. I'm not sure
>>>> what you meant with your comment about those.
>>>>
>>>> For the Operator interface I'm thinking about something like this:
>>>>
>>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>>   public processElement(IN element);
>>>>   public processBarrier(...);
>>>>   public processPunctuation/lowWatermark(...):
>>>> }
>>>>
>>>> The operator also has access to the TaskContext and ExecutionConfig
>>>> and Serializers. The operator would emit values using an emit() method
>>>> or the Collector interface, not sure about that yet.
>>>>
>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>>>> <javascript:;>> wrote:
>>>>> I think this a good idea in general. I would try to minimize the methods
>>>> we
>>>>> include and make the ones that we keep very concrete. For instance i
>>>> would
>>>>> not have the receive barrier method as that is handled on a totally
>>>>> different level already. And instead of punctuation I would directly add
>>>> a
>>>>> method to work on watermarks.
>>>>>
>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>>>> <javascript:;>> wrote:
>>>>>
>>>>>> What do you mean by "losing iterations"?
>>>>>>
>>>>>> For the pros and cons:
>>>>>>
>>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>>> already and already behave like a collector.
>>>>>>
>>>>>> Pros:
>>>>>> - Unified model for operators, chainable operators don't have to
>>>>>> worry about input iterators and the collect interface.
>>>>>> - Enables features that we want in the future, such as barriers and
>>>>>> punctuations because they don't work with the
>>>>>>  simple Collector interface.
>>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>>> thing that runs Operators) can control the flow of data better and
>>>>>> deal with
>>>>>>  stuff like barriers and punctuations. If we want to keep the
>>>>>> main-loop inside each operator, then they all have to manage input
>>>>>> readers and inline events manually.
>>>>>>
>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>>>> <javascript:;>
>>>>>> <javascript:;>> wrote:
>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>>> functionality by getting rid of iterations?
>>>>>>>
>>>>>>> Kostas
>>>>>>>
>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>>>> <javascript:;>
>>>>>> <javascript:;>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Folks,
>>>>>>>> while working on introducing source-assigned timestamps into
>>>> streaming
>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>>> how
>>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>>> Collector and get elements via collect() from the previous operator
>>>> in
>>>>>>>> a chain.
>>>>>>>>
>>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>>
>>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>>
>>>>>>>> What do you think? I would of course be willing to implement this
>>>>>> myself.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Aljoscha
>>>>>>>>
>>>>>>
>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Paris Carbone
By watermark handling I meant making punctuations explicit and forwarding/modifying them at the operator level. I think this is clear so far.

> On 05 May 2015, at 15:41, Aljoscha Krettek <[hidden email]> wrote:
>
> There is no watermark handling yet. :D
>
> But this would enable me to do this.
>
> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
>> I agree with Gyula on this one. Barriers should better not be exposed to the operator. They are system events for state management. Apart from that, watermark handling seems to be on a right track, I like it so far.
>>
>>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
>>>
>>> I don't know, I just put that there because other people are working
>>> on the checkpointing/barrier thing. So there would need to be some
>>> functionality there at some point.
>>>
>>> Or maybe it is not required there and can be handled in the
>>> StreamTask. Others might know this better than I do right now.
>>>
>>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:
>>>> What would the processBarrier method do?
>>>>
>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>>>>
>>>>> I'm using the term punctuation and watermark interchangeably here
>>>>> because for practical purposes they do the same thing. I'm not sure
>>>>> what you meant with your comment about those.
>>>>>
>>>>> For the Operator interface I'm thinking about something like this:
>>>>>
>>>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>>>  public processElement(IN element);
>>>>>  public processBarrier(...);
>>>>>  public processPunctuation/lowWatermark(...):
>>>>> }
>>>>>
>>>>> The operator also has access to the TaskContext and ExecutionConfig
>>>>> and Serializers. The operator would emit values using an emit() method
>>>>> or the Collector interface, not sure about that yet.
>>>>>
>>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>>>>> <javascript:;>> wrote:
>>>>>> I think this a good idea in general. I would try to minimize the methods
>>>>> we
>>>>>> include and make the ones that we keep very concrete. For instance i
>>>>> would
>>>>>> not have the receive barrier method as that is handled on a totally
>>>>>> different level already. And instead of punctuation I would directly add
>>>>> a
>>>>>> method to work on watermarks.
>>>>>>
>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>>>>> <javascript:;>> wrote:
>>>>>>
>>>>>>> What do you mean by "losing iterations"?
>>>>>>>
>>>>>>> For the pros and cons:
>>>>>>>
>>>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>>>> already and already behave like a collector.
>>>>>>>
>>>>>>> Pros:
>>>>>>> - Unified model for operators, chainable operators don't have to
>>>>>>> worry about input iterators and the collect interface.
>>>>>>> - Enables features that we want in the future, such as barriers and
>>>>>>> punctuations because they don't work with the
>>>>>>> simple Collector interface.
>>>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>>>> thing that runs Operators) can control the flow of data better and
>>>>>>> deal with
>>>>>>> stuff like barriers and punctuations. If we want to keep the
>>>>>>> main-loop inside each operator, then they all have to manage input
>>>>>>> readers and inline events manually.
>>>>>>>
>>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>>>>> <javascript:;>
>>>>>>> <javascript:;>> wrote:
>>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>>>> functionality by getting rid of iterations?
>>>>>>>>
>>>>>>>> Kostas
>>>>>>>>
>>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>>>>> <javascript:;>
>>>>>>> <javascript:;>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Folks,
>>>>>>>>> while working on introducing source-assigned timestamps into
>>>>> streaming
>>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>>>> how
>>>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>>>> Collector and get elements via collect() from the previous operator
>>>>> in
>>>>>>>>> a chain.
>>>>>>>>>
>>>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>>>
>>>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>>>
>>>>>>>>> What do you think? I would of course be willing to implement this
>>>>>>> myself.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Aljoscha
>>>>>>>>>
>>>>>>>
>>>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Stephan Ewen
In reply to this post by Aljoscha Krettek-2
Does the operator have to know about barriers actually?

My first intuition would be that the operator reacts to a barrier the same
way as to a punctuation/watermark.

The outside driver handles the barriers as follows
 1) Punctuate operator
 2) Draw operator state snapshot
 3) send output barriers
 4) confirm after snapshot write is complete

That way, the operator needs not deal with checkpointing, but only allow
the driver to draw a state snapshot.



On Tue, May 5, 2015 at 3:41 PM, Aljoscha Krettek <[hidden email]>
wrote:

> There is no watermark handling yet. :D
>
> But this would enable me to do this.
>
> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
> > I agree with Gyula on this one. Barriers should better not be exposed to
> the operator. They are system events for state management. Apart from that,
> watermark handling seems to be on a right track, I like it so far.
> >
> >> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
> >>
> >> I don't know, I just put that there because other people are working
> >> on the checkpointing/barrier thing. So there would need to be some
> >> functionality there at some point.
> >>
> >> Or maybe it is not required there and can be handled in the
> >> StreamTask. Others might know this better than I do right now.
> >>
> >> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]>
> wrote:
> >>> What would the processBarrier method do?
> >>>
> >>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
> >>>
> >>>> I'm using the term punctuation and watermark interchangeably here
> >>>> because for practical purposes they do the same thing. I'm not sure
> >>>> what you meant with your comment about those.
> >>>>
> >>>> For the Operator interface I'm thinking about something like this:
> >>>>
> >>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
> >>>>   public processElement(IN element);
> >>>>   public processBarrier(...);
> >>>>   public processPunctuation/lowWatermark(...):
> >>>> }
> >>>>
> >>>> The operator also has access to the TaskContext and ExecutionConfig
> >>>> and Serializers. The operator would emit values using an emit() method
> >>>> or the Collector interface, not sure about that yet.
> >>>>
> >>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
> >>>> <javascript:;>> wrote:
> >>>>> I think this a good idea in general. I would try to minimize the
> methods
> >>>> we
> >>>>> include and make the ones that we keep very concrete. For instance i
> >>>> would
> >>>>> not have the receive barrier method as that is handled on a totally
> >>>>> different level already. And instead of punctuation I would directly
> add
> >>>> a
> >>>>> method to work on watermarks.
> >>>>>
> >>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
> >>>> <javascript:;>> wrote:
> >>>>>
> >>>>>> What do you mean by "losing iterations"?
> >>>>>>
> >>>>>> For the pros and cons:
> >>>>>>
> >>>>>> Cons: I can't think of any, since most of the operators are
> chainable
> >>>>>> already and already behave like a collector.
> >>>>>>
> >>>>>> Pros:
> >>>>>> - Unified model for operators, chainable operators don't have to
> >>>>>> worry about input iterators and the collect interface.
> >>>>>> - Enables features that we want in the future, such as barriers and
> >>>>>> punctuations because they don't work with the
> >>>>>>  simple Collector interface.
> >>>>>> - The while-loop is moved outside of the operators, now the Task
> (the
> >>>>>> thing that runs Operators) can control the flow of data better and
> >>>>>> deal with
> >>>>>>  stuff like barriers and punctuations. If we want to keep the
> >>>>>> main-loop inside each operator, then they all have to manage input
> >>>>>> readers and inline events manually.
> >>>>>>
> >>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
> >>>> <javascript:;>
> >>>>>> <javascript:;>> wrote:
> >>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
> >>>>>>> functionality by getting rid of iterations?
> >>>>>>>
> >>>>>>> Kostas
> >>>>>>>
> >>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <
> [hidden email]
> >>>> <javascript:;>
> >>>>>> <javascript:;>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Folks,
> >>>>>>>> while working on introducing source-assigned timestamps into
> >>>> streaming
> >>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought
> about
> >>>> how
> >>>>>>>> the punctuations (low watermarks) can be pushed through the
> system.
> >>>>>>>> The problem is, that operators can have two ways of getting
> input: 1.
> >>>>>>>> They read directly from input iterators, and 2. They act as a
> >>>>>>>> Collector and get elements via collect() from the previous
> operator
> >>>> in
> >>>>>>>> a chain.
> >>>>>>>>
> >>>>>>>> This makes it hard to push things through a chain that are not
> >>>>>>>> elements, such as barriers and/or punctuations.
> >>>>>>>>
> >>>>>>>> I propose to change all streaming operators to be push based,
> with a
> >>>>>>>> slightly improved interface: In addition to collect(), which I
> would
> >>>>>>>> call receiveElement() I would add receivePunctuation() and
> >>>>>>>> receiveBarrier(). The first operator in the chain would also get
> data
> >>>>>>>> from the outside invokable that reads from the input iterator and
> >>>>>>>> calls receiveElement() for the first operator in a chain.
> >>>>>>>>
> >>>>>>>> What do you think? I would of course be willing to implement this
> >>>>>> myself.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Aljoscha
> >>>>>>>>
> >>>>>>
> >>>>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
In reply to this post by Paris Carbone
Yes, because the handling of punctuations depends on the operator: A
MapOperator can just forward them while a windowed join or reduce can
only forward them after emitting the correct windows or results.

On Tue, May 5, 2015 at 3:58 PM, Paris Carbone <[hidden email]> wrote:

> By watermark handling I meant making punctuations explicit and forwarding/modifying them at the operator level. I think this is clear so far.
>> On 05 May 2015, at 15:41, Aljoscha Krettek <[hidden email]> wrote:
>>
>> There is no watermark handling yet. :D
>>
>> But this would enable me to do this.
>>
>> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
>>> I agree with Gyula on this one. Barriers should better not be exposed to the operator. They are system events for state management. Apart from that, watermark handling seems to be on a right track, I like it so far.
>>>
>>>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
>>>>
>>>> I don't know, I just put that there because other people are working
>>>> on the checkpointing/barrier thing. So there would need to be some
>>>> functionality there at some point.
>>>>
>>>> Or maybe it is not required there and can be handled in the
>>>> StreamTask. Others might know this better than I do right now.
>>>>
>>>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:
>>>>> What would the processBarrier method do?
>>>>>
>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>>>>>
>>>>>> I'm using the term punctuation and watermark interchangeably here
>>>>>> because for practical purposes they do the same thing. I'm not sure
>>>>>> what you meant with your comment about those.
>>>>>>
>>>>>> For the Operator interface I'm thinking about something like this:
>>>>>>
>>>>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>>>>  public processElement(IN element);
>>>>>>  public processBarrier(...);
>>>>>>  public processPunctuation/lowWatermark(...):
>>>>>> }
>>>>>>
>>>>>> The operator also has access to the TaskContext and ExecutionConfig
>>>>>> and Serializers. The operator would emit values using an emit() method
>>>>>> or the Collector interface, not sure about that yet.
>>>>>>
>>>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>>>>>> <javascript:;>> wrote:
>>>>>>> I think this a good idea in general. I would try to minimize the methods
>>>>>> we
>>>>>>> include and make the ones that we keep very concrete. For instance i
>>>>>> would
>>>>>>> not have the receive barrier method as that is handled on a totally
>>>>>>> different level already. And instead of punctuation I would directly add
>>>>>> a
>>>>>>> method to work on watermarks.
>>>>>>>
>>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>>>>>> <javascript:;>> wrote:
>>>>>>>
>>>>>>>> What do you mean by "losing iterations"?
>>>>>>>>
>>>>>>>> For the pros and cons:
>>>>>>>>
>>>>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>>>>> already and already behave like a collector.
>>>>>>>>
>>>>>>>> Pros:
>>>>>>>> - Unified model for operators, chainable operators don't have to
>>>>>>>> worry about input iterators and the collect interface.
>>>>>>>> - Enables features that we want in the future, such as barriers and
>>>>>>>> punctuations because they don't work with the
>>>>>>>> simple Collector interface.
>>>>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>>>>> thing that runs Operators) can control the flow of data better and
>>>>>>>> deal with
>>>>>>>> stuff like barriers and punctuations. If we want to keep the
>>>>>>>> main-loop inside each operator, then they all have to manage input
>>>>>>>> readers and inline events manually.
>>>>>>>>
>>>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>>>>>> <javascript:;>
>>>>>>>> <javascript:;>> wrote:
>>>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>>>>> functionality by getting rid of iterations?
>>>>>>>>>
>>>>>>>>> Kostas
>>>>>>>>>
>>>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>>>>>> <javascript:;>
>>>>>>>> <javascript:;>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Folks,
>>>>>>>>>> while working on introducing source-assigned timestamps into
>>>>>> streaming
>>>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>>>>> how
>>>>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>>>>> Collector and get elements via collect() from the previous operator
>>>>>> in
>>>>>>>>>> a chain.
>>>>>>>>>>
>>>>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>>>>
>>>>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>>>>
>>>>>>>>>> What do you think? I would of course be willing to implement this
>>>>>>>> myself.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Aljoscha
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
So I gather I should go forward with this? If no-one objects I will
open a Jira and work on this.

On Tue, May 5, 2015 at 4:14 PM, Aljoscha Krettek <[hidden email]> wrote:

> Yes, because the handling of punctuations depends on the operator: A
> MapOperator can just forward them while a windowed join or reduce can
> only forward them after emitting the correct windows or results.
>
> On Tue, May 5, 2015 at 3:58 PM, Paris Carbone <[hidden email]> wrote:
>> By watermark handling I meant making punctuations explicit and forwarding/modifying them at the operator level. I think this is clear so far.
>>> On 05 May 2015, at 15:41, Aljoscha Krettek <[hidden email]> wrote:
>>>
>>> There is no watermark handling yet. :D
>>>
>>> But this would enable me to do this.
>>>
>>> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
>>>> I agree with Gyula on this one. Barriers should better not be exposed to the operator. They are system events for state management. Apart from that, watermark handling seems to be on a right track, I like it so far.
>>>>
>>>>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]> wrote:
>>>>>
>>>>> I don't know, I just put that there because other people are working
>>>>> on the checkpointing/barrier thing. So there would need to be some
>>>>> functionality there at some point.
>>>>>
>>>>> Or maybe it is not required there and can be handled in the
>>>>> StreamTask. Others might know this better than I do right now.
>>>>>
>>>>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]> wrote:
>>>>>> What would the processBarrier method do?
>>>>>>
>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]> wrote:
>>>>>>
>>>>>>> I'm using the term punctuation and watermark interchangeably here
>>>>>>> because for practical purposes they do the same thing. I'm not sure
>>>>>>> what you meant with your comment about those.
>>>>>>>
>>>>>>> For the Operator interface I'm thinking about something like this:
>>>>>>>
>>>>>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>>>>>  public processElement(IN element);
>>>>>>>  public processBarrier(...);
>>>>>>>  public processPunctuation/lowWatermark(...):
>>>>>>> }
>>>>>>>
>>>>>>> The operator also has access to the TaskContext and ExecutionConfig
>>>>>>> and Serializers. The operator would emit values using an emit() method
>>>>>>> or the Collector interface, not sure about that yet.
>>>>>>>
>>>>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>>>>>>> <javascript:;>> wrote:
>>>>>>>> I think this a good idea in general. I would try to minimize the methods
>>>>>>> we
>>>>>>>> include and make the ones that we keep very concrete. For instance i
>>>>>>> would
>>>>>>>> not have the receive barrier method as that is handled on a totally
>>>>>>>> different level already. And instead of punctuation I would directly add
>>>>>>> a
>>>>>>>> method to work on watermarks.
>>>>>>>>
>>>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>>>>>>> <javascript:;>> wrote:
>>>>>>>>
>>>>>>>>> What do you mean by "losing iterations"?
>>>>>>>>>
>>>>>>>>> For the pros and cons:
>>>>>>>>>
>>>>>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>>>>>> already and already behave like a collector.
>>>>>>>>>
>>>>>>>>> Pros:
>>>>>>>>> - Unified model for operators, chainable operators don't have to
>>>>>>>>> worry about input iterators and the collect interface.
>>>>>>>>> - Enables features that we want in the future, such as barriers and
>>>>>>>>> punctuations because they don't work with the
>>>>>>>>> simple Collector interface.
>>>>>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>>>>>> thing that runs Operators) can control the flow of data better and
>>>>>>>>> deal with
>>>>>>>>> stuff like barriers and punctuations. If we want to keep the
>>>>>>>>> main-loop inside each operator, then they all have to manage input
>>>>>>>>> readers and inline events manually.
>>>>>>>>>
>>>>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <[hidden email]
>>>>>>> <javascript:;>
>>>>>>>>> <javascript:;>> wrote:
>>>>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>>>>>> functionality by getting rid of iterations?
>>>>>>>>>>
>>>>>>>>>> Kostas
>>>>>>>>>>
>>>>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <[hidden email]
>>>>>>> <javascript:;>
>>>>>>>>> <javascript:;>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Folks,
>>>>>>>>>>> while working on introducing source-assigned timestamps into
>>>>>>> streaming
>>>>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>>>>>> how
>>>>>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>>>>>> Collector and get elements via collect() from the previous operator
>>>>>>> in
>>>>>>>>>>> a chain.
>>>>>>>>>>>
>>>>>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>>>>>
>>>>>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>>>>>
>>>>>>>>>>> What do you think? I would of course be willing to implement this
>>>>>>>>> myself.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Aljoscha
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Stephan Ewen
Yep, I would say: Move ahead :-)

On Tue, May 5, 2015 at 4:48 PM, Aljoscha Krettek <[hidden email]>
wrote:

> So I gather I should go forward with this? If no-one objects I will
> open a Jira and work on this.
>
> On Tue, May 5, 2015 at 4:14 PM, Aljoscha Krettek <[hidden email]>
> wrote:
> > Yes, because the handling of punctuations depends on the operator: A
> > MapOperator can just forward them while a windowed join or reduce can
> > only forward them after emitting the correct windows or results.
> >
> > On Tue, May 5, 2015 at 3:58 PM, Paris Carbone <[hidden email]> wrote:
> >> By watermark handling I meant making punctuations explicit and
> forwarding/modifying them at the operator level. I think this is clear so
> far.
> >>> On 05 May 2015, at 15:41, Aljoscha Krettek <[hidden email]>
> wrote:
> >>>
> >>> There is no watermark handling yet. :D
> >>>
> >>> But this would enable me to do this.
> >>>
> >>> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
> >>>> I agree with Gyula on this one. Barriers should better not be exposed
> to the operator. They are system events for state management. Apart from
> that, watermark handling seems to be on a right track, I like it so far.
> >>>>
> >>>>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]>
> wrote:
> >>>>>
> >>>>> I don't know, I just put that there because other people are working
> >>>>> on the checkpointing/barrier thing. So there would need to be some
> >>>>> functionality there at some point.
> >>>>>
> >>>>> Or maybe it is not required there and can be handled in the
> >>>>> StreamTask. Others might know this better than I do right now.
> >>>>>
> >>>>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]>
> wrote:
> >>>>>> What would the processBarrier method do?
> >>>>>>
> >>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]>
> wrote:
> >>>>>>
> >>>>>>> I'm using the term punctuation and watermark interchangeably here
> >>>>>>> because for practical purposes they do the same thing. I'm not sure
> >>>>>>> what you meant with your comment about those.
> >>>>>>>
> >>>>>>> For the Operator interface I'm thinking about something like this:
> >>>>>>>
> >>>>>>> abstract class OneInputStreamOperator<IN, OUT, F extends
> Function>  {
> >>>>>>>  public processElement(IN element);
> >>>>>>>  public processBarrier(...);
> >>>>>>>  public processPunctuation/lowWatermark(...):
> >>>>>>> }
> >>>>>>>
> >>>>>>> The operator also has access to the TaskContext and ExecutionConfig
> >>>>>>> and Serializers. The operator would emit values using an emit()
> method
> >>>>>>> or the Collector interface, not sure about that yet.
> >>>>>>>
> >>>>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
> >>>>>>> <javascript:;>> wrote:
> >>>>>>>> I think this a good idea in general. I would try to minimize the
> methods
> >>>>>>> we
> >>>>>>>> include and make the ones that we keep very concrete. For
> instance i
> >>>>>>> would
> >>>>>>>> not have the receive barrier method as that is handled on a
> totally
> >>>>>>>> different level already. And instead of punctuation I would
> directly add
> >>>>>>> a
> >>>>>>>> method to work on watermarks.
> >>>>>>>>
> >>>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
> >>>>>>> <javascript:;>> wrote:
> >>>>>>>>
> >>>>>>>>> What do you mean by "losing iterations"?
> >>>>>>>>>
> >>>>>>>>> For the pros and cons:
> >>>>>>>>>
> >>>>>>>>> Cons: I can't think of any, since most of the operators are
> chainable
> >>>>>>>>> already and already behave like a collector.
> >>>>>>>>>
> >>>>>>>>> Pros:
> >>>>>>>>> - Unified model for operators, chainable operators don't have to
> >>>>>>>>> worry about input iterators and the collect interface.
> >>>>>>>>> - Enables features that we want in the future, such as barriers
> and
> >>>>>>>>> punctuations because they don't work with the
> >>>>>>>>> simple Collector interface.
> >>>>>>>>> - The while-loop is moved outside of the operators, now the Task
> (the
> >>>>>>>>> thing that runs Operators) can control the flow of data better
> and
> >>>>>>>>> deal with
> >>>>>>>>> stuff like barriers and punctuations. If we want to keep the
> >>>>>>>>> main-loop inside each operator, then they all have to manage
> input
> >>>>>>>>> readers and inline events manually.
> >>>>>>>>>
> >>>>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <
> [hidden email]
> >>>>>>> <javascript:;>
> >>>>>>>>> <javascript:;>> wrote:
> >>>>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose
> some
> >>>>>>>>>> functionality by getting rid of iterations?
> >>>>>>>>>>
> >>>>>>>>>> Kostas
> >>>>>>>>>>
> >>>>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <
> [hidden email]
> >>>>>>> <javascript:;>
> >>>>>>>>> <javascript:;>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Folks,
> >>>>>>>>>>> while working on introducing source-assigned timestamps into
> >>>>>>> streaming
> >>>>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought
> about
> >>>>>>> how
> >>>>>>>>>>> the punctuations (low watermarks) can be pushed through the
> system.
> >>>>>>>>>>> The problem is, that operators can have two ways of getting
> input: 1.
> >>>>>>>>>>> They read directly from input iterators, and 2. They act as a
> >>>>>>>>>>> Collector and get elements via collect() from the previous
> operator
> >>>>>>> in
> >>>>>>>>>>> a chain.
> >>>>>>>>>>>
> >>>>>>>>>>> This makes it hard to push things through a chain that are not
> >>>>>>>>>>> elements, such as barriers and/or punctuations.
> >>>>>>>>>>>
> >>>>>>>>>>> I propose to change all streaming operators to be push based,
> with a
> >>>>>>>>>>> slightly improved interface: In addition to collect(), which I
> would
> >>>>>>>>>>> call receiveElement() I would add receivePunctuation() and
> >>>>>>>>>>> receiveBarrier(). The first operator in the chain would also
> get data
> >>>>>>>>>>> from the outside invokable that reads from the input iterator
> and
> >>>>>>>>>>> calls receiveElement() for the first operator in a chain.
> >>>>>>>>>>>
> >>>>>>>>>>> What do you think? I would of course be willing to implement
> this
> >>>>>>>>> myself.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> Aljoscha
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Change Streaming Operators to be Push-Only

Aljoscha Krettek-2
There is already a Jira and a Pull Request:
https://github.com/apache/flink/pull/659

On Mon, May 11, 2015 at 6:29 PM, Stephan Ewen <[hidden email]> wrote:

> Yep, I would say: Move ahead :-)
>
> On Tue, May 5, 2015 at 4:48 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> So I gather I should go forward with this? If no-one objects I will
>> open a Jira and work on this.
>>
>> On Tue, May 5, 2015 at 4:14 PM, Aljoscha Krettek <[hidden email]>
>> wrote:
>> > Yes, because the handling of punctuations depends on the operator: A
>> > MapOperator can just forward them while a windowed join or reduce can
>> > only forward them after emitting the correct windows or results.
>> >
>> > On Tue, May 5, 2015 at 3:58 PM, Paris Carbone <[hidden email]> wrote:
>> >> By watermark handling I meant making punctuations explicit and
>> forwarding/modifying them at the operator level. I think this is clear so
>> far.
>> >>> On 05 May 2015, at 15:41, Aljoscha Krettek <[hidden email]>
>> wrote:
>> >>>
>> >>> There is no watermark handling yet. :D
>> >>>
>> >>> But this would enable me to do this.
>> >>>
>> >>> On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <[hidden email]> wrote:
>> >>>> I agree with Gyula on this one. Barriers should better not be exposed
>> to the operator. They are system events for state management. Apart from
>> that, watermark handling seems to be on a right track, I like it so far.
>> >>>>
>> >>>>> On 05 May 2015, at 15:26, Aljoscha Krettek <[hidden email]>
>> wrote:
>> >>>>>
>> >>>>> I don't know, I just put that there because other people are working
>> >>>>> on the checkpointing/barrier thing. So there would need to be some
>> >>>>> functionality there at some point.
>> >>>>>
>> >>>>> Or maybe it is not required there and can be handled in the
>> >>>>> StreamTask. Others might know this better than I do right now.
>> >>>>>
>> >>>>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <[hidden email]>
>> wrote:
>> >>>>>> What would the processBarrier method do?
>> >>>>>>
>> >>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]>
>> wrote:
>> >>>>>>
>> >>>>>>> I'm using the term punctuation and watermark interchangeably here
>> >>>>>>> because for practical purposes they do the same thing. I'm not sure
>> >>>>>>> what you meant with your comment about those.
>> >>>>>>>
>> >>>>>>> For the Operator interface I'm thinking about something like this:
>> >>>>>>>
>> >>>>>>> abstract class OneInputStreamOperator<IN, OUT, F extends
>> Function>  {
>> >>>>>>>  public processElement(IN element);
>> >>>>>>>  public processBarrier(...);
>> >>>>>>>  public processPunctuation/lowWatermark(...):
>> >>>>>>> }
>> >>>>>>>
>> >>>>>>> The operator also has access to the TaskContext and ExecutionConfig
>> >>>>>>> and Serializers. The operator would emit values using an emit()
>> method
>> >>>>>>> or the Collector interface, not sure about that yet.
>> >>>>>>>
>> >>>>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <[hidden email]
>> >>>>>>> <javascript:;>> wrote:
>> >>>>>>>> I think this a good idea in general. I would try to minimize the
>> methods
>> >>>>>>> we
>> >>>>>>>> include and make the ones that we keep very concrete. For
>> instance i
>> >>>>>>> would
>> >>>>>>>> not have the receive barrier method as that is handled on a
>> totally
>> >>>>>>>> different level already. And instead of punctuation I would
>> directly add
>> >>>>>>> a
>> >>>>>>>> method to work on watermarks.
>> >>>>>>>>
>> >>>>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <[hidden email]
>> >>>>>>> <javascript:;>> wrote:
>> >>>>>>>>
>> >>>>>>>>> What do you mean by "losing iterations"?
>> >>>>>>>>>
>> >>>>>>>>> For the pros and cons:
>> >>>>>>>>>
>> >>>>>>>>> Cons: I can't think of any, since most of the operators are
>> chainable
>> >>>>>>>>> already and already behave like a collector.
>> >>>>>>>>>
>> >>>>>>>>> Pros:
>> >>>>>>>>> - Unified model for operators, chainable operators don't have to
>> >>>>>>>>> worry about input iterators and the collect interface.
>> >>>>>>>>> - Enables features that we want in the future, such as barriers
>> and
>> >>>>>>>>> punctuations because they don't work with the
>> >>>>>>>>> simple Collector interface.
>> >>>>>>>>> - The while-loop is moved outside of the operators, now the Task
>> (the
>> >>>>>>>>> thing that runs Operators) can control the flow of data better
>> and
>> >>>>>>>>> deal with
>> >>>>>>>>> stuff like barriers and punctuations. If we want to keep the
>> >>>>>>>>> main-loop inside each operator, then they all have to manage
>> input
>> >>>>>>>>> readers and inline events manually.
>> >>>>>>>>>
>> >>>>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <
>> [hidden email]
>> >>>>>>> <javascript:;>
>> >>>>>>>>> <javascript:;>> wrote:
>> >>>>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose
>> some
>> >>>>>>>>>> functionality by getting rid of iterations?
>> >>>>>>>>>>
>> >>>>>>>>>> Kostas
>> >>>>>>>>>>
>> >>>>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <
>> [hidden email]
>> >>>>>>> <javascript:;>
>> >>>>>>>>> <javascript:;>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hi Folks,
>> >>>>>>>>>>> while working on introducing source-assigned timestamps into
>> >>>>>>> streaming
>> >>>>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought
>> about
>> >>>>>>> how
>> >>>>>>>>>>> the punctuations (low watermarks) can be pushed through the
>> system.
>> >>>>>>>>>>> The problem is, that operators can have two ways of getting
>> input: 1.
>> >>>>>>>>>>> They read directly from input iterators, and 2. They act as a
>> >>>>>>>>>>> Collector and get elements via collect() from the previous
>> operator
>> >>>>>>> in
>> >>>>>>>>>>> a chain.
>> >>>>>>>>>>>
>> >>>>>>>>>>> This makes it hard to push things through a chain that are not
>> >>>>>>>>>>> elements, such as barriers and/or punctuations.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I propose to change all streaming operators to be push based,
>> with a
>> >>>>>>>>>>> slightly improved interface: In addition to collect(), which I
>> would
>> >>>>>>>>>>> call receiveElement() I would add receivePunctuation() and
>> >>>>>>>>>>> receiveBarrier(). The first operator in the chain would also
>> get data
>> >>>>>>>>>>> from the outside invokable that reads from the input iterator
>> and
>> >>>>>>>>>>> calls receiveElement() for the first operator in a chain.
>> >>>>>>>>>>>
>> >>>>>>>>>>> What do you think? I would of course be willing to implement
>> this
>> >>>>>>>>> myself.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Cheers,
>> >>>>>>>>>>> Aljoscha
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>
>> >>
>>