[DISCUSS] Integrate new SourceReader with Mailbox Model in StreamTask

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Integrate new SourceReader with Mailbox Model in StreamTask

Zhijiang(wangzhijiang999)

Hi all,

As mentioned in FLIP-27[1], the proposed SourceReader interface is part of refactoring source interface.
AFAIK FLIP-27 has not been voted yet and might still need more time as a whole compared to the


SourceReader discussion which seems more or less converged.


We would like to start an initial progress on runtime side based on the SourceReader interface to integrate
the execution of SourceReader with existing mailbox thread model[2] in StreamTask. It is one of the steps for


finally integrating all the actions in task (processing timer, checkpoint) into unified mailbox model. The benefits

are simpler processing logics because only one single thread handles all the actions without concurrent issue,

and further getting rid of lock dependency which might cause unfair lock concern in checkpoint process.

We still need to support the legacy source in some releases which would probably be used for a while, especially
for the scenario of performance concern.

Welcome any feedbacks or comments in design doc[3].

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2] https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g/edit
[3] https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit?usp=sharing

Best,
Zhijiang
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Integrate new SourceReader with Mailbox Model in StreamTask

Stephan Ewen
+1 to looking at the Source Reader interface as converged with respect to
its integration with the runtime.

Especially the semantics around the availability future and "emitNext" seem
to have reach consensus.

On Sat, Aug 10, 2019 at 10:51 PM zhijiang
<[hidden email]> wrote:

>
> Hi all,
>
> As mentioned in FLIP-27[1], the proposed SourceReader interface is part of
> refactoring source interface.
> AFAIK FLIP-27 has not been voted yet and might still need more time as a
> whole compared to the
>
>
> SourceReader discussion which seems more or less converged.
>
>
> We would like to start an initial progress on runtime side based on the
> SourceReader interface to integrate
> the execution of SourceReader with existing mailbox thread model[2] in
> StreamTask. It is one of the steps for
>
>
> finally integrating all the actions in task (processing timer, checkpoint)
> into unified mailbox model. The benefits
>
> are simpler processing logics because only one single thread handles all
> the actions without concurrent issue,
>
> and further getting rid of lock dependency which might cause unfair lock
> concern in checkpoint process.
>
> We still need to support the legacy source in some releases which would
> probably be used for a while, especially
> for the scenario of performance concern.
>
> Welcome any feedbacks or comments in design doc[3].
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> [2]
> https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g/edit
> [3]
> https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit?usp=sharing
>
> Best,
> Zhijiang
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Integrate new SourceReader with Mailbox Model in StreamTask

Becket Qin
+1 as well. starting the work in parallel may also give some insights on
whether some additional API on SourceReader is needed in order to support
the interaction between SourceReader and runtime.

On Mon, Aug 12, 2019 at 11:29 AM Stephan Ewen <[hidden email]> wrote:

> +1 to looking at the Source Reader interface as converged with respect to
> its integration with the runtime.
>
> Especially the semantics around the availability future and "emitNext" seem
> to have reach consensus.
>
> On Sat, Aug 10, 2019 at 10:51 PM zhijiang
> <[hidden email]> wrote:
>
> >
> > Hi all,
> >
> > As mentioned in FLIP-27[1], the proposed SourceReader interface is part
> of
> > refactoring source interface.
> > AFAIK FLIP-27 has not been voted yet and might still need more time as a
> > whole compared to the
> >
> >
> > SourceReader discussion which seems more or less converged.
> >
> >
> > We would like to start an initial progress on runtime side based on the
> > SourceReader interface to integrate
> > the execution of SourceReader with existing mailbox thread model[2] in
> > StreamTask. It is one of the steps for
> >
> >
> > finally integrating all the actions in task (processing timer,
> checkpoint)
> > into unified mailbox model. The benefits
> >
> > are simpler processing logics because only one single thread handles all
> > the actions without concurrent issue,
> >
> > and further getting rid of lock dependency which might cause unfair lock
> > concern in checkpoint process.
> >
> > We still need to support the legacy source in some releases which would
> > probably be used for a while, especially
> > for the scenario of performance concern.
> >
> > Welcome any feedbacks or comments in design doc[3].
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > [2]
> >
> https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g/edit
> > [3]
> >
> https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit?usp=sharing
> >
> > Best,
> > Zhijiang
> >
>