[DISCUSS] Proposal for Asynchronous I/O in FLINK

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Proposal for Asynchronous I/O in FLINK

David Wang
Hi ALL,

Recently, we have designed and implemented a new proposal in FLINK to
support Asynchronous Operation while streaming. The main feature of this
proposal is to introduce async i/o operation in FLINK to boost the TPS of
streaming job without delaying the checkpoint, and provide an easy way for
the FLINK users to implement their async i/o codes in FLINK job.

Here is the link to Google Doc:
https://docs.google.com/document/d/1Lr9UYXEz6s6R_3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit

Any feedback is appreciated.

Thanks,
David
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

Henry Saputra
HI David,

Thanks so much for the interest to contribute to Apache Flink.

To help review and DISCUSS for new feature in Flink, please do submit FLIP
[1] proposal.

It will help the PMCs managing the new feature proposals and keep resources
with ASF realm.


Thanks,

Henry

[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

On Sun, Sep 11, 2016 at 7:56 PM, Jeffery David <[hidden email]> wrote:

> Hi ALL,
>
> Recently, we have designed and implemented a new proposal in FLINK to
> support Asynchronous Operation while streaming. The main feature of this
> proposal is to introduce async i/o operation in FLINK to boost the TPS of
> streaming job without delaying the checkpoint, and provide an easy way for
> the FLINK users to implement their async i/o codes in FLINK job.
>
> Here is the link to Google Doc:
> https://docs.google.com/document/d/1Lr9UYXEz6s6R_
> 3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit
>
> Any feedback is appreciated.
>
> Thanks,
> David
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

David Wang
Hi Henry,

Here is the FLIP:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65870673

Thanks,
David

2016-09-15 12:26 GMT+08:00 Henry Saputra <[hidden email]>:

> HI David,
>
> Thanks so much for the interest to contribute to Apache Flink.
>
> To help review and DISCUSS for new feature in Flink, please do submit FLIP
> [1] proposal.
>
> It will help the PMCs managing the new feature proposals and keep resources
> with ASF realm.
>
>
> Thanks,
>
> Henry
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/
> Flink+Improvement+Proposals
>
> On Sun, Sep 11, 2016 at 7:56 PM, Jeffery David <[hidden email]> wrote:
>
> > Hi ALL,
> >
> > Recently, we have designed and implemented a new proposal in FLINK to
> > support Asynchronous Operation while streaming. The main feature of this
> > proposal is to introduce async i/o operation in FLINK to boost the TPS of
> > streaming job without delaying the checkpoint, and provide an easy way
> for
> > the FLINK users to implement their async i/o codes in FLINK job.
> >
> > Here is the link to Google Doc:
> > https://docs.google.com/document/d/1Lr9UYXEz6s6R_
> > 3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit
> >
> > Any feedback is appreciated.
> >
> > Thanks,
> > David
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

Fabian Hueske-2
Hi David,

thanks for the FLIP! It looks pretty good. A few questions / suggestions:

- It would be good to make the number of concurrent AsyncFunction calls
configurable. Maybe overload the unorderedWait and orderedWait methods with
an additional int parameter?
- Do you plan to also add a RichFunction variant of AsyncFunction?
- Does it make sense to add a timeout thread for async calls to be able to
cancel a request?

Best, Fabian


2016-09-18 12:22 GMT+02:00 David Wang <[hidden email]>:

> Hi Henry,
>
> Here is the FLIP:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65870673
>
> Thanks,
> David
>
> 2016-09-15 12:26 GMT+08:00 Henry Saputra <[hidden email]>:
>
> > HI David,
> >
> > Thanks so much for the interest to contribute to Apache Flink.
> >
> > To help review and DISCUSS for new feature in Flink, please do submit
> FLIP
> > [1] proposal.
> >
> > It will help the PMCs managing the new feature proposals and keep
> resources
> > with ASF realm.
> >
> >
> > Thanks,
> >
> > Henry
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/
> > Flink+Improvement+Proposals
> >
> > On Sun, Sep 11, 2016 at 7:56 PM, Jeffery David <[hidden email]>
> wrote:
> >
> > > Hi ALL,
> > >
> > > Recently, we have designed and implemented a new proposal in FLINK to
> > > support Asynchronous Operation while streaming. The main feature of
> this
> > > proposal is to introduce async i/o operation in FLINK to boost the TPS
> of
> > > streaming job without delaying the checkpoint, and provide an easy way
> > for
> > > the FLINK users to implement their async i/o codes in FLINK job.
> > >
> > > Here is the link to Google Doc:
> > > https://docs.google.com/document/d/1Lr9UYXEz6s6R_
> > > 3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit
> > >
> > > Any feedback is appreciated.
> > >
> > > Thanks,
> > > David
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

David Wang
Hi Fabian,

Thanks for your review.

- It would be good to make the number of concurrent AsyncFunction calls
configurable. Maybe overload the unorderedWait and orderedWait methods with
an additional int parameter?

*That is right, the concurrent number is confiurable. I have updated the
FLIP to provide extra methods to do so.*

- Do you plan to also add a RichFunction variant of AsyncFunction?

*Good point. Providing a rich function version could provide more control
in asyncInvoke().*

 - Does it make sense to add a timeout thread for async calls to be able to
cancel a request?

*I think it is Async Client's reponsibility to handle timeout issue, but
not FLINK. In AsyncFunction, user can configure connection timeout while
setting up the function. AsyncWaitOperator provides a way to propogate
timeout errors from async operation.*

Thanks,
David

2016-09-20 5:06 GMT+08:00 Fabian Hueske <[hidden email]>:

> Hi David,
>
> thanks for the FLIP! It looks pretty good. A few questions / suggestions:
>
> - It would be good to make the number of concurrent AsyncFunction calls
> configurable. Maybe overload the unorderedWait and orderedWait methods with
> an additional int parameter?
> - Do you plan to also add a RichFunction variant of AsyncFunction?
> - Does it make sense to add a timeout thread for async calls to be able to
> cancel a request?
>
> Best, Fabian
>
>
> 2016-09-18 12:22 GMT+02:00 David Wang <[hidden email]>:
>
> > Hi Henry,
> >
> > Here is the FLIP:
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65870673
> >
> > Thanks,
> > David
> >
> > 2016-09-15 12:26 GMT+08:00 Henry Saputra <[hidden email]>:
> >
> > > HI David,
> > >
> > > Thanks so much for the interest to contribute to Apache Flink.
> > >
> > > To help review and DISCUSS for new feature in Flink, please do submit
> > FLIP
> > > [1] proposal.
> > >
> > > It will help the PMCs managing the new feature proposals and keep
> > resources
> > > with ASF realm.
> > >
> > >
> > > Thanks,
> > >
> > > Henry
> > >
> > > [1]
> > > https://cwiki.apache.org/confluence/display/FLINK/
> > > Flink+Improvement+Proposals
> > >
> > > On Sun, Sep 11, 2016 at 7:56 PM, Jeffery David <[hidden email]>
> > wrote:
> > >
> > > > Hi ALL,
> > > >
> > > > Recently, we have designed and implemented a new proposal in FLINK to
> > > > support Asynchronous Operation while streaming. The main feature of
> > this
> > > > proposal is to introduce async i/o operation in FLINK to boost the
> TPS
> > of
> > > > streaming job without delaying the checkpoint, and provide an easy
> way
> > > for
> > > > the FLINK users to implement their async i/o codes in FLINK job.
> > > >
> > > > Here is the link to Google Doc:
> > > > https://docs.google.com/document/d/1Lr9UYXEz6s6R_
> > > > 3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit
> > > >
> > > > Any feedback is appreciated.
> > > >
> > > > Thanks,
> > > > David
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

Shannon Carey
David,

I just wanted to say "thanks" for making this proposal! I'm also interested in performing nonblocking I/O (multiplexing threads/reactive programming) within Flink operators so that we can, for example, communicate with external web services with Netty/RxNetty without blocking an entire Flink slot (aka a thread) while we wait for the operation to complete. It looks like your FLIP will enable that use case.

I'm not sure whether it will be possible to share one Netty EventLoopGroup (or the equivalent for any other non-blocking framework, connection pool, etc.) among multiple slots in a single JVM though. Flink supports open/close operation on a RichFunction, but that's on a per-slot basis. I don't know of a way to open/close objects on a per-job-JVM basis. But I suppose that's an issue that should be discussed and resolved separately.

-Shannon

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

Stephan Ewen
@Shannon: One could have a "static" broker to share the same netty across
slots in the same JVM. Implicitly, Flink does the same with broadcast
variables.

On Wed, Sep 21, 2016 at 5:04 PM, Shannon Carey <[hidden email]> wrote:

> David,
>
> I just wanted to say "thanks" for making this proposal! I'm also
> interested in performing nonblocking I/O (multiplexing threads/reactive
> programming) within Flink operators so that we can, for example,
> communicate with external web services with Netty/RxNetty without blocking
> an entire Flink slot (aka a thread) while we wait for the operation to
> complete. It looks like your FLIP will enable that use case.
>
> I'm not sure whether it will be possible to share one Netty EventLoopGroup
> (or the equivalent for any other non-blocking framework, connection pool,
> etc.) among multiple slots in a single JVM though. Flink supports
> open/close operation on a RichFunction, but that's on a per-slot basis. I
> don't know of a way to open/close objects on a per-job-JVM basis. But I
> suppose that's an issue that should be discussed and resolved separately.
>
> -Shannon
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

David Wang
Hi Shannon,

That's right. This FLIP aims to boost TPS of the task workers with async
i/o operation.

As what Stephan has mentioned, by placing static attribute to shared
resources(like event pool, connection), it is possible to share those
resources among different slots in the same JVM.

I will make a note in the FLIP about how to share resources ;D

Thanks,
David

2016-09-22 1:46 GMT+08:00 Stephan Ewen <[hidden email]>:

> @Shannon: One could have a "static" broker to share the same netty across
> slots in the same JVM. Implicitly, Flink does the same with broadcast
> variables.
>
> On Wed, Sep 21, 2016 at 5:04 PM, Shannon Carey <[hidden email]> wrote:
>
> > David,
> >
> > I just wanted to say "thanks" for making this proposal! I'm also
> > interested in performing nonblocking I/O (multiplexing threads/reactive
> > programming) within Flink operators so that we can, for example,
> > communicate with external web services with Netty/RxNetty without
> blocking
> > an entire Flink slot (aka a thread) while we wait for the operation to
> > complete. It looks like your FLIP will enable that use case.
> >
> > I'm not sure whether it will be possible to share one Netty
> EventLoopGroup
> > (or the equivalent for any other non-blocking framework, connection pool,
> > etc.) among multiple slots in a single JVM though. Flink supports
> > open/close operation on a RichFunction, but that's on a per-slot basis. I
> > don't know of a way to open/close objects on a per-job-JVM basis. But I
> > suppose that's an issue that should be discussed and resolved separately.
> >
> > -Shannon
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

Shannon Carey
Not to derail this thread onto another topic but the problem with using a static instance is that there's no way to shut it down when the job stops. So if, for example, it starts threads, I don't think those threads will stop when the job stops. I'm not very well versed in how various Java 8 implementations perform unloading of classloaders & class definitions/statics therein, but it seems problematic unless the job provides a shutdown hook to which user code can subscribe.




On 9/21/16, 8:05 PM, "David Wang" <[hidden email]> wrote:

>Hi Shannon,
>
>That's right. This FLIP aims to boost TPS of the task workers with async
>i/o operation.
>
>As what Stephan has mentioned, by placing static attribute to shared
>resources(like event pool, connection), it is possible to share those
>resources among different slots in the same JVM.
>
>I will make a note in the FLIP about how to share resources ;D
>
>Thanks,
>David
>
>2016-09-22 1:46 GMT+08:00 Stephan Ewen <[hidden email]>:
>
>> @Shannon: One could have a "static" broker to share the same netty across
>> slots in the same JVM. Implicitly, Flink does the same with broadcast
>> variables.
>>
>> On Wed, Sep 21, 2016 at 5:04 PM, Shannon Carey <[hidden email]> wrote:
>>
>> > David,
>> >
>> > I just wanted to say "thanks" for making this proposal! I'm also
>> > interested in performing nonblocking I/O (multiplexing threads/reactive
>> > programming) within Flink operators so that we can, for example,
>> > communicate with external web services with Netty/RxNetty without
>> blocking
>> > an entire Flink slot (aka a thread) while we wait for the operation to
>> > complete. It looks like your FLIP will enable that use case.
>> >
>> > I'm not sure whether it will be possible to share one Netty
>> EventLoopGroup
>> > (or the equivalent for any other non-blocking framework, connection pool,
>> > etc.) among multiple slots in a single JVM though. Flink supports
>> > open/close operation on a RichFunction, but that's on a per-slot basis. I
>> > don't know of a way to open/close objects on a per-job-JVM basis. But I
>> > suppose that's an issue that should be discussed and resolved separately.
>> >
>> > -Shannon
>> >
>> >
>>