(DEPRECATED) Apache Flink Mailing List archive.

Question about Async IO

Classic

List

Threaded

5 messages Options

Gyula Fóra-2

Question about Async IO

Hi,

I was looking at the AsyncFunction interface and try to wrap my head around
the implementation and the assumptions and I have some questions, maybe
somebody could help me out :)

What happens if the user does not collect any data or set a future to do so
in the invoke method?
Also what happens if I create more than one Future?

It seems that the "streamRecordBufferEntry" logic assumes that there will
be a Future that eventually collects 1 thing or the user does this
directly.
Do I understand correctly? If not I am probably missing the part where the
buffer entry is removed immediately if no async request was made.

Thank you!
Gyula

Till Rohrmann

Re: Question about Async IO

Hi Gyula,

the assumption is that the AsyncCollector is either completed by the user
or, if you have a timeout defined, that it will be completed with a timeout
exception. This means that if you have no timeout defined, then you have to
make sure that the collector is completed. Otherwise you will have
lingering state which is never cleared. In that sense it follows the
semantics of normal futures.

What do you mean by creating more than one future? More than one future
which completes the AsyncCollector? If that's the case, then the first
future which completes will also complete the AsyncCollector and the result
of the other future should be ignored.

Cheers,
Till

On Mon, Feb 20, 2017 at 2:53 PM, Gyula Fóra <[hidden email]> wrote:

> Hi,
>
> I was looking at the AsyncFunction interface and try to wrap my head around
> the implementation and the assumptions and I have some questions, maybe
> somebody could help me out :)
>
> What happens if the user does not collect any data or set a future to do so
> in the invoke method?
> Also what happens if I create more than one Future?
>
> It seems that the "streamRecordBufferEntry" logic assumes that there will
> be a Future that eventually collects 1 thing or the user does this
> directly.
> Do I understand correctly? If not I am probably missing the part where the
> buffer entry is removed immediately if no async request was made.
>
> Thank you!
> Gyula
>

Gyula Fóra

Re: Question about Async IO

Hi Till,

Thanks, for the explanation!

How do I express if I don't want to collect any elements in the async
collector? Like 0 output from a flatmap.

Also it doesn't seem to be specified anywhere that the AsyncCollector is
"completed", it is just a collector. You should be able to collect multiple
things to it, but it actually won't work if you try to do that from more
than one Future.

I wonder if it would make sense to change the API to make this more
specific otherwise we might keep a lot of unnecessary state or have
potential leaks depending on the usage.

Just my thoughts, now I also understand the current rationale just I didn't
completely get it for the first pass.

Gyula

Till Rohrmann <[hidden email]> ezt írta (időpont: 2017. febr. 20., H,
15:35):

> Hi Gyula,
>
> the assumption is that the AsyncCollector is either completed by the user
> or, if you have a timeout defined, that it will be completed with a timeout
> exception. This means that if you have no timeout defined, then you have to
> make sure that the collector is completed. Otherwise you will have
> lingering state which is never cleared. In that sense it follows the
> semantics of normal futures.
>
> What do you mean by creating more than one future? More than one future
> which completes the AsyncCollector? If that's the case, then the first
> future which completes will also complete the AsyncCollector and the result
> of the other future should be ignored.
>
> Cheers,
> Till
>
>
> On Mon, Feb 20, 2017 at 2:53 PM, Gyula Fóra <[hidden email]> wrote:
>
> > Hi,
> >
> > I was looking at the AsyncFunction interface and try to wrap my head
> around
> > the implementation and the assumptions and I have some questions, maybe
> > somebody could help me out :)
> >
> > What happens if the user does not collect any data or set a future to do
> so
> > in the invoke method?
> > Also what happens if I create more than one Future?
> >
> > It seems that the "streamRecordBufferEntry" logic assumes that there
> will
> > be a Future that eventually collects 1 thing or the user does this
> > directly.
> > Do I understand correctly? If not I am probably missing the part where
> the
> > buffer entry is removed immediately if no async request was made.
> >
> > Thank you!
> > Gyula
> >
>

Till Rohrmann

Re: Question about Async IO

In order to output 0 elements you have to pass an empty collection to the
`collect` method.

You're right that our online documentation is lacking the fact that you're
only supposed to call `collect` once. It's actually documented in the
JavaDocs of this method. We should change this.

You're also right that the name `AsyncCollector` is definitely not the best
name for reflecting what it actually represents. I think the initial idea
was to make it look similar to the existing functions which are given a
`Collector`. Actually it is a promise/completable future and as such we
should maybe name it "ResultPromise"/"ResultFuture". I've opened a JIRA for
this [1].

[1] https://issues.apache.org/jira/browse/FLINK-5851

Cheers,
Till

On Mon, Feb 20, 2017 at 3:41 PM, Gyula Fóra <[hidden email]> wrote:

> Hi Till,
>
> Thanks, for the explanation!
>
> How do I express if I don't want to collect any elements in the async
> collector? Like 0 output from a flatmap.
>
> Also it doesn't seem to be specified anywhere that the AsyncCollector is
> "completed", it is just a collector. You should be able to collect multiple
> things to it, but it actually won't work if you try to do that from more
> than one Future.
>
> I wonder if it would make sense to change the API to make this more
> specific otherwise we might keep a lot of unnecessary state or have
> potential leaks depending on the usage.
>
> Just my thoughts, now I also understand the current rationale just I didn't
> completely get it for the first pass.
>
> Gyula
>
> Till Rohrmann <[hidden email]> ezt írta (időpont: 2017. febr. 20.,
> H,
> 15:35):
>
> > Hi Gyula,
> >
> > the assumption is that the AsyncCollector is either completed by the user
> > or, if you have a timeout defined, that it will be completed with a
> timeout
> > exception. This means that if you have no timeout defined, then you have
> to
> > make sure that the collector is completed. Otherwise you will have
> > lingering state which is never cleared. In that sense it follows the
> > semantics of normal futures.
> >
> > What do you mean by creating more than one future? More than one future
> > which completes the AsyncCollector? If that's the case, then the first
> > future which completes will also complete the AsyncCollector and the
> result
> > of the other future should be ignored.
> >
> > Cheers,
> > Till
> >
> >
> > On Mon, Feb 20, 2017 at 2:53 PM, Gyula Fóra <[hidden email]> wrote:
> >
> > > Hi,
> > >
> > > I was looking at the AsyncFunction interface and try to wrap my head
> > around
> > > the implementation and the assumptions and I have some questions, maybe
> > > somebody could help me out :)
> > >
> > > What happens if the user does not collect any data or set a future to
> do
> > so
> > > in the invoke method?
> > > Also what happens if I create more than one Future?
> > >
> > > It seems that the "streamRecordBufferEntry" logic assumes that there
> > will
> > > be a Future that eventually collects 1 thing or the user does this
> > > directly.
> > > Do I understand correctly? If not I am probably missing the part where
> > the
> > > buffer entry is removed immediately if no async request was made.
> > >
> > > Thank you!
> > > Gyula
> > >
> >
>

Stephan Ewen

Re: Question about Async IO

You can also issue multiple calls in one "invoke()" call (have multiple
Futures) and then chain these futures and return only something
once all Futures are complete.

On Mon, Feb 20, 2017 at 4:01 PM, Till Rohrmann <[hidden email]> wrote:

> In order to output 0 elements you have to pass an empty collection to the
> `collect` method.
>
> You're right that our online documentation is lacking the fact that you're
> only supposed to call `collect` once. It's actually documented in the
> JavaDocs of this method. We should change this.
>
> You're also right that the name `AsyncCollector` is definitely not the best
> name for reflecting what it actually represents. I think the initial idea
> was to make it look similar to the existing functions which are given a
> `Collector`. Actually it is a promise/completable future and as such we
> should maybe name it "ResultPromise"/"ResultFuture". I've opened a JIRA
> for
> this [1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-5851
>
> Cheers,
> Till
>
> On Mon, Feb 20, 2017 at 3:41 PM, Gyula Fóra <[hidden email]> wrote:
>
> > Hi Till,
> >
> > Thanks, for the explanation!
> >
> > How do I express if I don't want to collect any elements in the async
> > collector? Like 0 output from a flatmap.
> >
> > Also it doesn't seem to be specified anywhere that the AsyncCollector is
> > "completed", it is just a collector. You should be able to collect
> multiple
> > things to it, but it actually won't work if you try to do that from more
> > than one Future.
> >
> > I wonder if it would make sense to change the API to make this more
> > specific otherwise we might keep a lot of unnecessary state or have
> > potential leaks depending on the usage.
> >
> > Just my thoughts, now I also understand the current rationale just I
> didn't
> > completely get it for the first pass.
> >
> > Gyula
> >
> > Till Rohrmann <[hidden email]> ezt írta (időpont: 2017. febr. 20.,
> > H,
> > 15:35):
> >
> > > Hi Gyula,
> > >
> > > the assumption is that the AsyncCollector is either completed by the
> user
> > > or, if you have a timeout defined, that it will be completed with a
> > timeout
> > > exception. This means that if you have no timeout defined, then you
> have
> > to
> > > make sure that the collector is completed. Otherwise you will have
> > > lingering state which is never cleared. In that sense it follows the
> > > semantics of normal futures.
> > >
> > > What do you mean by creating more than one future? More than one future
> > > which completes the AsyncCollector? If that's the case, then the first
> > > future which completes will also complete the AsyncCollector and the
> > result
> > > of the other future should be ignored.
> > >
> > > Cheers,
> > > Till
> > >
> > >
> > > On Mon, Feb 20, 2017 at 2:53 PM, Gyula Fóra <[hidden email]> wrote:
> > >
> > > > Hi,
> > > >
> > > > I was looking at the AsyncFunction interface and try to wrap my head
> > > around
> > > > the implementation and the assumptions and I have some questions,
> maybe
> > > > somebody could help me out :)
> > > >
> > > > What happens if the user does not collect any data or set a future to
> > do
> > > so
> > > > in the invoke method?
> > > > Also what happens if I create more than one Future?
> > > >
> > > > It seems that the "streamRecordBufferEntry" logic assumes that there
> > > will
> > > > be a Future that eventually collects 1 thing or the user does this
> > > > directly.
> > > > Do I understand correctly? If not I am probably missing the part
> where
> > > the
> > > > buffer entry is removed immediately if no async request was made.
> > > >
> > > > Thank you!
> > > > Gyula
> > > >
> > >
> >
>