[Discussion] More Deep Learning usages on Apache Flink

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] More Deep Learning usages on Apache Flink

Qing Lan
Hi all,

On behalf of the AWS DJL team, I would like to discuss about the Apache Flink's ML integration development. We would like to contribute some more Deep Learning (DL) based applications to Flink that including but not limited to TensorFlow, PyTorch, Apache MXNet, Apache TVM and more through DJL... Do you have any thoughts to have these DL engines on Flink ML module?

Here is an example using Apache Flink to do Sentiment Analysis with PyTorch: https://github.com/aws-samples/djl-demo/tree/master/flink/sentiment-analysis

Some background about DJL: DJL (https://github.com/awslabs/djl) is an open-source project (Licensed Apache 2.0) that is aimed to bridge DL applications into the Java world. It offers the full multi-threading and low-memory inference experience across all DL engines and has been used in online service, streaming applications, and distributed inference.

Thanks,
Qing
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] More Deep Learning usages on Apache Flink

Becket Qin
Hi Qing,

Thanks for raising the discussion. It is great to know the project DJL.

If I understand correctly, the discussion is mostly about inference. DJL
essentially provides a uniform Java API for people to use different deep
learning engines. It is useful for people to combine Flink and DJL so they
can essentially have "deep learning udf" in their Flink job to do the
inference. It is what the Sentiment Analysis example does and makes a lot
of sense to me.

Personally speaking I think it seems already simple enough for people to
leverage DJL in the inference case via Flink UDFs. But there might be
something we can do to make this solution more visible if we want to:

1. Have a built-in DJLPredictorMapper which wraps the DJL predictor. This
makes this solution a bit more visible to the users, but I am not sure if
it is worth doing, because this would introduce external dependency on DJL
in Flink which is something we may want to avoid.
2. Add the DJLPredictorMapper to a 3rd party project (personally I don't
think it is necessary, a code snippet example seems good enough), list it
in the flink packages website[1], and add a Flink ML use case page to Flink
website to advertise the usage with other Flink ML usages.

I am in favor of option 2 here.

Apart from that, I am very curious about the exact latency and performance
overhead of IPC in DJL - I assume there is an IPC between JVM and other
processes under the hood.

Thanks,

Jiangjie (Becket) Qin

[1] https://flink-packages.org/

On Fri, Jan 15, 2021 at 10:12 AM Qing Lan <[hidden email]> wrote:

> Hi all,
>
> On behalf of the AWS DJL team, I would like to discuss about the Apache
> Flink's ML integration development. We would like to contribute some more
> Deep Learning (DL) based applications to Flink that including but not
> limited to TensorFlow, PyTorch, Apache MXNet, Apache TVM and more through
> DJL... Do you have any thoughts to have these DL engines on Flink ML module?
>
> Here is an example using Apache Flink to do Sentiment Analysis with
> PyTorch:
> https://github.com/aws-samples/djl-demo/tree/master/flink/sentiment-analysis
>
> Some background about DJL: DJL (https://github.com/awslabs/djl) is an
> open-source project (Licensed Apache 2.0) that is aimed to bridge DL
> applications into the Java world. It offers the full multi-threading and
> low-memory inference experience across all DL engines and has been used in
> online service, streaming applications, and distributed inference.
>
> Thanks,
> Qing
>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] More Deep Learning usages on Apache Flink

Qing Lan
Hi Becket,

Talking about the IPC, DJL is leveraging the JNI/JNA directly to connect to DL engines C /C++ API. So the latency between C++ and Java is minimum (~10ns). Performance wise speaking, DJL can offers true multi threading Java inference, means load model once, use in as many threads you would like... It’s been using to deal with some streaming tasks with 1ms budget.. and DJL only consume 400ns or less per inference for recommendations models.

Thanks,
Qing

> On Jan 15, 2021, at 12:36 AM, Becket Qin <[hidden email]> wrote:
>
> Hi Qing,
>
> Thanks for raising the discussion. It is great to know the project DJL.
>
> If I understand correctly, the discussion is mostly about inference. DJL
> essentially provides a uniform Java API for people to use different deep
> learning engines. It is useful for people to combine Flink and DJL so they
> can essentially have "deep learning udf" in their Flink job to do the
> inference. It is what the Sentiment Analysis example does and makes a lot
> of sense to me.
>
> Personally speaking I think it seems already simple enough for people to
> leverage DJL in the inference case via Flink UDFs. But there might be
> something we can do to make this solution more visible if we want to:
>
> 1. Have a built-in DJLPredictorMapper which wraps the DJL predictor. This
> makes this solution a bit more visible to the users, but I am not sure if
> it is worth doing, because this would introduce external dependency on DJL
> in Flink which is something we may want to avoid.
> 2. Add the DJLPredictorMapper to a 3rd party project (personally I don't
> think it is necessary, a code snippet example seems good enough), list it
> in the flink packages website[1], and add a Flink ML use case page to Flink
> website to advertise the usage with other Flink ML usages.
>
> I am in favor of option 2 here.
>
> Apart from that, I am very curious about the exact latency and performance
> overhead of IPC in DJL - I assume there is an IPC between JVM and other
> processes under the hood.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fflink-packages.org%2F&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Qobsue2QNoEIjmikJJ3IPudcMsvuSl2z4TQIaxOrBgg%3D&amp;reserved=0
>
>> On Fri, Jan 15, 2021 at 10:12 AM Qing Lan <[hidden email]> wrote:
>>
>> Hi all,
>>
>> On behalf of the AWS DJL team, I would like to discuss about the Apache
>> Flink's ML integration development. We would like to contribute some more
>> Deep Learning (DL) based applications to Flink that including but not
>> limited to TensorFlow, PyTorch, Apache MXNet, Apache TVM and more through
>> DJL... Do you have any thoughts to have these DL engines on Flink ML module?
>>
>> Here is an example using Apache Flink to do Sentiment Analysis with
>> PyTorch:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faws-samples%2Fdjl-demo%2Ftree%2Fmaster%2Fflink%2Fsentiment-analysis&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2F4Bqu2YSxSOvIfmz6QZbAJVZS0TqC0OndykYVxjMUzk%3D&amp;reserved=0
>>
>> Some background about DJL: DJL (https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fawslabs%2Fdjl&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0yn06BgSF0fn5XGTlhAFBi2jUvxmJtj06ZXXv9SrMt4%3D&amp;reserved=0) is an
>> open-source project (Licensed Apache 2.0) that is aimed to bridge DL
>> applications into the Java world. It offers the full multi-threading and
>> low-memory inference experience across all DL engines and has been used in
>> online service, streaming applications, and distributed inference.
>>
>> Thanks,
>> Qing
>>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] More Deep Learning usages on Apache Flink

Becket Qin
Hi Qing,

Thanks for the numbers. They look very good. I am wondering if we can have
DJL integrated with some existing Flink AI ecosystem projects.

For example, the project flink-ai-extended [1] provides the capability to
run a distributed TF/PyTorch cluster on top of Flink, which allows people
to combine the data processing and ML training / inference in the same
Flink project. Leveraging DJL might help solve a few issues there, for
instance:
1. flink-ai-extended uses  the TF Java API[2] for inference. Unfortunately
the TF Java API is not quite stable and sometimes missing functionalities.
2. So far there is only TF/PyTorch integration, by leveraging DJL, users
would have access to more DL engines.

Do you have time for an online meeting to discuss the details a bit further?

Thanks,

Jiangjie (Becket) Qin

[1] https://github.com/alibaba/flink-ai-extended
[2] https://www.tensorflow.org/install/lang_java

On Fri, Jan 15, 2021 at 4:23 PM Qing Lan <[hidden email]> wrote:

> Hi Becket,
>
> Talking about the IPC, DJL is leveraging the JNI/JNA directly to connect
> to DL engines C /C++ API. So the latency between C++ and Java is minimum
> (~10ns). Performance wise speaking, DJL can offers true multi threading
> Java inference, means load model once, use in as many threads you would
> like... It’s been using to deal with some streaming tasks with 1ms budget..
> and DJL only consume 400ns or less per inference for recommendations models.
>
> Thanks,
> Qing
>
> > On Jan 15, 2021, at 12:36 AM, Becket Qin <[hidden email]> wrote:
> >
> > Hi Qing,
> >
> > Thanks for raising the discussion. It is great to know the project DJL.
> >
> > If I understand correctly, the discussion is mostly about inference. DJL
> > essentially provides a uniform Java API for people to use different deep
> > learning engines. It is useful for people to combine Flink and DJL so
> they
> > can essentially have "deep learning udf" in their Flink job to do the
> > inference. It is what the Sentiment Analysis example does and makes a lot
> > of sense to me.
> >
> > Personally speaking I think it seems already simple enough for people to
> > leverage DJL in the inference case via Flink UDFs. But there might be
> > something we can do to make this solution more visible if we want to:
> >
> > 1. Have a built-in DJLPredictorMapper which wraps the DJL predictor. This
> > makes this solution a bit more visible to the users, but I am not sure if
> > it is worth doing, because this would introduce external dependency on
> DJL
> > in Flink which is something we may want to avoid.
> > 2. Add the DJLPredictorMapper to a 3rd party project (personally I don't
> > think it is necessary, a code snippet example seems good enough), list it
> > in the flink packages website[1], and add a Flink ML use case page to
> Flink
> > website to advertise the usage with other Flink ML usages.
> >
> > I am in favor of option 2 here.
> >
> > Apart from that, I am very curious about the exact latency and
> performance
> > overhead of IPC in DJL - I assume there is an IPC between JVM and other
> > processes under the hood.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > [1]
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fflink-packages.org%2F&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Qobsue2QNoEIjmikJJ3IPudcMsvuSl2z4TQIaxOrBgg%3D&amp;reserved=0
> >
> >> On Fri, Jan 15, 2021 at 10:12 AM Qing Lan <[hidden email]> wrote:
> >>
> >> Hi all,
> >>
> >> On behalf of the AWS DJL team, I would like to discuss about the Apache
> >> Flink's ML integration development. We would like to contribute some
> more
> >> Deep Learning (DL) based applications to Flink that including but not
> >> limited to TensorFlow, PyTorch, Apache MXNet, Apache TVM and more
> through
> >> DJL... Do you have any thoughts to have these DL engines on Flink ML
> module?
> >>
> >> Here is an example using Apache Flink to do Sentiment Analysis with
> >> PyTorch:
> >>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faws-samples%2Fdjl-demo%2Ftree%2Fmaster%2Fflink%2Fsentiment-analysis&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2F4Bqu2YSxSOvIfmz6QZbAJVZS0TqC0OndykYVxjMUzk%3D&amp;reserved=0
> >>
> >> Some background about DJL: DJL (
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fawslabs%2Fdjl&amp;data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0yn06BgSF0fn5XGTlhAFBi2jUvxmJtj06ZXXv9SrMt4%3D&amp;reserved=0)
> is an
> >> open-source project (Licensed Apache 2.0) that is aimed to bridge DL
> >> applications into the Java world. It offers the full multi-threading and
> >> low-memory inference experience across all DL engines and has been used
> in
> >> online service, streaming applications, and distributed inference.
> >>
> >> Thanks,
> >> Qing
> >>
>