Apache Flink and Kudu integration

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Apache Flink and Kudu integration

ruben.casado.tejedor
Hi all,

Is there any PoC about reading/writing from/to Kudu? I think the flow kafka-flink-kudu is an interesting pattern. I would like to evaluate it so please let me know if there is any existing attempt to avoid starting from scratch. Advices are welcomed :)

Best


----------------------------------------
Rubén Casado Tejedor, PhD
> accenture digital
Big Data Manager
':+ 34 629 009 429
*:[hidden email]


________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com
Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink and Kudu integration

Márton Balassi
Hi Ruben,

I am currently not aware of such an effort, but I definitely do agree that
it is an interesting pattern to investigate. As a motivation you could have
a look at the Spark connector implementations to see the Kudu APIs in use.
For that I would recommend the DataSource API implementation that is now
part of Spark or Ted Malaska's prototype [2] that is bit less complex thus
might be easier to read.

Let us know if you decide to give the implementation a try.

[1]
https://kudu.apache.org/docs/developing.html#_kudu_integration_with_spark
[2] https://github.com/tmalaska/SparkOnKudu

Best,

Marton

On Fri, Oct 28, 2016 at 8:33 AM, <[hidden email]> wrote:

> Hi all,
>
> Is there any PoC about reading/writing from/to Kudu? I think the flow
> kafka-flink-kudu is an interesting pattern. I would like to evaluate it so
> please let me know if there is any existing attempt to avoid starting from
> scratch. Advices are welcomed :)
>
> Best
>
>
> ----------------------------------------
> Rubén Casado Tejedor, PhD
> > accenture digital
> Big Data Manager
> ':+ 34 629 009 429
> *:[hidden email]
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ____________________________________________________________
> __________________________
>
> www.accenture.com
>
Reply | Threaded
Open this post in threaded view
|

RE: Apache Flink and Kudu integration

ruben.casado.tejedor
Hi,

I am starting a PoC to do that. I will try to develop both source and sink. I will let you know asap I have something ;-)

Best

-----Original Message-----
From: Márton Balassi [mailto:[hidden email]]
Sent: viernes, 28 de octubre de 2016 8:50
To: [hidden email]
Subject: Re: Apache Flink and Kudu integration

Hi Ruben,

I am currently not aware of such an effort, but I definitely do agree that it is an interesting pattern to investigate. As a motivation you could have a look at the Spark connector implementations to see the Kudu APIs in use.
For that I would recommend the DataSource API implementation that is now part of Spark or Ted Malaska's prototype [2] that is bit less complex thus might be easier to read.

Let us know if you decide to give the implementation a try.

[1]
https://urldefense.proofpoint.com/v2/url?u=https-3A__kudu.apache.org_docs_developing.html-23-5Fkudu-5Fintegration-5Fwith-5Fspark&d=DQIFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=ncxdkfO9tu7KuKTt6QEv5kC5nYPTuBj2fFI65S3vg3g&s=ut78uIYh-IHXL-ZV2ChjskV3u9ChY3UE1Vkso5z7M-c&e=
[2] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmalaska_SparkOnKudu&d=DQIFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=ncxdkfO9tu7KuKTt6QEv5kC5nYPTuBj2fFI65S3vg3g&s=rx8u6LdU2OCmU1lCD-_8o_9R3naC5viVIO-ijINS2aI&e=

Best,

Marton

On Fri, Oct 28, 2016 at 8:33 AM, <[hidden email]> wrote:

> Hi all,
>
> Is there any PoC about reading/writing from/to Kudu? I think the flow
> kafka-flink-kudu is an interesting pattern. I would like to evaluate
> it so please let me know if there is any existing attempt to avoid
> starting from scratch. Advices are welcomed :)
>
> Best
>
>
> ----------------------------------------
> Rubén Casado Tejedor, PhD
> > accenture digital
> Big Data Manager
> ':+ 34 629 009 429
> *:[hidden email]
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you
> have received it in error, please notify the sender immediately and
> delete the original. Any other use of the e-mail by you is prohibited.
> Where allowed by local law, electronic communications with Accenture
> and its affiliates, including e-mail and instant messaging (including
> content), may be scanned by our systems for the purposes of
> information security and assessment of internal compliance with Accenture policy.
> ____________________________________________________________
> __________________________
>
> www.accenture.com
>

________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com
Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink and Kudu integration

Fabian Hueske-2
Hi Ruben,

that sounds great!
In case you are planning to contribute your connector, you should have a
look at Apache Bahir [1].
Bahir is a project that collects connectors and other extensions of
distributed analytics platforms (currently Flink and Spark).

As of now, it offers Flink connectors to Redis, Flume, and ActiveMQ.

Cheers,
Fabian

[1] http://bahir.apache.org/



2016-11-10 8:56 GMT+01:00 <[hidden email]>:

> Hi,
>
> I am starting a PoC to do that. I will try to develop both source and
> sink. I will let you know asap I have something ;-)
>
> Best
>
> -----Original Message-----
> From: Márton Balassi [mailto:[hidden email]]
> Sent: viernes, 28 de octubre de 2016 8:50
> To: [hidden email]
> Subject: Re: Apache Flink and Kudu integration
>
> Hi Ruben,
>
> I am currently not aware of such an effort, but I definitely do agree that
> it is an interesting pattern to investigate. As a motivation you could have
> a look at the Spark connector implementations to see the Kudu APIs in use.
> For that I would recommend the DataSource API implementation that is now
> part of Spark or Ted Malaska's prototype [2] that is bit less complex thus
> might be easier to read.
>
> Let us know if you decide to give the implementation a try.
>
> [1]
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kudu.
> apache.org_docs_developing.html-23-5Fkudu-5Fintegration-
> 5Fwith-5Fspark&d=DQIFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=
> brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=
> ncxdkfO9tu7KuKTt6QEv5kC5nYPTuBj2fFI65S3vg3g&s=ut78uIYh-IHXL-
> ZV2ChjskV3u9ChY3UE1Vkso5z7M-c&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_tmalaska_SparkOnKudu&d=DQIFaQ&c=eIGjsITfXP_y-
> DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5a
> a2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=ncxdkfO9tu7KuKTt6QEv5kC5nYPTuB
> j2fFI65S3vg3g&s=rx8u6LdU2OCmU1lCD-_8o_9R3naC5viVIO-ijINS2aI&e=
>
> Best,
>
> Marton
>
> On Fri, Oct 28, 2016 at 8:33 AM, <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > Is there any PoC about reading/writing from/to Kudu? I think the flow
> > kafka-flink-kudu is an interesting pattern. I would like to evaluate
> > it so please let me know if there is any existing attempt to avoid
> > starting from scratch. Advices are welcomed :)
> >
> > Best
> >
> >
> > ----------------------------------------
> > Rubén Casado Tejedor, PhD
> > > accenture digital
> > Big Data Manager
> > ':+ 34 629 009 429
> > *:[hidden email]
> >
> >
> > ________________________________
> >
> > This message is for the designated recipient only and may contain
> > privileged, proprietary, or otherwise confidential information. If you
> > have received it in error, please notify the sender immediately and
> > delete the original. Any other use of the e-mail by you is prohibited.
> > Where allowed by local law, electronic communications with Accenture
> > and its affiliates, including e-mail and instant messaging (including
> > content), may be scanned by our systems for the purposes of
> > information security and assessment of internal compliance with
> Accenture policy.
> > ____________________________________________________________
> > __________________________
> >
> > www.accenture.com
> >
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ____________________________________________________________
> __________________________
>
> www.accenture.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink and Kudu integration

Márton Balassi
Hi Ruben,

Thanks. Let us know how you progress. Funny enough today I am playing with
the Spark Kudu connector, I would be interested in checking out your code
as soon as you have something tangible.

Best,

Marton

On Thu, Nov 10, 2016 at 10:19 AM, Fabian Hueske <[hidden email]> wrote:

> Hi Ruben,
>
> that sounds great!
> In case you are planning to contribute your connector, you should have a
> look at Apache Bahir [1].
> Bahir is a project that collects connectors and other extensions of
> distributed analytics platforms (currently Flink and Spark).
>
> As of now, it offers Flink connectors to Redis, Flume, and ActiveMQ.
>
> Cheers,
> Fabian
>
> [1] http://bahir.apache.org/
>
>
>
> 2016-11-10 8:56 GMT+01:00 <[hidden email]>:
>
> > Hi,
> >
> > I am starting a PoC to do that. I will try to develop both source and
> > sink. I will let you know asap I have something ;-)
> >
> > Best
> >
> > -----Original Message-----
> > From: Márton Balassi [mailto:[hidden email]]
> > Sent: viernes, 28 de octubre de 2016 8:50
> > To: [hidden email]
> > Subject: Re: Apache Flink and Kudu integration
> >
> > Hi Ruben,
> >
> > I am currently not aware of such an effort, but I definitely do agree
> that
> > it is an interesting pattern to investigate. As a motivation you could
> have
> > a look at the Spark connector implementations to see the Kudu APIs in
> use.
> > For that I would recommend the DataSource API implementation that is now
> > part of Spark or Ted Malaska's prototype [2] that is bit less complex
> thus
> > might be easier to read.
> >
> > Let us know if you decide to give the implementation a try.
> >
> > [1]
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__kudu.
> > apache.org_docs_developing.html-23-5Fkudu-5Fintegration-
> > 5Fwith-5Fspark&d=DQIFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=
> > brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=
> > ncxdkfO9tu7KuKTt6QEv5kC5nYPTuBj2fFI65S3vg3g&s=ut78uIYh-IHXL-
> > ZV2ChjskV3u9ChY3UE1Vkso5z7M-c&e=
> > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > com_tmalaska_SparkOnKudu&d=DQIFaQ&c=eIGjsITfXP_y-
> > DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5a
> > a2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=ncxdkfO9tu7KuKTt6QEv5kC5nYPTuB
> > j2fFI65S3vg3g&s=rx8u6LdU2OCmU1lCD-_8o_9R3naC5viVIO-ijINS2aI&e=
> >
> > Best,
> >
> > Marton
> >
> > On Fri, Oct 28, 2016 at 8:33 AM, <[hidden email]>
> > wrote:
> >
> > > Hi all,
> > >
> > > Is there any PoC about reading/writing from/to Kudu? I think the flow
> > > kafka-flink-kudu is an interesting pattern. I would like to evaluate
> > > it so please let me know if there is any existing attempt to avoid
> > > starting from scratch. Advices are welcomed :)
> > >
> > > Best
> > >
> > >
> > > ----------------------------------------
> > > Rubén Casado Tejedor, PhD
> > > > accenture digital
> > > Big Data Manager
> > > ':+ 34 629 009 429
> > > *:[hidden email]
> > >
> > >
> > > ________________________________
> > >
> > > This message is for the designated recipient only and may contain
> > > privileged, proprietary, or otherwise confidential information. If you
> > > have received it in error, please notify the sender immediately and
> > > delete the original. Any other use of the e-mail by you is prohibited.
> > > Where allowed by local law, electronic communications with Accenture
> > > and its affiliates, including e-mail and instant messaging (including
> > > content), may be scanned by our systems for the purposes of
> > > information security and assessment of internal compliance with
> > Accenture policy.
> > > ____________________________________________________________
> > > __________________________
> > >
> > > www.accenture.com
> > >
> >
> > ________________________________
> >
> > This message is for the designated recipient only and may contain
> > privileged, proprietary, or otherwise confidential information. If you
> have
> > received it in error, please notify the sender immediately and delete the
> > original. Any other use of the e-mail by you is prohibited. Where allowed
> > by local law, electronic communications with Accenture and its
> affiliates,
> > including e-mail and instant messaging (including content), may be
> scanned
> > by our systems for the purposes of information security and assessment of
> > internal compliance with Accenture policy.
> > ____________________________________________________________
> > __________________________
> >
> > www.accenture.com
> >
>