On integrating Flink with Apache NiFi

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

On integrating Flink with Apache NiFi

Slim Baltagi
Hi Flink experts!

I came across Apache Nifi https://nifi.apache.org  https://www.youtube.com/watch?v=sQCgtCoZyFQ
In the Nifi project, there is an open JIRA issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate / provide integration with Apache Flink.

Is the integration of Flink with NiFi on the roadmap of the Apache Flink project?

Expected advantages for Flink when integrating with Nifi would be:
- GUI: Web-based user interface to design data flows
- Dynamic dataflows:  Modify dataflow at runtime
- Security: Authentication, authorization, encryption
- Data Provenance: Track data flow from beginning to end

What do you think?

Thanks

Slim Baltagi


Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

Flavio Pompermaier
As a user that would be very helpful!
On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:

> Hi Flink experts!
>
> I came across Apache Nifi https://nifi.apache.org
> https://www.youtube.com/watch?v=sQCgtCoZyFQ
> In the Nifi project, there is an open JIRA
> issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate / provide
> integration with Apache Flink.
>
> Is the integration of Flink with NiFi on the roadmap of the Apache Flink
> project?
>
> Expected advantages for Flink when integrating with Nifi would be:
> - GUI: Web-based user interface to design data flows
> - Dynamic dataflows:  Modify dataflow at runtime
> - Security: Authentication, authorization, encryption
> - Data Provenance: Track data flow from beginning to end
>
> What do you think?
>
> Thanks
>
> Slim Baltagi
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

mnxfst
Hi Slim,

thanks for sharing the presentation. Like Flavio I see the (UI)
capabilities provided by Apache NiFi as very helpful for (enterprise)
customers.

At the Otto Group we are currently think about how to track data through a
stream based system like we do in the batch world using staging layers.
Apache NiFi seems to provide a feasible approach which gives lots of
insights into your running system. Especially in those cases where
topologies are still under development or people are curious on how these
mysterious streaming systems work UI supported development & tracking /
monitoring is a big win.

Having these requirements in mind, I don't see an integration point with
Apache NiFi as the project does not only provide an UI/orchestration layer
but has a runtime etc. available too. An integration like Spark (shown
right at the beginning of the demo) seems to be possible but what I'd like
to see is a more Apache NiFi-like UI layer exclusively provided for Apache
Flink. It would bring Flink very (!!!) much closer to the enterprise market
as it solves some crucial requirements when it comes to development and
monitoring - no need for developers with in-depth programming knowledge
since topologies may be built visually, no need for depth tech
understanding when it comes to monitoring ...

From my point of view, solving the issue includes the following aspects:

* providing an appropriate UI layer to visualize data flow / operator
graphs along with metrics (system & user-defined) exported by operators
* message tracking by integrating K/V stores and index management tools

...probably both worth more than a year development time ;-)

But if there were some efforts to implement & integrate such a feature,
text me if dev support is require.

Best regards,
  Christian


2015-09-19 12:42 GMT+02:00 Flavio Pompermaier <[hidden email]>:

> As a user that would be very helpful!
> On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:
>
> > Hi Flink experts!
> >
> > I came across Apache Nifi https://nifi.apache.org
> > https://www.youtube.com/watch?v=sQCgtCoZyFQ
> > In the Nifi project, there is an open JIRA
> > issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate /
> provide
> > integration with Apache Flink.
> >
> > Is the integration of Flink with NiFi on the roadmap of the Apache Flink
> > project?
> >
> > Expected advantages for Flink when integrating with Nifi would be:
> > - GUI: Web-based user interface to design data flows
> > - Dynamic dataflows:  Modify dataflow at runtime
> > - Security: Authentication, authorization, encryption
> > - Data Provenance: Track data flow from beginning to end
> >
> > What do you think?
> >
> > Thanks
> >
> > Slim Baltagi
> >
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> > Sent from the Apache Flink Mailing List archive. mailing list archive at
> > Nabble.com.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

Flavio Pompermaier
I saw that now cascading supports Flink..so maybe you could think in
programming a cascading abstraction to have also spark and tez
compatibility for free!what do you think?
On 22 Sep 2015 11:17, "Christian Kreutzfeldt" <[hidden email]> wrote:

> Hi Slim,
>
> thanks for sharing the presentation. Like Flavio I see the (UI)
> capabilities provided by Apache NiFi as very helpful for (enterprise)
> customers.
>
> At the Otto Group we are currently think about how to track data through a
> stream based system like we do in the batch world using staging layers.
> Apache NiFi seems to provide a feasible approach which gives lots of
> insights into your running system. Especially in those cases where
> topologies are still under development or people are curious on how these
> mysterious streaming systems work UI supported development & tracking /
> monitoring is a big win.
>
> Having these requirements in mind, I don't see an integration point with
> Apache NiFi as the project does not only provide an UI/orchestration layer
> but has a runtime etc. available too. An integration like Spark (shown
> right at the beginning of the demo) seems to be possible but what I'd like
> to see is a more Apache NiFi-like UI layer exclusively provided for Apache
> Flink. It would bring Flink very (!!!) much closer to the enterprise market
> as it solves some crucial requirements when it comes to development and
> monitoring - no need for developers with in-depth programming knowledge
> since topologies may be built visually, no need for depth tech
> understanding when it comes to monitoring ...
>
> From my point of view, solving the issue includes the following aspects:
>
> * providing an appropriate UI layer to visualize data flow / operator
> graphs along with metrics (system & user-defined) exported by operators
> * message tracking by integrating K/V stores and index management tools
>
> ...probably both worth more than a year development time ;-)
>
> But if there were some efforts to implement & integrate such a feature,
> text me if dev support is require.
>
> Best regards,
>   Christian
>
>
> 2015-09-19 12:42 GMT+02:00 Flavio Pompermaier <[hidden email]>:
>
> > As a user that would be very helpful!
> > On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:
> >
> > > Hi Flink experts!
> > >
> > > I came across Apache Nifi https://nifi.apache.org
> > > https://www.youtube.com/watch?v=sQCgtCoZyFQ
> > > In the Nifi project, there is an open JIRA
> > > issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate /
> > provide
> > > integration with Apache Flink.
> > >
> > > Is the integration of Flink with NiFi on the roadmap of the Apache
> Flink
> > > project?
> > >
> > > Expected advantages for Flink when integrating with Nifi would be:
> > > - GUI: Web-based user interface to design data flows
> > > - Dynamic dataflows:  Modify dataflow at runtime
> > > - Security: Authentication, authorization, encryption
> > > - Data Provenance: Track data flow from beginning to end
> > >
> > > What do you think?
> > >
> > > Thanks
> > >
> > > Slim Baltagi
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> > > Sent from the Apache Flink Mailing List archive. mailing list archive
> at
> > > Nabble.com.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

Kostas Tzoumas-2
I had a discussion with Joe from the NiFi community, and they are
interested in contributing a connector between NiFi and Flink. I created a
JIRA issue for that: https://issues.apache.org/jira/browse/FLINK-2740

I believe that this is the easiest and most useful integration point to
begin with, as NiFi and Flink are two quite different beasts, and ideally
you would want to use them in conjunction.

A UI to compose Flink programs would be a somewhat different topic, as that
would need to be more tailored to Flink's capabilities.

@Flavio: I am not sure I understood the question regarding cascading.
Perhaps start a new thread about it?

On Tue, Sep 22, 2015 at 7:05 PM, Flavio Pompermaier <[hidden email]>
wrote:

> I saw that now cascading supports Flink..so maybe you could think in
> programming a cascading abstraction to have also spark and tez
> compatibility for free!what do you think?
> On 22 Sep 2015 11:17, "Christian Kreutzfeldt" <[hidden email]> wrote:
>
> > Hi Slim,
> >
> > thanks for sharing the presentation. Like Flavio I see the (UI)
> > capabilities provided by Apache NiFi as very helpful for (enterprise)
> > customers.
> >
> > At the Otto Group we are currently think about how to track data through
> a
> > stream based system like we do in the batch world using staging layers.
> > Apache NiFi seems to provide a feasible approach which gives lots of
> > insights into your running system. Especially in those cases where
> > topologies are still under development or people are curious on how these
> > mysterious streaming systems work UI supported development & tracking /
> > monitoring is a big win.
> >
> > Having these requirements in mind, I don't see an integration point with
> > Apache NiFi as the project does not only provide an UI/orchestration
> layer
> > but has a runtime etc. available too. An integration like Spark (shown
> > right at the beginning of the demo) seems to be possible but what I'd
> like
> > to see is a more Apache NiFi-like UI layer exclusively provided for
> Apache
> > Flink. It would bring Flink very (!!!) much closer to the enterprise
> market
> > as it solves some crucial requirements when it comes to development and
> > monitoring - no need for developers with in-depth programming knowledge
> > since topologies may be built visually, no need for depth tech
> > understanding when it comes to monitoring ...
> >
> > From my point of view, solving the issue includes the following aspects:
> >
> > * providing an appropriate UI layer to visualize data flow / operator
> > graphs along with metrics (system & user-defined) exported by operators
> > * message tracking by integrating K/V stores and index management tools
> >
> > ...probably both worth more than a year development time ;-)
> >
> > But if there were some efforts to implement & integrate such a feature,
> > text me if dev support is require.
> >
> > Best regards,
> >   Christian
> >
> >
> > 2015-09-19 12:42 GMT+02:00 Flavio Pompermaier <[hidden email]>:
> >
> > > As a user that would be very helpful!
> > > On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:
> > >
> > > > Hi Flink experts!
> > > >
> > > > I came across Apache Nifi https://nifi.apache.org
> > > > https://www.youtube.com/watch?v=sQCgtCoZyFQ
> > > > In the Nifi project, there is an open JIRA
> > > > issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate /
> > > provide
> > > > integration with Apache Flink.
> > > >
> > > > Is the integration of Flink with NiFi on the roadmap of the Apache
> > Flink
> > > > project?
> > > >
> > > > Expected advantages for Flink when integrating with Nifi would be:
> > > > - GUI: Web-based user interface to design data flows
> > > > - Dynamic dataflows:  Modify dataflow at runtime
> > > > - Security: Authentication, authorization, encryption
> > > > - Data Provenance: Track data flow from beginning to end
> > > >
> > > > What do you think?
> > > >
> > > > Thanks
> > > >
> > > > Slim Baltagi
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> > > > Sent from the Apache Flink Mailing List archive. mailing list archive
> > at
> > > > Nabble.com.
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

Flavio Pompermaier
Hi Kostas,
my question is related to the fact that now Flink is available as Cascading
engine.
So, if I had to write a general purpose dataflow UI I think I'd use
Cascading or Dataflow,
so that I could draw my pipeline with the UI and write the execution code
using just one API.

Obviously this introduce some latency in the dependency update process but,
from the UI developer perspective, I don't have to develop 3 different
connectors (Flink, Spark and Tez for example) because they come for free
using Cascading or DataFlow APIs. Does it make sense?

Best,
Flavio

On Tue, Sep 22, 2015 at 10:58 PM, Kostas Tzoumas <[hidden email]>
wrote:

> I had a discussion with Joe from the NiFi community, and they are
> interested in contributing a connector between NiFi and Flink. I created a
> JIRA issue for that: https://issues.apache.org/jira/browse/FLINK-2740
>
> I believe that this is the easiest and most useful integration point to
> begin with, as NiFi and Flink are two quite different beasts, and ideally
> you would want to use them in conjunction.
>
> A UI to compose Flink programs would be a somewhat different topic, as that
> would need to be more tailored to Flink's capabilities.
>
> @Flavio: I am not sure I understood the question regarding cascading.
> Perhaps start a new thread about it?
>
> On Tue, Sep 22, 2015 at 7:05 PM, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > I saw that now cascading supports Flink..so maybe you could think in
> > programming a cascading abstraction to have also spark and tez
> > compatibility for free!what do you think?
> > On 22 Sep 2015 11:17, "Christian Kreutzfeldt" <[hidden email]> wrote:
> >
> > > Hi Slim,
> > >
> > > thanks for sharing the presentation. Like Flavio I see the (UI)
> > > capabilities provided by Apache NiFi as very helpful for (enterprise)
> > > customers.
> > >
> > > At the Otto Group we are currently think about how to track data
> through
> > a
> > > stream based system like we do in the batch world using staging layers.
> > > Apache NiFi seems to provide a feasible approach which gives lots of
> > > insights into your running system. Especially in those cases where
> > > topologies are still under development or people are curious on how
> these
> > > mysterious streaming systems work UI supported development & tracking /
> > > monitoring is a big win.
> > >
> > > Having these requirements in mind, I don't see an integration point
> with
> > > Apache NiFi as the project does not only provide an UI/orchestration
> > layer
> > > but has a runtime etc. available too. An integration like Spark (shown
> > > right at the beginning of the demo) seems to be possible but what I'd
> > like
> > > to see is a more Apache NiFi-like UI layer exclusively provided for
> > Apache
> > > Flink. It would bring Flink very (!!!) much closer to the enterprise
> > market
> > > as it solves some crucial requirements when it comes to development and
> > > monitoring - no need for developers with in-depth programming knowledge
> > > since topologies may be built visually, no need for depth tech
> > > understanding when it comes to monitoring ...
> > >
> > > From my point of view, solving the issue includes the following
> aspects:
> > >
> > > * providing an appropriate UI layer to visualize data flow / operator
> > > graphs along with metrics (system & user-defined) exported by operators
> > > * message tracking by integrating K/V stores and index management tools
> > >
> > > ...probably both worth more than a year development time ;-)
> > >
> > > But if there were some efforts to implement & integrate such a feature,
> > > text me if dev support is require.
> > >
> > > Best regards,
> > >   Christian
> > >
> > >
> > > 2015-09-19 12:42 GMT+02:00 Flavio Pompermaier <[hidden email]>:
> > >
> > > > As a user that would be very helpful!
> > > > On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:
> > > >
> > > > > Hi Flink experts!
> > > > >
> > > > > I came across Apache Nifi https://nifi.apache.org
> > > > > https://www.youtube.com/watch?v=sQCgtCoZyFQ
> > > > > In the Nifi project, there is an open JIRA
> > > > > issue:https://issues.apache.org/jira/browse/NIFI-823 to evaluate /
> > > > provide
> > > > > integration with Apache Flink.
> > > > >
> > > > > Is the integration of Flink with NiFi on the roadmap of the Apache
> > > Flink
> > > > > project?
> > > > >
> > > > > Expected advantages for Flink when integrating with Nifi would be:
> > > > > - GUI: Web-based user interface to design data flows
> > > > > - Dynamic dataflows:  Modify dataflow at runtime
> > > > > - Security: Authentication, authorization, encryption
> > > > > - Data Provenance: Track data flow from beginning to end
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks
> > > > >
> > > > > Slim Baltagi
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > View this message in context:
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> > > > > Sent from the Apache Flink Mailing List archive. mailing list
> archive
> > > at
> > > > > Nabble.com.
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: On integrating Flink with Apache NiFi

Fabian Hueske-2
That depends on your requirements.
A common interface is usually less feature rich than the abstracted
processing engines. For example, Cascading does not support iterations or
streaming.
If you are fine with the features (and limitations) of the common
interface, your approach might make sense.

Cheers, Fabian

2015-09-23 9:41 GMT+02:00 Flavio Pompermaier <[hidden email]>:

> Hi Kostas,
> my question is related to the fact that now Flink is available as Cascading
> engine.
> So, if I had to write a general purpose dataflow UI I think I'd use
> Cascading or Dataflow,
> so that I could draw my pipeline with the UI and write the execution code
> using just one API.
>
> Obviously this introduce some latency in the dependency update process but,
> from the UI developer perspective, I don't have to develop 3 different
> connectors (Flink, Spark and Tez for example) because they come for free
> using Cascading or DataFlow APIs. Does it make sense?
>
> Best,
> Flavio
>
> On Tue, Sep 22, 2015 at 10:58 PM, Kostas Tzoumas <[hidden email]>
> wrote:
>
> > I had a discussion with Joe from the NiFi community, and they are
> > interested in contributing a connector between NiFi and Flink. I created
> a
> > JIRA issue for that: https://issues.apache.org/jira/browse/FLINK-2740
> >
> > I believe that this is the easiest and most useful integration point to
> > begin with, as NiFi and Flink are two quite different beasts, and ideally
> > you would want to use them in conjunction.
> >
> > A UI to compose Flink programs would be a somewhat different topic, as
> that
> > would need to be more tailored to Flink's capabilities.
> >
> > @Flavio: I am not sure I understood the question regarding cascading.
> > Perhaps start a new thread about it?
> >
> > On Tue, Sep 22, 2015 at 7:05 PM, Flavio Pompermaier <
> [hidden email]>
> > wrote:
> >
> > > I saw that now cascading supports Flink..so maybe you could think in
> > > programming a cascading abstraction to have also spark and tez
> > > compatibility for free!what do you think?
> > > On 22 Sep 2015 11:17, "Christian Kreutzfeldt" <[hidden email]>
> wrote:
> > >
> > > > Hi Slim,
> > > >
> > > > thanks for sharing the presentation. Like Flavio I see the (UI)
> > > > capabilities provided by Apache NiFi as very helpful for (enterprise)
> > > > customers.
> > > >
> > > > At the Otto Group we are currently think about how to track data
> > through
> > > a
> > > > stream based system like we do in the batch world using staging
> layers.
> > > > Apache NiFi seems to provide a feasible approach which gives lots of
> > > > insights into your running system. Especially in those cases where
> > > > topologies are still under development or people are curious on how
> > these
> > > > mysterious streaming systems work UI supported development &
> tracking /
> > > > monitoring is a big win.
> > > >
> > > > Having these requirements in mind, I don't see an integration point
> > with
> > > > Apache NiFi as the project does not only provide an UI/orchestration
> > > layer
> > > > but has a runtime etc. available too. An integration like Spark
> (shown
> > > > right at the beginning of the demo) seems to be possible but what I'd
> > > like
> > > > to see is a more Apache NiFi-like UI layer exclusively provided for
> > > Apache
> > > > Flink. It would bring Flink very (!!!) much closer to the enterprise
> > > market
> > > > as it solves some crucial requirements when it comes to development
> and
> > > > monitoring - no need for developers with in-depth programming
> knowledge
> > > > since topologies may be built visually, no need for depth tech
> > > > understanding when it comes to monitoring ...
> > > >
> > > > From my point of view, solving the issue includes the following
> > aspects:
> > > >
> > > > * providing an appropriate UI layer to visualize data flow / operator
> > > > graphs along with metrics (system & user-defined) exported by
> operators
> > > > * message tracking by integrating K/V stores and index management
> tools
> > > >
> > > > ...probably both worth more than a year development time ;-)
> > > >
> > > > But if there were some efforts to implement & integrate such a
> feature,
> > > > text me if dev support is require.
> > > >
> > > > Best regards,
> > > >   Christian
> > > >
> > > >
> > > > 2015-09-19 12:42 GMT+02:00 Flavio Pompermaier <[hidden email]
> >:
> > > >
> > > > > As a user that would be very helpful!
> > > > > On 19 Sep 2015 12:34, "Slim Baltagi" <[hidden email]> wrote:
> > > > >
> > > > > > Hi Flink experts!
> > > > > >
> > > > > > I came across Apache Nifi https://nifi.apache.org
> > > > > > https://www.youtube.com/watch?v=sQCgtCoZyFQ
> > > > > > In the Nifi project, there is an open JIRA
> > > > > > issue:https://issues.apache.org/jira/browse/NIFI-823 to
> evaluate /
> > > > > provide
> > > > > > integration with Apache Flink.
> > > > > >
> > > > > > Is the integration of Flink with NiFi on the roadmap of the
> Apache
> > > > Flink
> > > > > > project?
> > > > > >
> > > > > > Expected advantages for Flink when integrating with Nifi would
> be:
> > > > > > - GUI: Web-based user interface to design data flows
> > > > > > - Dynamic dataflows:  Modify dataflow at runtime
> > > > > > - Security: Authentication, authorization, encryption
> > > > > > - Data Provenance: Track data flow from beginning to end
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Slim Baltagi
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > View this message in context:
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/On-integrating-Flink-with-Apache-NiFi-tp8059.html
> > > > > > Sent from the Apache Flink Mailing List archive. mailing list
> > archive
> > > > at
> > > > > > Nabble.com.
> > > > > >
> > > > >
> > > >
> > >
> >
>