Externalizing the Flink connectors

classic Classic list List threaded Threaded
12 messages Options
mxm
Reply | Threaded
Open this post in threaded view
|

Externalizing the Flink connectors

mxm
Hi squirrels,

By this time, we have numerous connectors which let you insert data
into Flink or output data from Flink.

On the streaming side we have

- RollingSink
- Flume
- Kafka
- Nifi
- RabbitMQ
- Twitter

On the batch side we have

- Avro
- Hadoop compatibility
- HBase
- HCatalog
- JDBC


Many times we would have liked to release updates to the connectors or
even create new ones in between Flink releases. This is currently not
possible because the connectors are part of the main repository.

Therefore, I have created a new repository at
https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
is to externalize the connectors to this repository. We can then
update and release them independently of the main Flink repository. I
think this will give us more flexibility in the development process.

What do you think about this idea?

Cheers,
Max
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Fabian Hueske-2
Sounds like a good idea to me.

+1

Fabian

2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:

> Hi squirrels,
>
> By this time, we have numerous connectors which let you insert data
> into Flink or output data from Flink.
>
> On the streaming side we have
>
> - RollingSink
> - Flume
> - Kafka
> - Nifi
> - RabbitMQ
> - Twitter
>
> On the batch side we have
>
> - Avro
> - Hadoop compatibility
> - HBase
> - HCatalog
> - JDBC
>
>
> Many times we would have liked to release updates to the connectors or
> even create new ones in between Flink releases. This is currently not
> possible because the connectors are part of the main repository.
>
> Therefore, I have created a new repository at
> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
> is to externalize the connectors to this repository. We can then
> update and release them independently of the main Flink repository. I
> think this will give us more flexibility in the development process.
>
> What do you think about this idea?
>
> Cheers,
> Max
>
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Stephan Ewen
I like this a lot. It has multiple advantages:

  - Obviously more frequent connector updates without being forced to go to
a snapshot version
  - Reduce complexity and build time of the core flink repository

We should make sure that for example 0.10.x connectors always work with
0.10.x flink core releases.

Would we loose test coverage by putting the connectors into a separate
repository/maven project?



On Thu, Dec 10, 2015 at 3:45 PM, Fabian Hueske <[hidden email]> wrote:

> Sounds like a good idea to me.
>
> +1
>
> Fabian
>
> 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>
> > Hi squirrels,
> >
> > By this time, we have numerous connectors which let you insert data
> > into Flink or output data from Flink.
> >
> > On the streaming side we have
> >
> > - RollingSink
> > - Flume
> > - Kafka
> > - Nifi
> > - RabbitMQ
> > - Twitter
> >
> > On the batch side we have
> >
> > - Avro
> > - Hadoop compatibility
> > - HBase
> > - HCatalog
> > - JDBC
> >
> >
> > Many times we would have liked to release updates to the connectors or
> > even create new ones in between Flink releases. This is currently not
> > possible because the connectors are part of the main repository.
> >
> > Therefore, I have created a new repository at
> > https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
> > is to externalize the connectors to this repository. We can then
> > update and release them independently of the main Flink repository. I
> > think this will give us more flexibility in the development process.
> >
> > What do you think about this idea?
> >
> > Cheers,
> > Max
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Aljoscha Krettek-2
In reply to this post by Fabian Hueske-2
We would need to have a stable interface between the connectors and flink and have very good checks that ensure that we don’t inadvertently break things.

> On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>
> Sounds like a good idea to me.
>
> +1
>
> Fabian
>
> 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>
>> Hi squirrels,
>>
>> By this time, we have numerous connectors which let you insert data
>> into Flink or output data from Flink.
>>
>> On the streaming side we have
>>
>> - RollingSink
>> - Flume
>> - Kafka
>> - Nifi
>> - RabbitMQ
>> - Twitter
>>
>> On the batch side we have
>>
>> - Avro
>> - Hadoop compatibility
>> - HBase
>> - HCatalog
>> - JDBC
>>
>>
>> Many times we would have liked to release updates to the connectors or
>> even create new ones in between Flink releases. This is currently not
>> possible because the connectors are part of the main repository.
>>
>> Therefore, I have created a new repository at
>> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
>> is to externalize the connectors to this repository. We can then
>> update and release them independently of the main Flink repository. I
>> think this will give us more flexibility in the development process.
>>
>> What do you think about this idea?
>>
>> Cheers,
>> Max
>>

Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

jun aoki
The pluggable architecture is great! (why Don't we call it Flink plugins?
my 2 cents)
It will be nice if we come up with an idea of what directory structure
should look like before start dumping connectors (plugins).
Also wonder what to do with versioning.
At some point, for example, Twitter v1 connector could be compatible with
flink 0.10 but Flume v2 connector could be compatible with trunk, etc. It
should be taken consideration either in the directory structure or
branching strategy.

On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]>
wrote:

> We would need to have a stable interface between the connectors and flink
> and have very good checks that ensure that we don’t inadvertently break
> things.
>
> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
> >
> > Sounds like a good idea to me.
> >
> > +1
> >
> > Fabian
> >
> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
> >
> >> Hi squirrels,
> >>
> >> By this time, we have numerous connectors which let you insert data
> >> into Flink or output data from Flink.
> >>
> >> On the streaming side we have
> >>
> >> - RollingSink
> >> - Flume
> >> - Kafka
> >> - Nifi
> >> - RabbitMQ
> >> - Twitter
> >>
> >> On the batch side we have
> >>
> >> - Avro
> >> - Hadoop compatibility
> >> - HBase
> >> - HCatalog
> >> - JDBC
> >>
> >>
> >> Many times we would have liked to release updates to the connectors or
> >> even create new ones in between Flink releases. This is currently not
> >> possible because the connectors are part of the main repository.
> >>
> >> Therefore, I have created a new repository at
> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
> >> is to externalize the connectors to this repository. We can then
> >> update and release them independently of the main Flink repository. I
> >> think this will give us more flexibility in the development process.
> >>
> >> What do you think about this idea?
> >>
> >> Cheers,
> >> Max
> >>
>
>


--
-jun
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Till Rohrmann
+1 from my side as well. Good idea.

On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:

> The pluggable architecture is great! (why Don't we call it Flink plugins?
> my 2 cents)
> It will be nice if we come up with an idea of what directory structure
> should look like before start dumping connectors (plugins).
> Also wonder what to do with versioning.
> At some point, for example, Twitter v1 connector could be compatible with
> flink 0.10 but Flume v2 connector could be compatible with trunk, etc. It
> should be taken consideration either in the directory structure or
> branching strategy.
>
> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > We would need to have a stable interface between the connectors and flink
> > and have very good checks that ensure that we don’t inadvertently break
> > things.
> >
> > > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
> > >
> > > Sounds like a good idea to me.
> > >
> > > +1
> > >
> > > Fabian
> > >
> > > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
> > >
> > >> Hi squirrels,
> > >>
> > >> By this time, we have numerous connectors which let you insert data
> > >> into Flink or output data from Flink.
> > >>
> > >> On the streaming side we have
> > >>
> > >> - RollingSink
> > >> - Flume
> > >> - Kafka
> > >> - Nifi
> > >> - RabbitMQ
> > >> - Twitter
> > >>
> > >> On the batch side we have
> > >>
> > >> - Avro
> > >> - Hadoop compatibility
> > >> - HBase
> > >> - HCatalog
> > >> - JDBC
> > >>
> > >>
> > >> Many times we would have liked to release updates to the connectors or
> > >> even create new ones in between Flink releases. This is currently not
> > >> possible because the connectors are part of the main repository.
> > >>
> > >> Therefore, I have created a new repository at
> > >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The
> idea
> > >> is to externalize the connectors to this repository. We can then
> > >> update and release them independently of the main Flink repository. I
> > >> think this will give us more flexibility in the development process.
> > >>
> > >> What do you think about this idea?
> > >>
> > >> Cheers,
> > >> Max
> > >>
> >
> >
>
>
> --
> -jun
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

mxm
In reply to this post by jun aoki
We should have release branches which are in sync with the release
branches in the main repository. Connectors should be compatible
across minor releases. The versioning could be of the form
"flinkversion-connectorversion", e.g. 0.10-connector1.

>The pluggable architecture is great! (why Don't we call it Flink plugins? my 2 cents)

We can still change the name. IMHO "Plugins" is a bit broad since this
is currently only targeted at the connectors included in Flink.

>Would we loose test coverage by putting the connectors into a separate repository/maven project?

Not necessarily. Two possibilities:

1) Run a connectors test jar during the normal Travis tests in the
main repository
2) Trigger a Travis test run at the connectors repository upon a
commit into the main repository

Option 1 seems like the better alternative because we would
immediately see if a change breaks the connectors. Of course, if
changes are made in the connectors repository, we would also run tests
with the main repository.

On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:

> The pluggable architecture is great! (why Don't we call it Flink plugins?
> my 2 cents)
> It will be nice if we come up with an idea of what directory structure
> should look like before start dumping connectors (plugins).
> Also wonder what to do with versioning.
> At some point, for example, Twitter v1 connector could be compatible with
> flink 0.10 but Flume v2 connector could be compatible with trunk, etc. It
> should be taken consideration either in the directory structure or
> branching strategy.
>
> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> We would need to have a stable interface between the connectors and flink
>> and have very good checks that ensure that we don’t inadvertently break
>> things.
>>
>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>> >
>> > Sounds like a good idea to me.
>> >
>> > +1
>> >
>> > Fabian
>> >
>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>> >
>> >> Hi squirrels,
>> >>
>> >> By this time, we have numerous connectors which let you insert data
>> >> into Flink or output data from Flink.
>> >>
>> >> On the streaming side we have
>> >>
>> >> - RollingSink
>> >> - Flume
>> >> - Kafka
>> >> - Nifi
>> >> - RabbitMQ
>> >> - Twitter
>> >>
>> >> On the batch side we have
>> >>
>> >> - Avro
>> >> - Hadoop compatibility
>> >> - HBase
>> >> - HCatalog
>> >> - JDBC
>> >>
>> >>
>> >> Many times we would have liked to release updates to the connectors or
>> >> even create new ones in between Flink releases. This is currently not
>> >> possible because the connectors are part of the main repository.
>> >>
>> >> Therefore, I have created a new repository at
>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
>> >> is to externalize the connectors to this repository. We can then
>> >> update and release them independently of the main Flink repository. I
>> >> think this will give us more flexibility in the development process.
>> >>
>> >> What do you think about this idea?
>> >>
>> >> Cheers,
>> >> Max
>> >>
>>
>>
>
>
> --
> -jun
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Henry Saputra
I had small chat with Till about how to help manage Flink ML Libraries
contributions, which use Flink ML as dependencies.

I suppose if this approached is the way to go for Flink connectors,
could we do the same for Flink ML libraries?


- Henry

On Fri, Dec 11, 2015 at 1:33 AM, Maximilian Michels <[hidden email]> wrote:

> We should have release branches which are in sync with the release
> branches in the main repository. Connectors should be compatible
> across minor releases. The versioning could be of the form
> "flinkversion-connectorversion", e.g. 0.10-connector1.
>
>>The pluggable architecture is great! (why Don't we call it Flink plugins? my 2 cents)
>
> We can still change the name. IMHO "Plugins" is a bit broad since this
> is currently only targeted at the connectors included in Flink.
>
>>Would we loose test coverage by putting the connectors into a separate repository/maven project?
>
> Not necessarily. Two possibilities:
>
> 1) Run a connectors test jar during the normal Travis tests in the
> main repository
> 2) Trigger a Travis test run at the connectors repository upon a
> commit into the main repository
>
> Option 1 seems like the better alternative because we would
> immediately see if a change breaks the connectors. Of course, if
> changes are made in the connectors repository, we would also run tests
> with the main repository.
>
> On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:
>> The pluggable architecture is great! (why Don't we call it Flink plugins?
>> my 2 cents)
>> It will be nice if we come up with an idea of what directory structure
>> should look like before start dumping connectors (plugins).
>> Also wonder what to do with versioning.
>> At some point, for example, Twitter v1 connector could be compatible with
>> flink 0.10 but Flume v2 connector could be compatible with trunk, etc. It
>> should be taken consideration either in the directory structure or
>> branching strategy.
>>
>> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]>
>> wrote:
>>
>>> We would need to have a stable interface between the connectors and flink
>>> and have very good checks that ensure that we don’t inadvertently break
>>> things.
>>>
>>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>>> >
>>> > Sounds like a good idea to me.
>>> >
>>> > +1
>>> >
>>> > Fabian
>>> >
>>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>>> >
>>> >> Hi squirrels,
>>> >>
>>> >> By this time, we have numerous connectors which let you insert data
>>> >> into Flink or output data from Flink.
>>> >>
>>> >> On the streaming side we have
>>> >>
>>> >> - RollingSink
>>> >> - Flume
>>> >> - Kafka
>>> >> - Nifi
>>> >> - RabbitMQ
>>> >> - Twitter
>>> >>
>>> >> On the batch side we have
>>> >>
>>> >> - Avro
>>> >> - Hadoop compatibility
>>> >> - HBase
>>> >> - HCatalog
>>> >> - JDBC
>>> >>
>>> >>
>>> >> Many times we would have liked to release updates to the connectors or
>>> >> even create new ones in between Flink releases. This is currently not
>>> >> possible because the connectors are part of the main repository.
>>> >>
>>> >> Therefore, I have created a new repository at
>>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
>>> >> is to externalize the connectors to this repository. We can then
>>> >> update and release them independently of the main Flink repository. I
>>> >> think this will give us more flexibility in the development process.
>>> >>
>>> >> What do you think about this idea?
>>> >>
>>> >> Cheers,
>>> >> Max
>>> >>
>>>
>>>
>>
>>
>> --
>> -jun
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

mxm
Yes, absolutely. Setting up another repository for Flink ML would be no problem.

On Sat, Dec 12, 2015 at 1:52 AM, Henry Saputra <[hidden email]> wrote:

> I had small chat with Till about how to help manage Flink ML Libraries
> contributions, which use Flink ML as dependencies.
>
> I suppose if this approached is the way to go for Flink connectors,
> could we do the same for Flink ML libraries?
>
>
> - Henry
>
> On Fri, Dec 11, 2015 at 1:33 AM, Maximilian Michels <[hidden email]> wrote:
>> We should have release branches which are in sync with the release
>> branches in the main repository. Connectors should be compatible
>> across minor releases. The versioning could be of the form
>> "flinkversion-connectorversion", e.g. 0.10-connector1.
>>
>>>The pluggable architecture is great! (why Don't we call it Flink plugins? my 2 cents)
>>
>> We can still change the name. IMHO "Plugins" is a bit broad since this
>> is currently only targeted at the connectors included in Flink.
>>
>>>Would we loose test coverage by putting the connectors into a separate repository/maven project?
>>
>> Not necessarily. Two possibilities:
>>
>> 1) Run a connectors test jar during the normal Travis tests in the
>> main repository
>> 2) Trigger a Travis test run at the connectors repository upon a
>> commit into the main repository
>>
>> Option 1 seems like the better alternative because we would
>> immediately see if a change breaks the connectors. Of course, if
>> changes are made in the connectors repository, we would also run tests
>> with the main repository.
>>
>> On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:
>>> The pluggable architecture is great! (why Don't we call it Flink plugins?
>>> my 2 cents)
>>> It will be nice if we come up with an idea of what directory structure
>>> should look like before start dumping connectors (plugins).
>>> Also wonder what to do with versioning.
>>> At some point, for example, Twitter v1 connector could be compatible with
>>> flink 0.10 but Flume v2 connector could be compatible with trunk, etc. It
>>> should be taken consideration either in the directory structure or
>>> branching strategy.
>>>
>>> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]>
>>> wrote:
>>>
>>>> We would need to have a stable interface between the connectors and flink
>>>> and have very good checks that ensure that we don’t inadvertently break
>>>> things.
>>>>
>>>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>>>> >
>>>> > Sounds like a good idea to me.
>>>> >
>>>> > +1
>>>> >
>>>> > Fabian
>>>> >
>>>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>>>> >
>>>> >> Hi squirrels,
>>>> >>
>>>> >> By this time, we have numerous connectors which let you insert data
>>>> >> into Flink or output data from Flink.
>>>> >>
>>>> >> On the streaming side we have
>>>> >>
>>>> >> - RollingSink
>>>> >> - Flume
>>>> >> - Kafka
>>>> >> - Nifi
>>>> >> - RabbitMQ
>>>> >> - Twitter
>>>> >>
>>>> >> On the batch side we have
>>>> >>
>>>> >> - Avro
>>>> >> - Hadoop compatibility
>>>> >> - HBase
>>>> >> - HCatalog
>>>> >> - JDBC
>>>> >>
>>>> >>
>>>> >> Many times we would have liked to release updates to the connectors or
>>>> >> even create new ones in between Flink releases. This is currently not
>>>> >> possible because the connectors are part of the main repository.
>>>> >>
>>>> >> Therefore, I have created a new repository at
>>>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The idea
>>>> >> is to externalize the connectors to this repository. We can then
>>>> >> update and release them independently of the main Flink repository. I
>>>> >> think this will give us more flexibility in the development process.
>>>> >>
>>>> >> What do you think about this idea?
>>>> >>
>>>> >> Cheers,
>>>> >> Max
>>>> >>
>>>>
>>>>
>>>
>>>
>>> --
>>> -jun
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Robert Metzger
Regarding Max suggestion to have version compatible connectors: I'm not
sure if we are able to maintain all connectors across different releases. I
think its okay to have a document describing the minimum required Flink
version for each connector.

With the interface stability guarantees from 1.0 on, the number of breaking
changes will go down.

I'm against the name "plugins" because everything (documentation, code,
code comments, ...) is called "connectors" and it would be a pretty
breaking change. I also think that "connector" describes much better what
the whole thing is about.



On Mon, Dec 14, 2015 at 10:20 AM, Maximilian Michels <[hidden email]> wrote:

> Yes, absolutely. Setting up another repository for Flink ML would be no
> problem.
>
> On Sat, Dec 12, 2015 at 1:52 AM, Henry Saputra <[hidden email]>
> wrote:
> > I had small chat with Till about how to help manage Flink ML Libraries
> > contributions, which use Flink ML as dependencies.
> >
> > I suppose if this approached is the way to go for Flink connectors,
> > could we do the same for Flink ML libraries?
> >
> >
> > - Henry
> >
> > On Fri, Dec 11, 2015 at 1:33 AM, Maximilian Michels <[hidden email]>
> wrote:
> >> We should have release branches which are in sync with the release
> >> branches in the main repository. Connectors should be compatible
> >> across minor releases. The versioning could be of the form
> >> "flinkversion-connectorversion", e.g. 0.10-connector1.
> >>
> >>>The pluggable architecture is great! (why Don't we call it Flink
> plugins? my 2 cents)
> >>
> >> We can still change the name. IMHO "Plugins" is a bit broad since this
> >> is currently only targeted at the connectors included in Flink.
> >>
> >>>Would we loose test coverage by putting the connectors into a separate
> repository/maven project?
> >>
> >> Not necessarily. Two possibilities:
> >>
> >> 1) Run a connectors test jar during the normal Travis tests in the
> >> main repository
> >> 2) Trigger a Travis test run at the connectors repository upon a
> >> commit into the main repository
> >>
> >> Option 1 seems like the better alternative because we would
> >> immediately see if a change breaks the connectors. Of course, if
> >> changes are made in the connectors repository, we would also run tests
> >> with the main repository.
> >>
> >> On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:
> >>> The pluggable architecture is great! (why Don't we call it Flink
> plugins?
> >>> my 2 cents)
> >>> It will be nice if we come up with an idea of what directory structure
> >>> should look like before start dumping connectors (plugins).
> >>> Also wonder what to do with versioning.
> >>> At some point, for example, Twitter v1 connector could be compatible
> with
> >>> flink 0.10 but Flume v2 connector could be compatible with trunk, etc.
> It
> >>> should be taken consideration either in the directory structure or
> >>> branching strategy.
> >>>
> >>> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <[hidden email]
> >
> >>> wrote:
> >>>
> >>>> We would need to have a stable interface between the connectors and
> flink
> >>>> and have very good checks that ensure that we don’t inadvertently
> break
> >>>> things.
> >>>>
> >>>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
> >>>> >
> >>>> > Sounds like a good idea to me.
> >>>> >
> >>>> > +1
> >>>> >
> >>>> > Fabian
> >>>> >
> >>>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
> >>>> >
> >>>> >> Hi squirrels,
> >>>> >>
> >>>> >> By this time, we have numerous connectors which let you insert data
> >>>> >> into Flink or output data from Flink.
> >>>> >>
> >>>> >> On the streaming side we have
> >>>> >>
> >>>> >> - RollingSink
> >>>> >> - Flume
> >>>> >> - Kafka
> >>>> >> - Nifi
> >>>> >> - RabbitMQ
> >>>> >> - Twitter
> >>>> >>
> >>>> >> On the batch side we have
> >>>> >>
> >>>> >> - Avro
> >>>> >> - Hadoop compatibility
> >>>> >> - HBase
> >>>> >> - HCatalog
> >>>> >> - JDBC
> >>>> >>
> >>>> >>
> >>>> >> Many times we would have liked to release updates to the
> connectors or
> >>>> >> even create new ones in between Flink releases. This is currently
> not
> >>>> >> possible because the connectors are part of the main repository.
> >>>> >>
> >>>> >> Therefore, I have created a new repository at
> >>>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The
> idea
> >>>> >> is to externalize the connectors to this repository. We can then
> >>>> >> update and release them independently of the main Flink
> repository. I
> >>>> >> think this will give us more flexibility in the development
> process.
> >>>> >>
> >>>> >> What do you think about this idea?
> >>>> >>
> >>>> >> Cheers,
> >>>> >> Max
> >>>> >>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> -jun
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

mxm
>
> Regarding Max suggestion to have version compatible connectors: I'm not
> sure if we are able to maintain all connectors across different releases.
>

That was not my suggestion. Whenever we release, existing connectors should
be compatible with that release. Otherwise, they should be removed from the
release branch. This doesn't imply every connector version should be
compatible across all releases.

On Mon, Dec 14, 2015 at 10:39 AM, Robert Metzger <[hidden email]>
wrote:
> Regarding Max suggestion to have version compatible connectors: I'm not
> sure if we are able to maintain all connectors across different releases.
I
> think its okay to have a document describing the minimum required Flink
> version for each connector.
>
> With the interface stability guarantees from 1.0 on, the number of
breaking

> changes will go down.
>
> I'm against the name "plugins" because everything (documentation, code,
> code comments, ...) is called "connectors" and it would be a pretty
> breaking change. I also think that "connector" describes much better what
> the whole thing is about.
>
>
>
> On Mon, Dec 14, 2015 at 10:20 AM, Maximilian Michels <[hidden email]>
wrote:

>
>> Yes, absolutely. Setting up another repository for Flink ML would be no
>> problem.
>>
>> On Sat, Dec 12, 2015 at 1:52 AM, Henry Saputra <[hidden email]>
>> wrote:
>> > I had small chat with Till about how to help manage Flink ML Libraries
>> > contributions, which use Flink ML as dependencies.
>> >
>> > I suppose if this approached is the way to go for Flink connectors,
>> > could we do the same for Flink ML libraries?
>> >
>> >
>> > - Henry
>> >
>> > On Fri, Dec 11, 2015 at 1:33 AM, Maximilian Michels <[hidden email]>
>> wrote:
>> >> We should have release branches which are in sync with the release
>> >> branches in the main repository. Connectors should be compatible
>> >> across minor releases. The versioning could be of the form
>> >> "flinkversion-connectorversion", e.g. 0.10-connector1.
>> >>
>> >>>The pluggable architecture is great! (why Don't we call it Flink
>> plugins? my 2 cents)
>> >>
>> >> We can still change the name. IMHO "Plugins" is a bit broad since this
>> >> is currently only targeted at the connectors included in Flink.
>> >>
>> >>>Would we loose test coverage by putting the connectors into a separate
>> repository/maven project?
>> >>
>> >> Not necessarily. Two possibilities:
>> >>
>> >> 1) Run a connectors test jar during the normal Travis tests in the
>> >> main repository
>> >> 2) Trigger a Travis test run at the connectors repository upon a
>> >> commit into the main repository
>> >>
>> >> Option 1 seems like the better alternative because we would
>> >> immediately see if a change breaks the connectors. Of course, if
>> >> changes are made in the connectors repository, we would also run tests
>> >> with the main repository.
>> >>
>> >> On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:
>> >>> The pluggable architecture is great! (why Don't we call it Flink
>> plugins?
>> >>> my 2 cents)
>> >>> It will be nice if we come up with an idea of what directory
structure
>> >>> should look like before start dumping connectors (plugins).
>> >>> Also wonder what to do with versioning.
>> >>> At some point, for example, Twitter v1 connector could be compatible
>> with
>> >>> flink 0.10 but Flume v2 connector could be compatible with trunk,
etc.
>> It
>> >>> should be taken consideration either in the directory structure or
>> >>> branching strategy.
>> >>>
>> >>> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <
[hidden email]

>> >
>> >>> wrote:
>> >>>
>> >>>> We would need to have a stable interface between the connectors and
>> flink
>> >>>> and have very good checks that ensure that we don’t inadvertently
>> break
>> >>>> things.
>> >>>>
>> >>>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>> >>>> >
>> >>>> > Sounds like a good idea to me.
>> >>>> >
>> >>>> > +1
>> >>>> >
>> >>>> > Fabian
>> >>>> >
>> >>>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>> >>>> >
>> >>>> >> Hi squirrels,
>> >>>> >>
>> >>>> >> By this time, we have numerous connectors which let you insert
data

>> >>>> >> into Flink or output data from Flink.
>> >>>> >>
>> >>>> >> On the streaming side we have
>> >>>> >>
>> >>>> >> - RollingSink
>> >>>> >> - Flume
>> >>>> >> - Kafka
>> >>>> >> - Nifi
>> >>>> >> - RabbitMQ
>> >>>> >> - Twitter
>> >>>> >>
>> >>>> >> On the batch side we have
>> >>>> >>
>> >>>> >> - Avro
>> >>>> >> - Hadoop compatibility
>> >>>> >> - HBase
>> >>>> >> - HCatalog
>> >>>> >> - JDBC
>> >>>> >>
>> >>>> >>
>> >>>> >> Many times we would have liked to release updates to the
>> connectors or
>> >>>> >> even create new ones in between Flink releases. This is currently
>> not
>> >>>> >> possible because the connectors are part of the main repository.
>> >>>> >>
>> >>>> >> Therefore, I have created a new repository at
>> >>>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The
>> idea
>> >>>> >> is to externalize the connectors to this repository. We can then
>> >>>> >> update and release them independently of the main Flink
>> repository. I
>> >>>> >> think this will give us more flexibility in the development
>> process.
>> >>>> >>
>> >>>> >> What do you think about this idea?
>> >>>> >>
>> >>>> >> Cheers,
>> >>>> >> Max
>> >>>> >>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> -jun
>>
Reply | Threaded
Open this post in threaded view
|

Re: Externalizing the Flink connectors

Henry Saputra
Yes, that would be the way to go.

We could follow Cask CDAP hydrator plugin repository [1] that support
different plugins to run in their main CDAP hydrator [2]  product

- Henry

[1] https://github.com/caskdata/hydrator-plugins
[2] https://github.com/caskdata/cdap

On Mon, Dec 14, 2015 at 1:49 AM, Maximilian Michels <[hidden email]> wrote:

>>
>> Regarding Max suggestion to have version compatible connectors: I'm not
>> sure if we are able to maintain all connectors across different releases.
>>
>
> That was not my suggestion. Whenever we release, existing connectors should
> be compatible with that release. Otherwise, they should be removed from the
> release branch. This doesn't imply every connector version should be
> compatible across all releases.
>
> On Mon, Dec 14, 2015 at 10:39 AM, Robert Metzger <[hidden email]>
> wrote:
>> Regarding Max suggestion to have version compatible connectors: I'm not
>> sure if we are able to maintain all connectors across different releases.
> I
>> think its okay to have a document describing the minimum required Flink
>> version for each connector.
>>
>> With the interface stability guarantees from 1.0 on, the number of
> breaking
>> changes will go down.
>>
>> I'm against the name "plugins" because everything (documentation, code,
>> code comments, ...) is called "connectors" and it would be a pretty
>> breaking change. I also think that "connector" describes much better what
>> the whole thing is about.
>>
>>
>>
>> On Mon, Dec 14, 2015 at 10:20 AM, Maximilian Michels <[hidden email]>
> wrote:
>>
>>> Yes, absolutely. Setting up another repository for Flink ML would be no
>>> problem.
>>>
>>> On Sat, Dec 12, 2015 at 1:52 AM, Henry Saputra <[hidden email]>
>>> wrote:
>>> > I had small chat with Till about how to help manage Flink ML Libraries
>>> > contributions, which use Flink ML as dependencies.
>>> >
>>> > I suppose if this approached is the way to go for Flink connectors,
>>> > could we do the same for Flink ML libraries?
>>> >
>>> >
>>> > - Henry
>>> >
>>> > On Fri, Dec 11, 2015 at 1:33 AM, Maximilian Michels <[hidden email]>
>>> wrote:
>>> >> We should have release branches which are in sync with the release
>>> >> branches in the main repository. Connectors should be compatible
>>> >> across minor releases. The versioning could be of the form
>>> >> "flinkversion-connectorversion", e.g. 0.10-connector1.
>>> >>
>>> >>>The pluggable architecture is great! (why Don't we call it Flink
>>> plugins? my 2 cents)
>>> >>
>>> >> We can still change the name. IMHO "Plugins" is a bit broad since this
>>> >> is currently only targeted at the connectors included in Flink.
>>> >>
>>> >>>Would we loose test coverage by putting the connectors into a separate
>>> repository/maven project?
>>> >>
>>> >> Not necessarily. Two possibilities:
>>> >>
>>> >> 1) Run a connectors test jar during the normal Travis tests in the
>>> >> main repository
>>> >> 2) Trigger a Travis test run at the connectors repository upon a
>>> >> commit into the main repository
>>> >>
>>> >> Option 1 seems like the better alternative because we would
>>> >> immediately see if a change breaks the connectors. Of course, if
>>> >> changes are made in the connectors repository, we would also run tests
>>> >> with the main repository.
>>> >>
>>> >> On Thu, Dec 10, 2015 at 11:00 PM, jun aoki <[hidden email]> wrote:
>>> >>> The pluggable architecture is great! (why Don't we call it Flink
>>> plugins?
>>> >>> my 2 cents)
>>> >>> It will be nice if we come up with an idea of what directory
> structure
>>> >>> should look like before start dumping connectors (plugins).
>>> >>> Also wonder what to do with versioning.
>>> >>> At some point, for example, Twitter v1 connector could be compatible
>>> with
>>> >>> flink 0.10 but Flume v2 connector could be compatible with trunk,
> etc.
>>> It
>>> >>> should be taken consideration either in the directory structure or
>>> >>> branching strategy.
>>> >>>
>>> >>> On Thu, Dec 10, 2015 at 7:12 AM, Aljoscha Krettek <
> [hidden email]
>>> >
>>> >>> wrote:
>>> >>>
>>> >>>> We would need to have a stable interface between the connectors and
>>> flink
>>> >>>> and have very good checks that ensure that we don’t inadvertently
>>> break
>>> >>>> things.
>>> >>>>
>>> >>>> > On 10 Dec 2015, at 15:45, Fabian Hueske <[hidden email]> wrote:
>>> >>>> >
>>> >>>> > Sounds like a good idea to me.
>>> >>>> >
>>> >>>> > +1
>>> >>>> >
>>> >>>> > Fabian
>>> >>>> >
>>> >>>> > 2015-12-10 15:31 GMT+01:00 Maximilian Michels <[hidden email]>:
>>> >>>> >
>>> >>>> >> Hi squirrels,
>>> >>>> >>
>>> >>>> >> By this time, we have numerous connectors which let you insert
> data
>>> >>>> >> into Flink or output data from Flink.
>>> >>>> >>
>>> >>>> >> On the streaming side we have
>>> >>>> >>
>>> >>>> >> - RollingSink
>>> >>>> >> - Flume
>>> >>>> >> - Kafka
>>> >>>> >> - Nifi
>>> >>>> >> - RabbitMQ
>>> >>>> >> - Twitter
>>> >>>> >>
>>> >>>> >> On the batch side we have
>>> >>>> >>
>>> >>>> >> - Avro
>>> >>>> >> - Hadoop compatibility
>>> >>>> >> - HBase
>>> >>>> >> - HCatalog
>>> >>>> >> - JDBC
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> Many times we would have liked to release updates to the
>>> connectors or
>>> >>>> >> even create new ones in between Flink releases. This is currently
>>> not
>>> >>>> >> possible because the connectors are part of the main repository.
>>> >>>> >>
>>> >>>> >> Therefore, I have created a new repository at
>>> >>>> >> https://git-wip-us.apache.org/repos/asf/flink-connectors.git. The
>>> idea
>>> >>>> >> is to externalize the connectors to this repository. We can then
>>> >>>> >> update and release them independently of the main Flink
>>> repository. I
>>> >>>> >> think this will give us more flexibility in the development
>>> process.
>>> >>>> >>
>>> >>>> >> What do you think about this idea?
>>> >>>> >>
>>> >>>> >> Cheers,
>>> >>>> >> Max
>>> >>>> >>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> -jun
>>>