Tagging Flink classes with InterfaceAudience and InterfaceStability

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

Robert Metzger
I left out the classes in memory except for the Input/Output views.

Or course, we should try to minimize the stable classes, but user programs
get a lot of extension points in Flink ;)

I've opened a pull request with my current suggestion:
https://github.com/apache/flink/pull/1426


On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <[hidden email]> wrote:

> Thanks for the explanation. I was just wondering how you approached
> the problem. Going through class-by-class makes sense.
>
> Not sure whether we want to make "org.apache.flink.core.memory" and
> "org.apache.flink.api.common.distributions" stable interfaces. I would
> think these qualify more as experimental public.
>
> Shouldn't we try to minimize the number of classes we declare stable?
> Of course it is nice to offer a stable interface to developers but it
> might also come in the way of the Flink developers..
>
>
> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <[hidden email]>
> wrote:
> > Hi Max,
> >
> > classes in flink-scala are annotated as well, and its also in the list :)
> >
> > I considered classes in flink-core, flink-runtime, flink-scala,
> > flink-streaming-java, flink-streaming-scala,
> > flink-connector-kafka, flink-connector-filesystem, flink-avro and
> > flink-hadoop-compatibility.
> > I think there is no clear definition for a public interface, that's why I
> > just decided on a class-by-class basis.
> >
> > Classes I left out / I was uncertain with:
> >
> >    - org.apache.flink.api.common.distributions
> >    - only some Input/output classes in org.apache.flink.api.common.io
> >    - org.apache.flink.api.common.operators
> >    - only the TypeInformation in org.apache.flink.api.common.typeinfo
> (not
> >    the Atomic, basic, integer, .. type infos)
> >    - most in org.apache.flink.core.memory (except Input/output view)
> >    - I didn’t add the parsers in org.apache.flink.types.parser
> >
> >
> >
> >
> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <[hidden email]>
> wrote:
> >
> >> Thank for your getting us started on annotating the API. The list
> >> looks good so far. I have the feeling it could even be extended a bit.
> >> Just curious, how did you choose which classes you annotate? Did you
> >> go through all the classes in flink-core, flink-java, and
> >> flink-clients Maven projects?
> >>
> >> What about flink-scala? Shouldn't it be annotated as well?
> >>
> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <[hidden email]>
> >> wrote:
> >> > Okay, I'll introduce an annotation for experimental interfaces and
> I'll
> >> > make everything we have deprecated experimental.
> >> >
> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <
> [hidden email]>
> >> > wrote:
> >> >
> >> >> I still think we also need an annotation to mark public interfaces as
> >> >> experimental. For example for the windowing/triggers I would like to
> use
> >> >> that.
> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <[hidden email]>
> wrote:
> >> >> >
> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh script to
> >> see
> >> >> > which tools are used.
> >> >> > I think finding breaking API changes immediately is a better
> process
> >> then
> >> >> > reworking the APIs before a release.
> >> >> >
> >> >> > As you can see from my email response times (2 days since your
> email),
> >> >> I'm
> >> >> > probably too overloaded right now to participate in the Yetus
> project
> >> ...
> >> >> > Sadly.
> >> >> >
> >> >> >
> >> >> > @others: I know its not the most interesting thing to go through my
> >> list
> >> >> of
> >> >> > stable interfaces, but keep in mind that we have to maintain the
> stuff
> >> >> for
> >> >> > probably quite some time, so it would be good to have more than one
> >> pair
> >> >> of
> >> >> > eyes looking at it :)
> >> >> >
> >> >> >
> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <[hidden email]>
> >> >> wrote:
> >> >> >
> >> >> >>>
> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to fail
> a
> >> >> build
> >> >> >> on
> >> >> >>> breaking API changes? I would really like to have such a
> >> functionality
> >> >> in
> >> >> >>> Flink, because we can spot breaking changes very early.
> >> >> >>
> >> >> >>
> >> >> >> I don't think we have maven integration for this as of yet. We
> >> release
> >> >> >> managers run a script $HBASE/dev-support/check_compatibility.sh
> that
> >> >> >> generates a source and binary compatibility report. Issues are
> then
> >> >> >> resolved during the period leading up to the release candidate.
> >> >> >>
> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches from
> JIRA
> >> >> and
> >> >> >>> then does these
> >> >> >>> checks?
> >> >> >>>
> >> >> >>
> >> >> >> The "QA bot" is just a collection of shell scripts used during
> "Patch
> >> >> >> Available" status when a patch has been attached to JIRA or when
> a PR
> >> >> has
> >> >> >> been submitted through github. The check_compatibility script
> could
> >> be
> >> >> >> included in that automation, I don't see why not. Maybe you'd
> like to
> >> >> open
> >> >> >> a YETUS ticket :)
> >> >> >>
> >> >> >> I've pushed a branch to my own GitHub account with all classes I
> >> would
> >> >> make
> >> >> >>> public annotated:
> >> >> >>>
> >> >> >>>
> >> >> >>
> >> >>
> >>
> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1
> >> >> >>> Since this is really hard to read, I (half-automated) generated
> the
> >> >> >>> following list of annotated classes:
> >> >> >>>
> >> >> >>>
> >> >> >>
> >> >>
> >>
> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md
> >> >> >>>
> >> >> >>> Please let me know if you would like to include or exclude
> classes
> >> from
> >> >> >>> that list.
> >> >> >>> Also, let me know which methods (in stable classes) you would
> mark
> >> as
> >> >> >>> experimental.
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek <
> >> >> [hidden email]>
> >> >> >>> wrote:
> >> >> >>>
> >> >> >>>> +1 for some way of declaring public interfaces as experimental.
> >> >> >>>>
> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <[hidden email]>
> wrote:
> >> >> >>>>>
> >> >> >>>>> I think we need anyways an annotation "@PublicExperimental".
> >> >> >>>>>
> >> >> >>>>> We can make this annotation such that it can be added to
> methods
> >> and
> >> >> >>> can
> >> >> >>>>> use that to declare
> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as
> >> >> >> experimental.
> >> >> >>>>>
> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske <
> >> [hidden email]>
> >> >> >>>> wrote:
> >> >> >>>>>
> >> >> >>>>>> I am not sure if we always should declare complete classes as
> >> >> >>>>>> @PublicInterface.
> >> >> >>>>>> This does definitely make sense for interfaces and abstract
> >> classes
> >> >> >>>> such as
> >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes
> such
> >> as
> >> >> >>>> DataSet
> >> >> >>>>>> that we might want to extend by methods which should not
> >> immediately
> >> >> >>> be
> >> >> >>>>>> considered as stable.
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri <
> >> >> >>> [hidden email]
> >> >> >>>>> :
> >> >> >>>>>>
> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly API
> >> frozen
> >> >> >>> yet.
> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release, but
> if
> >> >> >>>> possible,
> >> >> >>>>>> I
> >> >> >>>>>>> would give it some more time.
> >> >> >>>>>>>
> >> >> >>>>>>> -V.
> >> >> >>>>>>>
> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <[hidden email]
> >
> >> >> >> wrote:
> >> >> >>>>>>>
> >> >> >>>>>>>> I think no component should be forced to be stable. It
> should
> >> be
> >> >> >> an
> >> >> >>>>>>>> individual decision for each component, and in some cases
> even
> >> for
> >> >> >>>>>>>> individual classes.
> >> >> >>>>>>>>
> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared
> >> interface-frozen,
> >> >> >>>> then
> >> >> >>>>>>>> this is a good point to raise and this should definitely be
> >> >> >>> reflected.
> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen when
> we
> >> are
> >> >> >>> not
> >> >> >>>>>> yet
> >> >> >>>>>>>> confident they have converged.
> >> >> >>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri <
> >> >> >>>>>>>> [hidden email]
> >> >> >>>>>>>>> wrote:
> >> >> >>>>>>>>
> >> >> >>>>>>>>> Hi Robert,
> >> >> >>>>>>>>>
> >> >> >>>>>>>>> thanks for bringing this up!
> >> >> >>>>>>>>>
> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to annotate
> the
> >> >> >>> Gelly
> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm quite
> >> sure
> >> >> >>>>>> we'll
> >> >> >>>>>>>> find
> >> >> >>>>>>>>> things to improve as it gets more exposure.
> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also e.g.
> ML,
> >> >> >> Table)
> >> >> >>>>>> to
> >> >> >>>>>>> a
> >> >> >>>>>>>>> "1.0" status (from an API stability point of view) since
> >> they're
> >> >> >>>>>> really
> >> >> >>>>>>>>> young compared to the other Flink APIs.
> >> >> >>>>>>>>>
> >> >> >>>>>>>>> Cheers,
> >> >> >>>>>>>>> Vasia.
> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" <
> >> [hidden email]>
> >> >> >>>>>> wrote:
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>> Hi everyone,
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> I would like to bring this discussion back to your
> attention
> >> as
> >> >> >> we
> >> >> >>>>>>> seem
> >> >> >>>>>>>>> to
> >> >> >>>>>>>>>> approach the 1.0 release of Flink.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> My suggestion back in January was to annotate all classes,
> >> but I
> >> >> >>>>>>> think
> >> >> >>>>>>>>>> it'll be more feasible to just annotate public classes.
> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such as:
> >> DataSet,
> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat,
> MapFunction,
> >> >> >>>>>>> FileSystems
> >> >> >>>>>>>>> but
> >> >> >>>>>>>>>> also Gelly for example.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> I would not annotate as public components such as ML,
> Storm
> >> >> >>>>>>>>> compatibility,
> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different
> maven
> >> >> >>>>>> plugins
> >> >> >>>>>>>> and
> >> >> >>>>>>>>>> java libraries and I found
> https://github.com/siom79/japicmp
> >> to
> >> >> >>> be
> >> >> >>>>>>> the
> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request to
> the
> >> >> >>>>>> project
> >> >> >>>>>>>> to
> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on annotations.
> >> Lets
> >> >> >>>>>> hope
> >> >> >>>>>>> it
> >> >> >>>>>>>>>> gets merged.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> Does everybody agree with adding just the @PublicInterface
> >> >> >>>>>>> annotation?
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level, making
> >> the
> >> >> >>>>>> entire
> >> >> >>>>>>>>> class
> >> >> >>>>>>>>>> either public or private (from a stability point of view).
> >> If we
> >> >> >>>>>>> need a
> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second
> >> >> >>>>>>> @PrivateInterface
> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request
> with
> >> all
> >> >> >>>>>>> classes
> >> >> >>>>>>>>>> annotated that I consider public.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra <
> >> >> >>>>>>>> [hidden email]>
> >> >> >>>>>>>>>> wrote:
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>> I like the idea. But would love to have different name
> for
> >> the
> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish.
> >> >> >>>>>>>>>>> How about "Module" or "Package" ?
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>> - Henry
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger <
> >> >> >>>>>>> [hidden email]
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>>> wrote:
> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the
> different
> >> >> >>>>>>>>> components
> >> >> >>>>>>>>>> of
> >> >> >>>>>>>>>>>> the project.
> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn").
> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic:
> >> >> >>>>>>>>>>>>
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>
> >> >> >>>>>>
> >> >> >>>>
> >> >> >>>
> >> >> >>
> >> >>
> >>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html
> >> >> >>>>>>>>>>>>
> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander Alexandrov <
> >> >> >>>>>>>>>>>> [hidden email]> wrote:
> >> >> >>>>>>>>>>>>
> >> >> >>>>>>>>>>>>> I don't get the difference between Private and
> >> >> >> LimitedPrivate,
> >> >> >>>>>>> but
> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea.
> >> >> >>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what these
> tags
> >> >> >>>>>>> actually
> >> >> >>>>>>>>>> mean
> >> >> >>>>>>>>>>> and
> >> >> >>>>>>>>>>>>> add this meaning to the documentation.
> >> >> >>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger <
> >> >> >>>>>> [hidden email]
> >> >> >>>>>>>> :
> >> >> >>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> Hi,
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability and
> >> >> >>>>>> audience
> >> >> >>>>>>> of
> >> >> >>>>>>>>>>> classes
> >> >> >>>>>>>>>>>>>> and methods.
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> Through that, you can have @InterfaceAudience.Public,
> >> >> >>>>>> Private,
> >> >> >>>>>>>>>>>>>> LimitedPrivate
> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable, and
> >> Stable.
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to automatically
> >> check
> >> >> >>>>>> if
> >> >> >>>>>>>>>>> interfaces,
> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed between
> >> >> >>>>>> releases
> >> >> >>>>>>>> (or
> >> >> >>>>>>>>> by
> >> >> >>>>>>>>>>> pull
> >> >> >>>>>>>>>>>>>> requests).
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to
> >> guarantee
> >> >> >>>>>>>>>> interface
> >> >> >>>>>>>>>>>>>> stability between releases.
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> What do you think? Should we add those annotations?
> Which
> >> >> >>>>>> one
> >> >> >>>>>>>>> would
> >> >> >>>>>>>>>>> you
> >> >> >>>>>>>>>>>>>> like to add?
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>> Robert
> >> >> >>>>>>>>>>>>>>
> >> >> >>>>>>>>>>>>>
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>
> >> >> >>>>>>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>
> >> >> >>
> >> >>
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

Robert Metzger
The PR has been open for a while, Stephan, Aljoscha and Henry looked over
it and I've addressed all comments by them:
https://github.com/apache/flink/pull/1427

I would like to merge the pull request adding the "flink-annotations"
module and annotating all classes in "flink-core" soon (say, next 24-48
hours).
We can still make changes once its merged, but I would like to get this one
done and start discussing the remaining modules and merge them as well
afterwards.



On Tue, Dec 1, 2015 at 2:20 PM, Robert Metzger <[hidden email]> wrote:

> I left out the classes in memory except for the Input/Output views.
>
> Or course, we should try to minimize the stable classes, but user programs
> get a lot of extension points in Flink ;)
>
> I've opened a pull request with my current suggestion:
> https://github.com/apache/flink/pull/1426
>
>
> On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <[hidden email]> wrote:
>
>> Thanks for the explanation. I was just wondering how you approached
>> the problem. Going through class-by-class makes sense.
>>
>> Not sure whether we want to make "org.apache.flink.core.memory" and
>> "org.apache.flink.api.common.distributions" stable interfaces. I would
>> think these qualify more as experimental public.
>>
>> Shouldn't we try to minimize the number of classes we declare stable?
>> Of course it is nice to offer a stable interface to developers but it
>> might also come in the way of the Flink developers..
>>
>>
>> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <[hidden email]>
>> wrote:
>> > Hi Max,
>> >
>> > classes in flink-scala are annotated as well, and its also in the list
>> :)
>> >
>> > I considered classes in flink-core, flink-runtime, flink-scala,
>> > flink-streaming-java, flink-streaming-scala,
>> > flink-connector-kafka, flink-connector-filesystem, flink-avro and
>> > flink-hadoop-compatibility.
>> > I think there is no clear definition for a public interface, that's why
>> I
>> > just decided on a class-by-class basis.
>> >
>> > Classes I left out / I was uncertain with:
>> >
>> >    - org.apache.flink.api.common.distributions
>> >    - only some Input/output classes in org.apache.flink.api.common.io
>> >    - org.apache.flink.api.common.operators
>> >    - only the TypeInformation in org.apache.flink.api.common.typeinfo
>> (not
>> >    the Atomic, basic, integer, .. type infos)
>> >    - most in org.apache.flink.core.memory (except Input/output view)
>> >    - I didn’t add the parsers in org.apache.flink.types.parser
>> >
>> >
>> >
>> >
>> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <[hidden email]>
>> wrote:
>> >
>> >> Thank for your getting us started on annotating the API. The list
>> >> looks good so far. I have the feeling it could even be extended a bit.
>> >> Just curious, how did you choose which classes you annotate? Did you
>> >> go through all the classes in flink-core, flink-java, and
>> >> flink-clients Maven projects?
>> >>
>> >> What about flink-scala? Shouldn't it be annotated as well?
>> >>
>> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <[hidden email]>
>> >> wrote:
>> >> > Okay, I'll introduce an annotation for experimental interfaces and
>> I'll
>> >> > make everything we have deprecated experimental.
>> >> >
>> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <
>> [hidden email]>
>> >> > wrote:
>> >> >
>> >> >> I still think we also need an annotation to mark public interfaces
>> as
>> >> >> experimental. For example for the windowing/triggers I would like
>> to use
>> >> >> that.
>> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <[hidden email]>
>> wrote:
>> >> >> >
>> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh script
>> to
>> >> see
>> >> >> > which tools are used.
>> >> >> > I think finding breaking API changes immediately is a better
>> process
>> >> then
>> >> >> > reworking the APIs before a release.
>> >> >> >
>> >> >> > As you can see from my email response times (2 days since your
>> email),
>> >> >> I'm
>> >> >> > probably too overloaded right now to participate in the Yetus
>> project
>> >> ...
>> >> >> > Sadly.
>> >> >> >
>> >> >> >
>> >> >> > @others: I know its not the most interesting thing to go through
>> my
>> >> list
>> >> >> of
>> >> >> > stable interfaces, but keep in mind that we have to maintain the
>> stuff
>> >> >> for
>> >> >> > probably quite some time, so it would be good to have more than
>> one
>> >> pair
>> >> >> of
>> >> >> > eyes looking at it :)
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <[hidden email]
>> >
>> >> >> wrote:
>> >> >> >
>> >> >> >>>
>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to
>> fail a
>> >> >> build
>> >> >> >> on
>> >> >> >>> breaking API changes? I would really like to have such a
>> >> functionality
>> >> >> in
>> >> >> >>> Flink, because we can spot breaking changes very early.
>> >> >> >>
>> >> >> >>
>> >> >> >> I don't think we have maven integration for this as of yet. We
>> >> release
>> >> >> >> managers run a script $HBASE/dev-support/check_compatibility.sh
>> that
>> >> >> >> generates a source and binary compatibility report. Issues are
>> then
>> >> >> >> resolved during the period leading up to the release candidate.
>> >> >> >>
>> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches from
>> JIRA
>> >> >> and
>> >> >> >>> then does these
>> >> >> >>> checks?
>> >> >> >>>
>> >> >> >>
>> >> >> >> The "QA bot" is just a collection of shell scripts used during
>> "Patch
>> >> >> >> Available" status when a patch has been attached to JIRA or when
>> a PR
>> >> >> has
>> >> >> >> been submitted through github. The check_compatibility script
>> could
>> >> be
>> >> >> >> included in that automation, I don't see why not. Maybe you'd
>> like to
>> >> >> open
>> >> >> >> a YETUS ticket :)
>> >> >> >>
>> >> >> >> I've pushed a branch to my own GitHub account with all classes I
>> >> would
>> >> >> make
>> >> >> >>> public annotated:
>> >> >> >>>
>> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1
>> >> >> >>> Since this is really hard to read, I (half-automated) generated
>> the
>> >> >> >>> following list of annotated classes:
>> >> >> >>>
>> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md
>> >> >> >>>
>> >> >> >>> Please let me know if you would like to include or exclude
>> classes
>> >> from
>> >> >> >>> that list.
>> >> >> >>> Also, let me know which methods (in stable classes) you would
>> mark
>> >> as
>> >> >> >>> experimental.
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek <
>> >> >> [hidden email]>
>> >> >> >>> wrote:
>> >> >> >>>
>> >> >> >>>> +1 for some way of declaring public interfaces as experimental.
>> >> >> >>>>
>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <[hidden email]>
>> wrote:
>> >> >> >>>>>
>> >> >> >>>>> I think we need anyways an annotation "@PublicExperimental".
>> >> >> >>>>>
>> >> >> >>>>> We can make this annotation such that it can be added to
>> methods
>> >> and
>> >> >> >>> can
>> >> >> >>>>> use that to declare
>> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as
>> >> >> >> experimental.
>> >> >> >>>>>
>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske <
>> >> [hidden email]>
>> >> >> >>>> wrote:
>> >> >> >>>>>
>> >> >> >>>>>> I am not sure if we always should declare complete classes as
>> >> >> >>>>>> @PublicInterface.
>> >> >> >>>>>> This does definitely make sense for interfaces and abstract
>> >> classes
>> >> >> >>>> such as
>> >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes
>> such
>> >> as
>> >> >> >>>> DataSet
>> >> >> >>>>>> that we might want to extend by methods which should not
>> >> immediately
>> >> >> >>> be
>> >> >> >>>>>> considered as stable.
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri <
>> >> >> >>> [hidden email]
>> >> >> >>>>> :
>> >> >> >>>>>>
>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly API
>> >> frozen
>> >> >> >>> yet.
>> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release, but
>> if
>> >> >> >>>> possible,
>> >> >> >>>>>> I
>> >> >> >>>>>>> would give it some more time.
>> >> >> >>>>>>>
>> >> >> >>>>>>> -V.
>> >> >> >>>>>>>
>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <
>> [hidden email]>
>> >> >> >> wrote:
>> >> >> >>>>>>>
>> >> >> >>>>>>>> I think no component should be forced to be stable. It
>> should
>> >> be
>> >> >> >> an
>> >> >> >>>>>>>> individual decision for each component, and in some cases
>> even
>> >> for
>> >> >> >>>>>>>> individual classes.
>> >> >> >>>>>>>>
>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared
>> >> interface-frozen,
>> >> >> >>>> then
>> >> >> >>>>>>>> this is a good point to raise and this should definitely be
>> >> >> >>> reflected.
>> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen when
>> we
>> >> are
>> >> >> >>> not
>> >> >> >>>>>> yet
>> >> >> >>>>>>>> confident they have converged.
>> >> >> >>>>>>>>
>> >> >> >>>>>>>>
>> >> >> >>>>>>>>
>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri <
>> >> >> >>>>>>>> [hidden email]
>> >> >> >>>>>>>>> wrote:
>> >> >> >>>>>>>>
>> >> >> >>>>>>>>> Hi Robert,
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>> thanks for bringing this up!
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to
>> annotate the
>> >> >> >>> Gelly
>> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm
>> quite
>> >> sure
>> >> >> >>>>>> we'll
>> >> >> >>>>>>>> find
>> >> >> >>>>>>>>> things to improve as it gets more exposure.
>> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also e.g.
>> ML,
>> >> >> >> Table)
>> >> >> >>>>>> to
>> >> >> >>>>>>> a
>> >> >> >>>>>>>>> "1.0" status (from an API stability point of view) since
>> >> they're
>> >> >> >>>>>> really
>> >> >> >>>>>>>>> young compared to the other Flink APIs.
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>> Cheers,
>> >> >> >>>>>>>>> Vasia.
>> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" <
>> >> [hidden email]>
>> >> >> >>>>>> wrote:
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>>> Hi everyone,
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> I would like to bring this discussion back to your
>> attention
>> >> as
>> >> >> >> we
>> >> >> >>>>>>> seem
>> >> >> >>>>>>>>> to
>> >> >> >>>>>>>>>> approach the 1.0 release of Flink.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all
>> classes,
>> >> but I
>> >> >> >>>>>>> think
>> >> >> >>>>>>>>>> it'll be more feasible to just annotate public classes.
>> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such as:
>> >> DataSet,
>> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat,
>> MapFunction,
>> >> >> >>>>>>> FileSystems
>> >> >> >>>>>>>>> but
>> >> >> >>>>>>>>>> also Gelly for example.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> I would not annotate as public components such as ML,
>> Storm
>> >> >> >>>>>>>>> compatibility,
>> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different
>> maven
>> >> >> >>>>>> plugins
>> >> >> >>>>>>>> and
>> >> >> >>>>>>>>>> java libraries and I found
>> https://github.com/siom79/japicmp
>> >> to
>> >> >> >>> be
>> >> >> >>>>>>> the
>> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request
>> to the
>> >> >> >>>>>> project
>> >> >> >>>>>>>> to
>> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on
>> annotations.
>> >> Lets
>> >> >> >>>>>> hope
>> >> >> >>>>>>> it
>> >> >> >>>>>>>>>> gets merged.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> Does everybody agree with adding just the
>> @PublicInterface
>> >> >> >>>>>>> annotation?
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level,
>> making
>> >> the
>> >> >> >>>>>> entire
>> >> >> >>>>>>>>> class
>> >> >> >>>>>>>>>> either public or private (from a stability point of
>> view).
>> >> If we
>> >> >> >>>>>>> need a
>> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second
>> >> >> >>>>>>> @PrivateInterface
>> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request
>> with
>> >> all
>> >> >> >>>>>>> classes
>> >> >> >>>>>>>>>> annotated that I consider public.
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra <
>> >> >> >>>>>>>> [hidden email]>
>> >> >> >>>>>>>>>> wrote:
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>>> I like the idea. But would love to have different name
>> for
>> >> the
>> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish.
>> >> >> >>>>>>>>>>> How about "Module" or "Package" ?
>> >> >> >>>>>>>>>>>
>> >> >> >>>>>>>>>>> - Henry
>> >> >> >>>>>>>>>>>
>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger <
>> >> >> >>>>>>> [hidden email]
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>>>> wrote:
>> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the
>> different
>> >> >> >>>>>>>>> components
>> >> >> >>>>>>>>>> of
>> >> >> >>>>>>>>>>>> the project.
>> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn").
>> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic:
>> >> >> >>>>>>>>>>>>
>> >> >> >>>>>>>>>>>
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>
>> >> >> >>>>>>>
>> >> >> >>>>>>
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html
>> >> >> >>>>>>>>>>>>
>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander Alexandrov <
>> >> >> >>>>>>>>>>>> [hidden email]> wrote:
>> >> >> >>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and
>> >> >> >> LimitedPrivate,
>> >> >> >>>>>>> but
>> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea.
>> >> >> >>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what these
>> tags
>> >> >> >>>>>>> actually
>> >> >> >>>>>>>>>> mean
>> >> >> >>>>>>>>>>> and
>> >> >> >>>>>>>>>>>>> add this meaning to the documentation.
>> >> >> >>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger <
>> >> >> >>>>>> [hidden email]
>> >> >> >>>>>>>> :
>> >> >> >>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> Hi,
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability and
>> >> >> >>>>>> audience
>> >> >> >>>>>>> of
>> >> >> >>>>>>>>>>> classes
>> >> >> >>>>>>>>>>>>>> and methods.
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> Through that, you can have @InterfaceAudience.Public,
>> >> >> >>>>>> Private,
>> >> >> >>>>>>>>>>>>>> LimitedPrivate
>> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable, and
>> >> Stable.
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to automatically
>> >> check
>> >> >> >>>>>> if
>> >> >> >>>>>>>>>>> interfaces,
>> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed between
>> >> >> >>>>>> releases
>> >> >> >>>>>>>> (or
>> >> >> >>>>>>>>> by
>> >> >> >>>>>>>>>>> pull
>> >> >> >>>>>>>>>>>>>> requests).
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to
>> >> guarantee
>> >> >> >>>>>>>>>> interface
>> >> >> >>>>>>>>>>>>>> stability between releases.
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those annotations?
>> Which
>> >> >> >>>>>> one
>> >> >> >>>>>>>>> would
>> >> >> >>>>>>>>>>> you
>> >> >> >>>>>>>>>>>>>> like to add?
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>> Robert
>> >> >> >>>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>>>
>> >> >> >>>>>>>>>>>
>> >> >> >>>>>>>>>>
>> >> >> >>>>>>>>>
>> >> >> >>>>>>>>
>> >> >> >>>>>>>
>> >> >> >>>>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>
>> >> >>
>> >> >>
>> >>
>>
>
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

mxm
I've also had a look. +1 to merge from my side.

On Tue, Dec 15, 2015 at 3:27 PM, Robert Metzger <[hidden email]> wrote:

> The PR has been open for a while, Stephan, Aljoscha and Henry looked over
> it and I've addressed all comments by them:
> https://github.com/apache/flink/pull/1427
>
> I would like to merge the pull request adding the "flink-annotations"
> module and annotating all classes in "flink-core" soon (say, next 24-48
> hours).
> We can still make changes once its merged, but I would like to get this one
> done and start discussing the remaining modules and merge them as well
> afterwards.
>
>
>
> On Tue, Dec 1, 2015 at 2:20 PM, Robert Metzger <[hidden email]> wrote:
>
>> I left out the classes in memory except for the Input/Output views.
>>
>> Or course, we should try to minimize the stable classes, but user programs
>> get a lot of extension points in Flink ;)
>>
>> I've opened a pull request with my current suggestion:
>> https://github.com/apache/flink/pull/1426
>>
>>
>> On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <[hidden email]> wrote:
>>
>>> Thanks for the explanation. I was just wondering how you approached
>>> the problem. Going through class-by-class makes sense.
>>>
>>> Not sure whether we want to make "org.apache.flink.core.memory" and
>>> "org.apache.flink.api.common.distributions" stable interfaces. I would
>>> think these qualify more as experimental public.
>>>
>>> Shouldn't we try to minimize the number of classes we declare stable?
>>> Of course it is nice to offer a stable interface to developers but it
>>> might also come in the way of the Flink developers..
>>>
>>>
>>> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <[hidden email]>
>>> wrote:
>>> > Hi Max,
>>> >
>>> > classes in flink-scala are annotated as well, and its also in the list
>>> :)
>>> >
>>> > I considered classes in flink-core, flink-runtime, flink-scala,
>>> > flink-streaming-java, flink-streaming-scala,
>>> > flink-connector-kafka, flink-connector-filesystem, flink-avro and
>>> > flink-hadoop-compatibility.
>>> > I think there is no clear definition for a public interface, that's why
>>> I
>>> > just decided on a class-by-class basis.
>>> >
>>> > Classes I left out / I was uncertain with:
>>> >
>>> >    - org.apache.flink.api.common.distributions
>>> >    - only some Input/output classes in org.apache.flink.api.common.io
>>> >    - org.apache.flink.api.common.operators
>>> >    - only the TypeInformation in org.apache.flink.api.common.typeinfo
>>> (not
>>> >    the Atomic, basic, integer, .. type infos)
>>> >    - most in org.apache.flink.core.memory (except Input/output view)
>>> >    - I didn’t add the parsers in org.apache.flink.types.parser
>>> >
>>> >
>>> >
>>> >
>>> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <[hidden email]>
>>> wrote:
>>> >
>>> >> Thank for your getting us started on annotating the API. The list
>>> >> looks good so far. I have the feeling it could even be extended a bit.
>>> >> Just curious, how did you choose which classes you annotate? Did you
>>> >> go through all the classes in flink-core, flink-java, and
>>> >> flink-clients Maven projects?
>>> >>
>>> >> What about flink-scala? Shouldn't it be annotated as well?
>>> >>
>>> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <[hidden email]>
>>> >> wrote:
>>> >> > Okay, I'll introduce an annotation for experimental interfaces and
>>> I'll
>>> >> > make everything we have deprecated experimental.
>>> >> >
>>> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <
>>> [hidden email]>
>>> >> > wrote:
>>> >> >
>>> >> >> I still think we also need an annotation to mark public interfaces
>>> as
>>> >> >> experimental. For example for the windowing/triggers I would like
>>> to use
>>> >> >> that.
>>> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <[hidden email]>
>>> wrote:
>>> >> >> >
>>> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh script
>>> to
>>> >> see
>>> >> >> > which tools are used.
>>> >> >> > I think finding breaking API changes immediately is a better
>>> process
>>> >> then
>>> >> >> > reworking the APIs before a release.
>>> >> >> >
>>> >> >> > As you can see from my email response times (2 days since your
>>> email),
>>> >> >> I'm
>>> >> >> > probably too overloaded right now to participate in the Yetus
>>> project
>>> >> ...
>>> >> >> > Sadly.
>>> >> >> >
>>> >> >> >
>>> >> >> > @others: I know its not the most interesting thing to go through
>>> my
>>> >> list
>>> >> >> of
>>> >> >> > stable interfaces, but keep in mind that we have to maintain the
>>> stuff
>>> >> >> for
>>> >> >> > probably quite some time, so it would be good to have more than
>>> one
>>> >> pair
>>> >> >> of
>>> >> >> > eyes looking at it :)
>>> >> >> >
>>> >> >> >
>>> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <[hidden email]
>>> >
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >>>
>>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to
>>> fail a
>>> >> >> build
>>> >> >> >> on
>>> >> >> >>> breaking API changes? I would really like to have such a
>>> >> functionality
>>> >> >> in
>>> >> >> >>> Flink, because we can spot breaking changes very early.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> I don't think we have maven integration for this as of yet. We
>>> >> release
>>> >> >> >> managers run a script $HBASE/dev-support/check_compatibility.sh
>>> that
>>> >> >> >> generates a source and binary compatibility report. Issues are
>>> then
>>> >> >> >> resolved during the period leading up to the release candidate.
>>> >> >> >>
>>> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches from
>>> JIRA
>>> >> >> and
>>> >> >> >>> then does these
>>> >> >> >>> checks?
>>> >> >> >>>
>>> >> >> >>
>>> >> >> >> The "QA bot" is just a collection of shell scripts used during
>>> "Patch
>>> >> >> >> Available" status when a patch has been attached to JIRA or when
>>> a PR
>>> >> >> has
>>> >> >> >> been submitted through github. The check_compatibility script
>>> could
>>> >> be
>>> >> >> >> included in that automation, I don't see why not. Maybe you'd
>>> like to
>>> >> >> open
>>> >> >> >> a YETUS ticket :)
>>> >> >> >>
>>> >> >> >> I've pushed a branch to my own GitHub account with all classes I
>>> >> would
>>> >> >> make
>>> >> >> >>> public annotated:
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>
>>> >> >>
>>> >>
>>> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1
>>> >> >> >>> Since this is really hard to read, I (half-automated) generated
>>> the
>>> >> >> >>> following list of annotated classes:
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>
>>> >> >>
>>> >>
>>> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md
>>> >> >> >>>
>>> >> >> >>> Please let me know if you would like to include or exclude
>>> classes
>>> >> from
>>> >> >> >>> that list.
>>> >> >> >>> Also, let me know which methods (in stable classes) you would
>>> mark
>>> >> as
>>> >> >> >>> experimental.
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek <
>>> >> >> [hidden email]>
>>> >> >> >>> wrote:
>>> >> >> >>>
>>> >> >> >>>> +1 for some way of declaring public interfaces as experimental.
>>> >> >> >>>>
>>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <[hidden email]>
>>> wrote:
>>> >> >> >>>>>
>>> >> >> >>>>> I think we need anyways an annotation "@PublicExperimental".
>>> >> >> >>>>>
>>> >> >> >>>>> We can make this annotation such that it can be added to
>>> methods
>>> >> and
>>> >> >> >>> can
>>> >> >> >>>>> use that to declare
>>> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as
>>> >> >> >> experimental.
>>> >> >> >>>>>
>>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske <
>>> >> [hidden email]>
>>> >> >> >>>> wrote:
>>> >> >> >>>>>
>>> >> >> >>>>>> I am not sure if we always should declare complete classes as
>>> >> >> >>>>>> @PublicInterface.
>>> >> >> >>>>>> This does definitely make sense for interfaces and abstract
>>> >> classes
>>> >> >> >>>> such as
>>> >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes
>>> such
>>> >> as
>>> >> >> >>>> DataSet
>>> >> >> >>>>>> that we might want to extend by methods which should not
>>> >> immediately
>>> >> >> >>> be
>>> >> >> >>>>>> considered as stable.
>>> >> >> >>>>>>
>>> >> >> >>>>>>
>>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri <
>>> >> >> >>> [hidden email]
>>> >> >> >>>>> :
>>> >> >> >>>>>>
>>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly API
>>> >> frozen
>>> >> >> >>> yet.
>>> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release, but
>>> if
>>> >> >> >>>> possible,
>>> >> >> >>>>>> I
>>> >> >> >>>>>>> would give it some more time.
>>> >> >> >>>>>>>
>>> >> >> >>>>>>> -V.
>>> >> >> >>>>>>>
>>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <
>>> [hidden email]>
>>> >> >> >> wrote:
>>> >> >> >>>>>>>
>>> >> >> >>>>>>>> I think no component should be forced to be stable. It
>>> should
>>> >> be
>>> >> >> >> an
>>> >> >> >>>>>>>> individual decision for each component, and in some cases
>>> even
>>> >> for
>>> >> >> >>>>>>>> individual classes.
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared
>>> >> interface-frozen,
>>> >> >> >>>> then
>>> >> >> >>>>>>>> this is a good point to raise and this should definitely be
>>> >> >> >>> reflected.
>>> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen when
>>> we
>>> >> are
>>> >> >> >>> not
>>> >> >> >>>>>> yet
>>> >> >> >>>>>>>> confident they have converged.
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri <
>>> >> >> >>>>>>>> [hidden email]
>>> >> >> >>>>>>>>> wrote:
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>>> Hi Robert,
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>> thanks for bringing this up!
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to
>>> annotate the
>>> >> >> >>> Gelly
>>> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm
>>> quite
>>> >> sure
>>> >> >> >>>>>> we'll
>>> >> >> >>>>>>>> find
>>> >> >> >>>>>>>>> things to improve as it gets more exposure.
>>> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also e.g.
>>> ML,
>>> >> >> >> Table)
>>> >> >> >>>>>> to
>>> >> >> >>>>>>> a
>>> >> >> >>>>>>>>> "1.0" status (from an API stability point of view) since
>>> >> they're
>>> >> >> >>>>>> really
>>> >> >> >>>>>>>>> young compared to the other Flink APIs.
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>> Cheers,
>>> >> >> >>>>>>>>> Vasia.
>>> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" <
>>> >> [hidden email]>
>>> >> >> >>>>>> wrote:
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>>> Hi everyone,
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> I would like to bring this discussion back to your
>>> attention
>>> >> as
>>> >> >> >> we
>>> >> >> >>>>>>> seem
>>> >> >> >>>>>>>>> to
>>> >> >> >>>>>>>>>> approach the 1.0 release of Flink.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all
>>> classes,
>>> >> but I
>>> >> >> >>>>>>> think
>>> >> >> >>>>>>>>>> it'll be more feasible to just annotate public classes.
>>> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such as:
>>> >> DataSet,
>>> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat,
>>> MapFunction,
>>> >> >> >>>>>>> FileSystems
>>> >> >> >>>>>>>>> but
>>> >> >> >>>>>>>>>> also Gelly for example.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> I would not annotate as public components such as ML,
>>> Storm
>>> >> >> >>>>>>>>> compatibility,
>>> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different
>>> maven
>>> >> >> >>>>>> plugins
>>> >> >> >>>>>>>> and
>>> >> >> >>>>>>>>>> java libraries and I found
>>> https://github.com/siom79/japicmp
>>> >> to
>>> >> >> >>> be
>>> >> >> >>>>>>> the
>>> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request
>>> to the
>>> >> >> >>>>>> project
>>> >> >> >>>>>>>> to
>>> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on
>>> annotations.
>>> >> Lets
>>> >> >> >>>>>> hope
>>> >> >> >>>>>>> it
>>> >> >> >>>>>>>>>> gets merged.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> Does everybody agree with adding just the
>>> @PublicInterface
>>> >> >> >>>>>>> annotation?
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level,
>>> making
>>> >> the
>>> >> >> >>>>>> entire
>>> >> >> >>>>>>>>> class
>>> >> >> >>>>>>>>>> either public or private (from a stability point of
>>> view).
>>> >> If we
>>> >> >> >>>>>>> need a
>>> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second
>>> >> >> >>>>>>> @PrivateInterface
>>> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request
>>> with
>>> >> all
>>> >> >> >>>>>>> classes
>>> >> >> >>>>>>>>>> annotated that I consider public.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra <
>>> >> >> >>>>>>>> [hidden email]>
>>> >> >> >>>>>>>>>> wrote:
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>>> I like the idea. But would love to have different name
>>> for
>>> >> the
>>> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish.
>>> >> >> >>>>>>>>>>> How about "Module" or "Package" ?
>>> >> >> >>>>>>>>>>>
>>> >> >> >>>>>>>>>>> - Henry
>>> >> >> >>>>>>>>>>>
>>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger <
>>> >> >> >>>>>>> [hidden email]
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>>>> wrote:
>>> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the
>>> different
>>> >> >> >>>>>>>>> components
>>> >> >> >>>>>>>>>> of
>>> >> >> >>>>>>>>>>>> the project.
>>> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn").
>>> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic:
>>> >> >> >>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>
>>> >> >> >>>>>>
>>> >> >> >>>>
>>> >> >> >>>
>>> >> >> >>
>>> >> >>
>>> >>
>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html
>>> >> >> >>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander Alexandrov <
>>> >> >> >>>>>>>>>>>> [hidden email]> wrote:
>>> >> >> >>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and
>>> >> >> >> LimitedPrivate,
>>> >> >> >>>>>>> but
>>> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea.
>>> >> >> >>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what these
>>> tags
>>> >> >> >>>>>>> actually
>>> >> >> >>>>>>>>>> mean
>>> >> >> >>>>>>>>>>> and
>>> >> >> >>>>>>>>>>>>> add this meaning to the documentation.
>>> >> >> >>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger <
>>> >> >> >>>>>> [hidden email]
>>> >> >> >>>>>>>> :
>>> >> >> >>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> Hi,
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability and
>>> >> >> >>>>>> audience
>>> >> >> >>>>>>> of
>>> >> >> >>>>>>>>>>> classes
>>> >> >> >>>>>>>>>>>>>> and methods.
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> Through that, you can have @InterfaceAudience.Public,
>>> >> >> >>>>>> Private,
>>> >> >> >>>>>>>>>>>>>> LimitedPrivate
>>> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable, and
>>> >> Stable.
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to automatically
>>> >> check
>>> >> >> >>>>>> if
>>> >> >> >>>>>>>>>>> interfaces,
>>> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed between
>>> >> >> >>>>>> releases
>>> >> >> >>>>>>>> (or
>>> >> >> >>>>>>>>> by
>>> >> >> >>>>>>>>>>> pull
>>> >> >> >>>>>>>>>>>>>> requests).
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to
>>> >> guarantee
>>> >> >> >>>>>>>>>> interface
>>> >> >> >>>>>>>>>>>>>> stability between releases.
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those annotations?
>>> Which
>>> >> >> >>>>>> one
>>> >> >> >>>>>>>>> would
>>> >> >> >>>>>>>>>>> you
>>> >> >> >>>>>>>>>>>>>> like to add?
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>> Robert
>>> >> >> >>>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>>>
>>> >> >> >>>>>>>>>>>
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>
>>> >> >> >>>>>>
>>> >> >> >>>>
>>> >> >> >>>>
>>> >> >> >>>
>>> >> >> >>
>>> >> >>
>>> >> >>
>>> >>
>>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

Robert Metzger
Fabian took another look into the interface stability PR for flink-core.

There are the following issues which we need to decide upon:

1) The "ExecutionMode" class name is misleading because its for batch-jobs
only.
I suggest to make the  ExecutionMode class an "@Experimental" interface
(and the ExecutionConfig.setExecutionMode/getExecutionMode methods).
This way, we can still rename the ExecutionMode if necessary.


2) Remove the "JobExecutionResult.getIntCounterResult()" method.
I suggest to make the method @Experimental and @Deprecated


3) Does it make sense to extend the Partitioner.partition(K key, int
numPartition) method by a parameter for the current partitionId?
I remember as well that there was a user asking about this. How much effort
would it be to add the parameter?

4) We need to think about whether we change the RichGroupReduceFunction to
implement the regular combine function (combining into one value) rather
than the group combine function. While API breaking, it allows for more
efficient pre-aggregations.


Fabian suggested to introduce an additional annotation @Internal for
marking internal developer APIs within @Public classes. Without the
additional annotation, we would need to mark stable internal APIs as
@Experimental even though they are stable within Flink.

On Tue, Dec 15, 2015 at 3:46 PM, Maximilian Michels <[hidden email]> wrote:

> I've also had a look. +1 to merge from my side.
>
> On Tue, Dec 15, 2015 at 3:27 PM, Robert Metzger <[hidden email]>
> wrote:
> > The PR has been open for a while, Stephan, Aljoscha and Henry looked over
> > it and I've addressed all comments by them:
> > https://github.com/apache/flink/pull/1427
> >
> > I would like to merge the pull request adding the "flink-annotations"
> > module and annotating all classes in "flink-core" soon (say, next 24-48
> > hours).
> > We can still make changes once its merged, but I would like to get this
> one
> > done and start discussing the remaining modules and merge them as well
> > afterwards.
> >
> >
> >
> > On Tue, Dec 1, 2015 at 2:20 PM, Robert Metzger <[hidden email]>
> wrote:
> >
> >> I left out the classes in memory except for the Input/Output views.
> >>
> >> Or course, we should try to minimize the stable classes, but user
> programs
> >> get a lot of extension points in Flink ;)
> >>
> >> I've opened a pull request with my current suggestion:
> >> https://github.com/apache/flink/pull/1426
> >>
> >>
> >> On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <[hidden email]>
> wrote:
> >>
> >>> Thanks for the explanation. I was just wondering how you approached
> >>> the problem. Going through class-by-class makes sense.
> >>>
> >>> Not sure whether we want to make "org.apache.flink.core.memory" and
> >>> "org.apache.flink.api.common.distributions" stable interfaces. I would
> >>> think these qualify more as experimental public.
> >>>
> >>> Shouldn't we try to minimize the number of classes we declare stable?
> >>> Of course it is nice to offer a stable interface to developers but it
> >>> might also come in the way of the Flink developers..
> >>>
> >>>
> >>> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <[hidden email]>
> >>> wrote:
> >>> > Hi Max,
> >>> >
> >>> > classes in flink-scala are annotated as well, and its also in the
> list
> >>> :)
> >>> >
> >>> > I considered classes in flink-core, flink-runtime, flink-scala,
> >>> > flink-streaming-java, flink-streaming-scala,
> >>> > flink-connector-kafka, flink-connector-filesystem, flink-avro and
> >>> > flink-hadoop-compatibility.
> >>> > I think there is no clear definition for a public interface, that's
> why
> >>> I
> >>> > just decided on a class-by-class basis.
> >>> >
> >>> > Classes I left out / I was uncertain with:
> >>> >
> >>> >    - org.apache.flink.api.common.distributions
> >>> >    - only some Input/output classes in
> org.apache.flink.api.common.io
> >>> >    - org.apache.flink.api.common.operators
> >>> >    - only the TypeInformation in org.apache.flink.api.common.typeinfo
> >>> (not
> >>> >    the Atomic, basic, integer, .. type infos)
> >>> >    - most in org.apache.flink.core.memory (except Input/output view)
> >>> >    - I didn’t add the parsers in org.apache.flink.types.parser
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <[hidden email]
> >
> >>> wrote:
> >>> >
> >>> >> Thank for your getting us started on annotating the API. The list
> >>> >> looks good so far. I have the feeling it could even be extended a
> bit.
> >>> >> Just curious, how did you choose which classes you annotate? Did you
> >>> >> go through all the classes in flink-core, flink-java, and
> >>> >> flink-clients Maven projects?
> >>> >>
> >>> >> What about flink-scala? Shouldn't it be annotated as well?
> >>> >>
> >>> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <
> [hidden email]>
> >>> >> wrote:
> >>> >> > Okay, I'll introduce an annotation for experimental interfaces and
> >>> I'll
> >>> >> > make everything we have deprecated experimental.
> >>> >> >
> >>> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <
> >>> [hidden email]>
> >>> >> > wrote:
> >>> >> >
> >>> >> >> I still think we also need an annotation to mark public
> interfaces
> >>> as
> >>> >> >> experimental. For example for the windowing/triggers I would like
> >>> to use
> >>> >> >> that.
> >>> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <[hidden email]>
> >>> wrote:
> >>> >> >> >
> >>> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh
> script
> >>> to
> >>> >> see
> >>> >> >> > which tools are used.
> >>> >> >> > I think finding breaking API changes immediately is a better
> >>> process
> >>> >> then
> >>> >> >> > reworking the APIs before a release.
> >>> >> >> >
> >>> >> >> > As you can see from my email response times (2 days since your
> >>> email),
> >>> >> >> I'm
> >>> >> >> > probably too overloaded right now to participate in the Yetus
> >>> project
> >>> >> ...
> >>> >> >> > Sadly.
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > @others: I know its not the most interesting thing to go
> through
> >>> my
> >>> >> list
> >>> >> >> of
> >>> >> >> > stable interfaces, but keep in mind that we have to maintain
> the
> >>> stuff
> >>> >> >> for
> >>> >> >> > probably quite some time, so it would be good to have more than
> >>> one
> >>> >> pair
> >>> >> >> of
> >>> >> >> > eyes looking at it :)
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <
> [hidden email]
> >>> >
> >>> >> >> wrote:
> >>> >> >> >
> >>> >> >> >>>
> >>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to
> >>> fail a
> >>> >> >> build
> >>> >> >> >> on
> >>> >> >> >>> breaking API changes? I would really like to have such a
> >>> >> functionality
> >>> >> >> in
> >>> >> >> >>> Flink, because we can spot breaking changes very early.
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> I don't think we have maven integration for this as of yet. We
> >>> >> release
> >>> >> >> >> managers run a script
> $HBASE/dev-support/check_compatibility.sh
> >>> that
> >>> >> >> >> generates a source and binary compatibility report. Issues are
> >>> then
> >>> >> >> >> resolved during the period leading up to the release
> candidate.
> >>> >> >> >>
> >>> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches
> from
> >>> JIRA
> >>> >> >> and
> >>> >> >> >>> then does these
> >>> >> >> >>> checks?
> >>> >> >> >>>
> >>> >> >> >>
> >>> >> >> >> The "QA bot" is just a collection of shell scripts used during
> >>> "Patch
> >>> >> >> >> Available" status when a patch has been attached to JIRA or
> when
> >>> a PR
> >>> >> >> has
> >>> >> >> >> been submitted through github. The check_compatibility script
> >>> could
> >>> >> be
> >>> >> >> >> included in that automation, I don't see why not. Maybe you'd
> >>> like to
> >>> >> >> open
> >>> >> >> >> a YETUS ticket :)
> >>> >> >> >>
> >>> >> >> >> I've pushed a branch to my own GitHub account with all
> classes I
> >>> >> would
> >>> >> >> make
> >>> >> >> >>> public annotated:
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>
> >>> >> >>
> >>> >>
> >>>
> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1
> >>> >> >> >>> Since this is really hard to read, I (half-automated)
> generated
> >>> the
> >>> >> >> >>> following list of annotated classes:
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>
> >>> >> >>
> >>> >>
> >>>
> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md
> >>> >> >> >>>
> >>> >> >> >>> Please let me know if you would like to include or exclude
> >>> classes
> >>> >> from
> >>> >> >> >>> that list.
> >>> >> >> >>> Also, let me know which methods (in stable classes) you would
> >>> mark
> >>> >> as
> >>> >> >> >>> experimental.
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek <
> >>> >> >> [hidden email]>
> >>> >> >> >>> wrote:
> >>> >> >> >>>
> >>> >> >> >>>> +1 for some way of declaring public interfaces as
> experimental.
> >>> >> >> >>>>
> >>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <[hidden email]>
> >>> wrote:
> >>> >> >> >>>>>
> >>> >> >> >>>>> I think we need anyways an annotation
> "@PublicExperimental".
> >>> >> >> >>>>>
> >>> >> >> >>>>> We can make this annotation such that it can be added to
> >>> methods
> >>> >> and
> >>> >> >> >>> can
> >>> >> >> >>>>> use that to declare
> >>> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as
> >>> >> >> >> experimental.
> >>> >> >> >>>>>
> >>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske <
> >>> >> [hidden email]>
> >>> >> >> >>>> wrote:
> >>> >> >> >>>>>
> >>> >> >> >>>>>> I am not sure if we always should declare complete
> classes as
> >>> >> >> >>>>>> @PublicInterface.
> >>> >> >> >>>>>> This does definitely make sense for interfaces and
> abstract
> >>> >> classes
> >>> >> >> >>>> such as
> >>> >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes
> >>> such
> >>> >> as
> >>> >> >> >>>> DataSet
> >>> >> >> >>>>>> that we might want to extend by methods which should not
> >>> >> immediately
> >>> >> >> >>> be
> >>> >> >> >>>>>> considered as stable.
> >>> >> >> >>>>>>
> >>> >> >> >>>>>>
> >>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri <
> >>> >> >> >>> [hidden email]
> >>> >> >> >>>>> :
> >>> >> >> >>>>>>
> >>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly
> API
> >>> >> frozen
> >>> >> >> >>> yet.
> >>> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release,
> but
> >>> if
> >>> >> >> >>>> possible,
> >>> >> >> >>>>>> I
> >>> >> >> >>>>>>> would give it some more time.
> >>> >> >> >>>>>>>
> >>> >> >> >>>>>>> -V.
> >>> >> >> >>>>>>>
> >>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <
> >>> [hidden email]>
> >>> >> >> >> wrote:
> >>> >> >> >>>>>>>
> >>> >> >> >>>>>>>> I think no component should be forced to be stable. It
> >>> should
> >>> >> be
> >>> >> >> >> an
> >>> >> >> >>>>>>>> individual decision for each component, and in some
> cases
> >>> even
> >>> >> for
> >>> >> >> >>>>>>>> individual classes.
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared
> >>> >> interface-frozen,
> >>> >> >> >>>> then
> >>> >> >> >>>>>>>> this is a good point to raise and this should
> definitely be
> >>> >> >> >>> reflected.
> >>> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen
> when
> >>> we
> >>> >> are
> >>> >> >> >>> not
> >>> >> >> >>>>>> yet
> >>> >> >> >>>>>>>> confident they have converged.
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri <
> >>> >> >> >>>>>>>> [hidden email]
> >>> >> >> >>>>>>>>> wrote:
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>>> Hi Robert,
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>> thanks for bringing this up!
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to
> >>> annotate the
> >>> >> >> >>> Gelly
> >>> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm
> >>> quite
> >>> >> sure
> >>> >> >> >>>>>> we'll
> >>> >> >> >>>>>>>> find
> >>> >> >> >>>>>>>>> things to improve as it gets more exposure.
> >>> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also
> e.g.
> >>> ML,
> >>> >> >> >> Table)
> >>> >> >> >>>>>> to
> >>> >> >> >>>>>>> a
> >>> >> >> >>>>>>>>> "1.0" status (from an API stability point of view)
> since
> >>> >> they're
> >>> >> >> >>>>>> really
> >>> >> >> >>>>>>>>> young compared to the other Flink APIs.
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>> Cheers,
> >>> >> >> >>>>>>>>> Vasia.
> >>> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" <
> >>> >> [hidden email]>
> >>> >> >> >>>>>> wrote:
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>>> Hi everyone,
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> I would like to bring this discussion back to your
> >>> attention
> >>> >> as
> >>> >> >> >> we
> >>> >> >> >>>>>>> seem
> >>> >> >> >>>>>>>>> to
> >>> >> >> >>>>>>>>>> approach the 1.0 release of Flink.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all
> >>> classes,
> >>> >> but I
> >>> >> >> >>>>>>> think
> >>> >> >> >>>>>>>>>> it'll be more feasible to just annotate public
> classes.
> >>> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such
> as:
> >>> >> DataSet,
> >>> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat,
> >>> MapFunction,
> >>> >> >> >>>>>>> FileSystems
> >>> >> >> >>>>>>>>> but
> >>> >> >> >>>>>>>>>> also Gelly for example.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> I would not annotate as public components such as ML,
> >>> Storm
> >>> >> >> >>>>>>>>> compatibility,
> >>> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different
> >>> maven
> >>> >> >> >>>>>> plugins
> >>> >> >> >>>>>>>> and
> >>> >> >> >>>>>>>>>> java libraries and I found
> >>> https://github.com/siom79/japicmp
> >>> >> to
> >>> >> >> >>> be
> >>> >> >> >>>>>>> the
> >>> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request
> >>> to the
> >>> >> >> >>>>>> project
> >>> >> >> >>>>>>>> to
> >>> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on
> >>> annotations.
> >>> >> Lets
> >>> >> >> >>>>>> hope
> >>> >> >> >>>>>>> it
> >>> >> >> >>>>>>>>>> gets merged.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> Does everybody agree with adding just the
> >>> @PublicInterface
> >>> >> >> >>>>>>> annotation?
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level,
> >>> making
> >>> >> the
> >>> >> >> >>>>>> entire
> >>> >> >> >>>>>>>>> class
> >>> >> >> >>>>>>>>>> either public or private (from a stability point of
> >>> view).
> >>> >> If we
> >>> >> >> >>>>>>> need a
> >>> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second
> >>> >> >> >>>>>>> @PrivateInterface
> >>> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request
> >>> with
> >>> >> all
> >>> >> >> >>>>>>> classes
> >>> >> >> >>>>>>>>>> annotated that I consider public.
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra <
> >>> >> >> >>>>>>>> [hidden email]>
> >>> >> >> >>>>>>>>>> wrote:
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>>> I like the idea. But would love to have different
> name
> >>> for
> >>> >> the
> >>> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish.
> >>> >> >> >>>>>>>>>>> How about "Module" or "Package" ?
> >>> >> >> >>>>>>>>>>>
> >>> >> >> >>>>>>>>>>> - Henry
> >>> >> >> >>>>>>>>>>>
> >>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger <
> >>> >> >> >>>>>>> [hidden email]
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>>>> wrote:
> >>> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the
> >>> different
> >>> >> >> >>>>>>>>> components
> >>> >> >> >>>>>>>>>> of
> >>> >> >> >>>>>>>>>>>> the project.
> >>> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn").
> >>> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic:
> >>> >> >> >>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>
> >>> >> >> >>>>>>
> >>> >> >> >>>>
> >>> >> >> >>>
> >>> >> >> >>
> >>> >> >>
> >>> >>
> >>>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html
> >>> >> >> >>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander
> Alexandrov <
> >>> >> >> >>>>>>>>>>>> [hidden email]> wrote:
> >>> >> >> >>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and
> >>> >> >> >> LimitedPrivate,
> >>> >> >> >>>>>>> but
> >>> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea.
> >>> >> >> >>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what
> these
> >>> tags
> >>> >> >> >>>>>>> actually
> >>> >> >> >>>>>>>>>> mean
> >>> >> >> >>>>>>>>>>> and
> >>> >> >> >>>>>>>>>>>>> add this meaning to the documentation.
> >>> >> >> >>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger <
> >>> >> >> >>>>>> [hidden email]
> >>> >> >> >>>>>>>> :
> >>> >> >> >>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> Hi,
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability
> and
> >>> >> >> >>>>>> audience
> >>> >> >> >>>>>>> of
> >>> >> >> >>>>>>>>>>> classes
> >>> >> >> >>>>>>>>>>>>>> and methods.
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> Through that, you can have
> @InterfaceAudience.Public,
> >>> >> >> >>>>>> Private,
> >>> >> >> >>>>>>>>>>>>>> LimitedPrivate
> >>> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable,
> and
> >>> >> Stable.
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to
> automatically
> >>> >> check
> >>> >> >> >>>>>> if
> >>> >> >> >>>>>>>>>>> interfaces,
> >>> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed
> between
> >>> >> >> >>>>>> releases
> >>> >> >> >>>>>>>> (or
> >>> >> >> >>>>>>>>> by
> >>> >> >> >>>>>>>>>>> pull
> >>> >> >> >>>>>>>>>>>>>> requests).
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to
> >>> >> guarantee
> >>> >> >> >>>>>>>>>> interface
> >>> >> >> >>>>>>>>>>>>>> stability between releases.
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those
> annotations?
> >>> Which
> >>> >> >> >>>>>> one
> >>> >> >> >>>>>>>>> would
> >>> >> >> >>>>>>>>>>> you
> >>> >> >> >>>>>>>>>>>>>> like to add?
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>> Robert
> >>> >> >> >>>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>>>
> >>> >> >> >>>>>>>>>>>
> >>> >> >> >>>>>>>>>>
> >>> >> >> >>>>>>>>>
> >>> >> >> >>>>>>>>
> >>> >> >> >>>>>>>
> >>> >> >> >>>>>>
> >>> >> >> >>>>
> >>> >> >> >>>>
> >>> >> >> >>>
> >>> >> >> >>
> >>> >> >>
> >>> >> >>
> >>> >>
> >>>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

Robert Metzger
Since nobody disagreed with my proposal to merge the PR for "flink-core",
I'll add it to master now.

We can still make changes to the annotations when its merged ;)


On Wed, Jan 6, 2016 at 4:00 PM, Robert Metzger <[hidden email]> wrote:

> Fabian took another look into the interface stability PR for flink-core.
>
> There are the following issues which we need to decide upon:
>
> 1) The "ExecutionMode" class name is misleading because its for batch-jobs
> only.
> I suggest to make the  ExecutionMode class an "@Experimental" interface
> (and the ExecutionConfig.setExecutionMode/getExecutionMode methods).
> This way, we can still rename the ExecutionMode if necessary.
>
>
> 2) Remove the "JobExecutionResult.getIntCounterResult()" method.
> I suggest to make the method @Experimental and @Deprecated
>
>
> 3) Does it make sense to extend the Partitioner.partition(K key, int
> numPartition) method by a parameter for the current partitionId?
> I remember as well that there was a user asking about this. How much
> effort would it be to add the parameter?
>
> 4) We need to think about whether we change the RichGroupReduceFunction to
> implement the regular combine function (combining into one value) rather
> than the group combine function. While API breaking, it allows for more
> efficient pre-aggregations.
>
>
> Fabian suggested to introduce an additional annotation @Internal for
> marking internal developer APIs within @Public classes. Without the
> additional annotation, we would need to mark stable internal APIs as
> @Experimental even though they are stable within Flink.
>
> On Tue, Dec 15, 2015 at 3:46 PM, Maximilian Michels <[hidden email]>
> wrote:
>
>> I've also had a look. +1 to merge from my side.
>>
>> On Tue, Dec 15, 2015 at 3:27 PM, Robert Metzger <[hidden email]>
>> wrote:
>> > The PR has been open for a while, Stephan, Aljoscha and Henry looked
>> over
>> > it and I've addressed all comments by them:
>> > https://github.com/apache/flink/pull/1427
>> >
>> > I would like to merge the pull request adding the "flink-annotations"
>> > module and annotating all classes in "flink-core" soon (say, next 24-48
>> > hours).
>> > We can still make changes once its merged, but I would like to get this
>> one
>> > done and start discussing the remaining modules and merge them as well
>> > afterwards.
>> >
>> >
>> >
>> > On Tue, Dec 1, 2015 at 2:20 PM, Robert Metzger <[hidden email]>
>> wrote:
>> >
>> >> I left out the classes in memory except for the Input/Output views.
>> >>
>> >> Or course, we should try to minimize the stable classes, but user
>> programs
>> >> get a lot of extension points in Flink ;)
>> >>
>> >> I've opened a pull request with my current suggestion:
>> >> https://github.com/apache/flink/pull/1426
>> >>
>> >>
>> >> On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <[hidden email]>
>> wrote:
>> >>
>> >>> Thanks for the explanation. I was just wondering how you approached
>> >>> the problem. Going through class-by-class makes sense.
>> >>>
>> >>> Not sure whether we want to make "org.apache.flink.core.memory" and
>> >>> "org.apache.flink.api.common.distributions" stable interfaces. I would
>> >>> think these qualify more as experimental public.
>> >>>
>> >>> Shouldn't we try to minimize the number of classes we declare stable?
>> >>> Of course it is nice to offer a stable interface to developers but it
>> >>> might also come in the way of the Flink developers..
>> >>>
>> >>>
>> >>> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <[hidden email]
>> >
>> >>> wrote:
>> >>> > Hi Max,
>> >>> >
>> >>> > classes in flink-scala are annotated as well, and its also in the
>> list
>> >>> :)
>> >>> >
>> >>> > I considered classes in flink-core, flink-runtime, flink-scala,
>> >>> > flink-streaming-java, flink-streaming-scala,
>> >>> > flink-connector-kafka, flink-connector-filesystem, flink-avro and
>> >>> > flink-hadoop-compatibility.
>> >>> > I think there is no clear definition for a public interface, that's
>> why
>> >>> I
>> >>> > just decided on a class-by-class basis.
>> >>> >
>> >>> > Classes I left out / I was uncertain with:
>> >>> >
>> >>> >    - org.apache.flink.api.common.distributions
>> >>> >    - only some Input/output classes in
>> org.apache.flink.api.common.io
>> >>> >    - org.apache.flink.api.common.operators
>> >>> >    - only the TypeInformation in
>> org.apache.flink.api.common.typeinfo
>> >>> (not
>> >>> >    the Atomic, basic, integer, .. type infos)
>> >>> >    - most in org.apache.flink.core.memory (except Input/output view)
>> >>> >    - I didn’t add the parsers in org.apache.flink.types.parser
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <
>> [hidden email]>
>> >>> wrote:
>> >>> >
>> >>> >> Thank for your getting us started on annotating the API. The list
>> >>> >> looks good so far. I have the feeling it could even be extended a
>> bit.
>> >>> >> Just curious, how did you choose which classes you annotate? Did
>> you
>> >>> >> go through all the classes in flink-core, flink-java, and
>> >>> >> flink-clients Maven projects?
>> >>> >>
>> >>> >> What about flink-scala? Shouldn't it be annotated as well?
>> >>> >>
>> >>> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <
>> [hidden email]>
>> >>> >> wrote:
>> >>> >> > Okay, I'll introduce an annotation for experimental interfaces
>> and
>> >>> I'll
>> >>> >> > make everything we have deprecated experimental.
>> >>> >> >
>> >>> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <
>> >>> [hidden email]>
>> >>> >> > wrote:
>> >>> >> >
>> >>> >> >> I still think we also need an annotation to mark public
>> interfaces
>> >>> as
>> >>> >> >> experimental. For example for the windowing/triggers I would
>> like
>> >>> to use
>> >>> >> >> that.
>> >>> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <[hidden email]
>> >
>> >>> wrote:
>> >>> >> >> >
>> >>> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh
>> script
>> >>> to
>> >>> >> see
>> >>> >> >> > which tools are used.
>> >>> >> >> > I think finding breaking API changes immediately is a better
>> >>> process
>> >>> >> then
>> >>> >> >> > reworking the APIs before a release.
>> >>> >> >> >
>> >>> >> >> > As you can see from my email response times (2 days since your
>> >>> email),
>> >>> >> >> I'm
>> >>> >> >> > probably too overloaded right now to participate in the Yetus
>> >>> project
>> >>> >> ...
>> >>> >> >> > Sadly.
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > @others: I know its not the most interesting thing to go
>> through
>> >>> my
>> >>> >> list
>> >>> >> >> of
>> >>> >> >> > stable interfaces, but keep in mind that we have to maintain
>> the
>> >>> stuff
>> >>> >> >> for
>> >>> >> >> > probably quite some time, so it would be good to have more
>> than
>> >>> one
>> >>> >> pair
>> >>> >> >> of
>> >>> >> >> > eyes looking at it :)
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <
>> [hidden email]
>> >>> >
>> >>> >> >> wrote:
>> >>> >> >> >
>> >>> >> >> >>>
>> >>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to
>> >>> fail a
>> >>> >> >> build
>> >>> >> >> >> on
>> >>> >> >> >>> breaking API changes? I would really like to have such a
>> >>> >> functionality
>> >>> >> >> in
>> >>> >> >> >>> Flink, because we can spot breaking changes very early.
>> >>> >> >> >>
>> >>> >> >> >>
>> >>> >> >> >> I don't think we have maven integration for this as of yet.
>> We
>> >>> >> release
>> >>> >> >> >> managers run a script
>> $HBASE/dev-support/check_compatibility.sh
>> >>> that
>> >>> >> >> >> generates a source and binary compatibility report. Issues
>> are
>> >>> then
>> >>> >> >> >> resolved during the period leading up to the release
>> candidate.
>> >>> >> >> >>
>> >>> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches
>> from
>> >>> JIRA
>> >>> >> >> and
>> >>> >> >> >>> then does these
>> >>> >> >> >>> checks?
>> >>> >> >> >>>
>> >>> >> >> >>
>> >>> >> >> >> The "QA bot" is just a collection of shell scripts used
>> during
>> >>> "Patch
>> >>> >> >> >> Available" status when a patch has been attached to JIRA or
>> when
>> >>> a PR
>> >>> >> >> has
>> >>> >> >> >> been submitted through github. The check_compatibility script
>> >>> could
>> >>> >> be
>> >>> >> >> >> included in that automation, I don't see why not. Maybe you'd
>> >>> like to
>> >>> >> >> open
>> >>> >> >> >> a YETUS ticket :)
>> >>> >> >> >>
>> >>> >> >> >> I've pushed a branch to my own GitHub account with all
>> classes I
>> >>> >> would
>> >>> >> >> make
>> >>> >> >> >>> public annotated:
>> >>> >> >> >>>
>> >>> >> >> >>>
>> >>> >> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1
>> >>> >> >> >>> Since this is really hard to read, I (half-automated)
>> generated
>> >>> the
>> >>> >> >> >>> following list of annotated classes:
>> >>> >> >> >>>
>> >>> >> >> >>>
>> >>> >> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md
>> >>> >> >> >>>
>> >>> >> >> >>> Please let me know if you would like to include or exclude
>> >>> classes
>> >>> >> from
>> >>> >> >> >>> that list.
>> >>> >> >> >>> Also, let me know which methods (in stable classes) you
>> would
>> >>> mark
>> >>> >> as
>> >>> >> >> >>> experimental.
>> >>> >> >> >>>
>> >>> >> >> >>>
>> >>> >> >> >>>
>> >>> >> >> >>>
>> >>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek <
>> >>> >> >> [hidden email]>
>> >>> >> >> >>> wrote:
>> >>> >> >> >>>
>> >>> >> >> >>>> +1 for some way of declaring public interfaces as
>> experimental.
>> >>> >> >> >>>>
>> >>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <[hidden email]>
>> >>> wrote:
>> >>> >> >> >>>>>
>> >>> >> >> >>>>> I think we need anyways an annotation
>> "@PublicExperimental".
>> >>> >> >> >>>>>
>> >>> >> >> >>>>> We can make this annotation such that it can be added to
>> >>> methods
>> >>> >> and
>> >>> >> >> >>> can
>> >>> >> >> >>>>> use that to declare
>> >>> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as
>> >>> >> >> >> experimental.
>> >>> >> >> >>>>>
>> >>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske <
>> >>> >> [hidden email]>
>> >>> >> >> >>>> wrote:
>> >>> >> >> >>>>>
>> >>> >> >> >>>>>> I am not sure if we always should declare complete
>> classes as
>> >>> >> >> >>>>>> @PublicInterface.
>> >>> >> >> >>>>>> This does definitely make sense for interfaces and
>> abstract
>> >>> >> classes
>> >>> >> >> >>>> such as
>> >>> >> >> >>>>>> MapFunction or InputFormat but not necessarily for
>> classes
>> >>> such
>> >>> >> as
>> >>> >> >> >>>> DataSet
>> >>> >> >> >>>>>> that we might want to extend by methods which should not
>> >>> >> immediately
>> >>> >> >> >>> be
>> >>> >> >> >>>>>> considered as stable.
>> >>> >> >> >>>>>>
>> >>> >> >> >>>>>>
>> >>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri <
>> >>> >> >> >>> [hidden email]
>> >>> >> >> >>>>> :
>> >>> >> >> >>>>>>
>> >>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly
>> API
>> >>> >> frozen
>> >>> >> >> >>> yet.
>> >>> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release,
>> but
>> >>> if
>> >>> >> >> >>>> possible,
>> >>> >> >> >>>>>> I
>> >>> >> >> >>>>>>> would give it some more time.
>> >>> >> >> >>>>>>>
>> >>> >> >> >>>>>>> -V.
>> >>> >> >> >>>>>>>
>> >>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <
>> >>> [hidden email]>
>> >>> >> >> >> wrote:
>> >>> >> >> >>>>>>>
>> >>> >> >> >>>>>>>> I think no component should be forced to be stable. It
>> >>> should
>> >>> >> be
>> >>> >> >> >> an
>> >>> >> >> >>>>>>>> individual decision for each component, and in some
>> cases
>> >>> even
>> >>> >> for
>> >>> >> >> >>>>>>>> individual classes.
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared
>> >>> >> interface-frozen,
>> >>> >> >> >>>> then
>> >>> >> >> >>>>>>>> this is a good point to raise and this should
>> definitely be
>> >>> >> >> >>> reflected.
>> >>> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen
>> when
>> >>> we
>> >>> >> are
>> >>> >> >> >>> not
>> >>> >> >> >>>>>> yet
>> >>> >> >> >>>>>>>> confident they have converged.
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri <
>> >>> >> >> >>>>>>>> [hidden email]
>> >>> >> >> >>>>>>>>> wrote:
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>>> Hi Robert,
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>> thanks for bringing this up!
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to
>> >>> annotate the
>> >>> >> >> >>> Gelly
>> >>> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm
>> >>> quite
>> >>> >> sure
>> >>> >> >> >>>>>> we'll
>> >>> >> >> >>>>>>>> find
>> >>> >> >> >>>>>>>>> things to improve as it gets more exposure.
>> >>> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also
>> e.g.
>> >>> ML,
>> >>> >> >> >> Table)
>> >>> >> >> >>>>>> to
>> >>> >> >> >>>>>>> a
>> >>> >> >> >>>>>>>>> "1.0" status (from an API stability point of view)
>> since
>> >>> >> they're
>> >>> >> >> >>>>>> really
>> >>> >> >> >>>>>>>>> young compared to the other Flink APIs.
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>> Cheers,
>> >>> >> >> >>>>>>>>> Vasia.
>> >>> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" <
>> >>> >> [hidden email]>
>> >>> >> >> >>>>>> wrote:
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>>> Hi everyone,
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> I would like to bring this discussion back to your
>> >>> attention
>> >>> >> as
>> >>> >> >> >> we
>> >>> >> >> >>>>>>> seem
>> >>> >> >> >>>>>>>>> to
>> >>> >> >> >>>>>>>>>> approach the 1.0 release of Flink.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all
>> >>> classes,
>> >>> >> but I
>> >>> >> >> >>>>>>> think
>> >>> >> >> >>>>>>>>>> it'll be more feasible to just annotate public
>> classes.
>> >>> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such
>> as:
>> >>> >> DataSet,
>> >>> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat,
>> >>> MapFunction,
>> >>> >> >> >>>>>>> FileSystems
>> >>> >> >> >>>>>>>>> but
>> >>> >> >> >>>>>>>>>> also Gelly for example.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> I would not annotate as public components such as ML,
>> >>> Storm
>> >>> >> >> >>>>>>>>> compatibility,
>> >>> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into
>> different
>> >>> maven
>> >>> >> >> >>>>>> plugins
>> >>> >> >> >>>>>>>> and
>> >>> >> >> >>>>>>>>>> java libraries and I found
>> >>> https://github.com/siom79/japicmp
>> >>> >> to
>> >>> >> >> >>> be
>> >>> >> >> >>>>>>> the
>> >>> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull
>> request
>> >>> to the
>> >>> >> >> >>>>>> project
>> >>> >> >> >>>>>>>> to
>> >>> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on
>> >>> annotations.
>> >>> >> Lets
>> >>> >> >> >>>>>> hope
>> >>> >> >> >>>>>>> it
>> >>> >> >> >>>>>>>>>> gets merged.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> Does everybody agree with adding just the
>> >>> @PublicInterface
>> >>> >> >> >>>>>>> annotation?
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level,
>> >>> making
>> >>> >> the
>> >>> >> >> >>>>>> entire
>> >>> >> >> >>>>>>>>> class
>> >>> >> >> >>>>>>>>>> either public or private (from a stability point of
>> >>> view).
>> >>> >> If we
>> >>> >> >> >>>>>>> need a
>> >>> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second
>> >>> >> >> >>>>>>> @PrivateInterface
>> >>> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull
>> request
>> >>> with
>> >>> >> all
>> >>> >> >> >>>>>>> classes
>> >>> >> >> >>>>>>>>>> annotated that I consider public.
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra <
>> >>> >> >> >>>>>>>> [hidden email]>
>> >>> >> >> >>>>>>>>>> wrote:
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>>> I like the idea. But would love to have different
>> name
>> >>> for
>> >>> >> the
>> >>> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish.
>> >>> >> >> >>>>>>>>>>> How about "Module" or "Package" ?
>> >>> >> >> >>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>> - Henry
>> >>> >> >> >>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger <
>> >>> >> >> >>>>>>> [hidden email]
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>>>> wrote:
>> >>> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the
>> >>> different
>> >>> >> >> >>>>>>>>> components
>> >>> >> >> >>>>>>>>>> of
>> >>> >> >> >>>>>>>>>>>> the project.
>> >>> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn").
>> >>> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic:
>> >>> >> >> >>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>
>> >>> >> >> >>>>>>
>> >>> >> >> >>>>
>> >>> >> >> >>>
>> >>> >> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html
>> >>> >> >> >>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander
>> Alexandrov <
>> >>> >> >> >>>>>>>>>>>> [hidden email]> wrote:
>> >>> >> >> >>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and
>> >>> >> >> >> LimitedPrivate,
>> >>> >> >> >>>>>>> but
>> >>> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea.
>> >>> >> >> >>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what
>> these
>> >>> tags
>> >>> >> >> >>>>>>> actually
>> >>> >> >> >>>>>>>>>> mean
>> >>> >> >> >>>>>>>>>>> and
>> >>> >> >> >>>>>>>>>>>>> add this meaning to the documentation.
>> >>> >> >> >>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger <
>> >>> >> >> >>>>>> [hidden email]
>> >>> >> >> >>>>>>>> :
>> >>> >> >> >>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> Hi,
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability
>> and
>> >>> >> >> >>>>>> audience
>> >>> >> >> >>>>>>> of
>> >>> >> >> >>>>>>>>>>> classes
>> >>> >> >> >>>>>>>>>>>>>> and methods.
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> Through that, you can have
>> @InterfaceAudience.Public,
>> >>> >> >> >>>>>> Private,
>> >>> >> >> >>>>>>>>>>>>>> LimitedPrivate
>> >>> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable,
>> and
>> >>> >> Stable.
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to
>> automatically
>> >>> >> check
>> >>> >> >> >>>>>> if
>> >>> >> >> >>>>>>>>>>> interfaces,
>> >>> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed
>> between
>> >>> >> >> >>>>>> releases
>> >>> >> >> >>>>>>>> (or
>> >>> >> >> >>>>>>>>> by
>> >>> >> >> >>>>>>>>>>> pull
>> >>> >> >> >>>>>>>>>>>>>> requests).
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want
>> to
>> >>> >> guarantee
>> >>> >> >> >>>>>>>>>> interface
>> >>> >> >> >>>>>>>>>>>>>> stability between releases.
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those
>> annotations?
>> >>> Which
>> >>> >> >> >>>>>> one
>> >>> >> >> >>>>>>>>> would
>> >>> >> >> >>>>>>>>>>> you
>> >>> >> >> >>>>>>>>>>>>>> like to add?
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>> Robert
>> >>> >> >> >>>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>>
>> >>> >> >> >>>>>>>>>>
>> >>> >> >> >>>>>>>>>
>> >>> >> >> >>>>>>>>
>> >>> >> >> >>>>>>>
>> >>> >> >> >>>>>>
>> >>> >> >> >>>>
>> >>> >> >> >>>>
>> >>> >> >> >>>
>> >>> >> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> >>
>> >>
>>
>
>
12