(DEPRECATED) Apache Flink Mailing List archive.

[DISCUSS] Discourage using the same class names even though in different packages

Classic

List

Threaded

7 messages Options

Henry Saputra

[DISCUSS] Discourage using the same class names even though in different packages

Hi All,

I am seeing some same class names, even though in different package
names, that could confuse new contributors. One of the attractiveness
of Spark that it is the code structure is simple to follow than Hadoop
(or Hive for that matter).

For example we have IntermediateResultPartition in both partition and
executiongraph packages, which both are under runtime parent package.
To make it more difficult, some of these duplicate classes have no
Javadoc or comment why the class exist and how does it relates to
other existing code, one has to trace the code and figure out where
the code is used and how it is impacting or differ the others existing
classes.

I would like to propose the "no duplicate class name if possible"
(which I know is possible) in the how to contribute code guide.

- Henry

Henry Saputra

Re: [DISCUSS] Discourage using the same class names even though in different packages

Just to be clear that I was not advocating flink to simplify the code
just for the sake of clarity :)

Flink has a lot to offer by providing simple APIs by hiding complexity to
achieve performance. Which I think is one of the key differentiator compare
to other general distributed processing platform.

My suggestion was meant to help contributors and committers to
easily follow and keep up with changes that impact kernel or gut of Flink.

Thoughts and comments are welcomed :)

On Monday, February 23, 2015, Henry Saputra <[hidden email]> wrote:

> Hi All,
>
> I am seeing some same class names, even though in different package
> names, that could confuse new contributors. One of the attractiveness
> of Spark that it is the code structure is simple to follow than Hadoop
> (or Hive for that matter).
>
> For example we have IntermediateResultPartition in both partition and
> executiongraph packages, which both are under runtime parent package.
> To make it more difficult, some of these duplicate classes have no
> Javadoc or comment why the class exist and how does it relates to
> other existing code, one has to trace the code and figure out where
> the code is used and how it is impacting or differ the others existing
> classes.
>
> I would like to propose the "no duplicate class name if possible"
> (which I know is possible) in the how to contribute code guide.
>
> - Henry
>

Stephan Ewen

Re: [DISCUSS] Discourage using the same class names even though in different packages

That is a good comment, Henry.

Let's try and follow this rule...
Am 24.02.2015 02:28 schrieb "Henry Saputra" <[hidden email]>:

> Just to be clear that I was not advocating flink to simplify the code
> just for the sake of clarity :)
>
> Flink has a lot to offer by providing simple APIs by hiding complexity to
> achieve performance. Which I think is one of the key differentiator compare
> to other general distributed processing platform.
>
> My suggestion was meant to help contributors and committers to
> easily follow and keep up with changes that impact kernel or gut of Flink.
>
> Thoughts and comments are welcomed :)
>
> On Monday, February 23, 2015, Henry Saputra <[hidden email]>
> wrote:
>
> > Hi All,
> >
> > I am seeing some same class names, even though in different package
> > names, that could confuse new contributors. One of the attractiveness
> > of Spark that it is the code structure is simple to follow than Hadoop
> > (or Hive for that matter).
> >
> > For example we have IntermediateResultPartition in both partition and
> > executiongraph packages, which both are under runtime parent package.
> > To make it more difficult, some of these duplicate classes have no
> > Javadoc or comment why the class exist and how does it relates to
> > other existing code, one has to trace the code and figure out where
> > the code is used and how it is impacting or differ the others existing
> > classes.
> >
> > I would like to propose the "no duplicate class name if possible"
> > (which I know is possible) in the how to contribute code guide.
> >
> > - Henry
> >
>

Kostas Tzoumas-2

Re: [DISCUSS] Discourage using the same class names even though in different packages

I agree, at least for all non-user facing classes (e.g., the examples in
Scala/Java/Streaming etc may have the same names)

Kostas

On Tue, Feb 24, 2015 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:

> That is a good comment, Henry.
>
> Let's try and follow this rule...
> Am 24.02.2015 02:28 schrieb "Henry Saputra" <[hidden email]>:
>
> > Just to be clear that I was not advocating flink to simplify the code
> > just for the sake of clarity :)
> >
> > Flink has a lot to offer by providing simple APIs by hiding complexity to
> > achieve performance. Which I think is one of the key differentiator
> compare
> > to other general distributed processing platform.
> >
> > My suggestion was meant to help contributors and committers to
> > easily follow and keep up with changes that impact kernel or gut of
> Flink.
> >
> > Thoughts and comments are welcomed :)
> >
> > On Monday, February 23, 2015, Henry Saputra <[hidden email]>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am seeing some same class names, even though in different package
> > > names, that could confuse new contributors. One of the attractiveness
> > > of Spark that it is the code structure is simple to follow than Hadoop
> > > (or Hive for that matter).
> > >
> > > For example we have IntermediateResultPartition in both partition and
> > > executiongraph packages, which both are under runtime parent package.
> > > To make it more difficult, some of these duplicate classes have no
> > > Javadoc or comment why the class exist and how does it relates to
> > > other existing code, one has to trace the code and figure out where
> > > the code is used and how it is impacting or differ the others existing
> > > classes.
> > >
> > > I would like to propose the "no duplicate class name if possible"
> > > (which I know is possible) in the how to contribute code guide.
> > >
> > > - Henry
> > >
> >
>

Till Rohrmann

Re: [DISCUSS] Discourage using the same class names even though in different packages

+1 for Henry's proposition.

On Tue, Feb 24, 2015 at 9:55 AM, Kostas Tzoumas <[hidden email]> wrote:

> I agree, at least for all non-user facing classes (e.g., the examples in
> Scala/Java/Streaming etc may have the same names)
>
> Kostas
>
> On Tue, Feb 24, 2015 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
>
> > That is a good comment, Henry.
> >
> > Let's try and follow this rule...
> > Am 24.02.2015 02:28 schrieb "Henry Saputra" <[hidden email]>:
> >
> > > Just to be clear that I was not advocating flink to simplify the code
> > > just for the sake of clarity :)
> > >
> > > Flink has a lot to offer by providing simple APIs by hiding complexity
> to
> > > achieve performance. Which I think is one of the key differentiator
> > compare
> > > to other general distributed processing platform.
> > >
> > > My suggestion was meant to help contributors and committers to
> > > easily follow and keep up with changes that impact kernel or gut of
> > Flink.
> > >
> > > Thoughts and comments are welcomed :)
> > >
> > > On Monday, February 23, 2015, Henry Saputra <[hidden email]>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am seeing some same class names, even though in different package
> > > > names, that could confuse new contributors. One of the attractiveness
> > > > of Spark that it is the code structure is simple to follow than
> Hadoop
> > > > (or Hive for that matter).
> > > >
> > > > For example we have IntermediateResultPartition in both partition and
> > > > executiongraph packages, which both are under runtime parent package.
> > > > To make it more difficult, some of these duplicate classes have no
> > > > Javadoc or comment why the class exist and how does it relates to
> > > > other existing code, one has to trace the code and figure out where
> > > > the code is used and how it is impacting or differ the others
> existing
> > > > classes.
> > > >
> > > > I would like to propose the "no duplicate class name if possible"
> > > > (which I know is possible) in the how to contribute code guide.
> > > >
> > > > - Henry
> > > >
> > >
> >
>

mxm

Re: [DISCUSS] Discourage using the same class names even though in different packages

Totally agree with you Henry. Duplicate class names just add
confusion. However, the actual problem is the lack of documentation
for a lot of classes. It would be great if we could have a
documentation sprint in the near future to at least add a doc string
for every class. This might be some work but, in the long run, it will
make it much easier to contribute to the Flink project.

Best regards,
Max

On Tue, Feb 24, 2015 at 2:10 PM, Till Rohrmann <[hidden email]> wrote:

> +1 for Henry's proposition.
>
> On Tue, Feb 24, 2015 at 9:55 AM, Kostas Tzoumas <[hidden email]> wrote:
>
>> I agree, at least for all non-user facing classes (e.g., the examples in
>> Scala/Java/Streaming etc may have the same names)
>>
>> Kostas
>>
>> On Tue, Feb 24, 2015 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
>>
>> > That is a good comment, Henry.
>> >
>> > Let's try and follow this rule...
>> > Am 24.02.2015 02:28 schrieb "Henry Saputra" <[hidden email]>:
>> >
>> > > Just to be clear that I was not advocating flink to simplify the code
>> > > just for the sake of clarity :)
>> > >
>> > > Flink has a lot to offer by providing simple APIs by hiding complexity
>> to
>> > > achieve performance. Which I think is one of the key differentiator
>> > compare
>> > > to other general distributed processing platform.
>> > >
>> > > My suggestion was meant to help contributors and committers to
>> > > easily follow and keep up with changes that impact kernel or gut of
>> > Flink.
>> > >
>> > > Thoughts and comments are welcomed :)
>> > >
>> > > On Monday, February 23, 2015, Henry Saputra <[hidden email]>
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > I am seeing some same class names, even though in different package
>> > > > names, that could confuse new contributors. One of the attractiveness
>> > > > of Spark that it is the code structure is simple to follow than
>> Hadoop
>> > > > (or Hive for that matter).
>> > > >
>> > > > For example we have IntermediateResultPartition in both partition and
>> > > > executiongraph packages, which both are under runtime parent package.
>> > > > To make it more difficult, some of these duplicate classes have no
>> > > > Javadoc or comment why the class exist and how does it relates to
>> > > > other existing code, one has to trace the code and figure out where
>> > > > the code is used and how it is impacting or differ the others
>> existing
>> > > > classes.
>> > > >
>> > > > I would like to propose the "no duplicate class name if possible"
>> > > > (which I know is possible) in the how to contribute code guide.
>> > > >
>> > > > - Henry
>> > > >
>> > >
>> >
>>

Henry Saputra

Re: [DISCUSS] Discourage using the same class names even though in different packages

Thanks for the response, all.

@Max, yes I second that the duplicate class names, at least the ones not on
client facing APIs, add more confusion and it does not help, or even make
it worse, the lack of code documentation in some of the classes to figure
out how they work together.

Agree we can and will do better. Will add Jira and update coding guidelines
to follow up.

- HS

On Tuesday, February 24, 2015, Max Michels <[hidden email]> wrote:

> Totally agree with you Henry. Duplicate class names just add
> confusion. However, the actual problem is the lack of documentation
> for a lot of classes. It would be great if we could have a
> documentation sprint in the near future to at least add a doc string
> for every class. This might be some work but, in the long run, it will
> make it much easier to contribute to the Flink project.
>
> Best regards,
> Max
>
> On Tue, Feb 24, 2015 at 2:10 PM, Till Rohrmann <[hidden email]
> <javascript:;>> wrote:
> > +1 for Henry's proposition.
> >
> > On Tue, Feb 24, 2015 at 9:55 AM, Kostas Tzoumas <[hidden email]
> <javascript:;>> wrote:
> >
> >> I agree, at least for all non-user facing classes (e.g., the examples in
> >> Scala/Java/Streaming etc may have the same names)
> >>
> >> Kostas
> >>
> >> On Tue, Feb 24, 2015 at 9:10 AM, Stephan Ewen <[hidden email]
> <javascript:;>> wrote:
> >>
> >> > That is a good comment, Henry.
> >> >
> >> > Let's try and follow this rule...
> >> > Am 24.02.2015 02:28 schrieb "Henry Saputra" <[hidden email]
> <javascript:;>>:
> >> >
> >> > > Just to be clear that I was not advocating flink to simplify the
> code
> >> > > just for the sake of clarity :)
> >> > >
> >> > > Flink has a lot to offer by providing simple APIs by hiding
> complexity
> >> to
> >> > > achieve performance. Which I think is one of the key differentiator
> >> > compare
> >> > > to other general distributed processing platform.
> >> > >
> >> > > My suggestion was meant to help contributors and committers to
> >> > > easily follow and keep up with changes that impact kernel or gut of
> >> > Flink.
> >> > >
> >> > > Thoughts and comments are welcomed :)
> >> > >
> >> > > On Monday, February 23, 2015, Henry Saputra <
> [hidden email] <javascript:;>>
> >> > > wrote:
> >> > >
> >> > > > Hi All,
> >> > > >
> >> > > > I am seeing some same class names, even though in different
> package
> >> > > > names, that could confuse new contributors. One of the
> attractiveness
> >> > > > of Spark that it is the code structure is simple to follow than
> >> Hadoop
> >> > > > (or Hive for that matter).
> >> > > >
> >> > > > For example we have IntermediateResultPartition in both partition
> and
> >> > > > executiongraph packages, which both are under runtime parent
> package.
> >> > > > To make it more difficult, some of these duplicate classes have no
> >> > > > Javadoc or comment why the class exist and how does it relates to
> >> > > > other existing code, one has to trace the code and figure out
> where
> >> > > > the code is used and how it is impacting or differ the others
> >> existing
> >> > > > classes.
> >> > > >
> >> > > > I would like to propose the "no duplicate class name if possible"
> >> > > > (which I know is possible) in the how to contribute code guide.
> >> > > >
> >> > > > - Henry
> >> > > >
> >> > >
> >> >
> >>
>