[CODE-STYLE] Builder pattern

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[CODE-STYLE] Builder pattern

Gyula Fóra
Hi All!

I would like to start a code-style related discussion regarding how we
implement the builder pattern in the Flink project.

It would be the best to have a common approach, there are some aspects of
the pattern that come to my mind please feel free to add anything I missed:

1. Creating the builder objects:

a) Always create using static method in "built" class:
           Here we should have naming guidelines: .builder(..) or
.xyzBuilder(...)
b) Always use builder class constructor
c) Mix: Again we should have some guidelines when to use which

I personally prefer option a) to always have a static method to create the
builder with static method names that end in builder.

2. Setting properties on the builder:

 a) withSomething(...)
 b) setSomething(...)
 c) other

I don't really have a preference but either a or b for consistency.


3. Implementing the builder object:

 a) Immutable -> Creates a new builder object after setting a property
 b) Mutable -> Returns (this) after setting the property

I personally prefer the mutable version as it keeps the builder
implementation much simpler and it seems to be a very common way of doing
it.

What do you all think?

Regards,
Gyula
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

dwysakowicz
Hi Gyula,

A few comments from my side.

Ad. 1 Personally I also prefer a static method in the "built" class. Not
sure if I would be that strict about the "Builder" suffix, though. It is
usually quite easy to realize the method returns a builder rather than
the object itself. In my opinion the suffix might be redundant at times.

Ad. 2 Here I also don't have a strict preference. I like the b) approach
the least though. Whenever I see setXXX I expect this is an old style
setter without return type. For that reason for a generic name I always
prefer withXXX. I see though lots of benefits for the option c), as this
might be the most descriptive option. Some examples that I like the
usage of option c) are:

* org.apache.flink.api.common.state.StateTtlConfig.Builder#useProcessingTime

*
org.apache.flink.api.common.state.StateTtlConfig.Builder#cleanupInBackground

*
org.apache.flink.api.common.state.StateTtlConfig.Builder#cleanupInRocksdbCompactFilter()

To sum up on this topic I would not enforce a strict policy on this
topic. But I am ok with one, if the community prefers it.

Ad. 3 I agree that mutable Builders are way more easier to implement and
usually it does not harm to have them mutable as long as they are not
passed around.


Some other topics that I think are worth considering:

4. Setting the default values:

a) The static method/constructor should accept all the required
parameters (that don't have a reasonable defaults). So that
newBuilder(...).build() construct a valid object.

b) Validation in the build() method. newBuilder().build() might throw an
Exception if some values were not set.

Personally I think the option a) should be strongly preferred. However I
believe there are cases were b) could be acceptable.

5. Building the end object:

a) Always with build() method

b) Allow building object with arbitrary methods

I think the option a) is the most common approach. I think though option
b) should also be allowed if there is a good reason for that. What I can
imagine is when we need some additional information from the build
method (e.g. generic type) or if the method modifies the internal state
of the Builder in a way that it is unsafe to continue setting values on
the Builder.

An overall comment. I think it is good to share opinions on this topic,
but I am afraid there is too many sides to the issue to standardize it
very strictly. I might be wrong though. Really looking forward to the
outcome of this discussion.

Best,

Dawid

On 26/08/2019 14:18, Gyula Fóra wrote:

> Hi All!
>
> I would like to start a code-style related discussion regarding how we
> implement the builder pattern in the Flink project.
>
> It would be the best to have a common approach, there are some aspects of
> the pattern that come to my mind please feel free to add anything I missed:
>
> 1. Creating the builder objects:
>
> a) Always create using static method in "built" class:
>            Here we should have naming guidelines: .builder(..) or
> .xyzBuilder(...)
> b) Always use builder class constructor
> c) Mix: Again we should have some guidelines when to use which
>
> I personally prefer option a) to always have a static method to create the
> builder with static method names that end in builder.
>
> 2. Setting properties on the builder:
>
>  a) withSomething(...)
>  b) setSomething(...)
>  c) other
>
> I don't really have a preference but either a or b for consistency.
>
>
> 3. Implementing the builder object:
>
>  a) Immutable -> Creates a new builder object after setting a property
>  b) Mutable -> Returns (this) after setting the property
>
> I personally prefer the mutable version as it keeps the builder
> implementation much simpler and it seems to be a very common way of doing
> it.
>
> What do you all think?
>
> Regards,
> Gyula
>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Jark Wu-2
In reply to this post by Gyula Fóra
Hi Gyula,

Thanks for bringing this. I think it would be nice if we have a common
approach to create builder pattern.
Currently, we have a lot of builders but with different tastes.

 > 1. Creating the builder objects:
I prefer option a) too. It would be easier for users to get the builder
instance.

> 2. Setting properties on the builder:
I don't have a preference for it. But I think there is another option might
be more concise, i.e. "something()" without `with` or `set` prefix.
For example:

CsvTableSource source = new CsvTableSource.builder()
    .path("/path/to/your/file.csv")
    .field("myfield", Types.STRING)
    .field("myfield2", Types.INT)
    .build();

This pattern is heavily used in flink-table, e.g. `TableSchema`,
`TypeInference`, `BuiltInFunctionDefinition`.

> 3. Implementing the builder object:
I prefer  b) Mutable approach which is simpler from the implementation part.


Besides that, I think maybe we can add some other aspects:

4. Constructor of the main class.
 a) private constructor
 b) public constructor

5. setXXX methods of the main class
 a) setXXX methods are not allowed
 b) setXXX methods are allowed.

I prefer both option a). Because I think one of the reason to have the
builder is that we don't want the constructor public.
A public constructor makes it hard to maintain and evolve compatibly when
adding new parameters, FlinkKafkaProducer is a good example.
For set methods, I think in most cases, we want users to set the fields
eagerly (through the builder) and `setXXX` methods on the main class
is duplicate with the methods on the builder. We should avoid that.


Regards,
Jark


On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]> wrote:

> Hi All!
>
> I would like to start a code-style related discussion regarding how we
> implement the builder pattern in the Flink project.
>
> It would be the best to have a common approach, there are some aspects of
> the pattern that come to my mind please feel free to add anything I missed:
>
> 1. Creating the builder objects:
>
> a) Always create using static method in "built" class:
>            Here we should have naming guidelines: .builder(..) or
> .xyzBuilder(...)
> b) Always use builder class constructor
> c) Mix: Again we should have some guidelines when to use which
>
> I personally prefer option a) to always have a static method to create the
> builder with static method names that end in builder.
>
> 2. Setting properties on the builder:
>
>  a) withSomething(...)
>  b) setSomething(...)
>  c) other
>
> I don't really have a preference but either a or b for consistency.
>
>
> 3. Implementing the builder object:
>
>  a) Immutable -> Creates a new builder object after setting a property
>  b) Mutable -> Returns (this) after setting the property
>
> I personally prefer the mutable version as it keeps the builder
> implementation much simpler and it seems to be a very common way of doing
> it.
>
> What do you all think?
>
> Regards,
> Gyula
>
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Piotr Nowojski-3
Hi,

I agree with Dawid, modulo that I don’t have any preference about point 2 - I’m ok even with not enforcing this.

One side note about point 4. There are use cases where passing obligatory parameters in the build method itself might make sense:

I. - when those parameters can not be or can not be easily passed via the constructor. Good example of that is “builder” pattern for the StreamOperators (StreamOperatorFactory), where factory is constructed on the API level in the client, then it’s being serialised and sent over the network and reconstructed on the TaskManager, where StreamOperator is finally constructed. The issue is that some of the obligatory parameters are only available on the TaskManager, so they can not be passed on a DataStream level in the client.
II. - when builder might be used to create multiple instances of the object with different values.

Piotrek

> On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
>
> Hi Gyula,
>
> Thanks for bringing this. I think it would be nice if we have a common
> approach to create builder pattern.
> Currently, we have a lot of builders but with different tastes.
>
>> 1. Creating the builder objects:
> I prefer option a) too. It would be easier for users to get the builder
> instance.
>
>> 2. Setting properties on the builder:
> I don't have a preference for it. But I think there is another option might
> be more concise, i.e. "something()" without `with` or `set` prefix.
> For example:
>
> CsvTableSource source = new CsvTableSource.builder()
>    .path("/path/to/your/file.csv")
>    .field("myfield", Types.STRING)
>    .field("myfield2", Types.INT)
>    .build();
>
> This pattern is heavily used in flink-table, e.g. `TableSchema`,
> `TypeInference`, `BuiltInFunctionDefinition`.
>
>> 3. Implementing the builder object:
> I prefer  b) Mutable approach which is simpler from the implementation part.
>
>
> Besides that, I think maybe we can add some other aspects:
>
> 4. Constructor of the main class.
> a) private constructor
> b) public constructor
>
> 5. setXXX methods of the main class
> a) setXXX methods are not allowed
> b) setXXX methods are allowed.
>
> I prefer both option a). Because I think one of the reason to have the
> builder is that we don't want the constructor public.
> A public constructor makes it hard to maintain and evolve compatibly when
> adding new parameters, FlinkKafkaProducer is a good example.
> For set methods, I think in most cases, we want users to set the fields
> eagerly (through the builder) and `setXXX` methods on the main class
> is duplicate with the methods on the builder. We should avoid that.
>
>
> Regards,
> Jark
>
>
> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]> wrote:
>
>> Hi All!
>>
>> I would like to start a code-style related discussion regarding how we
>> implement the builder pattern in the Flink project.
>>
>> It would be the best to have a common approach, there are some aspects of
>> the pattern that come to my mind please feel free to add anything I missed:
>>
>> 1. Creating the builder objects:
>>
>> a) Always create using static method in "built" class:
>>           Here we should have naming guidelines: .builder(..) or
>> .xyzBuilder(...)
>> b) Always use builder class constructor
>> c) Mix: Again we should have some guidelines when to use which
>>
>> I personally prefer option a) to always have a static method to create the
>> builder with static method names that end in builder.
>>
>> 2. Setting properties on the builder:
>>
>> a) withSomething(...)
>> b) setSomething(...)
>> c) other
>>
>> I don't really have a preference but either a or b for consistency.
>>
>>
>> 3. Implementing the builder object:
>>
>> a) Immutable -> Creates a new builder object after setting a property
>> b) Mutable -> Returns (this) after setting the property
>>
>> I personally prefer the mutable version as it keeps the builder
>> implementation much simpler and it seems to be a very common way of doing
>> it.
>>
>> What do you all think?
>>
>> Regards,
>> Gyula
>>

Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Arvid Heise
Hi all,

I'd like to differentiate between API level builder usage and "internal"
builder usage (for example, test harness).

For API level builder, in general everything goes, as long as it aligns
with user expectations. API level usages are also much more discussed, such
that I'd expect them to be consistent within one API. This freedom is
especially required when considering APIs for non-java languages.

Now for "internal" usages (which may or may not align with Java Datastream
etc. usage). I'd like to get a style that is well supported by the
primarily used IDEs. I don't want to write a new builder from scratch and
by the looks of it, we will get many more builders.

Furthermore, I'd like to emphasize that the primary use case for using
builders for me is to mitigate the lack of named arguments in Java, which
is especially painful for immutable types. In an ideal world, I'd like to
have all classes immutable and use builders to create new instances if we
have
a) too many parameters to be passed (which should become many more once we
commit even more to DI),
b) we have meaningful default values, such that we can omit a significant
amount parameters by using a buidler, or
c) we have a good amount of optional (=nullable) parameters.
Obviously, we deviate from that whenever performance considerations demand
it.

With that my votes for the questions:
1. static method only, hide builder ctor
2. Intellij and avro use setX() for property X, so I'd go with that. Lombok
just uses X(), so I wouldn't mind it.
3. Mutable approach. Immutable doesn't make much sense to me. Then I can
directly go with a Wither pattern on the immutable class without builder.
4. Private ctor. If it has a builder, it should be used. Exception:
migration support for some intermediate versions (if we added a builder to
a class, keep the ctor deprecated public for 1,2 versions).
5. no setX in general, we want to have an immutable class. Exceptions where
due (should be mostly for performance reasons).

Best,

Arvid

On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]> wrote:

> Hi,
>
> I agree with Dawid, modulo that I don’t have any preference about point 2
> - I’m ok even with not enforcing this.
>
> One side note about point 4. There are use cases where passing obligatory
> parameters in the build method itself might make sense:
>
> I. - when those parameters can not be or can not be easily passed via the
> constructor. Good example of that is “builder” pattern for the
> StreamOperators (StreamOperatorFactory), where factory is constructed on
> the API level in the client, then it’s being serialised and sent over the
> network and reconstructed on the TaskManager, where StreamOperator is
> finally constructed. The issue is that some of the obligatory parameters
> are only available on the TaskManager, so they can not be passed on a
> DataStream level in the client.
> II. - when builder might be used to create multiple instances of the
> object with different values.
>
> Piotrek
>
> > On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
> >
> > Hi Gyula,
> >
> > Thanks for bringing this. I think it would be nice if we have a common
> > approach to create builder pattern.
> > Currently, we have a lot of builders but with different tastes.
> >
> >> 1. Creating the builder objects:
> > I prefer option a) too. It would be easier for users to get the builder
> > instance.
> >
> >> 2. Setting properties on the builder:
> > I don't have a preference for it. But I think there is another option
> might
> > be more concise, i.e. "something()" without `with` or `set` prefix.
> > For example:
> >
> > CsvTableSource source = new CsvTableSource.builder()
> >    .path("/path/to/your/file.csv")
> >    .field("myfield", Types.STRING)
> >    .field("myfield2", Types.INT)
> >    .build();
> >
> > This pattern is heavily used in flink-table, e.g. `TableSchema`,
> > `TypeInference`, `BuiltInFunctionDefinition`.
> >
> >> 3. Implementing the builder object:
> > I prefer  b) Mutable approach which is simpler from the implementation
> part.
> >
> >
> > Besides that, I think maybe we can add some other aspects:
> >
> > 4. Constructor of the main class.
> > a) private constructor
> > b) public constructor
> >
> > 5. setXXX methods of the main class
> > a) setXXX methods are not allowed
> > b) setXXX methods are allowed.
> >
> > I prefer both option a). Because I think one of the reason to have the
> > builder is that we don't want the constructor public.
> > A public constructor makes it hard to maintain and evolve compatibly when
> > adding new parameters, FlinkKafkaProducer is a good example.
> > For set methods, I think in most cases, we want users to set the fields
> > eagerly (through the builder) and `setXXX` methods on the main class
> > is duplicate with the methods on the builder. We should avoid that.
> >
> >
> > Regards,
> > Jark
> >
> >
> > On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]> wrote:
> >
> >> Hi All!
> >>
> >> I would like to start a code-style related discussion regarding how we
> >> implement the builder pattern in the Flink project.
> >>
> >> It would be the best to have a common approach, there are some aspects
> of
> >> the pattern that come to my mind please feel free to add anything I
> missed:
> >>
> >> 1. Creating the builder objects:
> >>
> >> a) Always create using static method in "built" class:
> >>           Here we should have naming guidelines: .builder(..) or
> >> .xyzBuilder(...)
> >> b) Always use builder class constructor
> >> c) Mix: Again we should have some guidelines when to use which
> >>
> >> I personally prefer option a) to always have a static method to create
> the
> >> builder with static method names that end in builder.
> >>
> >> 2. Setting properties on the builder:
> >>
> >> a) withSomething(...)
> >> b) setSomething(...)
> >> c) other
> >>
> >> I don't really have a preference but either a or b for consistency.
> >>
> >>
> >> 3. Implementing the builder object:
> >>
> >> a) Immutable -> Creates a new builder object after setting a property
> >> b) Mutable -> Returns (this) after setting the property
> >>
> >> I personally prefer the mutable version as it keeps the builder
> >> implementation much simpler and it seems to be a very common way of
> doing
> >> it.
> >>
> >> What do you all think?
> >>
> >> Regards,
> >> Gyula
> >>
>
>

--

Arvid Heise | Senior Software Engineer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Kostas Kloudas-4
Hi all,

I agree with Arvid, although for point 2 I would be less strict.

@Piotr, for the side note you mentioned, and from the description you
mention in the mail for example I,
it seems that the need to pass parameters in the build() is not an
inherent need of the build pattern but it
can be mitigated by just creating sth like a StreamOperatorConfig (and
not the operator builder itself) on the
client, serialize it, and then at the TM, use the actual
StreamOperator builder with that config to create the
operator. There you can have all the needed parameters and also
perform the validation that
Dawid mention.

Again, this is just from the summary you provided, not from looking at
the code, so I may be missing something.

On the validation, I think that it should happen at the build(), as
this is the only place where
we know all the parameters.

Cheers,
Kostas

On Tue, Aug 27, 2019 at 9:47 AM Arvid Heise <[hidden email]> wrote:

>
> Hi all,
>
> I'd like to differentiate between API level builder usage and "internal"
> builder usage (for example, test harness).
>
> For API level builder, in general everything goes, as long as it aligns
> with user expectations. API level usages are also much more discussed, such
> that I'd expect them to be consistent within one API. This freedom is
> especially required when considering APIs for non-java languages.
>
> Now for "internal" usages (which may or may not align with Java Datastream
> etc. usage). I'd like to get a style that is well supported by the
> primarily used IDEs. I don't want to write a new builder from scratch and
> by the looks of it, we will get many more builders.
>
> Furthermore, I'd like to emphasize that the primary use case for using
> builders for me is to mitigate the lack of named arguments in Java, which
> is especially painful for immutable types. In an ideal world, I'd like to
> have all classes immutable and use builders to create new instances if we
> have
> a) too many parameters to be passed (which should become many more once we
> commit even more to DI),
> b) we have meaningful default values, such that we can omit a significant
> amount parameters by using a buidler, or
> c) we have a good amount of optional (=nullable) parameters.
> Obviously, we deviate from that whenever performance considerations demand
> it.
>
> With that my votes for the questions:
> 1. static method only, hide builder ctor
> 2. Intellij and avro use setX() for property X, so I'd go with that. Lombok
> just uses X(), so I wouldn't mind it.
> 3. Mutable approach. Immutable doesn't make much sense to me. Then I can
> directly go with a Wither pattern on the immutable class without builder.
> 4. Private ctor. If it has a builder, it should be used. Exception:
> migration support for some intermediate versions (if we added a builder to
> a class, keep the ctor deprecated public for 1,2 versions).
> 5. no setX in general, we want to have an immutable class. Exceptions where
> due (should be mostly for performance reasons).
>
> Best,
>
> Arvid
>
> On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]> wrote:
>
> > Hi,
> >
> > I agree with Dawid, modulo that I don’t have any preference about point 2
> > - I’m ok even with not enforcing this.
> >
> > One side note about point 4. There are use cases where passing obligatory
> > parameters in the build method itself might make sense:
> >
> > I. - when those parameters can not be or can not be easily passed via the
> > constructor. Good example of that is “builder” pattern for the
> > StreamOperators (StreamOperatorFactory), where factory is constructed on
> > the API level in the client, then it’s being serialised and sent over the
> > network and reconstructed on the TaskManager, where StreamOperator is
> > finally constructed. The issue is that some of the obligatory parameters
> > are only available on the TaskManager, so they can not be passed on a
> > DataStream level in the client.
> > II. - when builder might be used to create multiple instances of the
> > object with different values.
> >
> > Piotrek
> >
> > > On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
> > >
> > > Hi Gyula,
> > >
> > > Thanks for bringing this. I think it would be nice if we have a common
> > > approach to create builder pattern.
> > > Currently, we have a lot of builders but with different tastes.
> > >
> > >> 1. Creating the builder objects:
> > > I prefer option a) too. It would be easier for users to get the builder
> > > instance.
> > >
> > >> 2. Setting properties on the builder:
> > > I don't have a preference for it. But I think there is another option
> > might
> > > be more concise, i.e. "something()" without `with` or `set` prefix.
> > > For example:
> > >
> > > CsvTableSource source = new CsvTableSource.builder()
> > >    .path("/path/to/your/file.csv")
> > >    .field("myfield", Types.STRING)
> > >    .field("myfield2", Types.INT)
> > >    .build();
> > >
> > > This pattern is heavily used in flink-table, e.g. `TableSchema`,
> > > `TypeInference`, `BuiltInFunctionDefinition`.
> > >
> > >> 3. Implementing the builder object:
> > > I prefer  b) Mutable approach which is simpler from the implementation
> > part.
> > >
> > >
> > > Besides that, I think maybe we can add some other aspects:
> > >
> > > 4. Constructor of the main class.
> > > a) private constructor
> > > b) public constructor
> > >
> > > 5. setXXX methods of the main class
> > > a) setXXX methods are not allowed
> > > b) setXXX methods are allowed.
> > >
> > > I prefer both option a). Because I think one of the reason to have the
> > > builder is that we don't want the constructor public.
> > > A public constructor makes it hard to maintain and evolve compatibly when
> > > adding new parameters, FlinkKafkaProducer is a good example.
> > > For set methods, I think in most cases, we want users to set the fields
> > > eagerly (through the builder) and `setXXX` methods on the main class
> > > is duplicate with the methods on the builder. We should avoid that.
> > >
> > >
> > > Regards,
> > > Jark
> > >
> > >
> > > On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]> wrote:
> > >
> > >> Hi All!
> > >>
> > >> I would like to start a code-style related discussion regarding how we
> > >> implement the builder pattern in the Flink project.
> > >>
> > >> It would be the best to have a common approach, there are some aspects
> > of
> > >> the pattern that come to my mind please feel free to add anything I
> > missed:
> > >>
> > >> 1. Creating the builder objects:
> > >>
> > >> a) Always create using static method in "built" class:
> > >>           Here we should have naming guidelines: .builder(..) or
> > >> .xyzBuilder(...)
> > >> b) Always use builder class constructor
> > >> c) Mix: Again we should have some guidelines when to use which
> > >>
> > >> I personally prefer option a) to always have a static method to create
> > the
> > >> builder with static method names that end in builder.
> > >>
> > >> 2. Setting properties on the builder:
> > >>
> > >> a) withSomething(...)
> > >> b) setSomething(...)
> > >> c) other
> > >>
> > >> I don't really have a preference but either a or b for consistency.
> > >>
> > >>
> > >> 3. Implementing the builder object:
> > >>
> > >> a) Immutable -> Creates a new builder object after setting a property
> > >> b) Mutable -> Returns (this) after setting the property
> > >>
> > >> I personally prefer the mutable version as it keeps the builder
> > >> implementation much simpler and it seems to be a very common way of
> > doing
> > >> it.
> > >>
> > >> What do you all think?
> > >>
> > >> Regards,
> > >> Gyula
> > >>
> >
> >
>
> --
>
> Arvid Heise | Senior Software Engineer
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Timo Walther-2
Hi all,

great to put this code style discussion on the mailing list because I
also have found this style inconsistent in the past.

Regarding Gyula's suggestions:
1. a static method `builder()` I think IDEs are also hightlight methods
with this name
2. I would vote for a more declarative `propertyA(...).propertyB(...)`
approach instead of setters because most methods should be setters.
However, if implementers want to add methods such as `addField(..)`,
`useProcessingTime()` this sounds also fine to me.
3. mutable
Regarding Dawid's suggestions:
4. regarding required and optional parameters, I would allow both options
5. always end with `build()`

Thanks,
Timo

On 27.08.19 10:04, Kostas Kloudas wrote:

> Hi all,
>
> I agree with Arvid, although for point 2 I would be less strict.
>
> @Piotr, for the side note you mentioned, and from the description you
> mention in the mail for example I,
> it seems that the need to pass parameters in the build() is not an
> inherent need of the build pattern but it
> can be mitigated by just creating sth like a StreamOperatorConfig (and
> not the operator builder itself) on the
> client, serialize it, and then at the TM, use the actual
> StreamOperator builder with that config to create the
> operator. There you can have all the needed parameters and also
> perform the validation that
> Dawid mention.
>
> Again, this is just from the summary you provided, not from looking at
> the code, so I may be missing something.
>
> On the validation, I think that it should happen at the build(), as
> this is the only place where
> we know all the parameters.
>
> Cheers,
> Kostas
>
> On Tue, Aug 27, 2019 at 9:47 AM Arvid Heise <[hidden email]> wrote:
>> Hi all,
>>
>> I'd like to differentiate between API level builder usage and "internal"
>> builder usage (for example, test harness).
>>
>> For API level builder, in general everything goes, as long as it aligns
>> with user expectations. API level usages are also much more discussed, such
>> that I'd expect them to be consistent within one API. This freedom is
>> especially required when considering APIs for non-java languages.
>>
>> Now for "internal" usages (which may or may not align with Java Datastream
>> etc. usage). I'd like to get a style that is well supported by the
>> primarily used IDEs. I don't want to write a new builder from scratch and
>> by the looks of it, we will get many more builders.
>>
>> Furthermore, I'd like to emphasize that the primary use case for using
>> builders for me is to mitigate the lack of named arguments in Java, which
>> is especially painful for immutable types. In an ideal world, I'd like to
>> have all classes immutable and use builders to create new instances if we
>> have
>> a) too many parameters to be passed (which should become many more once we
>> commit even more to DI),
>> b) we have meaningful default values, such that we can omit a significant
>> amount parameters by using a buidler, or
>> c) we have a good amount of optional (=nullable) parameters.
>> Obviously, we deviate from that whenever performance considerations demand
>> it.
>>
>> With that my votes for the questions:
>> 1. static method only, hide builder ctor
>> 2. Intellij and avro use setX() for property X, so I'd go with that. Lombok
>> just uses X(), so I wouldn't mind it.
>> 3. Mutable approach. Immutable doesn't make much sense to me. Then I can
>> directly go with a Wither pattern on the immutable class without builder.
>> 4. Private ctor. If it has a builder, it should be used. Exception:
>> migration support for some intermediate versions (if we added a builder to
>> a class, keep the ctor deprecated public for 1,2 versions).
>> 5. no setX in general, we want to have an immutable class. Exceptions where
>> due (should be mostly for performance reasons).
>>
>> Best,
>>
>> Arvid
>>
>> On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]> wrote:
>>
>>> Hi,
>>>
>>> I agree with Dawid, modulo that I don’t have any preference about point 2
>>> - I’m ok even with not enforcing this.
>>>
>>> One side note about point 4. There are use cases where passing obligatory
>>> parameters in the build method itself might make sense:
>>>
>>> I. - when those parameters can not be or can not be easily passed via the
>>> constructor. Good example of that is “builder” pattern for the
>>> StreamOperators (StreamOperatorFactory), where factory is constructed on
>>> the API level in the client, then it’s being serialised and sent over the
>>> network and reconstructed on the TaskManager, where StreamOperator is
>>> finally constructed. The issue is that some of the obligatory parameters
>>> are only available on the TaskManager, so they can not be passed on a
>>> DataStream level in the client.
>>> II. - when builder might be used to create multiple instances of the
>>> object with different values.
>>>
>>> Piotrek
>>>
>>>> On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
>>>>
>>>> Hi Gyula,
>>>>
>>>> Thanks for bringing this. I think it would be nice if we have a common
>>>> approach to create builder pattern.
>>>> Currently, we have a lot of builders but with different tastes.
>>>>
>>>>> 1. Creating the builder objects:
>>>> I prefer option a) too. It would be easier for users to get the builder
>>>> instance.
>>>>
>>>>> 2. Setting properties on the builder:
>>>> I don't have a preference for it. But I think there is another option
>>> might
>>>> be more concise, i.e. "something()" without `with` or `set` prefix.
>>>> For example:
>>>>
>>>> CsvTableSource source = new CsvTableSource.builder()
>>>>     .path("/path/to/your/file.csv")
>>>>     .field("myfield", Types.STRING)
>>>>     .field("myfield2", Types.INT)
>>>>     .build();
>>>>
>>>> This pattern is heavily used in flink-table, e.g. `TableSchema`,
>>>> `TypeInference`, `BuiltInFunctionDefinition`.
>>>>
>>>>> 3. Implementing the builder object:
>>>> I prefer  b) Mutable approach which is simpler from the implementation
>>> part.
>>>>
>>>> Besides that, I think maybe we can add some other aspects:
>>>>
>>>> 4. Constructor of the main class.
>>>> a) private constructor
>>>> b) public constructor
>>>>
>>>> 5. setXXX methods of the main class
>>>> a) setXXX methods are not allowed
>>>> b) setXXX methods are allowed.
>>>>
>>>> I prefer both option a). Because I think one of the reason to have the
>>>> builder is that we don't want the constructor public.
>>>> A public constructor makes it hard to maintain and evolve compatibly when
>>>> adding new parameters, FlinkKafkaProducer is a good example.
>>>> For set methods, I think in most cases, we want users to set the fields
>>>> eagerly (through the builder) and `setXXX` methods on the main class
>>>> is duplicate with the methods on the builder. We should avoid that.
>>>>
>>>>
>>>> Regards,
>>>> Jark
>>>>
>>>>
>>>> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]> wrote:
>>>>
>>>>> Hi All!
>>>>>
>>>>> I would like to start a code-style related discussion regarding how we
>>>>> implement the builder pattern in the Flink project.
>>>>>
>>>>> It would be the best to have a common approach, there are some aspects
>>> of
>>>>> the pattern that come to my mind please feel free to add anything I
>>> missed:
>>>>> 1. Creating the builder objects:
>>>>>
>>>>> a) Always create using static method in "built" class:
>>>>>            Here we should have naming guidelines: .builder(..) or
>>>>> .xyzBuilder(...)
>>>>> b) Always use builder class constructor
>>>>> c) Mix: Again we should have some guidelines when to use which
>>>>>
>>>>> I personally prefer option a) to always have a static method to create
>>> the
>>>>> builder with static method names that end in builder.
>>>>>
>>>>> 2. Setting properties on the builder:
>>>>>
>>>>> a) withSomething(...)
>>>>> b) setSomething(...)
>>>>> c) other
>>>>>
>>>>> I don't really have a preference but either a or b for consistency.
>>>>>
>>>>>
>>>>> 3. Implementing the builder object:
>>>>>
>>>>> a) Immutable -> Creates a new builder object after setting a property
>>>>> b) Mutable -> Returns (this) after setting the property
>>>>>
>>>>> I personally prefer the mutable version as it keeps the builder
>>>>> implementation much simpler and it seems to be a very common way of
>>> doing
>>>>> it.
>>>>>
>>>>> What do you all think?
>>>>>
>>>>> Regards,
>>>>> Gyula
>>>>>
>>>
>> --
>>
>> Arvid Heise | Senior Software Engineer
>>
>> <https://www.ververica.com/>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>> --
>>
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen


Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Till Rohrmann
Hi all,

I would be in favour of the following convention

1. a) static method for builder creation
2. No strict rule because especially for boolean flags it might make sense
to have something like `enableX()` or `withY()` where one doesn't specify a
concrete value.
3. Mutable builders but if there is a good reason to follow the immutable
approach then I wouldn't forbid it.
4. Private constructor if there is a builder and no backwards compatibility
constraint
5. No setX methods on the instance to create

For Piotr's comment 4. I. I agree with Klou that this sounds rather like a
problem of the builder's usage than a builder problem. I think such a
scenario can easily be solved by introducing a transfer object.

Cheers,
Till

On Tue, Aug 27, 2019 at 1:46 PM Timo Walther <[hidden email]> wrote:

> Hi all,
>
> great to put this code style discussion on the mailing list because I
> also have found this style inconsistent in the past.
>
> Regarding Gyula's suggestions:
> 1. a static method `builder()` I think IDEs are also hightlight methods
> with this name
> 2. I would vote for a more declarative `propertyA(...).propertyB(...)`
> approach instead of setters because most methods should be setters.
> However, if implementers want to add methods such as `addField(..)`,
> `useProcessingTime()` this sounds also fine to me.
> 3. mutable
> Regarding Dawid's suggestions:
> 4. regarding required and optional parameters, I would allow both options
> 5. always end with `build()`
>
> Thanks,
> Timo
>
> On 27.08.19 10:04, Kostas Kloudas wrote:
> > Hi all,
> >
> > I agree with Arvid, although for point 2 I would be less strict.
> >
> > @Piotr, for the side note you mentioned, and from the description you
> > mention in the mail for example I,
> > it seems that the need to pass parameters in the build() is not an
> > inherent need of the build pattern but it
> > can be mitigated by just creating sth like a StreamOperatorConfig (and
> > not the operator builder itself) on the
> > client, serialize it, and then at the TM, use the actual
> > StreamOperator builder with that config to create the
> > operator. There you can have all the needed parameters and also
> > perform the validation that
> > Dawid mention.
> >
> > Again, this is just from the summary you provided, not from looking at
> > the code, so I may be missing something.
> >
> > On the validation, I think that it should happen at the build(), as
> > this is the only place where
> > we know all the parameters.
> >
> > Cheers,
> > Kostas
> >
> > On Tue, Aug 27, 2019 at 9:47 AM Arvid Heise <[hidden email]>
> wrote:
> >> Hi all,
> >>
> >> I'd like to differentiate between API level builder usage and "internal"
> >> builder usage (for example, test harness).
> >>
> >> For API level builder, in general everything goes, as long as it aligns
> >> with user expectations. API level usages are also much more discussed,
> such
> >> that I'd expect them to be consistent within one API. This freedom is
> >> especially required when considering APIs for non-java languages.
> >>
> >> Now for "internal" usages (which may or may not align with Java
> Datastream
> >> etc. usage). I'd like to get a style that is well supported by the
> >> primarily used IDEs. I don't want to write a new builder from scratch
> and
> >> by the looks of it, we will get many more builders.
> >>
> >> Furthermore, I'd like to emphasize that the primary use case for using
> >> builders for me is to mitigate the lack of named arguments in Java,
> which
> >> is especially painful for immutable types. In an ideal world, I'd like
> to
> >> have all classes immutable and use builders to create new instances if
> we
> >> have
> >> a) too many parameters to be passed (which should become many more once
> we
> >> commit even more to DI),
> >> b) we have meaningful default values, such that we can omit a
> significant
> >> amount parameters by using a buidler, or
> >> c) we have a good amount of optional (=nullable) parameters.
> >> Obviously, we deviate from that whenever performance considerations
> demand
> >> it.
> >>
> >> With that my votes for the questions:
> >> 1. static method only, hide builder ctor
> >> 2. Intellij and avro use setX() for property X, so I'd go with that.
> Lombok
> >> just uses X(), so I wouldn't mind it.
> >> 3. Mutable approach. Immutable doesn't make much sense to me. Then I can
> >> directly go with a Wither pattern on the immutable class without
> builder.
> >> 4. Private ctor. If it has a builder, it should be used. Exception:
> >> migration support for some intermediate versions (if we added a builder
> to
> >> a class, keep the ctor deprecated public for 1,2 versions).
> >> 5. no setX in general, we want to have an immutable class. Exceptions
> where
> >> due (should be mostly for performance reasons).
> >>
> >> Best,
> >>
> >> Arvid
> >>
> >> On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> I agree with Dawid, modulo that I don’t have any preference about
> point 2
> >>> - I’m ok even with not enforcing this.
> >>>
> >>> One side note about point 4. There are use cases where passing
> obligatory
> >>> parameters in the build method itself might make sense:
> >>>
> >>> I. - when those parameters can not be or can not be easily passed via
> the
> >>> constructor. Good example of that is “builder” pattern for the
> >>> StreamOperators (StreamOperatorFactory), where factory is constructed
> on
> >>> the API level in the client, then it’s being serialised and sent over
> the
> >>> network and reconstructed on the TaskManager, where StreamOperator is
> >>> finally constructed. The issue is that some of the obligatory
> parameters
> >>> are only available on the TaskManager, so they can not be passed on a
> >>> DataStream level in the client.
> >>> II. - when builder might be used to create multiple instances of the
> >>> object with different values.
> >>>
> >>> Piotrek
> >>>
> >>>> On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
> >>>>
> >>>> Hi Gyula,
> >>>>
> >>>> Thanks for bringing this. I think it would be nice if we have a common
> >>>> approach to create builder pattern.
> >>>> Currently, we have a lot of builders but with different tastes.
> >>>>
> >>>>> 1. Creating the builder objects:
> >>>> I prefer option a) too. It would be easier for users to get the
> builder
> >>>> instance.
> >>>>
> >>>>> 2. Setting properties on the builder:
> >>>> I don't have a preference for it. But I think there is another option
> >>> might
> >>>> be more concise, i.e. "something()" without `with` or `set` prefix.
> >>>> For example:
> >>>>
> >>>> CsvTableSource source = new CsvTableSource.builder()
> >>>>     .path("/path/to/your/file.csv")
> >>>>     .field("myfield", Types.STRING)
> >>>>     .field("myfield2", Types.INT)
> >>>>     .build();
> >>>>
> >>>> This pattern is heavily used in flink-table, e.g. `TableSchema`,
> >>>> `TypeInference`, `BuiltInFunctionDefinition`.
> >>>>
> >>>>> 3. Implementing the builder object:
> >>>> I prefer  b) Mutable approach which is simpler from the implementation
> >>> part.
> >>>>
> >>>> Besides that, I think maybe we can add some other aspects:
> >>>>
> >>>> 4. Constructor of the main class.
> >>>> a) private constructor
> >>>> b) public constructor
> >>>>
> >>>> 5. setXXX methods of the main class
> >>>> a) setXXX methods are not allowed
> >>>> b) setXXX methods are allowed.
> >>>>
> >>>> I prefer both option a). Because I think one of the reason to have the
> >>>> builder is that we don't want the constructor public.
> >>>> A public constructor makes it hard to maintain and evolve compatibly
> when
> >>>> adding new parameters, FlinkKafkaProducer is a good example.
> >>>> For set methods, I think in most cases, we want users to set the
> fields
> >>>> eagerly (through the builder) and `setXXX` methods on the main class
> >>>> is duplicate with the methods on the builder. We should avoid that.
> >>>>
> >>>>
> >>>> Regards,
> >>>> Jark
> >>>>
> >>>>
> >>>> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]>
> wrote:
> >>>>
> >>>>> Hi All!
> >>>>>
> >>>>> I would like to start a code-style related discussion regarding how
> we
> >>>>> implement the builder pattern in the Flink project.
> >>>>>
> >>>>> It would be the best to have a common approach, there are some
> aspects
> >>> of
> >>>>> the pattern that come to my mind please feel free to add anything I
> >>> missed:
> >>>>> 1. Creating the builder objects:
> >>>>>
> >>>>> a) Always create using static method in "built" class:
> >>>>>            Here we should have naming guidelines: .builder(..) or
> >>>>> .xyzBuilder(...)
> >>>>> b) Always use builder class constructor
> >>>>> c) Mix: Again we should have some guidelines when to use which
> >>>>>
> >>>>> I personally prefer option a) to always have a static method to
> create
> >>> the
> >>>>> builder with static method names that end in builder.
> >>>>>
> >>>>> 2. Setting properties on the builder:
> >>>>>
> >>>>> a) withSomething(...)
> >>>>> b) setSomething(...)
> >>>>> c) other
> >>>>>
> >>>>> I don't really have a preference but either a or b for consistency.
> >>>>>
> >>>>>
> >>>>> 3. Implementing the builder object:
> >>>>>
> >>>>> a) Immutable -> Creates a new builder object after setting a property
> >>>>> b) Mutable -> Returns (this) after setting the property
> >>>>>
> >>>>> I personally prefer the mutable version as it keeps the builder
> >>>>> implementation much simpler and it seems to be a very common way of
> >>> doing
> >>>>> it.
> >>>>>
> >>>>> What do you all think?
> >>>>>
> >>>>> Regards,
> >>>>> Gyula
> >>>>>
> >>>
> >> --
> >>
> >> Arvid Heise | Senior Software Engineer
> >>
> >> <https://www.ververica.com/>
> >>
> >> Follow us @VervericaData
> >>
> >> --
> >>
> >> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> >> Conference
> >>
> >> Stream Processing | Event Driven | Real Time
> >>
> >> --
> >>
> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >>
> >> --
> >> Ververica GmbH
> >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> >> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Piotr Nowojski-3
> For Piotr's comment 4. I. I agree with Klou that this sounds rather like a
> problem of the builder's usage than a builder problem. I think such a
> scenario can easily be solved by introducing a transfer object.

It could be solved, but that would mean we have some kind of builder/factory/descriptor that creates a factory/builder that creates a final object. I think in this specific example I would prefer to add missing runtime parameters in the build/create() method instead of asking people implementing an operator to wrap it into two layers of indirection.

However let’s not clutter this discussion, I might be missing something. Feel free to open a Jira tickets/start a new design discussion about this pattern (it’s currently partially implemented in the form of `StreamOperatorFactory` [1]).

Piotrek

[1] https://issues.apache.org/jira/browse/FLINK-11974 <https://issues.apache.org/jira/browse/FLINK-11974>

> On 27 Aug 2019, at 14:43, Till Rohrmann <[hidden email]> wrote:
>
> Hi all,
>
> I would be in favour of the following convention
>
> 1. a) static method for builder creation
> 2. No strict rule because especially for boolean flags it might make sense
> to have something like `enableX()` or `withY()` where one doesn't specify a
> concrete value.
> 3. Mutable builders but if there is a good reason to follow the immutable
> approach then I wouldn't forbid it.
> 4. Private constructor if there is a builder and no backwards compatibility
> constraint
> 5. No setX methods on the instance to create
>
> For Piotr's comment 4. I. I agree with Klou that this sounds rather like a
> problem of the builder's usage than a builder problem. I think such a
> scenario can easily be solved by introducing a transfer object.
>
> Cheers,
> Till
>
> On Tue, Aug 27, 2019 at 1:46 PM Timo Walther <[hidden email]> wrote:
>
>> Hi all,
>>
>> great to put this code style discussion on the mailing list because I
>> also have found this style inconsistent in the past.
>>
>> Regarding Gyula's suggestions:
>> 1. a static method `builder()` I think IDEs are also hightlight methods
>> with this name
>> 2. I would vote for a more declarative `propertyA(...).propertyB(...)`
>> approach instead of setters because most methods should be setters.
>> However, if implementers want to add methods such as `addField(..)`,
>> `useProcessingTime()` this sounds also fine to me.
>> 3. mutable
>> Regarding Dawid's suggestions:
>> 4. regarding required and optional parameters, I would allow both options
>> 5. always end with `build()`
>>
>> Thanks,
>> Timo
>>
>> On 27.08.19 10:04, Kostas Kloudas wrote:
>>> Hi all,
>>>
>>> I agree with Arvid, although for point 2 I would be less strict.
>>>
>>> @Piotr, for the side note you mentioned, and from the description you
>>> mention in the mail for example I,
>>> it seems that the need to pass parameters in the build() is not an
>>> inherent need of the build pattern but it
>>> can be mitigated by just creating sth like a StreamOperatorConfig (and
>>> not the operator builder itself) on the
>>> client, serialize it, and then at the TM, use the actual
>>> StreamOperator builder with that config to create the
>>> operator. There you can have all the needed parameters and also
>>> perform the validation that
>>> Dawid mention.
>>>
>>> Again, this is just from the summary you provided, not from looking at
>>> the code, so I may be missing something.
>>>
>>> On the validation, I think that it should happen at the build(), as
>>> this is the only place where
>>> we know all the parameters.
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Tue, Aug 27, 2019 at 9:47 AM Arvid Heise <[hidden email]>
>> wrote:
>>>> Hi all,
>>>>
>>>> I'd like to differentiate between API level builder usage and "internal"
>>>> builder usage (for example, test harness).
>>>>
>>>> For API level builder, in general everything goes, as long as it aligns
>>>> with user expectations. API level usages are also much more discussed,
>> such
>>>> that I'd expect them to be consistent within one API. This freedom is
>>>> especially required when considering APIs for non-java languages.
>>>>
>>>> Now for "internal" usages (which may or may not align with Java
>> Datastream
>>>> etc. usage). I'd like to get a style that is well supported by the
>>>> primarily used IDEs. I don't want to write a new builder from scratch
>> and
>>>> by the looks of it, we will get many more builders.
>>>>
>>>> Furthermore, I'd like to emphasize that the primary use case for using
>>>> builders for me is to mitigate the lack of named arguments in Java,
>> which
>>>> is especially painful for immutable types. In an ideal world, I'd like
>> to
>>>> have all classes immutable and use builders to create new instances if
>> we
>>>> have
>>>> a) too many parameters to be passed (which should become many more once
>> we
>>>> commit even more to DI),
>>>> b) we have meaningful default values, such that we can omit a
>> significant
>>>> amount parameters by using a buidler, or
>>>> c) we have a good amount of optional (=nullable) parameters.
>>>> Obviously, we deviate from that whenever performance considerations
>> demand
>>>> it.
>>>>
>>>> With that my votes for the questions:
>>>> 1. static method only, hide builder ctor
>>>> 2. Intellij and avro use setX() for property X, so I'd go with that.
>> Lombok
>>>> just uses X(), so I wouldn't mind it.
>>>> 3. Mutable approach. Immutable doesn't make much sense to me. Then I can
>>>> directly go with a Wither pattern on the immutable class without
>> builder.
>>>> 4. Private ctor. If it has a builder, it should be used. Exception:
>>>> migration support for some intermediate versions (if we added a builder
>> to
>>>> a class, keep the ctor deprecated public for 1,2 versions).
>>>> 5. no setX in general, we want to have an immutable class. Exceptions
>> where
>>>> due (should be mostly for performance reasons).
>>>>
>>>> Best,
>>>>
>>>> Arvid
>>>>
>>>> On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]>
>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I agree with Dawid, modulo that I don’t have any preference about
>> point 2
>>>>> - I’m ok even with not enforcing this.
>>>>>
>>>>> One side note about point 4. There are use cases where passing
>> obligatory
>>>>> parameters in the build method itself might make sense:
>>>>>
>>>>> I. - when those parameters can not be or can not be easily passed via
>> the
>>>>> constructor. Good example of that is “builder” pattern for the
>>>>> StreamOperators (StreamOperatorFactory), where factory is constructed
>> on
>>>>> the API level in the client, then it’s being serialised and sent over
>> the
>>>>> network and reconstructed on the TaskManager, where StreamOperator is
>>>>> finally constructed. The issue is that some of the obligatory
>> parameters
>>>>> are only available on the TaskManager, so they can not be passed on a
>>>>> DataStream level in the client.
>>>>> II. - when builder might be used to create multiple instances of the
>>>>> object with different values.
>>>>>
>>>>> Piotrek
>>>>>
>>>>>> On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
>>>>>>
>>>>>> Hi Gyula,
>>>>>>
>>>>>> Thanks for bringing this. I think it would be nice if we have a common
>>>>>> approach to create builder pattern.
>>>>>> Currently, we have a lot of builders but with different tastes.
>>>>>>
>>>>>>> 1. Creating the builder objects:
>>>>>> I prefer option a) too. It would be easier for users to get the
>> builder
>>>>>> instance.
>>>>>>
>>>>>>> 2. Setting properties on the builder:
>>>>>> I don't have a preference for it. But I think there is another option
>>>>> might
>>>>>> be more concise, i.e. "something()" without `with` or `set` prefix.
>>>>>> For example:
>>>>>>
>>>>>> CsvTableSource source = new CsvTableSource.builder()
>>>>>>    .path("/path/to/your/file.csv")
>>>>>>    .field("myfield", Types.STRING)
>>>>>>    .field("myfield2", Types.INT)
>>>>>>    .build();
>>>>>>
>>>>>> This pattern is heavily used in flink-table, e.g. `TableSchema`,
>>>>>> `TypeInference`, `BuiltInFunctionDefinition`.
>>>>>>
>>>>>>> 3. Implementing the builder object:
>>>>>> I prefer  b) Mutable approach which is simpler from the implementation
>>>>> part.
>>>>>>
>>>>>> Besides that, I think maybe we can add some other aspects:
>>>>>>
>>>>>> 4. Constructor of the main class.
>>>>>> a) private constructor
>>>>>> b) public constructor
>>>>>>
>>>>>> 5. setXXX methods of the main class
>>>>>> a) setXXX methods are not allowed
>>>>>> b) setXXX methods are allowed.
>>>>>>
>>>>>> I prefer both option a). Because I think one of the reason to have the
>>>>>> builder is that we don't want the constructor public.
>>>>>> A public constructor makes it hard to maintain and evolve compatibly
>> when
>>>>>> adding new parameters, FlinkKafkaProducer is a good example.
>>>>>> For set methods, I think in most cases, we want users to set the
>> fields
>>>>>> eagerly (through the builder) and `setXXX` methods on the main class
>>>>>> is duplicate with the methods on the builder. We should avoid that.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Jark
>>>>>>
>>>>>>
>>>>>> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Hi All!
>>>>>>>
>>>>>>> I would like to start a code-style related discussion regarding how
>> we
>>>>>>> implement the builder pattern in the Flink project.
>>>>>>>
>>>>>>> It would be the best to have a common approach, there are some
>> aspects
>>>>> of
>>>>>>> the pattern that come to my mind please feel free to add anything I
>>>>> missed:
>>>>>>> 1. Creating the builder objects:
>>>>>>>
>>>>>>> a) Always create using static method in "built" class:
>>>>>>>           Here we should have naming guidelines: .builder(..) or
>>>>>>> .xyzBuilder(...)
>>>>>>> b) Always use builder class constructor
>>>>>>> c) Mix: Again we should have some guidelines when to use which
>>>>>>>
>>>>>>> I personally prefer option a) to always have a static method to
>> create
>>>>> the
>>>>>>> builder with static method names that end in builder.
>>>>>>>
>>>>>>> 2. Setting properties on the builder:
>>>>>>>
>>>>>>> a) withSomething(...)
>>>>>>> b) setSomething(...)
>>>>>>> c) other
>>>>>>>
>>>>>>> I don't really have a preference but either a or b for consistency.
>>>>>>>
>>>>>>>
>>>>>>> 3. Implementing the builder object:
>>>>>>>
>>>>>>> a) Immutable -> Creates a new builder object after setting a property
>>>>>>> b) Mutable -> Returns (this) after setting the property
>>>>>>>
>>>>>>> I personally prefer the mutable version as it keeps the builder
>>>>>>> implementation much simpler and it seems to be a very common way of
>>>>> doing
>>>>>>> it.
>>>>>>>
>>>>>>> What do you all think?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Gyula
>>>>>>>
>>>>>
>>>> --
>>>>
>>>> Arvid Heise | Senior Software Engineer
>>>>
>>>> <https://www.ververica.com/>
>>>>
>>>> Follow us @VervericaData
>>>>
>>>> --
>>>>
>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>> Conference
>>>>
>>>> Stream Processing | Event Driven | Real Time
>>>>
>>>> --
>>>>
>>>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>>>
>>>> --
>>>> Ververica GmbH
>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [CODE-STYLE] Builder pattern

Gyula Fóra
Hi all,

Thank you all for the valuable feedback, let me try to summarize the rough
agreement:

If there is a builder for class A, then A should:
 -  Have only private ctor -> force usage of the builder
 - Be immutable (no setters)
 - Have a public static .builder(required params if not too many) method
that creates a new builder instance.

The builder:
 - Should follow a mutable pattern (return this after setting property) to
simplify code
 - Argument setter methods should follow some reasonable convention (setX,
withX, x...) whatever makes most sense in the context
 - There should be a single .build() method that constructs the instance of
the target class

These guidelines mostly applicable for internal builders. User API builders
should also follow this in most cases but we should allow deviations when
it makes code more readable or the API more user friendly.

If you have more comments don't hesitate to add them here, if everyone
agrees I will open a PR in the next few days to add this to the code-style
guide!

Cheers,
Gyula

On Wed, Aug 28, 2019 at 1:37 PM Piotr Nowojski <[hidden email]> wrote:

> > For Piotr's comment 4. I. I agree with Klou that this sounds rather like
> a
> > problem of the builder's usage than a builder problem. I think such a
> > scenario can easily be solved by introducing a transfer object.
>
> It could be solved, but that would mean we have some kind of
> builder/factory/descriptor that creates a factory/builder that creates a
> final object. I think in this specific example I would prefer to add
> missing runtime parameters in the build/create() method instead of asking
> people implementing an operator to wrap it into two layers of indirection.
>
> However let’s not clutter this discussion, I might be missing something.
> Feel free to open a Jira tickets/start a new design discussion about this
> pattern (it’s currently partially implemented in the form of
> `StreamOperatorFactory` [1]).
>
> Piotrek
>
> [1] https://issues.apache.org/jira/browse/FLINK-11974 <
> https://issues.apache.org/jira/browse/FLINK-11974>
>
> > On 27 Aug 2019, at 14:43, Till Rohrmann <[hidden email]> wrote:
> >
> > Hi all,
> >
> > I would be in favour of the following convention
> >
> > 1. a) static method for builder creation
> > 2. No strict rule because especially for boolean flags it might make
> sense
> > to have something like `enableX()` or `withY()` where one doesn't
> specify a
> > concrete value.
> > 3. Mutable builders but if there is a good reason to follow the immutable
> > approach then I wouldn't forbid it.
> > 4. Private constructor if there is a builder and no backwards
> compatibility
> > constraint
> > 5. No setX methods on the instance to create
> >
> > For Piotr's comment 4. I. I agree with Klou that this sounds rather like
> a
> > problem of the builder's usage than a builder problem. I think such a
> > scenario can easily be solved by introducing a transfer object.
> >
> > Cheers,
> > Till
> >
> > On Tue, Aug 27, 2019 at 1:46 PM Timo Walther <[hidden email]> wrote:
> >
> >> Hi all,
> >>
> >> great to put this code style discussion on the mailing list because I
> >> also have found this style inconsistent in the past.
> >>
> >> Regarding Gyula's suggestions:
> >> 1. a static method `builder()` I think IDEs are also hightlight methods
> >> with this name
> >> 2. I would vote for a more declarative `propertyA(...).propertyB(...)`
> >> approach instead of setters because most methods should be setters.
> >> However, if implementers want to add methods such as `addField(..)`,
> >> `useProcessingTime()` this sounds also fine to me.
> >> 3. mutable
> >> Regarding Dawid's suggestions:
> >> 4. regarding required and optional parameters, I would allow both
> options
> >> 5. always end with `build()`
> >>
> >> Thanks,
> >> Timo
> >>
> >> On 27.08.19 10:04, Kostas Kloudas wrote:
> >>> Hi all,
> >>>
> >>> I agree with Arvid, although for point 2 I would be less strict.
> >>>
> >>> @Piotr, for the side note you mentioned, and from the description you
> >>> mention in the mail for example I,
> >>> it seems that the need to pass parameters in the build() is not an
> >>> inherent need of the build pattern but it
> >>> can be mitigated by just creating sth like a StreamOperatorConfig (and
> >>> not the operator builder itself) on the
> >>> client, serialize it, and then at the TM, use the actual
> >>> StreamOperator builder with that config to create the
> >>> operator. There you can have all the needed parameters and also
> >>> perform the validation that
> >>> Dawid mention.
> >>>
> >>> Again, this is just from the summary you provided, not from looking at
> >>> the code, so I may be missing something.
> >>>
> >>> On the validation, I think that it should happen at the build(), as
> >>> this is the only place where
> >>> we know all the parameters.
> >>>
> >>> Cheers,
> >>> Kostas
> >>>
> >>> On Tue, Aug 27, 2019 at 9:47 AM Arvid Heise <[hidden email]>
> >> wrote:
> >>>> Hi all,
> >>>>
> >>>> I'd like to differentiate between API level builder usage and
> "internal"
> >>>> builder usage (for example, test harness).
> >>>>
> >>>> For API level builder, in general everything goes, as long as it
> aligns
> >>>> with user expectations. API level usages are also much more discussed,
> >> such
> >>>> that I'd expect them to be consistent within one API. This freedom is
> >>>> especially required when considering APIs for non-java languages.
> >>>>
> >>>> Now for "internal" usages (which may or may not align with Java
> >> Datastream
> >>>> etc. usage). I'd like to get a style that is well supported by the
> >>>> primarily used IDEs. I don't want to write a new builder from scratch
> >> and
> >>>> by the looks of it, we will get many more builders.
> >>>>
> >>>> Furthermore, I'd like to emphasize that the primary use case for using
> >>>> builders for me is to mitigate the lack of named arguments in Java,
> >> which
> >>>> is especially painful for immutable types. In an ideal world, I'd like
> >> to
> >>>> have all classes immutable and use builders to create new instances if
> >> we
> >>>> have
> >>>> a) too many parameters to be passed (which should become many more
> once
> >> we
> >>>> commit even more to DI),
> >>>> b) we have meaningful default values, such that we can omit a
> >> significant
> >>>> amount parameters by using a buidler, or
> >>>> c) we have a good amount of optional (=nullable) parameters.
> >>>> Obviously, we deviate from that whenever performance considerations
> >> demand
> >>>> it.
> >>>>
> >>>> With that my votes for the questions:
> >>>> 1. static method only, hide builder ctor
> >>>> 2. Intellij and avro use setX() for property X, so I'd go with that.
> >> Lombok
> >>>> just uses X(), so I wouldn't mind it.
> >>>> 3. Mutable approach. Immutable doesn't make much sense to me. Then I
> can
> >>>> directly go with a Wither pattern on the immutable class without
> >> builder.
> >>>> 4. Private ctor. If it has a builder, it should be used. Exception:
> >>>> migration support for some intermediate versions (if we added a
> builder
> >> to
> >>>> a class, keep the ctor deprecated public for 1,2 versions).
> >>>> 5. no setX in general, we want to have an immutable class. Exceptions
> >> where
> >>>> due (should be mostly for performance reasons).
> >>>>
> >>>> Best,
> >>>>
> >>>> Arvid
> >>>>
> >>>> On Mon, Aug 26, 2019 at 4:40 PM Piotr Nowojski <[hidden email]>
> >> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I agree with Dawid, modulo that I don’t have any preference about
> >> point 2
> >>>>> - I’m ok even with not enforcing this.
> >>>>>
> >>>>> One side note about point 4. There are use cases where passing
> >> obligatory
> >>>>> parameters in the build method itself might make sense:
> >>>>>
> >>>>> I. - when those parameters can not be or can not be easily passed via
> >> the
> >>>>> constructor. Good example of that is “builder” pattern for the
> >>>>> StreamOperators (StreamOperatorFactory), where factory is constructed
> >> on
> >>>>> the API level in the client, then it’s being serialised and sent over
> >> the
> >>>>> network and reconstructed on the TaskManager, where StreamOperator is
> >>>>> finally constructed. The issue is that some of the obligatory
> >> parameters
> >>>>> are only available on the TaskManager, so they can not be passed on a
> >>>>> DataStream level in the client.
> >>>>> II. - when builder might be used to create multiple instances of the
> >>>>> object with different values.
> >>>>>
> >>>>> Piotrek
> >>>>>
> >>>>>> On 26 Aug 2019, at 15:12, Jark Wu <[hidden email]> wrote:
> >>>>>>
> >>>>>> Hi Gyula,
> >>>>>>
> >>>>>> Thanks for bringing this. I think it would be nice if we have a
> common
> >>>>>> approach to create builder pattern.
> >>>>>> Currently, we have a lot of builders but with different tastes.
> >>>>>>
> >>>>>>> 1. Creating the builder objects:
> >>>>>> I prefer option a) too. It would be easier for users to get the
> >> builder
> >>>>>> instance.
> >>>>>>
> >>>>>>> 2. Setting properties on the builder:
> >>>>>> I don't have a preference for it. But I think there is another
> option
> >>>>> might
> >>>>>> be more concise, i.e. "something()" without `with` or `set` prefix.
> >>>>>> For example:
> >>>>>>
> >>>>>> CsvTableSource source = new CsvTableSource.builder()
> >>>>>>    .path("/path/to/your/file.csv")
> >>>>>>    .field("myfield", Types.STRING)
> >>>>>>    .field("myfield2", Types.INT)
> >>>>>>    .build();
> >>>>>>
> >>>>>> This pattern is heavily used in flink-table, e.g. `TableSchema`,
> >>>>>> `TypeInference`, `BuiltInFunctionDefinition`.
> >>>>>>
> >>>>>>> 3. Implementing the builder object:
> >>>>>> I prefer  b) Mutable approach which is simpler from the
> implementation
> >>>>> part.
> >>>>>>
> >>>>>> Besides that, I think maybe we can add some other aspects:
> >>>>>>
> >>>>>> 4. Constructor of the main class.
> >>>>>> a) private constructor
> >>>>>> b) public constructor
> >>>>>>
> >>>>>> 5. setXXX methods of the main class
> >>>>>> a) setXXX methods are not allowed
> >>>>>> b) setXXX methods are allowed.
> >>>>>>
> >>>>>> I prefer both option a). Because I think one of the reason to have
> the
> >>>>>> builder is that we don't want the constructor public.
> >>>>>> A public constructor makes it hard to maintain and evolve compatibly
> >> when
> >>>>>> adding new parameters, FlinkKafkaProducer is a good example.
> >>>>>> For set methods, I think in most cases, we want users to set the
> >> fields
> >>>>>> eagerly (through the builder) and `setXXX` methods on the main class
> >>>>>> is duplicate with the methods on the builder. We should avoid that.
> >>>>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Jark
> >>>>>>
> >>>>>>
> >>>>>> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra <[hidden email]>
> >> wrote:
> >>>>>>
> >>>>>>> Hi All!
> >>>>>>>
> >>>>>>> I would like to start a code-style related discussion regarding how
> >> we
> >>>>>>> implement the builder pattern in the Flink project.
> >>>>>>>
> >>>>>>> It would be the best to have a common approach, there are some
> >> aspects
> >>>>> of
> >>>>>>> the pattern that come to my mind please feel free to add anything I
> >>>>> missed:
> >>>>>>> 1. Creating the builder objects:
> >>>>>>>
> >>>>>>> a) Always create using static method in "built" class:
> >>>>>>>           Here we should have naming guidelines: .builder(..) or
> >>>>>>> .xyzBuilder(...)
> >>>>>>> b) Always use builder class constructor
> >>>>>>> c) Mix: Again we should have some guidelines when to use which
> >>>>>>>
> >>>>>>> I personally prefer option a) to always have a static method to
> >> create
> >>>>> the
> >>>>>>> builder with static method names that end in builder.
> >>>>>>>
> >>>>>>> 2. Setting properties on the builder:
> >>>>>>>
> >>>>>>> a) withSomething(...)
> >>>>>>> b) setSomething(...)
> >>>>>>> c) other
> >>>>>>>
> >>>>>>> I don't really have a preference but either a or b for consistency.
> >>>>>>>
> >>>>>>>
> >>>>>>> 3. Implementing the builder object:
> >>>>>>>
> >>>>>>> a) Immutable -> Creates a new builder object after setting a
> property
> >>>>>>> b) Mutable -> Returns (this) after setting the property
> >>>>>>>
> >>>>>>> I personally prefer the mutable version as it keeps the builder
> >>>>>>> implementation much simpler and it seems to be a very common way of
> >>>>> doing
> >>>>>>> it.
> >>>>>>>
> >>>>>>> What do you all think?
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Gyula
> >>>>>>>
> >>>>>
> >>>> --
> >>>>
> >>>> Arvid Heise | Senior Software Engineer
> >>>>
> >>>> <https://www.ververica.com/>
> >>>>
> >>>> Follow us @VervericaData
> >>>>
> >>>> --
> >>>>
> >>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> >>>> Conference
> >>>>
> >>>> Stream Processing | Event Driven | Real Time
> >>>>
> >>>> --
> >>>>
> >>>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >>>>
> >>>> --
> >>>> Ververica GmbH
> >>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> >>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
> >>
> >>
> >>
>
>