[DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
Hi everyone,

some of you might remember the discussion I started end of March [1]
about introducing a new Java DSL for Table API that is not embedded in a
string.

In particular, it solves the following issues:

- No possibility of deprecating functions

- Missing documentation for users

- Missing auto-completion for users

- Need to port the ExpressionParser from Scala to Java

- Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
one.

Due to shift of priorities, we could not work on it in Flink 1.9 but the
feedback at that time was positive and we should aim for 1.10 to
simplify the API with this change.

We propose the following FLIP-55:

https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing 
<https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>

Thanks for any feedback,

Timo

[1]
https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

David Anderson-2
In general I'm in favor of anything that is going to make the Table
API easier to learn and more predictable in its behavior. This
proposal kind of falls in the middle. As someone who has spent hours
in the crevices between the various flavors of the current
implementations, I certainly view keeping the various APIs and DSLs
more in sync, and making them less buggy, as highly desirable.

On the other hand, some of the details in the proposal do make the
resulting user code less pretty and less approachable than the current
Java DSL. In a training context it will be easy to teach, but I wonder
if we can find a way to make it look less alien at first glance.

David

On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:

>
> Hi everyone,
>
> some of you might remember the discussion I started end of March [1]
> about introducing a new Java DSL for Table API that is not embedded in a
> string.
>
> In particular, it solves the following issues:
>
> - No possibility of deprecating functions
>
> - Missing documentation for users
>
> - Missing auto-completion for users
>
> - Need to port the ExpressionParser from Scala to Java
>
> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
> one.
>
> Due to shift of priorities, we could not work on it in Flink 1.9 but the
> feedback at that time was positive and we should aim for 1.10 to
> simplify the API with this change.
>
> We propose the following FLIP-55:
>
> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>
> Thanks for any feedback,
>
> Timo
>
> [1]
> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
Hi David,

thanks for your feedback. With the current design, the DSL would be free
of any ambiguity but it is definitely more verbose esp. around defining
values.

I would be happy about further suggestions that make the DSL more
readable. I'm also not sure if we go for `$()` and `v()` instead of more
readable `ref()` and `val()`. This could maybe make it look less
"alien", what do you think?

Some people mentioned to overload certain methods for accepting values
or column names. E.g. `$("field").isEqual("str")` but then string values
could be confused with column names.

Thanks,
Timo

On 27.08.19 17:34, David Anderson wrote:

> In general I'm in favor of anything that is going to make the Table
> API easier to learn and more predictable in its behavior. This
> proposal kind of falls in the middle. As someone who has spent hours
> in the crevices between the various flavors of the current
> implementations, I certainly view keeping the various APIs and DSLs
> more in sync, and making them less buggy, as highly desirable.
>
> On the other hand, some of the details in the proposal do make the
> resulting user code less pretty and less approachable than the current
> Java DSL. In a training context it will be easy to teach, but I wonder
> if we can find a way to make it look less alien at first glance.
>
> David
>
> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
>> Hi everyone,
>>
>> some of you might remember the discussion I started end of March [1]
>> about introducing a new Java DSL for Table API that is not embedded in a
>> string.
>>
>> In particular, it solves the following issues:
>>
>> - No possibility of deprecating functions
>>
>> - Missing documentation for users
>>
>> - Missing auto-completion for users
>>
>> - Need to port the ExpressionParser from Scala to Java
>>
>> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
>> one.
>>
>> Due to shift of priorities, we could not work on it in Flink 1.9 but the
>> feedback at that time was positive and we should aim for 1.10 to
>> simplify the API with this change.
>>
>> We propose the following FLIP-55:
>>
>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>>
>> Thanks for any feedback,
>>
>> Timo
>>
>> [1]
>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

David Anderson-2
TImo,

While it's not exactly pretty, I don't mind the $("field") construct.
It's not particularly surprising. The v() method troubles me more; it
looks mysterious. I think we would do better to have something more
explicit. val() isn't much better -- val("foo") could be interpreted
to mean the value of the "foo" column, or a literal string.

David

On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]> wrote:

>
> Hi David,
>
> thanks for your feedback. With the current design, the DSL would be free
> of any ambiguity but it is definitely more verbose esp. around defining
> values.
>
> I would be happy about further suggestions that make the DSL more
> readable. I'm also not sure if we go for `$()` and `v()` instead of more
> readable `ref()` and `val()`. This could maybe make it look less
> "alien", what do you think?
>
> Some people mentioned to overload certain methods for accepting values
> or column names. E.g. `$("field").isEqual("str")` but then string values
> could be confused with column names.
>
> Thanks,
> Timo
>
> On 27.08.19 17:34, David Anderson wrote:
> > In general I'm in favor of anything that is going to make the Table
> > API easier to learn and more predictable in its behavior. This
> > proposal kind of falls in the middle. As someone who has spent hours
> > in the crevices between the various flavors of the current
> > implementations, I certainly view keeping the various APIs and DSLs
> > more in sync, and making them less buggy, as highly desirable.
> >
> > On the other hand, some of the details in the proposal do make the
> > resulting user code less pretty and less approachable than the current
> > Java DSL. In a training context it will be easy to teach, but I wonder
> > if we can find a way to make it look less alien at first glance.
> >
> > David
> >
> > On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
> >> Hi everyone,
> >>
> >> some of you might remember the discussion I started end of March [1]
> >> about introducing a new Java DSL for Table API that is not embedded in a
> >> string.
> >>
> >> In particular, it solves the following issues:
> >>
> >> - No possibility of deprecating functions
> >>
> >> - Missing documentation for users
> >>
> >> - Missing auto-completion for users
> >>
> >> - Need to port the ExpressionParser from Scala to Java
> >>
> >> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
> >> one.
> >>
> >> Due to shift of priorities, we could not work on it in Flink 1.9 but the
> >> feedback at that time was positive and we should aim for 1.10 to
> >> simplify the API with this change.
> >>
> >> We propose the following FLIP-55:
> >>
> >> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
> >> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
> >>
> >> Thanks for any feedback,
> >>
> >> Timo
> >>
> >> [1]
> >> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
Hi David,

thanks for your feedback. I was also skeptical about 1 char method
names, I restored the `val()` method for now. If you read literature
such as Wikipedia [1]: "literal is a notation for representing a fixed
value in source code. Almost all programming languages have notations
for atomic values". So they are also talking about "values".

Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced
that this is better.

Regards,
Timo

[1] https://en.wikipedia.org/wiki/Literal_(computer_programming)

On 27.08.19 22:10, David Anderson wrote:

> TImo,
>
> While it's not exactly pretty, I don't mind the $("field") construct.
> It's not particularly surprising. The v() method troubles me more; it
> looks mysterious. I think we would do better to have something more
> explicit. val() isn't much better -- val("foo") could be interpreted
> to mean the value of the "foo" column, or a literal string.
>
> David
>
> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]> wrote:
>> Hi David,
>>
>> thanks for your feedback. With the current design, the DSL would be free
>> of any ambiguity but it is definitely more verbose esp. around defining
>> values.
>>
>> I would be happy about further suggestions that make the DSL more
>> readable. I'm also not sure if we go for `$()` and `v()` instead of more
>> readable `ref()` and `val()`. This could maybe make it look less
>> "alien", what do you think?
>>
>> Some people mentioned to overload certain methods for accepting values
>> or column names. E.g. `$("field").isEqual("str")` but then string values
>> could be confused with column names.
>>
>> Thanks,
>> Timo
>>
>> On 27.08.19 17:34, David Anderson wrote:
>>> In general I'm in favor of anything that is going to make the Table
>>> API easier to learn and more predictable in its behavior. This
>>> proposal kind of falls in the middle. As someone who has spent hours
>>> in the crevices between the various flavors of the current
>>> implementations, I certainly view keeping the various APIs and DSLs
>>> more in sync, and making them less buggy, as highly desirable.
>>>
>>> On the other hand, some of the details in the proposal do make the
>>> resulting user code less pretty and less approachable than the current
>>> Java DSL. In a training context it will be easy to teach, but I wonder
>>> if we can find a way to make it look less alien at first glance.
>>>
>>> David
>>>
>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
>>>> Hi everyone,
>>>>
>>>> some of you might remember the discussion I started end of March [1]
>>>> about introducing a new Java DSL for Table API that is not embedded in a
>>>> string.
>>>>
>>>> In particular, it solves the following issues:
>>>>
>>>> - No possibility of deprecating functions
>>>>
>>>> - Missing documentation for users
>>>>
>>>> - Missing auto-completion for users
>>>>
>>>> - Need to port the ExpressionParser from Scala to Java
>>>>
>>>> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
>>>> one.
>>>>
>>>> Due to shift of priorities, we could not work on it in Flink 1.9 but the
>>>> feedback at that time was positive and we should aim for 1.10 to
>>>> simplify the API with this change.
>>>>
>>>> We propose the following FLIP-55:
>>>>
>>>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>>>>
>>>> Thanks for any feedback,
>>>>
>>>> Timo
>>>>
>>>> [1]
>>>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Seth Wiesman-4
I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala. Assuming the intention is to make the dsl ergonomic for Scala developers.

Seth

> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]> wrote:
>
> Hi David,
>
> thanks for your feedback. I was also skeptical about 1 char method names, I restored the `val()` method for now. If you read literature such as Wikipedia [1]: "literal is a notation for representing a fixed value in source code. Almost all programming languages have notations for atomic values". So they are also talking about "values".
>
> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced that this is better.
>
> Regards,
> Timo
>
> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
>
>> On 27.08.19 22:10, David Anderson wrote:
>> TImo,
>>
>> While it's not exactly pretty, I don't mind the $("field") construct.
>> It's not particularly surprising. The v() method troubles me more; it
>> looks mysterious. I think we would do better to have something more
>> explicit. val() isn't much better -- val("foo") could be interpreted
>> to mean the value of the "foo" column, or a literal string.
>>
>> David
>>
>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]> wrote:
>>> Hi David,
>>>
>>> thanks for your feedback. With the current design, the DSL would be free
>>> of any ambiguity but it is definitely more verbose esp. around defining
>>> values.
>>>
>>> I would be happy about further suggestions that make the DSL more
>>> readable. I'm also not sure if we go for `$()` and `v()` instead of more
>>> readable `ref()` and `val()`. This could maybe make it look less
>>> "alien", what do you think?
>>>
>>> Some people mentioned to overload certain methods for accepting values
>>> or column names. E.g. `$("field").isEqual("str")` but then string values
>>> could be confused with column names.
>>>
>>> Thanks,
>>> Timo
>>>
>>>> On 27.08.19 17:34, David Anderson wrote:
>>>> In general I'm in favor of anything that is going to make the Table
>>>> API easier to learn and more predictable in its behavior. This
>>>> proposal kind of falls in the middle. As someone who has spent hours
>>>> in the crevices between the various flavors of the current
>>>> implementations, I certainly view keeping the various APIs and DSLs
>>>> more in sync, and making them less buggy, as highly desirable.
>>>>
>>>> On the other hand, some of the details in the proposal do make the
>>>> resulting user code less pretty and less approachable than the current
>>>> Java DSL. In a training context it will be easy to teach, but I wonder
>>>> if we can find a way to make it look less alien at first glance.
>>>>
>>>> David
>>>>
>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
>>>>> Hi everyone,
>>>>>
>>>>> some of you might remember the discussion I started end of March [1]
>>>>> about introducing a new Java DSL for Table API that is not embedded in a
>>>>> string.
>>>>>
>>>>> In particular, it solves the following issues:
>>>>>
>>>>> - No possibility of deprecating functions
>>>>>
>>>>> - Missing documentation for users
>>>>>
>>>>> - Missing auto-completion for users
>>>>>
>>>>> - Need to port the ExpressionParser from Scala to Java
>>>>>
>>>>> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
>>>>> one.
>>>>>
>>>>> Due to shift of priorities, we could not work on it in Flink 1.9 but the
>>>>> feedback at that time was positive and we should aim for 1.10 to
>>>>> simplify the API with this change.
>>>>>
>>>>> We propose the following FLIP-55:
>>>>>
>>>>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>>> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>>>>>
>>>>> Thanks for any feedback,
>>>>>
>>>>> Timo
>>>>>
>>>>> [1]
>>>>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Aljoscha Krettek-2
Overall, this is a very nice development that should also simplify the code base once we deprecate the expression parser!

Regarding method names, I agree with Seth that values/literals should use something like “lit()”. I also think that for column references we could use “col()” to make it clear that it is a column reference. What do you think?

Aljoscha

> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
>
> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala. Assuming the intention is to make the dsl ergonomic for Scala developers.
>
> Seth
>
>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]> wrote:
>>
>> Hi David,
>>
>> thanks for your feedback. I was also skeptical about 1 char method names, I restored the `val()` method for now. If you read literature such as Wikipedia [1]: "literal is a notation for representing a fixed value in source code. Almost all programming languages have notations for atomic values". So they are also talking about "values".
>>
>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced that this is better.
>>
>> Regards,
>> Timo
>>
>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
>>
>>> On 27.08.19 22:10, David Anderson wrote:
>>> TImo,
>>>
>>> While it's not exactly pretty, I don't mind the $("field") construct.
>>> It's not particularly surprising. The v() method troubles me more; it
>>> looks mysterious. I think we would do better to have something more
>>> explicit. val() isn't much better -- val("foo") could be interpreted
>>> to mean the value of the "foo" column, or a literal string.
>>>
>>> David
>>>
>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]> wrote:
>>>> Hi David,
>>>>
>>>> thanks for your feedback. With the current design, the DSL would be free
>>>> of any ambiguity but it is definitely more verbose esp. around defining
>>>> values.
>>>>
>>>> I would be happy about further suggestions that make the DSL more
>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of more
>>>> readable `ref()` and `val()`. This could maybe make it look less
>>>> "alien", what do you think?
>>>>
>>>> Some people mentioned to overload certain methods for accepting values
>>>> or column names. E.g. `$("field").isEqual("str")` but then string values
>>>> could be confused with column names.
>>>>
>>>> Thanks,
>>>> Timo
>>>>
>>>>> On 27.08.19 17:34, David Anderson wrote:
>>>>> In general I'm in favor of anything that is going to make the Table
>>>>> API easier to learn and more predictable in its behavior. This
>>>>> proposal kind of falls in the middle. As someone who has spent hours
>>>>> in the crevices between the various flavors of the current
>>>>> implementations, I certainly view keeping the various APIs and DSLs
>>>>> more in sync, and making them less buggy, as highly desirable.
>>>>>
>>>>> On the other hand, some of the details in the proposal do make the
>>>>> resulting user code less pretty and less approachable than the current
>>>>> Java DSL. In a training context it will be easy to teach, but I wonder
>>>>> if we can find a way to make it look less alien at first glance.
>>>>>
>>>>> David
>>>>>
>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>> some of you might remember the discussion I started end of March [1]
>>>>>> about introducing a new Java DSL for Table API that is not embedded in a
>>>>>> string.
>>>>>>
>>>>>> In particular, it solves the following issues:
>>>>>>
>>>>>> - No possibility of deprecating functions
>>>>>>
>>>>>> - Missing documentation for users
>>>>>>
>>>>>> - Missing auto-completion for users
>>>>>>
>>>>>> - Need to port the ExpressionParser from Scala to Java
>>>>>>
>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
>>>>>> one.
>>>>>>
>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9 but the
>>>>>> feedback at that time was positive and we should aim for 1.10 to
>>>>>> simplify the API with this change.
>>>>>>
>>>>>> We propose the following FLIP-55:
>>>>>>
>>>>>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>>>> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>>>>>>
>>>>>> Thanks for any feedback,
>>>>>>
>>>>>> Timo
>>>>>>
>>>>>> [1]
>>>>>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>>>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
I'm fine with `lit()`. Regarding `col()`, I initially suggested `ref()`
but I think Fabian and Dawid liked single char methods for the most
commonly used expressions.

Btw, what is your opinion on the names of commonly used methods such as
`isEqual`, `isGreaterOrEqual`? Are we fine with the current naming.
In theory we could make them shorter like `equals(), greaterOrEqual()`
or even shorter to `eq`, `gt`, `gte`?

Thanks,
Timo


On 29.08.19 11:51, Aljoscha Krettek wrote:

> Overall, this is a very nice development that should also simplify the code base once we deprecate the expression parser!
>
> Regarding method names, I agree with Seth that values/literals should use something like “lit()”. I also think that for column references we could use “col()” to make it clear that it is a column reference. What do you think?
>
> Aljoscha
>
>> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
>>
>> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala. Assuming the intention is to make the dsl ergonomic for Scala developers.
>>
>> Seth
>>
>>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]> wrote:
>>>
>>> Hi David,
>>>
>>> thanks for your feedback. I was also skeptical about 1 char method names, I restored the `val()` method for now. If you read literature such as Wikipedia [1]: "literal is a notation for representing a fixed value in source code. Almost all programming languages have notations for atomic values". So they are also talking about "values".
>>>
>>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced that this is better.
>>>
>>> Regards,
>>> Timo
>>>
>>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
>>>
>>>> On 27.08.19 22:10, David Anderson wrote:
>>>> TImo,
>>>>
>>>> While it's not exactly pretty, I don't mind the $("field") construct.
>>>> It's not particularly surprising. The v() method troubles me more; it
>>>> looks mysterious. I think we would do better to have something more
>>>> explicit. val() isn't much better -- val("foo") could be interpreted
>>>> to mean the value of the "foo" column, or a literal string.
>>>>
>>>> David
>>>>
>>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]> wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks for your feedback. With the current design, the DSL would be free
>>>>> of any ambiguity but it is definitely more verbose esp. around defining
>>>>> values.
>>>>>
>>>>> I would be happy about further suggestions that make the DSL more
>>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of more
>>>>> readable `ref()` and `val()`. This could maybe make it look less
>>>>> "alien", what do you think?
>>>>>
>>>>> Some people mentioned to overload certain methods for accepting values
>>>>> or column names. E.g. `$("field").isEqual("str")` but then string values
>>>>> could be confused with column names.
>>>>>
>>>>> Thanks,
>>>>> Timo
>>>>>
>>>>>> On 27.08.19 17:34, David Anderson wrote:
>>>>>> In general I'm in favor of anything that is going to make the Table
>>>>>> API easier to learn and more predictable in its behavior. This
>>>>>> proposal kind of falls in the middle. As someone who has spent hours
>>>>>> in the crevices between the various flavors of the current
>>>>>> implementations, I certainly view keeping the various APIs and DSLs
>>>>>> more in sync, and making them less buggy, as highly desirable.
>>>>>>
>>>>>> On the other hand, some of the details in the proposal do make the
>>>>>> resulting user code less pretty and less approachable than the current
>>>>>> Java DSL. In a training context it will be easy to teach, but I wonder
>>>>>> if we can find a way to make it look less alien at first glance.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]> wrote:
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> some of you might remember the discussion I started end of March [1]
>>>>>>> about introducing a new Java DSL for Table API that is not embedded in a
>>>>>>> string.
>>>>>>>
>>>>>>> In particular, it solves the following issues:
>>>>>>>
>>>>>>> - No possibility of deprecating functions
>>>>>>>
>>>>>>> - Missing documentation for users
>>>>>>>
>>>>>>> - Missing auto-completion for users
>>>>>>>
>>>>>>> - Need to port the ExpressionParser from Scala to Java
>>>>>>>
>>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the Scala DSL
>>>>>>> one.
>>>>>>>
>>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9 but the
>>>>>>> feedback at that time was positive and we should aim for 1.10 to
>>>>>>> simplify the API with this change.
>>>>>>>
>>>>>>> We propose the following FLIP-55:
>>>>>>>
>>>>>>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>>>>> <https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0>
>>>>>>>
>>>>>>> Thanks for any feedback,
>>>>>>>
>>>>>>> Timo
>>>>>>>
>>>>>>> [1]
>>>>>>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Fabian Hueske-2
Hi,

IMO, we should define what we would like to optimize for:
1) easy-to-get-started experience or
2) productivity and ease-of-use

While 1) is certainly important, I think we should put more emphasis on
goal 2).
That's why I favor as short as possible names for commonly used methods
like column references and literals/values.
These are used *many* times in *every* query.
Every user who uses the API for more than 30 mins will know what $() or v()
(or whatever method names we come up with) are used for and everybody who
doesn't know can have a look at the JavaDocs or regular documentation.
Shorter method names are not only about increasing the speed to write a
query, but also reducing clutter that needs to be parsed to understand an
expression / query.

I'm OK with descriptive names for other expressions like call(),
isEqualTo() (although these could be the commonly used eq(), gte(), etc.),
and so on but column references (and literals) should be as lightweight as
possible, IMO.

Cheers,
Fabian

Am Do., 29. Aug. 2019 um 12:15 Uhr schrieb Timo Walther <[hidden email]
>:

> I'm fine with `lit()`. Regarding `col()`, I initially suggested `ref()`
> but I think Fabian and Dawid liked single char methods for the most
> commonly used expressions.
>
> Btw, what is your opinion on the names of commonly used methods such as
> `isEqual`, `isGreaterOrEqual`? Are we fine with the current naming.
> In theory we could make them shorter like `equals(), greaterOrEqual()`
> or even shorter to `eq`, `gt`, `gte`?
>
> Thanks,
> Timo
>
>
> On 29.08.19 11:51, Aljoscha Krettek wrote:
> > Overall, this is a very nice development that should also simplify the
> code base once we deprecate the expression parser!
> >
> > Regarding method names, I agree with Seth that values/literals should
> use something like “lit()”. I also think that for column references we
> could use “col()” to make it clear that it is a column reference. What do
> you think?
> >
> > Aljoscha
> >
> >> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
> >>
> >> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala.
> Assuming the intention is to make the dsl ergonomic for Scala developers.
> >>
> >> Seth
> >>
> >>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]> wrote:
> >>>
> >>> Hi David,
> >>>
> >>> thanks for your feedback. I was also skeptical about 1 char method
> names, I restored the `val()` method for now. If you read literature such
> as Wikipedia [1]: "literal is a notation for representing a fixed value in
> source code. Almost all programming languages have notations for atomic
> values". So they are also talking about "values".
> >>>
> >>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced
> that this is better.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
> >>>
> >>>> On 27.08.19 22:10, David Anderson wrote:
> >>>> TImo,
> >>>>
> >>>> While it's not exactly pretty, I don't mind the $("field") construct.
> >>>> It's not particularly surprising. The v() method troubles me more; it
> >>>> looks mysterious. I think we would do better to have something more
> >>>> explicit. val() isn't much better -- val("foo") could be interpreted
> >>>> to mean the value of the "foo" column, or a literal string.
> >>>>
> >>>> David
> >>>>
> >>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]>
> wrote:
> >>>>> Hi David,
> >>>>>
> >>>>> thanks for your feedback. With the current design, the DSL would be
> free
> >>>>> of any ambiguity but it is definitely more verbose esp. around
> defining
> >>>>> values.
> >>>>>
> >>>>> I would be happy about further suggestions that make the DSL more
> >>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of
> more
> >>>>> readable `ref()` and `val()`. This could maybe make it look less
> >>>>> "alien", what do you think?
> >>>>>
> >>>>> Some people mentioned to overload certain methods for accepting
> values
> >>>>> or column names. E.g. `$("field").isEqual("str")` but then string
> values
> >>>>> could be confused with column names.
> >>>>>
> >>>>> Thanks,
> >>>>> Timo
> >>>>>
> >>>>>> On 27.08.19 17:34, David Anderson wrote:
> >>>>>> In general I'm in favor of anything that is going to make the Table
> >>>>>> API easier to learn and more predictable in its behavior. This
> >>>>>> proposal kind of falls in the middle. As someone who has spent hours
> >>>>>> in the crevices between the various flavors of the current
> >>>>>> implementations, I certainly view keeping the various APIs and DSLs
> >>>>>> more in sync, and making them less buggy, as highly desirable.
> >>>>>>
> >>>>>> On the other hand, some of the details in the proposal do make the
> >>>>>> resulting user code less pretty and less approachable than the
> current
> >>>>>> Java DSL. In a training context it will be easy to teach, but I
> wonder
> >>>>>> if we can find a way to make it look less alien at first glance.
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]>
> wrote:
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> some of you might remember the discussion I started end of March
> [1]
> >>>>>>> about introducing a new Java DSL for Table API that is not
> embedded in a
> >>>>>>> string.
> >>>>>>>
> >>>>>>> In particular, it solves the following issues:
> >>>>>>>
> >>>>>>> - No possibility of deprecating functions
> >>>>>>>
> >>>>>>> - Missing documentation for users
> >>>>>>>
> >>>>>>> - Missing auto-completion for users
> >>>>>>>
> >>>>>>> - Need to port the ExpressionParser from Scala to Java
> >>>>>>>
> >>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the
> Scala DSL
> >>>>>>> one.
> >>>>>>>
> >>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9
> but the
> >>>>>>> feedback at that time was positive and we should aim for 1.10 to
> >>>>>>> simplify the API with this change.
> >>>>>>>
> >>>>>>> We propose the following FLIP-55:
> >>>>>>>
> >>>>>>>
> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
> >>>>>>> <
> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0
> >
> >>>>>>>
> >>>>>>> Thanks for any feedback,
> >>>>>>>
> >>>>>>> Timo
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
> >>>>>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
Hi all,

I see a majority votes for `lit(12)` so let's adopt that in the FLIP.
The `$("field")` would consider Fabian's concerns so I would vote for
keeping it like that.

One more question for native English speakers, is it acceptable to have
`isEqual` instead of `isEqualTo` and `isGreater` instead of `isGreaterThan`?

If there are no more concerns, I will start a voting thread soon.

Thanks,
Timo


On 29.08.19 12:24, Fabian Hueske wrote:

> Hi,
>
> IMO, we should define what we would like to optimize for:
> 1) easy-to-get-started experience or
> 2) productivity and ease-of-use
>
> While 1) is certainly important, I think we should put more emphasis on
> goal 2).
> That's why I favor as short as possible names for commonly used methods
> like column references and literals/values.
> These are used *many* times in *every* query.
> Every user who uses the API for more than 30 mins will know what $() or v()
> (or whatever method names we come up with) are used for and everybody who
> doesn't know can have a look at the JavaDocs or regular documentation.
> Shorter method names are not only about increasing the speed to write a
> query, but also reducing clutter that needs to be parsed to understand an
> expression / query.
>
> I'm OK with descriptive names for other expressions like call(),
> isEqualTo() (although these could be the commonly used eq(), gte(), etc.),
> and so on but column references (and literals) should be as lightweight as
> possible, IMO.
>
> Cheers,
> Fabian
>
> Am Do., 29. Aug. 2019 um 12:15 Uhr schrieb Timo Walther <[hidden email]
>> :
>> I'm fine with `lit()`. Regarding `col()`, I initially suggested `ref()`
>> but I think Fabian and Dawid liked single char methods for the most
>> commonly used expressions.
>>
>> Btw, what is your opinion on the names of commonly used methods such as
>> `isEqual`, `isGreaterOrEqual`? Are we fine with the current naming.
>> In theory we could make them shorter like `equals(), greaterOrEqual()`
>> or even shorter to `eq`, `gt`, `gte`?
>>
>> Thanks,
>> Timo
>>
>>
>> On 29.08.19 11:51, Aljoscha Krettek wrote:
>>> Overall, this is a very nice development that should also simplify the
>> code base once we deprecate the expression parser!
>>> Regarding method names, I agree with Seth that values/literals should
>> use something like “lit()”. I also think that for column references we
>> could use “col()” to make it clear that it is a column reference. What do
>> you think?
>>> Aljoscha
>>>
>>>> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
>>>>
>>>> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala.
>> Assuming the intention is to make the dsl ergonomic for Scala developers.
>>>> Seth
>>>>
>>>>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]> wrote:
>>>>>
>>>>> Hi David,
>>>>>
>>>>> thanks for your feedback. I was also skeptical about 1 char method
>> names, I restored the `val()` method for now. If you read literature such
>> as Wikipedia [1]: "literal is a notation for representing a fixed value in
>> source code. Almost all programming languages have notations for atomic
>> values". So they are also talking about "values".
>>>>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced
>> that this is better.
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
>>>>>
>>>>>> On 27.08.19 22:10, David Anderson wrote:
>>>>>> TImo,
>>>>>>
>>>>>> While it's not exactly pretty, I don't mind the $("field") construct.
>>>>>> It's not particularly surprising. The v() method troubles me more; it
>>>>>> looks mysterious. I think we would do better to have something more
>>>>>> explicit. val() isn't much better -- val("foo") could be interpreted
>>>>>> to mean the value of the "foo" column, or a literal string.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]>
>> wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> thanks for your feedback. With the current design, the DSL would be
>> free
>>>>>>> of any ambiguity but it is definitely more verbose esp. around
>> defining
>>>>>>> values.
>>>>>>>
>>>>>>> I would be happy about further suggestions that make the DSL more
>>>>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of
>> more
>>>>>>> readable `ref()` and `val()`. This could maybe make it look less
>>>>>>> "alien", what do you think?
>>>>>>>
>>>>>>> Some people mentioned to overload certain methods for accepting
>> values
>>>>>>> or column names. E.g. `$("field").isEqual("str")` but then string
>> values
>>>>>>> could be confused with column names.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Timo
>>>>>>>
>>>>>>>> On 27.08.19 17:34, David Anderson wrote:
>>>>>>>> In general I'm in favor of anything that is going to make the Table
>>>>>>>> API easier to learn and more predictable in its behavior. This
>>>>>>>> proposal kind of falls in the middle. As someone who has spent hours
>>>>>>>> in the crevices between the various flavors of the current
>>>>>>>> implementations, I certainly view keeping the various APIs and DSLs
>>>>>>>> more in sync, and making them less buggy, as highly desirable.
>>>>>>>>
>>>>>>>> On the other hand, some of the details in the proposal do make the
>>>>>>>> resulting user code less pretty and less approachable than the
>> current
>>>>>>>> Java DSL. In a training context it will be easy to teach, but I
>> wonder
>>>>>>>> if we can find a way to make it look less alien at first glance.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]>
>> wrote:
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> some of you might remember the discussion I started end of March
>> [1]
>>>>>>>>> about introducing a new Java DSL for Table API that is not
>> embedded in a
>>>>>>>>> string.
>>>>>>>>>
>>>>>>>>> In particular, it solves the following issues:
>>>>>>>>>
>>>>>>>>> - No possibility of deprecating functions
>>>>>>>>>
>>>>>>>>> - Missing documentation for users
>>>>>>>>>
>>>>>>>>> - Missing auto-completion for users
>>>>>>>>>
>>>>>>>>> - Need to port the ExpressionParser from Scala to Java
>>>>>>>>>
>>>>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the
>> Scala DSL
>>>>>>>>> one.
>>>>>>>>>
>>>>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9
>> but the
>>>>>>>>> feedback at that time was positive and we should aim for 1.10 to
>>>>>>>>> simplify the API with this change.
>>>>>>>>>
>>>>>>>>> We propose the following FLIP-55:
>>>>>>>>>
>>>>>>>>>
>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>>>>>>> <
>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0
>>>>>>>>> Thanks for any feedback,
>>>>>>>>>
>>>>>>>>> Timo
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>>
>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Rong Rong
Thanks for putting together the proposal @Timo and sorry for joining the
discussion thread late.

I also share the same thought with Fabian on the ease-of-use front. However
I was wondering if we need to start the expression design with them?
One thing I can think of is: is it possible to support "alias" later on in
the Expression once we collect enough feedback from the users?

IMO, It is always easier to expand the APIs later than reducing them.

Cheers,
Rong

On Mon, Sep 2, 2019 at 2:37 AM Timo Walther <[hidden email]> wrote:

> Hi all,
>
> I see a majority votes for `lit(12)` so let's adopt that in the FLIP.
> The `$("field")` would consider Fabian's concerns so I would vote for
> keeping it like that.
>
> One more question for native English speakers, is it acceptable to have
> `isEqual` instead of `isEqualTo` and `isGreater` instead of
> `isGreaterThan`?
>
> If there are no more concerns, I will start a voting thread soon.
>
> Thanks,
> Timo
>
>
> On 29.08.19 12:24, Fabian Hueske wrote:
> > Hi,
> >
> > IMO, we should define what we would like to optimize for:
> > 1) easy-to-get-started experience or
> > 2) productivity and ease-of-use
> >
> > While 1) is certainly important, I think we should put more emphasis on
> > goal 2).
> > That's why I favor as short as possible names for commonly used methods
> > like column references and literals/values.
> > These are used *many* times in *every* query.
> > Every user who uses the API for more than 30 mins will know what $() or
> v()
> > (or whatever method names we come up with) are used for and everybody who
> > doesn't know can have a look at the JavaDocs or regular documentation.
> > Shorter method names are not only about increasing the speed to write a
> > query, but also reducing clutter that needs to be parsed to understand an
> > expression / query.
> >
> > I'm OK with descriptive names for other expressions like call(),
> > isEqualTo() (although these could be the commonly used eq(), gte(),
> etc.),
> > and so on but column references (and literals) should be as lightweight
> as
> > possible, IMO.
> >
> > Cheers,
> > Fabian
> >
> > Am Do., 29. Aug. 2019 um 12:15 Uhr schrieb Timo Walther <
> [hidden email]
> >> :
> >> I'm fine with `lit()`. Regarding `col()`, I initially suggested `ref()`
> >> but I think Fabian and Dawid liked single char methods for the most
> >> commonly used expressions.
> >>
> >> Btw, what is your opinion on the names of commonly used methods such as
> >> `isEqual`, `isGreaterOrEqual`? Are we fine with the current naming.
> >> In theory we could make them shorter like `equals(), greaterOrEqual()`
> >> or even shorter to `eq`, `gt`, `gte`?
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 29.08.19 11:51, Aljoscha Krettek wrote:
> >>> Overall, this is a very nice development that should also simplify the
> >> code base once we deprecate the expression parser!
> >>> Regarding method names, I agree with Seth that values/literals should
> >> use something like “lit()”. I also think that for column references we
> >> could use “col()” to make it clear that it is a column reference. What
> do
> >> you think?
> >>> Aljoscha
> >>>
> >>>> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
> >>>>
> >>>> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala.
> >> Assuming the intention is to make the dsl ergonomic for Scala
> developers.
> >>>> Seth
> >>>>
> >>>>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]>
> wrote:
> >>>>>
> >>>>> Hi David,
> >>>>>
> >>>>> thanks for your feedback. I was also skeptical about 1 char method
> >> names, I restored the `val()` method for now. If you read literature
> such
> >> as Wikipedia [1]: "literal is a notation for representing a fixed value
> in
> >> source code. Almost all programming languages have notations for atomic
> >> values". So they are also talking about "values".
> >>>>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced
> >> that this is better.
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
> >>>>>
> >>>>>> On 27.08.19 22:10, David Anderson wrote:
> >>>>>> TImo,
> >>>>>>
> >>>>>> While it's not exactly pretty, I don't mind the $("field")
> construct.
> >>>>>> It's not particularly surprising. The v() method troubles me more;
> it
> >>>>>> looks mysterious. I think we would do better to have something more
> >>>>>> explicit. val() isn't much better -- val("foo") could be interpreted
> >>>>>> to mean the value of the "foo" column, or a literal string.
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]>
> >> wrote:
> >>>>>>> Hi David,
> >>>>>>>
> >>>>>>> thanks for your feedback. With the current design, the DSL would be
> >> free
> >>>>>>> of any ambiguity but it is definitely more verbose esp. around
> >> defining
> >>>>>>> values.
> >>>>>>>
> >>>>>>> I would be happy about further suggestions that make the DSL more
> >>>>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of
> >> more
> >>>>>>> readable `ref()` and `val()`. This could maybe make it look less
> >>>>>>> "alien", what do you think?
> >>>>>>>
> >>>>>>> Some people mentioned to overload certain methods for accepting
> >> values
> >>>>>>> or column names. E.g. `$("field").isEqual("str")` but then string
> >> values
> >>>>>>> could be confused with column names.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>>> On 27.08.19 17:34, David Anderson wrote:
> >>>>>>>> In general I'm in favor of anything that is going to make the
> Table
> >>>>>>>> API easier to learn and more predictable in its behavior. This
> >>>>>>>> proposal kind of falls in the middle. As someone who has spent
> hours
> >>>>>>>> in the crevices between the various flavors of the current
> >>>>>>>> implementations, I certainly view keeping the various APIs and
> DSLs
> >>>>>>>> more in sync, and making them less buggy, as highly desirable.
> >>>>>>>>
> >>>>>>>> On the other hand, some of the details in the proposal do make the
> >>>>>>>> resulting user code less pretty and less approachable than the
> >> current
> >>>>>>>> Java DSL. In a training context it will be easy to teach, but I
> >> wonder
> >>>>>>>> if we can find a way to make it look less alien at first glance.
> >>>>>>>>
> >>>>>>>> David
> >>>>>>>>
> >>>>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]
> >
> >> wrote:
> >>>>>>>>> Hi everyone,
> >>>>>>>>>
> >>>>>>>>> some of you might remember the discussion I started end of March
> >> [1]
> >>>>>>>>> about introducing a new Java DSL for Table API that is not
> >> embedded in a
> >>>>>>>>> string.
> >>>>>>>>>
> >>>>>>>>> In particular, it solves the following issues:
> >>>>>>>>>
> >>>>>>>>> - No possibility of deprecating functions
> >>>>>>>>>
> >>>>>>>>> - Missing documentation for users
> >>>>>>>>>
> >>>>>>>>> - Missing auto-completion for users
> >>>>>>>>>
> >>>>>>>>> - Need to port the ExpressionParser from Scala to Java
> >>>>>>>>>
> >>>>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the
> >> Scala DSL
> >>>>>>>>> one.
> >>>>>>>>>
> >>>>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9
> >> but the
> >>>>>>>>> feedback at that time was positive and we should aim for 1.10 to
> >>>>>>>>> simplify the API with this change.
> >>>>>>>>>
> >>>>>>>>> We propose the following FLIP-55:
> >>>>>>>>>
> >>>>>>>>>
> >>
> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
> >>>>>>>>> <
> >>
> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0
> >>>>>>>>> Thanks for any feedback,
> >>>>>>>>>
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>>>>>>
> >>
> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

Timo Walther-2
Thanks for your feedback Rong. You are right, we can still have shorter
names if the user feedback demands that. Adding additional shorter
method names is always possible. So let's stick to lit() for now.

I converted the Google document into a wiki page:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL

I would start a voting thread by tomorrow. If there are no objections.

Thanks,
Timo


On 04.09.19 02:52, Rong Rong wrote:

> Thanks for putting together the proposal @Timo and sorry for joining the
> discussion thread late.
>
> I also share the same thought with Fabian on the ease-of-use front. However
> I was wondering if we need to start the expression design with them?
> One thing I can think of is: is it possible to support "alias" later on in
> the Expression once we collect enough feedback from the users?
>
> IMO, It is always easier to expand the APIs later than reducing them.
>
> Cheers,
> Rong
>
> On Mon, Sep 2, 2019 at 2:37 AM Timo Walther <[hidden email]> wrote:
>
>> Hi all,
>>
>> I see a majority votes for `lit(12)` so let's adopt that in the FLIP.
>> The `$("field")` would consider Fabian's concerns so I would vote for
>> keeping it like that.
>>
>> One more question for native English speakers, is it acceptable to have
>> `isEqual` instead of `isEqualTo` and `isGreater` instead of
>> `isGreaterThan`?
>>
>> If there are no more concerns, I will start a voting thread soon.
>>
>> Thanks,
>> Timo
>>
>>
>> On 29.08.19 12:24, Fabian Hueske wrote:
>>> Hi,
>>>
>>> IMO, we should define what we would like to optimize for:
>>> 1) easy-to-get-started experience or
>>> 2) productivity and ease-of-use
>>>
>>> While 1) is certainly important, I think we should put more emphasis on
>>> goal 2).
>>> That's why I favor as short as possible names for commonly used methods
>>> like column references and literals/values.
>>> These are used *many* times in *every* query.
>>> Every user who uses the API for more than 30 mins will know what $() or
>> v()
>>> (or whatever method names we come up with) are used for and everybody who
>>> doesn't know can have a look at the JavaDocs or regular documentation.
>>> Shorter method names are not only about increasing the speed to write a
>>> query, but also reducing clutter that needs to be parsed to understand an
>>> expression / query.
>>>
>>> I'm OK with descriptive names for other expressions like call(),
>>> isEqualTo() (although these could be the commonly used eq(), gte(),
>> etc.),
>>> and so on but column references (and literals) should be as lightweight
>> as
>>> possible, IMO.
>>>
>>> Cheers,
>>> Fabian
>>>
>>> Am Do., 29. Aug. 2019 um 12:15 Uhr schrieb Timo Walther <
>> [hidden email]
>>>> :
>>>> I'm fine with `lit()`. Regarding `col()`, I initially suggested `ref()`
>>>> but I think Fabian and Dawid liked single char methods for the most
>>>> commonly used expressions.
>>>>
>>>> Btw, what is your opinion on the names of commonly used methods such as
>>>> `isEqual`, `isGreaterOrEqual`? Are we fine with the current naming.
>>>> In theory we could make them shorter like `equals(), greaterOrEqual()`
>>>> or even shorter to `eq`, `gt`, `gte`?
>>>>
>>>> Thanks,
>>>> Timo
>>>>
>>>>
>>>> On 29.08.19 11:51, Aljoscha Krettek wrote:
>>>>> Overall, this is a very nice development that should also simplify the
>>>> code base once we deprecate the expression parser!
>>>>> Regarding method names, I agree with Seth that values/literals should
>>>> use something like “lit()”. I also think that for column references we
>>>> could use “col()” to make it clear that it is a column reference. What
>> do
>>>> you think?
>>>>> Aljoscha
>>>>>
>>>>>> On 28. Aug 2019, at 15:59, Seth Wiesman <[hidden email]> wrote:
>>>>>>
>>>>>> I would prefer ‘lit()’ over  ‘val()’ since val is a keyword in Scala.
>>>> Assuming the intention is to make the dsl ergonomic for Scala
>> developers.
>>>>>> Seth
>>>>>>
>>>>>>> On Aug 28, 2019, at 7:58 AM, Timo Walther <[hidden email]>
>> wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> thanks for your feedback. I was also skeptical about 1 char method
>>>> names, I restored the `val()` method for now. If you read literature
>> such
>>>> as Wikipedia [1]: "literal is a notation for representing a fixed value
>> in
>>>> source code. Almost all programming languages have notations for atomic
>>>> values". So they are also talking about "values".
>>>>>>> Alteratively we could use `lit(12)` or `l(12)` but I'm not convinced
>>>> that this is better.
>>>>>>> Regards,
>>>>>>> Timo
>>>>>>>
>>>>>>> [1] https://en.wikipedia.org/wiki/Literal_(computer_programming)
>>>>>>>
>>>>>>>> On 27.08.19 22:10, David Anderson wrote:
>>>>>>>> TImo,
>>>>>>>>
>>>>>>>> While it's not exactly pretty, I don't mind the $("field")
>> construct.
>>>>>>>> It's not particularly surprising. The v() method troubles me more;
>> it
>>>>>>>> looks mysterious. I think we would do better to have something more
>>>>>>>> explicit. val() isn't much better -- val("foo") could be interpreted
>>>>>>>> to mean the value of the "foo" column, or a literal string.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> On Tue, Aug 27, 2019 at 5:45 PM Timo Walther <[hidden email]>
>>>> wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> thanks for your feedback. With the current design, the DSL would be
>>>> free
>>>>>>>>> of any ambiguity but it is definitely more verbose esp. around
>>>> defining
>>>>>>>>> values.
>>>>>>>>>
>>>>>>>>> I would be happy about further suggestions that make the DSL more
>>>>>>>>> readable. I'm also not sure if we go for `$()` and `v()` instead of
>>>> more
>>>>>>>>> readable `ref()` and `val()`. This could maybe make it look less
>>>>>>>>> "alien", what do you think?
>>>>>>>>>
>>>>>>>>> Some people mentioned to overload certain methods for accepting
>>>> values
>>>>>>>>> or column names. E.g. `$("field").isEqual("str")` but then string
>>>> values
>>>>>>>>> could be confused with column names.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Timo
>>>>>>>>>
>>>>>>>>>> On 27.08.19 17:34, David Anderson wrote:
>>>>>>>>>> In general I'm in favor of anything that is going to make the
>> Table
>>>>>>>>>> API easier to learn and more predictable in its behavior. This
>>>>>>>>>> proposal kind of falls in the middle. As someone who has spent
>> hours
>>>>>>>>>> in the crevices between the various flavors of the current
>>>>>>>>>> implementations, I certainly view keeping the various APIs and
>> DSLs
>>>>>>>>>> more in sync, and making them less buggy, as highly desirable.
>>>>>>>>>>
>>>>>>>>>> On the other hand, some of the details in the proposal do make the
>>>>>>>>>> resulting user code less pretty and less approachable than the
>>>> current
>>>>>>>>>> Java DSL. In a training context it will be easy to teach, but I
>>>> wonder
>>>>>>>>>> if we can find a way to make it look less alien at first glance.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 21, 2019 at 1:33 PM Timo Walther <[hidden email]
>>>> wrote:
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> some of you might remember the discussion I started end of March
>>>> [1]
>>>>>>>>>>> about introducing a new Java DSL for Table API that is not
>>>> embedded in a
>>>>>>>>>>> string.
>>>>>>>>>>>
>>>>>>>>>>> In particular, it solves the following issues:
>>>>>>>>>>>
>>>>>>>>>>> - No possibility of deprecating functions
>>>>>>>>>>>
>>>>>>>>>>> - Missing documentation for users
>>>>>>>>>>>
>>>>>>>>>>> - Missing auto-completion for users
>>>>>>>>>>>
>>>>>>>>>>> - Need to port the ExpressionParser from Scala to Java
>>>>>>>>>>>
>>>>>>>>>>> - Scala symbols are deprecated! A Java DSL can also enable the
>>>> Scala DSL
>>>>>>>>>>> one.
>>>>>>>>>>>
>>>>>>>>>>> Due to shift of priorities, we could not work on it in Flink 1.9
>>>> but the
>>>>>>>>>>> feedback at that time was positive and we should aim for 1.10 to
>>>>>>>>>>> simplify the API with this change.
>>>>>>>>>>>
>>>>>>>>>>> We propose the following FLIP-55:
>>>>>>>>>>>
>>>>>>>>>>>
>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit?usp=sharing
>>>>>>>>>>> <
>> https://docs.google.com/document/d/1CfaaD3j8APJDKwzIT4YsX7QD2huKTB4xlA3vnMUFJmA/edit#heading=h.jn04bfolpim0
>>>>>>>>>>> Thanks for any feedback,
>>>>>>>>>>>
>>>>>>>>>>> Timo
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>>
>> https://lists.apache.org/thread.html/e6f31d7fa53890b91be0991c2da64556a91ef0fc9ab3ffa889dacc23@%3Cdev.flink.apache.org%3E
>>