[DISCUSS] FLIP-57 - Rework FunctionCatalog

classic Classic list List threaded Threaded
73 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Dawid Wysakowicz
Hi,
Another idea to consider on top of Timo's suggestion. How about we have a
special namespace (catalog + database) for built-in objects? This catalog
would be invisible for users as Xuefu was suggesting.

Then users could still override built-in functions, if they fully qualify
object with the built-in namespace, but by default the common logic of
current dB & cat would be used.

CREATE TEMPORARY FUNCTION func ...
registers temporary function in current cat & dB

CREATE TEMPORARY FUNCTION cat.db.func ...
registers temporary function in cat db

CREATE TEMPORARY FUNCTION system.system.func ...
Overrides built-in function with temporary function

The built-in/system namespace would not be writable for permanent objects.
WDYT?

This way I think we can have benefits of both solutions.

Best,
Dawid


On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:

> Hi Bowen,
>
> I understand the potential benefit of overriding certain built-in
> functions. I'm open to such a feature if many people agree. However, it
> would be great to still support overriding catalog functions with
> temporary functions in order to prototype a query even though a
> catalog/database might not be available currently or should not be
> modified yet. How about we support both cases?
>
> CREATE TEMPORARY FUNCTION abs
> -> creates/overrides a built-in function and never consideres current
> catalog and database; inconsistent with other DDL but acceptable for
> functions I guess.
> CREATE TEMPORARY FUNCTION cat.db.fun
> -> creates/overrides a catalog function
>
> Regarding "Flink don't have any other built-in objects (tables, views)
> except functions", this might change in the near future. Take
> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
>
> Thanks,
> Timo
>
> On 14.09.19 01:40, Bowen Li wrote:
> > Hi Fabian,
> >
> > Yes, I agree 1-part/no-override is the least favorable thus I didn't
> > include that as a voting option, and the discussion is mainly between
> > 1-part/override builtin and 3-part/not override builtin.
> >
> > Re > However, it means that temp functions are differently treated than
> > other db objects.
> > IMO, the treatment difference results from the fact that functions are a
> > bit different from other objects - Flink don't have any other built-in
> > objects (tables, views) except functions.
> >
> > Cheers,
> > Bowen
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Xuefu Z
Thanks to Tmo and Dawid for sharing thoughts.

It seems to me that there is a general consensus on having temp functions
that have no namespaces and overwrite built-in functions. (As a side note
for comparability, the current user defined functions are all temporary and
having no namespaces.)

Nevertheless, I can also see the merit of having namespaced temp functions
that can overwrite functions defined in a specific cat/db. However,  this
idea appears orthogonal to the former and can be added incrementally.

How about we first implement non-namespaced temp functions now and leave
the door open for namespaced ones for later releases as the requirement
might become more crystal? This also helps shorten the debate and allow us
to make some progress along this direction.

As to Dawid's idea of having a dedicated cat/db to host the temporary temp
functions that don't have namespaces, my only concern is the special
treatment for a cat/db, which makes code less clean, as evident in treating
the built-in catalog currently.

Thanks,
Xuefiu

On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Hi,
> Another idea to consider on top of Timo's suggestion. How about we have a
> special namespace (catalog + database) for built-in objects? This catalog
> would be invisible for users as Xuefu was suggesting.
>
> Then users could still override built-in functions, if they fully qualify
> object with the built-in namespace, but by default the common logic of
> current dB & cat would be used.
>
> CREATE TEMPORARY FUNCTION func ...
> registers temporary function in current cat & dB
>
> CREATE TEMPORARY FUNCTION cat.db.func ...
> registers temporary function in cat db
>
> CREATE TEMPORARY FUNCTION system.system.func ...
> Overrides built-in function with temporary function
>
> The built-in/system namespace would not be writable for permanent objects.
> WDYT?
>
> This way I think we can have benefits of both solutions.
>
> Best,
> Dawid
>
>
> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
>
> > Hi Bowen,
> >
> > I understand the potential benefit of overriding certain built-in
> > functions. I'm open to such a feature if many people agree. However, it
> > would be great to still support overriding catalog functions with
> > temporary functions in order to prototype a query even though a
> > catalog/database might not be available currently or should not be
> > modified yet. How about we support both cases?
> >
> > CREATE TEMPORARY FUNCTION abs
> > -> creates/overrides a built-in function and never consideres current
> > catalog and database; inconsistent with other DDL but acceptable for
> > functions I guess.
> > CREATE TEMPORARY FUNCTION cat.db.fun
> > -> creates/overrides a catalog function
> >
> > Regarding "Flink don't have any other built-in objects (tables, views)
> > except functions", this might change in the near future. Take
> > https://issues.apache.org/jira/browse/FLINK-13900 as an example.
> >
> > Thanks,
> > Timo
> >
> > On 14.09.19 01:40, Bowen Li wrote:
> > > Hi Fabian,
> > >
> > > Yes, I agree 1-part/no-override is the least favorable thus I didn't
> > > include that as a voting option, and the discussion is mainly between
> > > 1-part/override builtin and 3-part/not override builtin.
> > >
> > > Re > However, it means that temp functions are differently treated than
> > > other db objects.
> > > IMO, the treatment difference results from the fact that functions are
> a
> > > bit different from other objects - Flink don't have any other built-in
> > > objects (tables, views) except functions.
> > >
> > > Cheers,
> > > Bowen
> > >
> >
> >
>


--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

kai wang
hi, everyone
I think this flip is very meaningful. it supports functions that can be
shared by different catalogs and dbs, reducing the duplication of functions.

Our group based on flink's sql parser module implements create function
feature, stores the parsed function metadata and schema into mysql, and
also customizes the catalog, customizes sql-client to support custom
schemas and functions. Loaded, but the function is currently global, and is
not subdivided according to catalog and db.

In addition, I very much hope to participate in the development of this
flip, I have been paying attention to the community, but found it is more
difficult to join.
 thank you.

Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:

> Thanks to Tmo and Dawid for sharing thoughts.
>
> It seems to me that there is a general consensus on having temp functions
> that have no namespaces and overwrite built-in functions. (As a side note
> for comparability, the current user defined functions are all temporary and
> having no namespaces.)
>
> Nevertheless, I can also see the merit of having namespaced temp functions
> that can overwrite functions defined in a specific cat/db. However,  this
> idea appears orthogonal to the former and can be added incrementally.
>
> How about we first implement non-namespaced temp functions now and leave
> the door open for namespaced ones for later releases as the requirement
> might become more crystal? This also helps shorten the debate and allow us
> to make some progress along this direction.
>
> As to Dawid's idea of having a dedicated cat/db to host the temporary temp
> functions that don't have namespaces, my only concern is the special
> treatment for a cat/db, which makes code less clean, as evident in treating
> the built-in catalog currently.
>
> Thanks,
> Xuefiu
>
> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> [hidden email]>
> wrote:
>
> > Hi,
> > Another idea to consider on top of Timo's suggestion. How about we have a
> > special namespace (catalog + database) for built-in objects? This catalog
> > would be invisible for users as Xuefu was suggesting.
> >
> > Then users could still override built-in functions, if they fully qualify
> > object with the built-in namespace, but by default the common logic of
> > current dB & cat would be used.
> >
> > CREATE TEMPORARY FUNCTION func ...
> > registers temporary function in current cat & dB
> >
> > CREATE TEMPORARY FUNCTION cat.db.func ...
> > registers temporary function in cat db
> >
> > CREATE TEMPORARY FUNCTION system.system.func ...
> > Overrides built-in function with temporary function
> >
> > The built-in/system namespace would not be writable for permanent
> objects.
> > WDYT?
> >
> > This way I think we can have benefits of both solutions.
> >
> > Best,
> > Dawid
> >
> >
> > On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
> >
> > > Hi Bowen,
> > >
> > > I understand the potential benefit of overriding certain built-in
> > > functions. I'm open to such a feature if many people agree. However, it
> > > would be great to still support overriding catalog functions with
> > > temporary functions in order to prototype a query even though a
> > > catalog/database might not be available currently or should not be
> > > modified yet. How about we support both cases?
> > >
> > > CREATE TEMPORARY FUNCTION abs
> > > -> creates/overrides a built-in function and never consideres current
> > > catalog and database; inconsistent with other DDL but acceptable for
> > > functions I guess.
> > > CREATE TEMPORARY FUNCTION cat.db.fun
> > > -> creates/overrides a catalog function
> > >
> > > Regarding "Flink don't have any other built-in objects (tables, views)
> > > except functions", this might change in the near future. Take
> > > https://issues.apache.org/jira/browse/FLINK-13900 as an example.
> > >
> > > Thanks,
> > > Timo
> > >
> > > On 14.09.19 01:40, Bowen Li wrote:
> > > > Hi Fabian,
> > > >
> > > > Yes, I agree 1-part/no-override is the least favorable thus I didn't
> > > > include that as a voting option, and the discussion is mainly between
> > > > 1-part/override builtin and 3-part/not override builtin.
> > > >
> > > > Re > However, it means that temp functions are differently treated
> than
> > > > other db objects.
> > > > IMO, the treatment difference results from the fact that functions
> are
> > a
> > > > bit different from other objects - Flink don't have any other
> built-in
> > > > objects (tables, views) except functions.
> > > >
> > > > Cheers,
> > > > Bowen
> > > >
> > >
> > >
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Timo Walther-2
Hi everyone,

@Xuefu: I would like to avoid adding too many things incrementally.
Users should be able to override all catalog objects consistently
according to FLIP-64 (Support for Temporary Objects in Table module). If
functions are treated completely different, we need more code and
special cases. From an implementation perspective, this topic only
affects the lookup logic which is rather low implementation effort which
is why I would like to clarify the remaining items. As you said, we have
a slight consenus on overriding built-in functions; we should also
strive for reaching consensus on the remaining topics.

@Dawid: I like your idea as it ensures registering catalog objects
consistent and the overriding of built-in functions more explicit.

Thanks,
Timo


On 17.09.19 11:59, kai wang wrote:

> hi, everyone
> I think this flip is very meaningful. it supports functions that can be
> shared by different catalogs and dbs, reducing the duplication of functions.
>
> Our group based on flink's sql parser module implements create function
> feature, stores the parsed function metadata and schema into mysql, and
> also customizes the catalog, customizes sql-client to support custom
> schemas and functions. Loaded, but the function is currently global, and is
> not subdivided according to catalog and db.
>
> In addition, I very much hope to participate in the development of this
> flip, I have been paying attention to the community, but found it is more
> difficult to join.
>   thank you.
>
> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>
>> Thanks to Tmo and Dawid for sharing thoughts.
>>
>> It seems to me that there is a general consensus on having temp functions
>> that have no namespaces and overwrite built-in functions. (As a side note
>> for comparability, the current user defined functions are all temporary and
>> having no namespaces.)
>>
>> Nevertheless, I can also see the merit of having namespaced temp functions
>> that can overwrite functions defined in a specific cat/db. However,  this
>> idea appears orthogonal to the former and can be added incrementally.
>>
>> How about we first implement non-namespaced temp functions now and leave
>> the door open for namespaced ones for later releases as the requirement
>> might become more crystal? This also helps shorten the debate and allow us
>> to make some progress along this direction.
>>
>> As to Dawid's idea of having a dedicated cat/db to host the temporary temp
>> functions that don't have namespaces, my only concern is the special
>> treatment for a cat/db, which makes code less clean, as evident in treating
>> the built-in catalog currently.
>>
>> Thanks,
>> Xuefiu
>>
>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>> [hidden email]>
>> wrote:
>>
>>> Hi,
>>> Another idea to consider on top of Timo's suggestion. How about we have a
>>> special namespace (catalog + database) for built-in objects? This catalog
>>> would be invisible for users as Xuefu was suggesting.
>>>
>>> Then users could still override built-in functions, if they fully qualify
>>> object with the built-in namespace, but by default the common logic of
>>> current dB & cat would be used.
>>>
>>> CREATE TEMPORARY FUNCTION func ...
>>> registers temporary function in current cat & dB
>>>
>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>> registers temporary function in cat db
>>>
>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>> Overrides built-in function with temporary function
>>>
>>> The built-in/system namespace would not be writable for permanent
>> objects.
>>> WDYT?
>>>
>>> This way I think we can have benefits of both solutions.
>>>
>>> Best,
>>> Dawid
>>>
>>>
>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
>>>
>>>> Hi Bowen,
>>>>
>>>> I understand the potential benefit of overriding certain built-in
>>>> functions. I'm open to such a feature if many people agree. However, it
>>>> would be great to still support overriding catalog functions with
>>>> temporary functions in order to prototype a query even though a
>>>> catalog/database might not be available currently or should not be
>>>> modified yet. How about we support both cases?
>>>>
>>>> CREATE TEMPORARY FUNCTION abs
>>>> -> creates/overrides a built-in function and never consideres current
>>>> catalog and database; inconsistent with other DDL but acceptable for
>>>> functions I guess.
>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>> -> creates/overrides a catalog function
>>>>
>>>> Regarding "Flink don't have any other built-in objects (tables, views)
>>>> except functions", this might change in the near future. Take
>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
>>>>
>>>> Thanks,
>>>> Timo
>>>>
>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>> Hi Fabian,
>>>>>
>>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't
>>>>> include that as a voting option, and the discussion is mainly between
>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>
>>>>> Re > However, it means that temp functions are differently treated
>> than
>>>>> other db objects.
>>>>> IMO, the treatment difference results from the fact that functions
>> are
>>> a
>>>>> bit different from other objects - Flink don't have any other
>> built-in
>>>>> objects (tables, views) except functions.
>>>>>
>>>>> Cheers,
>>>>> Bowen
>>>>>
>>>>
>>
>> --
>> Xuefu Zhang
>>
>> "In Honey We Trust!"
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Jark Wu-2
Hi,

+1 to strive for reaching consensus on the remaining topics. We are close to the truth. It will waste a lot of time if we resume the topic some time later.

+1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way to override a catalog function.

I’m not sure about “system.system.fun”, it introduces a nonexistent cat & db? And we still need to do special treatment for the dedicated system.system cat & db?

Best,
Jark


> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>
> Hi everyone,
>
> @Xuefu: I would like to avoid adding too many things incrementally. Users should be able to override all catalog objects consistently according to FLIP-64 (Support for Temporary Objects in Table module). If functions are treated completely different, we need more code and special cases. From an implementation perspective, this topic only affects the lookup logic which is rather low implementation effort which is why I would like to clarify the remaining items. As you said, we have a slight consenus on overriding built-in functions; we should also strive for reaching consensus on the remaining topics.
>
> @Dawid: I like your idea as it ensures registering catalog objects consistent and the overriding of built-in functions more explicit.
>
> Thanks,
> Timo
>
>
> On 17.09.19 11:59, kai wang wrote:
>> hi, everyone
>> I think this flip is very meaningful. it supports functions that can be
>> shared by different catalogs and dbs, reducing the duplication of functions.
>>
>> Our group based on flink's sql parser module implements create function
>> feature, stores the parsed function metadata and schema into mysql, and
>> also customizes the catalog, customizes sql-client to support custom
>> schemas and functions. Loaded, but the function is currently global, and is
>> not subdivided according to catalog and db.
>>
>> In addition, I very much hope to participate in the development of this
>> flip, I have been paying attention to the community, but found it is more
>> difficult to join.
>>  thank you.
>>
>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>>
>>> Thanks to Tmo and Dawid for sharing thoughts.
>>>
>>> It seems to me that there is a general consensus on having temp functions
>>> that have no namespaces and overwrite built-in functions. (As a side note
>>> for comparability, the current user defined functions are all temporary and
>>> having no namespaces.)
>>>
>>> Nevertheless, I can also see the merit of having namespaced temp functions
>>> that can overwrite functions defined in a specific cat/db. However,  this
>>> idea appears orthogonal to the former and can be added incrementally.
>>>
>>> How about we first implement non-namespaced temp functions now and leave
>>> the door open for namespaced ones for later releases as the requirement
>>> might become more crystal? This also helps shorten the debate and allow us
>>> to make some progress along this direction.
>>>
>>> As to Dawid's idea of having a dedicated cat/db to host the temporary temp
>>> functions that don't have namespaces, my only concern is the special
>>> treatment for a cat/db, which makes code less clean, as evident in treating
>>> the built-in catalog currently.
>>>
>>> Thanks,
>>> Xuefiu
>>>
>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>>> [hidden email]>
>>> wrote:
>>>
>>>> Hi,
>>>> Another idea to consider on top of Timo's suggestion. How about we have a
>>>> special namespace (catalog + database) for built-in objects? This catalog
>>>> would be invisible for users as Xuefu was suggesting.
>>>>
>>>> Then users could still override built-in functions, if they fully qualify
>>>> object with the built-in namespace, but by default the common logic of
>>>> current dB & cat would be used.
>>>>
>>>> CREATE TEMPORARY FUNCTION func ...
>>>> registers temporary function in current cat & dB
>>>>
>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>>> registers temporary function in cat db
>>>>
>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>>> Overrides built-in function with temporary function
>>>>
>>>> The built-in/system namespace would not be writable for permanent
>>> objects.
>>>> WDYT?
>>>>
>>>> This way I think we can have benefits of both solutions.
>>>>
>>>> Best,
>>>> Dawid
>>>>
>>>>
>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
>>>>
>>>>> Hi Bowen,
>>>>>
>>>>> I understand the potential benefit of overriding certain built-in
>>>>> functions. I'm open to such a feature if many people agree. However, it
>>>>> would be great to still support overriding catalog functions with
>>>>> temporary functions in order to prototype a query even though a
>>>>> catalog/database might not be available currently or should not be
>>>>> modified yet. How about we support both cases?
>>>>>
>>>>> CREATE TEMPORARY FUNCTION abs
>>>>> -> creates/overrides a built-in function and never consideres current
>>>>> catalog and database; inconsistent with other DDL but acceptable for
>>>>> functions I guess.
>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>>> -> creates/overrides a catalog function
>>>>>
>>>>> Regarding "Flink don't have any other built-in objects (tables, views)
>>>>> except functions", this might change in the near future. Take
>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
>>>>>
>>>>> Thanks,
>>>>> Timo
>>>>>
>>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>>> Hi Fabian,
>>>>>>
>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't
>>>>>> include that as a voting option, and the discussion is mainly between
>>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>>
>>>>>> Re > However, it means that temp functions are differently treated
>>> than
>>>>>> other db objects.
>>>>>> IMO, the treatment difference results from the fact that functions
>>> are
>>>> a
>>>>>> bit different from other objects - Flink don't have any other
>>> built-in
>>>>>> objects (tables, views) except functions.
>>>>>>
>>>>>> Cheers,
>>>>>> Bowen
>>>>>>
>>>>>
>>>
>>> --
>>> Xuefu Zhang
>>>
>>> "In Honey We Trust!"
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Aljoscha Krettek-2
Hi,

I think this discussion and the one for FLIP-64 are very connected. To resolve the differences, think we have to think about the basic principles and find consensus there. The basic questions I see are:

 - Do we want to support overriding builtin functions?
 - Do we want to support overriding catalog functions?
 - And then later: should temporary functions be tied to a catalog/database?

I don’t have much to say about these, except that we should somewhat stick to what the industry does. But I also understand that the industry is already very divided on this.

Best,
Aljoscha

> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>
> Hi,
>
> +1 to strive for reaching consensus on the remaining topics. We are close to the truth. It will waste a lot of time if we resume the topic some time later.
>
> +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way to override a catalog function.
>
> I’m not sure about “system.system.fun”, it introduces a nonexistent cat & db? And we still need to do special treatment for the dedicated system.system cat & db?
>
> Best,
> Jark
>
>
>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>>
>> Hi everyone,
>>
>> @Xuefu: I would like to avoid adding too many things incrementally. Users should be able to override all catalog objects consistently according to FLIP-64 (Support for Temporary Objects in Table module). If functions are treated completely different, we need more code and special cases. From an implementation perspective, this topic only affects the lookup logic which is rather low implementation effort which is why I would like to clarify the remaining items. As you said, we have a slight consenus on overriding built-in functions; we should also strive for reaching consensus on the remaining topics.
>>
>> @Dawid: I like your idea as it ensures registering catalog objects consistent and the overriding of built-in functions more explicit.
>>
>> Thanks,
>> Timo
>>
>>
>> On 17.09.19 11:59, kai wang wrote:
>>> hi, everyone
>>> I think this flip is very meaningful. it supports functions that can be
>>> shared by different catalogs and dbs, reducing the duplication of functions.
>>>
>>> Our group based on flink's sql parser module implements create function
>>> feature, stores the parsed function metadata and schema into mysql, and
>>> also customizes the catalog, customizes sql-client to support custom
>>> schemas and functions. Loaded, but the function is currently global, and is
>>> not subdivided according to catalog and db.
>>>
>>> In addition, I very much hope to participate in the development of this
>>> flip, I have been paying attention to the community, but found it is more
>>> difficult to join.
>>> thank you.
>>>
>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>>>
>>>> Thanks to Tmo and Dawid for sharing thoughts.
>>>>
>>>> It seems to me that there is a general consensus on having temp functions
>>>> that have no namespaces and overwrite built-in functions. (As a side note
>>>> for comparability, the current user defined functions are all temporary and
>>>> having no namespaces.)
>>>>
>>>> Nevertheless, I can also see the merit of having namespaced temp functions
>>>> that can overwrite functions defined in a specific cat/db. However,  this
>>>> idea appears orthogonal to the former and can be added incrementally.
>>>>
>>>> How about we first implement non-namespaced temp functions now and leave
>>>> the door open for namespaced ones for later releases as the requirement
>>>> might become more crystal? This also helps shorten the debate and allow us
>>>> to make some progress along this direction.
>>>>
>>>> As to Dawid's idea of having a dedicated cat/db to host the temporary temp
>>>> functions that don't have namespaces, my only concern is the special
>>>> treatment for a cat/db, which makes code less clean, as evident in treating
>>>> the built-in catalog currently.
>>>>
>>>> Thanks,
>>>> Xuefiu
>>>>
>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>>>> [hidden email]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> Another idea to consider on top of Timo's suggestion. How about we have a
>>>>> special namespace (catalog + database) for built-in objects? This catalog
>>>>> would be invisible for users as Xuefu was suggesting.
>>>>>
>>>>> Then users could still override built-in functions, if they fully qualify
>>>>> object with the built-in namespace, but by default the common logic of
>>>>> current dB & cat would be used.
>>>>>
>>>>> CREATE TEMPORARY FUNCTION func ...
>>>>> registers temporary function in current cat & dB
>>>>>
>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>>>> registers temporary function in cat db
>>>>>
>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>>>> Overrides built-in function with temporary function
>>>>>
>>>>> The built-in/system namespace would not be writable for permanent
>>>> objects.
>>>>> WDYT?
>>>>>
>>>>> This way I think we can have benefits of both solutions.
>>>>>
>>>>> Best,
>>>>> Dawid
>>>>>
>>>>>
>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
>>>>>
>>>>>> Hi Bowen,
>>>>>>
>>>>>> I understand the potential benefit of overriding certain built-in
>>>>>> functions. I'm open to such a feature if many people agree. However, it
>>>>>> would be great to still support overriding catalog functions with
>>>>>> temporary functions in order to prototype a query even though a
>>>>>> catalog/database might not be available currently or should not be
>>>>>> modified yet. How about we support both cases?
>>>>>>
>>>>>> CREATE TEMPORARY FUNCTION abs
>>>>>> -> creates/overrides a built-in function and never consideres current
>>>>>> catalog and database; inconsistent with other DDL but acceptable for
>>>>>> functions I guess.
>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>>>> -> creates/overrides a catalog function
>>>>>>
>>>>>> Regarding "Flink don't have any other built-in objects (tables, views)
>>>>>> except functions", this might change in the near future. Take
>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
>>>>>>
>>>>>> Thanks,
>>>>>> Timo
>>>>>>
>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>>>> Hi Fabian,
>>>>>>>
>>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't
>>>>>>> include that as a voting option, and the discussion is mainly between
>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>>>
>>>>>>> Re > However, it means that temp functions are differently treated
>>>> than
>>>>>>> other db objects.
>>>>>>> IMO, the treatment difference results from the fact that functions
>>>> are
>>>>> a
>>>>>>> bit different from other objects - Flink don't have any other
>>>> built-in
>>>>>>> objects (tables, views) except functions.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Bowen
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Xuefu Zhang
>>>>
>>>> "In Honey We Trust!"
>>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Xuefu Z
Hi Aljoscha,

Thanks for the summary and these are great questions to be answered. The
answer to your first question is clear: there is a general agreement to
override built-in functions with temp functions.

However, your second and third questions are sort of related, as a function
reference can be either just function name (like "func") or in the form or
"cat.db.func". When a reference is just function name, it can mean either a
built-in function or a function defined in the current cat/db. If we
support overriding a built-in function with a temp function, such
overriding can also cover a function in the current cat/db.

I think what Timo referred as "overriding a catalog function" means a temp
function defined as "cat.db.func" overrides a catalog function "func" in
cat/db even if cat/db is not current. To support this, temp function has to
be tied to a cat/db. What's why I said above that the 2nd and 3rd questions
are related. The problem with such support is the ambiguity when user
defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
Here "func" can means a global temp function, or a temp function in current
cat/db. If we can assume the former, this creates an inconsistency because
"CREATE FUNCTION func" actually means a function in current cat/db. If we
assume the latter, then there is no way for user to create a global temp
function.

Giving a special namespace for built-in functions may solve the ambiguity
problem above, but it also introduces artificial catalog/database that
needs special treatment and pollutes the cleanness of  the code. I would
rather introduce a syntax in DDL to solve the problem, like "CREATE
[GLOBAL] TEMPORARY FUNCTION func".

Thus, I'd like to summarize a few candidate proposals for voting purposes:

1. Support only global, temporary functions without namespace. Such temp
functions overrides built-in functions and catalog functions in current
cat/db. The resolution order is: temp functions -> built-in functions ->
catalog functions. (Partially or fully qualified functions has no
ambiguity!)

2. In addition to #1, support creating and referencing temporary functions
associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
functions. The resolution order is: global temp functions -> built-in
functions -> temp functions in current cat/db -> catalog function.
(Resolution for partially or fully qualified function reference is: temp
functions -> persistent functions.)

3. In addition to #1, support creating and referencing temporary functions
associated with a cat/db with a special namespace for built-in functions
and global temp functions. The resolution is the same as #2, except that
the special namespace might be prefixed to a reference to a built-in
function or global temp function. (In absence of the special namespace, the
resolution order is the same as in #2.)

My personal preference is #1, given the unknown use case and introduced
complexity for #2 and #3. However, #2 is an acceptable alternative. Thus,
my votes are:

+1 for #1
+0 for #2
-1 for #3

Everyone, please cast your vote (in above format please!), or let me know
if you have more questions or other candidates.

Thanks,
Xuefu







On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
wrote:

> Hi,
>
> I think this discussion and the one for FLIP-64 are very connected. To
> resolve the differences, think we have to think about the basic principles
> and find consensus there. The basic questions I see are:
>
>  - Do we want to support overriding builtin functions?
>  - Do we want to support overriding catalog functions?
>  - And then later: should temporary functions be tied to a
> catalog/database?
>
> I don’t have much to say about these, except that we should somewhat stick
> to what the industry does. But I also understand that the industry is
> already very divided on this.
>
> Best,
> Aljoscha
>
> > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> >
> > Hi,
> >
> > +1 to strive for reaching consensus on the remaining topics. We are
> close to the truth. It will waste a lot of time if we resume the topic some
> time later.
> >
> > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way
> to override a catalog function.
> >
> > I’m not sure about “system.system.fun”, it introduces a nonexistent cat
> & db? And we still need to do special treatment for the dedicated
> system.system cat & db?
> >
> > Best,
> > Jark
> >
> >
> >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> >>
> >> Hi everyone,
> >>
> >> @Xuefu: I would like to avoid adding too many things incrementally.
> Users should be able to override all catalog objects consistently according
> to FLIP-64 (Support for Temporary Objects in Table module). If functions
> are treated completely different, we need more code and special cases. From
> an implementation perspective, this topic only affects the lookup logic
> which is rather low implementation effort which is why I would like to
> clarify the remaining items. As you said, we have a slight consenus on
> overriding built-in functions; we should also strive for reaching consensus
> on the remaining topics.
> >>
> >> @Dawid: I like your idea as it ensures registering catalog objects
> consistent and the overriding of built-in functions more explicit.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 17.09.19 11:59, kai wang wrote:
> >>> hi, everyone
> >>> I think this flip is very meaningful. it supports functions that can be
> >>> shared by different catalogs and dbs, reducing the duplication of
> functions.
> >>>
> >>> Our group based on flink's sql parser module implements create function
> >>> feature, stores the parsed function metadata and schema into mysql, and
> >>> also customizes the catalog, customizes sql-client to support custom
> >>> schemas and functions. Loaded, but the function is currently global,
> and is
> >>> not subdivided according to catalog and db.
> >>>
> >>> In addition, I very much hope to participate in the development of this
> >>> flip, I have been paying attention to the community, but found it is
> more
> >>> difficult to join.
> >>> thank you.
> >>>
> >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> >>>
> >>>> Thanks to Tmo and Dawid for sharing thoughts.
> >>>>
> >>>> It seems to me that there is a general consensus on having temp
> functions
> >>>> that have no namespaces and overwrite built-in functions. (As a side
> note
> >>>> for comparability, the current user defined functions are all
> temporary and
> >>>> having no namespaces.)
> >>>>
> >>>> Nevertheless, I can also see the merit of having namespaced temp
> functions
> >>>> that can overwrite functions defined in a specific cat/db. However,
> this
> >>>> idea appears orthogonal to the former and can be added incrementally.
> >>>>
> >>>> How about we first implement non-namespaced temp functions now and
> leave
> >>>> the door open for namespaced ones for later releases as the
> requirement
> >>>> might become more crystal? This also helps shorten the debate and
> allow us
> >>>> to make some progress along this direction.
> >>>>
> >>>> As to Dawid's idea of having a dedicated cat/db to host the temporary
> temp
> >>>> functions that don't have namespaces, my only concern is the special
> >>>> treatment for a cat/db, which makes code less clean, as evident in
> treating
> >>>> the built-in catalog currently.
> >>>>
> >>>> Thanks,
> >>>> Xuefiu
> >>>>
> >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >>>> [hidden email]>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>> Another idea to consider on top of Timo's suggestion. How about we
> have a
> >>>>> special namespace (catalog + database) for built-in objects? This
> catalog
> >>>>> would be invisible for users as Xuefu was suggesting.
> >>>>>
> >>>>> Then users could still override built-in functions, if they fully
> qualify
> >>>>> object with the built-in namespace, but by default the common logic
> of
> >>>>> current dB & cat would be used.
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION func ...
> >>>>> registers temporary function in current cat & dB
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >>>>> registers temporary function in cat db
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >>>>> Overrides built-in function with temporary function
> >>>>>
> >>>>> The built-in/system namespace would not be writable for permanent
> >>>> objects.
> >>>>> WDYT?
> >>>>>
> >>>>> This way I think we can have benefits of both solutions.
> >>>>>
> >>>>> Best,
> >>>>> Dawid
> >>>>>
> >>>>>
> >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote:
> >>>>>
> >>>>>> Hi Bowen,
> >>>>>>
> >>>>>> I understand the potential benefit of overriding certain built-in
> >>>>>> functions. I'm open to such a feature if many people agree.
> However, it
> >>>>>> would be great to still support overriding catalog functions with
> >>>>>> temporary functions in order to prototype a query even though a
> >>>>>> catalog/database might not be available currently or should not be
> >>>>>> modified yet. How about we support both cases?
> >>>>>>
> >>>>>> CREATE TEMPORARY FUNCTION abs
> >>>>>> -> creates/overrides a built-in function and never consideres
> current
> >>>>>> catalog and database; inconsistent with other DDL but acceptable for
> >>>>>> functions I guess.
> >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >>>>>> -> creates/overrides a catalog function
> >>>>>>
> >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> views)
> >>>>>> except functions", this might change in the near future. Take
> >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Timo
> >>>>>>
> >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >>>>>>> Hi Fabian,
> >>>>>>>
> >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> didn't
> >>>>>>> include that as a voting option, and the discussion is mainly
> between
> >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >>>>>>>
> >>>>>>> Re > However, it means that temp functions are differently treated
> >>>> than
> >>>>>>> other db objects.
> >>>>>>> IMO, the treatment difference results from the fact that functions
> >>>> are
> >>>>> a
> >>>>>>> bit different from other objects - Flink don't have any other
> >>>> built-in
> >>>>>>> objects (tables, views) except functions.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Bowen
> >>>>>>>
> >>>>>>
> >>>>
> >>>> --
> >>>> Xuefu Zhang
> >>>>
> >>>> "In Honey We Trust!"
> >>>>
> >>
> >
>
>

--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Dawid Wysakowicz
Hi,
I think it makes sense to start voting at this point.

Option 1: Only 1-part identifiers
PROS:
- allows shadowing built-in functions
CONS:
- incosistent with all the other objects, both permanent & temporary
- does not allow shadowing catalog functions

Option 2: Special keyword for built-in function
I think this is quite similar to the special catalog/db. The thing I am
strongly against in this proposal is the GLOBAL keyword. This keyword has a
meaning in rdbms systems and means a function that is present for a
lifetime of a session in which it was created, but available in all other
sessions. Therefore I really don't want to use this keyword in a different
context.

Option 3: Special catalog/db

PROS:
- allows shadowing built-in functions
- allows shadowing catalog functions
- consistent with other objects
CONS:
- we introduce a special namespace for built-in functions

I don't see a problem with introducing the special namespace. In the end it
is very similar to the keyword approach. In this case the catalog/db
combination would be the "keyword"

Therefore my votes:
Option 1: -0
Option 2: -1 (I might change to +0 if we can come up with a better keyword)
Option 3: +1

Best,
Dawid


On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:

> Hi Aljoscha,
>
> Thanks for the summary and these are great questions to be answered. The
> answer to your first question is clear: there is a general agreement to
> override built-in functions with temp functions.
>
> However, your second and third questions are sort of related, as a function
> reference can be either just function name (like "func") or in the form or
> "cat.db.func". When a reference is just function name, it can mean either a
> built-in function or a function defined in the current cat/db. If we
> support overriding a built-in function with a temp function, such
> overriding can also cover a function in the current cat/db.
>
> I think what Timo referred as "overriding a catalog function" means a temp
> function defined as "cat.db.func" overrides a catalog function "func" in
> cat/db even if cat/db is not current. To support this, temp function has to
> be tied to a cat/db. What's why I said above that the 2nd and 3rd questions
> are related. The problem with such support is the ambiguity when user
> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
> Here "func" can means a global temp function, or a temp function in current
> cat/db. If we can assume the former, this creates an inconsistency because
> "CREATE FUNCTION func" actually means a function in current cat/db. If we
> assume the latter, then there is no way for user to create a global temp
> function.
>
> Giving a special namespace for built-in functions may solve the ambiguity
> problem above, but it also introduces artificial catalog/database that
> needs special treatment and pollutes the cleanness of  the code. I would
> rather introduce a syntax in DDL to solve the problem, like "CREATE
> [GLOBAL] TEMPORARY FUNCTION func".
>
> Thus, I'd like to summarize a few candidate proposals for voting purposes:
>
> 1. Support only global, temporary functions without namespace. Such temp
> functions overrides built-in functions and catalog functions in current
> cat/db. The resolution order is: temp functions -> built-in functions ->
> catalog functions. (Partially or fully qualified functions has no
> ambiguity!)
>
> 2. In addition to #1, support creating and referencing temporary functions
> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
> functions. The resolution order is: global temp functions -> built-in
> functions -> temp functions in current cat/db -> catalog function.
> (Resolution for partially or fully qualified function reference is: temp
> functions -> persistent functions.)
>
> 3. In addition to #1, support creating and referencing temporary functions
> associated with a cat/db with a special namespace for built-in functions
> and global temp functions. The resolution is the same as #2, except that
> the special namespace might be prefixed to a reference to a built-in
> function or global temp function. (In absence of the special namespace, the
> resolution order is the same as in #2.)
>
> My personal preference is #1, given the unknown use case and introduced
> complexity for #2 and #3. However, #2 is an acceptable alternative. Thus,
> my votes are:
>
> +1 for #1
> +0 for #2
> -1 for #3
>
> Everyone, please cast your vote (in above format please!), or let me know
> if you have more questions or other candidates.
>
> Thanks,
> Xuefu
>
>
>
>
>
>
>
> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
> wrote:
>
> > Hi,
> >
> > I think this discussion and the one for FLIP-64 are very connected. To
> > resolve the differences, think we have to think about the basic
> principles
> > and find consensus there. The basic questions I see are:
> >
> >  - Do we want to support overriding builtin functions?
> >  - Do we want to support overriding catalog functions?
> >  - And then later: should temporary functions be tied to a
> > catalog/database?
> >
> > I don’t have much to say about these, except that we should somewhat
> stick
> > to what the industry does. But I also understand that the industry is
> > already very divided on this.
> >
> > Best,
> > Aljoscha
> >
> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> > >
> > > Hi,
> > >
> > > +1 to strive for reaching consensus on the remaining topics. We are
> > close to the truth. It will waste a lot of time if we resume the topic
> some
> > time later.
> > >
> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way
> > to override a catalog function.
> > >
> > > I’m not sure about “system.system.fun”, it introduces a nonexistent cat
> > & db? And we still need to do special treatment for the dedicated
> > system.system cat & db?
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> > >>
> > >> Hi everyone,
> > >>
> > >> @Xuefu: I would like to avoid adding too many things incrementally.
> > Users should be able to override all catalog objects consistently
> according
> > to FLIP-64 (Support for Temporary Objects in Table module). If functions
> > are treated completely different, we need more code and special cases.
> From
> > an implementation perspective, this topic only affects the lookup logic
> > which is rather low implementation effort which is why I would like to
> > clarify the remaining items. As you said, we have a slight consenus on
> > overriding built-in functions; we should also strive for reaching
> consensus
> > on the remaining topics.
> > >>
> > >> @Dawid: I like your idea as it ensures registering catalog objects
> > consistent and the overriding of built-in functions more explicit.
> > >>
> > >> Thanks,
> > >> Timo
> > >>
> > >>
> > >> On 17.09.19 11:59, kai wang wrote:
> > >>> hi, everyone
> > >>> I think this flip is very meaningful. it supports functions that can
> be
> > >>> shared by different catalogs and dbs, reducing the duplication of
> > functions.
> > >>>
> > >>> Our group based on flink's sql parser module implements create
> function
> > >>> feature, stores the parsed function metadata and schema into mysql,
> and
> > >>> also customizes the catalog, customizes sql-client to support custom
> > >>> schemas and functions. Loaded, but the function is currently global,
> > and is
> > >>> not subdivided according to catalog and db.
> > >>>
> > >>> In addition, I very much hope to participate in the development of
> this
> > >>> flip, I have been paying attention to the community, but found it is
> > more
> > >>> difficult to join.
> > >>> thank you.
> > >>>
> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> > >>>
> > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > >>>>
> > >>>> It seems to me that there is a general consensus on having temp
> > functions
> > >>>> that have no namespaces and overwrite built-in functions. (As a side
> > note
> > >>>> for comparability, the current user defined functions are all
> > temporary and
> > >>>> having no namespaces.)
> > >>>>
> > >>>> Nevertheless, I can also see the merit of having namespaced temp
> > functions
> > >>>> that can overwrite functions defined in a specific cat/db. However,
> > this
> > >>>> idea appears orthogonal to the former and can be added
> incrementally.
> > >>>>
> > >>>> How about we first implement non-namespaced temp functions now and
> > leave
> > >>>> the door open for namespaced ones for later releases as the
> > requirement
> > >>>> might become more crystal? This also helps shorten the debate and
> > allow us
> > >>>> to make some progress along this direction.
> > >>>>
> > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> temporary
> > temp
> > >>>> functions that don't have namespaces, my only concern is the special
> > >>>> treatment for a cat/db, which makes code less clean, as evident in
> > treating
> > >>>> the built-in catalog currently.
> > >>>>
> > >>>> Thanks,
> > >>>> Xuefiu
> > >>>>
> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > >>>> [hidden email]>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi,
> > >>>>> Another idea to consider on top of Timo's suggestion. How about we
> > have a
> > >>>>> special namespace (catalog + database) for built-in objects? This
> > catalog
> > >>>>> would be invisible for users as Xuefu was suggesting.
> > >>>>>
> > >>>>> Then users could still override built-in functions, if they fully
> > qualify
> > >>>>> object with the built-in namespace, but by default the common logic
> > of
> > >>>>> current dB & cat would be used.
> > >>>>>
> > >>>>> CREATE TEMPORARY FUNCTION func ...
> > >>>>> registers temporary function in current cat & dB
> > >>>>>
> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > >>>>> registers temporary function in cat db
> > >>>>>
> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > >>>>> Overrides built-in function with temporary function
> > >>>>>
> > >>>>> The built-in/system namespace would not be writable for permanent
> > >>>> objects.
> > >>>>> WDYT?
> > >>>>>
> > >>>>> This way I think we can have benefits of both solutions.
> > >>>>>
> > >>>>> Best,
> > >>>>> Dawid
> > >>>>>
> > >>>>>
> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
> wrote:
> > >>>>>
> > >>>>>> Hi Bowen,
> > >>>>>>
> > >>>>>> I understand the potential benefit of overriding certain built-in
> > >>>>>> functions. I'm open to such a feature if many people agree.
> > However, it
> > >>>>>> would be great to still support overriding catalog functions with
> > >>>>>> temporary functions in order to prototype a query even though a
> > >>>>>> catalog/database might not be available currently or should not be
> > >>>>>> modified yet. How about we support both cases?
> > >>>>>>
> > >>>>>> CREATE TEMPORARY FUNCTION abs
> > >>>>>> -> creates/overrides a built-in function and never consideres
> > current
> > >>>>>> catalog and database; inconsistent with other DDL but acceptable
> for
> > >>>>>> functions I guess.
> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > >>>>>> -> creates/overrides a catalog function
> > >>>>>>
> > >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> > views)
> > >>>>>> except functions", this might change in the near future. Take
> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Timo
> > >>>>>>
> > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > >>>>>>> Hi Fabian,
> > >>>>>>>
> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> > didn't
> > >>>>>>> include that as a voting option, and the discussion is mainly
> > between
> > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > >>>>>>>
> > >>>>>>> Re > However, it means that temp functions are differently
> treated
> > >>>> than
> > >>>>>>> other db objects.
> > >>>>>>> IMO, the treatment difference results from the fact that
> functions
> > >>>> are
> > >>>>> a
> > >>>>>>> bit different from other objects - Flink don't have any other
> > >>>> built-in
> > >>>>>>> objects (tables, views) except functions.
> > >>>>>>>
> > >>>>>>> Cheers,
> > >>>>>>> Bowen
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>> --
> > >>>> Xuefu Zhang
> > >>>>
> > >>>> "In Honey We Trust!"
> > >>>>
> > >>
> > >
> >
> >
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Dawid Wysakowicz
Last additional comment on Option 2. The reason why I prefer option 3 is
that in option 3 all objects internally are identified with 3 parts. This
makes it easier to handle at different locations e.g. while persisting
views, as all objects have uniform representation.

On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]>
wrote:

> Hi,
> I think it makes sense to start voting at this point.
>
> Option 1: Only 1-part identifiers
> PROS:
> - allows shadowing built-in functions
> CONS:
> - incosistent with all the other objects, both permanent & temporary
> - does not allow shadowing catalog functions
>
> Option 2: Special keyword for built-in function
> I think this is quite similar to the special catalog/db. The thing I am
> strongly against in this proposal is the GLOBAL keyword. This keyword has a
> meaning in rdbms systems and means a function that is present for a
> lifetime of a session in which it was created, but available in all other
> sessions. Therefore I really don't want to use this keyword in a different
> context.
>
> Option 3: Special catalog/db
>
> PROS:
> - allows shadowing built-in functions
> - allows shadowing catalog functions
> - consistent with other objects
> CONS:
> - we introduce a special namespace for built-in functions
>
> I don't see a problem with introducing the special namespace. In the end
> it is very similar to the keyword approach. In this case the catalog/db
> combination would be the "keyword"
>
> Therefore my votes:
> Option 1: -0
> Option 2: -1 (I might change to +0 if we can come up with a better keyword)
> Option 3: +1
>
> Best,
> Dawid
>
>
> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>
>> Hi Aljoscha,
>>
>> Thanks for the summary and these are great questions to be answered. The
>> answer to your first question is clear: there is a general agreement to
>> override built-in functions with temp functions.
>>
>> However, your second and third questions are sort of related, as a
>> function
>> reference can be either just function name (like "func") or in the form or
>> "cat.db.func". When a reference is just function name, it can mean either
>> a
>> built-in function or a function defined in the current cat/db. If we
>> support overriding a built-in function with a temp function, such
>> overriding can also cover a function in the current cat/db.
>>
>> I think what Timo referred as "overriding a catalog function" means a temp
>> function defined as "cat.db.func" overrides a catalog function "func" in
>> cat/db even if cat/db is not current. To support this, temp function has
>> to
>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>> questions
>> are related. The problem with such support is the ambiguity when user
>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
>> Here "func" can means a global temp function, or a temp function in
>> current
>> cat/db. If we can assume the former, this creates an inconsistency because
>> "CREATE FUNCTION func" actually means a function in current cat/db. If we
>> assume the latter, then there is no way for user to create a global temp
>> function.
>>
>> Giving a special namespace for built-in functions may solve the ambiguity
>> problem above, but it also introduces artificial catalog/database that
>> needs special treatment and pollutes the cleanness of  the code. I would
>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>> [GLOBAL] TEMPORARY FUNCTION func".
>>
>> Thus, I'd like to summarize a few candidate proposals for voting purposes:
>>
>> 1. Support only global, temporary functions without namespace. Such temp
>> functions overrides built-in functions and catalog functions in current
>> cat/db. The resolution order is: temp functions -> built-in functions ->
>> catalog functions. (Partially or fully qualified functions has no
>> ambiguity!)
>>
>> 2. In addition to #1, support creating and referencing temporary functions
>> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
>> functions. The resolution order is: global temp functions -> built-in
>> functions -> temp functions in current cat/db -> catalog function.
>> (Resolution for partially or fully qualified function reference is: temp
>> functions -> persistent functions.)
>>
>> 3. In addition to #1, support creating and referencing temporary functions
>> associated with a cat/db with a special namespace for built-in functions
>> and global temp functions. The resolution is the same as #2, except that
>> the special namespace might be prefixed to a reference to a built-in
>> function or global temp function. (In absence of the special namespace,
>> the
>> resolution order is the same as in #2.)
>>
>> My personal preference is #1, given the unknown use case and introduced
>> complexity for #2 and #3. However, #2 is an acceptable alternative. Thus,
>> my votes are:
>>
>> +1 for #1
>> +0 for #2
>> -1 for #3
>>
>> Everyone, please cast your vote (in above format please!), or let me know
>> if you have more questions or other candidates.
>>
>> Thanks,
>> Xuefu
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
>> wrote:
>>
>> > Hi,
>> >
>> > I think this discussion and the one for FLIP-64 are very connected. To
>> > resolve the differences, think we have to think about the basic
>> principles
>> > and find consensus there. The basic questions I see are:
>> >
>> >  - Do we want to support overriding builtin functions?
>> >  - Do we want to support overriding catalog functions?
>> >  - And then later: should temporary functions be tied to a
>> > catalog/database?
>> >
>> > I don’t have much to say about these, except that we should somewhat
>> stick
>> > to what the industry does. But I also understand that the industry is
>> > already very divided on this.
>> >
>> > Best,
>> > Aljoscha
>> >
>> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>> > >
>> > > Hi,
>> > >
>> > > +1 to strive for reaching consensus on the remaining topics. We are
>> > close to the truth. It will waste a lot of time if we resume the topic
>> some
>> > time later.
>> > >
>> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way
>> > to override a catalog function.
>> > >
>> > > I’m not sure about “system.system.fun”, it introduces a nonexistent
>> cat
>> > & db? And we still need to do special treatment for the dedicated
>> > system.system cat & db?
>> > >
>> > > Best,
>> > > Jark
>> > >
>> > >
>> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>> > >>
>> > >> Hi everyone,
>> > >>
>> > >> @Xuefu: I would like to avoid adding too many things incrementally.
>> > Users should be able to override all catalog objects consistently
>> according
>> > to FLIP-64 (Support for Temporary Objects in Table module). If functions
>> > are treated completely different, we need more code and special cases.
>> From
>> > an implementation perspective, this topic only affects the lookup logic
>> > which is rather low implementation effort which is why I would like to
>> > clarify the remaining items. As you said, we have a slight consenus on
>> > overriding built-in functions; we should also strive for reaching
>> consensus
>> > on the remaining topics.
>> > >>
>> > >> @Dawid: I like your idea as it ensures registering catalog objects
>> > consistent and the overriding of built-in functions more explicit.
>> > >>
>> > >> Thanks,
>> > >> Timo
>> > >>
>> > >>
>> > >> On 17.09.19 11:59, kai wang wrote:
>> > >>> hi, everyone
>> > >>> I think this flip is very meaningful. it supports functions that
>> can be
>> > >>> shared by different catalogs and dbs, reducing the duplication of
>> > functions.
>> > >>>
>> > >>> Our group based on flink's sql parser module implements create
>> function
>> > >>> feature, stores the parsed function metadata and schema into mysql,
>> and
>> > >>> also customizes the catalog, customizes sql-client to support custom
>> > >>> schemas and functions. Loaded, but the function is currently global,
>> > and is
>> > >>> not subdivided according to catalog and db.
>> > >>>
>> > >>> In addition, I very much hope to participate in the development of
>> this
>> > >>> flip, I have been paying attention to the community, but found it is
>> > more
>> > >>> difficult to join.
>> > >>> thank you.
>> > >>>
>> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>> > >>>
>> > >>>> Thanks to Tmo and Dawid for sharing thoughts.
>> > >>>>
>> > >>>> It seems to me that there is a general consensus on having temp
>> > functions
>> > >>>> that have no namespaces and overwrite built-in functions. (As a
>> side
>> > note
>> > >>>> for comparability, the current user defined functions are all
>> > temporary and
>> > >>>> having no namespaces.)
>> > >>>>
>> > >>>> Nevertheless, I can also see the merit of having namespaced temp
>> > functions
>> > >>>> that can overwrite functions defined in a specific cat/db. However,
>> > this
>> > >>>> idea appears orthogonal to the former and can be added
>> incrementally.
>> > >>>>
>> > >>>> How about we first implement non-namespaced temp functions now and
>> > leave
>> > >>>> the door open for namespaced ones for later releases as the
>> > requirement
>> > >>>> might become more crystal? This also helps shorten the debate and
>> > allow us
>> > >>>> to make some progress along this direction.
>> > >>>>
>> > >>>> As to Dawid's idea of having a dedicated cat/db to host the
>> temporary
>> > temp
>> > >>>> functions that don't have namespaces, my only concern is the
>> special
>> > >>>> treatment for a cat/db, which makes code less clean, as evident in
>> > treating
>> > >>>> the built-in catalog currently.
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Xuefiu
>> > >>>>
>> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>> > >>>> [hidden email]>
>> > >>>> wrote:
>> > >>>>
>> > >>>>> Hi,
>> > >>>>> Another idea to consider on top of Timo's suggestion. How about we
>> > have a
>> > >>>>> special namespace (catalog + database) for built-in objects? This
>> > catalog
>> > >>>>> would be invisible for users as Xuefu was suggesting.
>> > >>>>>
>> > >>>>> Then users could still override built-in functions, if they fully
>> > qualify
>> > >>>>> object with the built-in namespace, but by default the common
>> logic
>> > of
>> > >>>>> current dB & cat would be used.
>> > >>>>>
>> > >>>>> CREATE TEMPORARY FUNCTION func ...
>> > >>>>> registers temporary function in current cat & dB
>> > >>>>>
>> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>> > >>>>> registers temporary function in cat db
>> > >>>>>
>> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>> > >>>>> Overrides built-in function with temporary function
>> > >>>>>
>> > >>>>> The built-in/system namespace would not be writable for permanent
>> > >>>> objects.
>> > >>>>> WDYT?
>> > >>>>>
>> > >>>>> This way I think we can have benefits of both solutions.
>> > >>>>>
>> > >>>>> Best,
>> > >>>>> Dawid
>> > >>>>>
>> > >>>>>
>> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
>> wrote:
>> > >>>>>
>> > >>>>>> Hi Bowen,
>> > >>>>>>
>> > >>>>>> I understand the potential benefit of overriding certain built-in
>> > >>>>>> functions. I'm open to such a feature if many people agree.
>> > However, it
>> > >>>>>> would be great to still support overriding catalog functions with
>> > >>>>>> temporary functions in order to prototype a query even though a
>> > >>>>>> catalog/database might not be available currently or should not
>> be
>> > >>>>>> modified yet. How about we support both cases?
>> > >>>>>>
>> > >>>>>> CREATE TEMPORARY FUNCTION abs
>> > >>>>>> -> creates/overrides a built-in function and never consideres
>> > current
>> > >>>>>> catalog and database; inconsistent with other DDL but acceptable
>> for
>> > >>>>>> functions I guess.
>> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>> > >>>>>> -> creates/overrides a catalog function
>> > >>>>>>
>> > >>>>>> Regarding "Flink don't have any other built-in objects (tables,
>> > views)
>> > >>>>>> except functions", this might change in the near future. Take
>> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> Timo
>> > >>>>>>
>> > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
>> > >>>>>>> Hi Fabian,
>> > >>>>>>>
>> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
>> > didn't
>> > >>>>>>> include that as a voting option, and the discussion is mainly
>> > between
>> > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
>> > >>>>>>>
>> > >>>>>>> Re > However, it means that temp functions are differently
>> treated
>> > >>>> than
>> > >>>>>>> other db objects.
>> > >>>>>>> IMO, the treatment difference results from the fact that
>> functions
>> > >>>> are
>> > >>>>> a
>> > >>>>>>> bit different from other objects - Flink don't have any other
>> > >>>> built-in
>> > >>>>>>> objects (tables, views) except functions.
>> > >>>>>>>
>> > >>>>>>> Cheers,
>> > >>>>>>> Bowen
>> > >>>>>>>
>> > >>>>>>
>> > >>>>
>> > >>>> --
>> > >>>> Xuefu Zhang
>> > >>>>
>> > >>>> "In Honey We Trust!"
>> > >>>>
>> > >>
>> > >
>> >
>> >
>>
>> --
>> Xuefu Zhang
>>
>> "In Honey We Trust!"
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

bowen.li
Hi,

For #2, as Xuefu and I discussed offline, the key point is to introduce a
keyword to SQL DDL to distinguish temp function that override built-in
functions v.s. temp functions that override catalog functions. It can be
something else than "GLOBAL", like "BUILTIN" (e.g. "CREATE BUILTIN TEMP
FUNCTION") given there's no SQL standard for temp functions. This option is
more about whether we want to have such a new keyword in SQL for the
proposed functionality, and I personally think it's acceptable.

For #3, besides the drawbacks mentioned by Xuefu and Jark, another con is:
we already have a special catalog and db as "default_catalog" and
"default_database", though they are not used very correctly at the moment
(another story...), they are at least physically present. Introducing yet
another virtual special catalog/db as something like "system"."system" that
never physically exist would further confuse users, and
hurt understandability and usability.

Thus my vote is:

+1 for #1
0 for #2
-1 for #3

Thanks,
Bowen

On Wed, Sep 18, 2019 at 4:55 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Last additional comment on Option 2. The reason why I prefer option 3 is
> that in option 3 all objects internally are identified with 3 parts. This
> makes it easier to handle at different locations e.g. while persisting
> views, as all objects have uniform representation.
>
> On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]>
> wrote:
>
> > Hi,
> > I think it makes sense to start voting at this point.
> >
> > Option 1: Only 1-part identifiers
> > PROS:
> > - allows shadowing built-in functions
> > CONS:
> > - incosistent with all the other objects, both permanent & temporary
> > - does not allow shadowing catalog functions
> >
> > Option 2: Special keyword for built-in function
> > I think this is quite similar to the special catalog/db. The thing I am
> > strongly against in this proposal is the GLOBAL keyword. This keyword
> has a
> > meaning in rdbms systems and means a function that is present for a
> > lifetime of a session in which it was created, but available in all other
> > sessions. Therefore I really don't want to use this keyword in a
> different
> > context.
> >
> > Option 3: Special catalog/db
> >
> > PROS:
> > - allows shadowing built-in functions
> > - allows shadowing catalog functions
> > - consistent with other objects
> > CONS:
> > - we introduce a special namespace for built-in functions
> >
> > I don't see a problem with introducing the special namespace. In the end
> > it is very similar to the keyword approach. In this case the catalog/db
> > combination would be the "keyword"
> >
> > Therefore my votes:
> > Option 1: -0
> > Option 2: -1 (I might change to +0 if we can come up with a better
> keyword)
> > Option 3: +1
> >
> > Best,
> > Dawid
> >
> >
> > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> >
> >> Hi Aljoscha,
> >>
> >> Thanks for the summary and these are great questions to be answered. The
> >> answer to your first question is clear: there is a general agreement to
> >> override built-in functions with temp functions.
> >>
> >> However, your second and third questions are sort of related, as a
> >> function
> >> reference can be either just function name (like "func") or in the form
> or
> >> "cat.db.func". When a reference is just function name, it can mean
> either
> >> a
> >> built-in function or a function defined in the current cat/db. If we
> >> support overriding a built-in function with a temp function, such
> >> overriding can also cover a function in the current cat/db.
> >>
> >> I think what Timo referred as "overriding a catalog function" means a
> temp
> >> function defined as "cat.db.func" overrides a catalog function "func" in
> >> cat/db even if cat/db is not current. To support this, temp function has
> >> to
> >> be tied to a cat/db. What's why I said above that the 2nd and 3rd
> >> questions
> >> are related. The problem with such support is the ambiguity when user
> >> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
> >> Here "func" can means a global temp function, or a temp function in
> >> current
> >> cat/db. If we can assume the former, this creates an inconsistency
> because
> >> "CREATE FUNCTION func" actually means a function in current cat/db. If
> we
> >> assume the latter, then there is no way for user to create a global temp
> >> function.
> >>
> >> Giving a special namespace for built-in functions may solve the
> ambiguity
> >> problem above, but it also introduces artificial catalog/database that
> >> needs special treatment and pollutes the cleanness of  the code. I would
> >> rather introduce a syntax in DDL to solve the problem, like "CREATE
> >> [GLOBAL] TEMPORARY FUNCTION func".
> >>
> >> Thus, I'd like to summarize a few candidate proposals for voting
> purposes:
> >>
> >> 1. Support only global, temporary functions without namespace. Such temp
> >> functions overrides built-in functions and catalog functions in current
> >> cat/db. The resolution order is: temp functions -> built-in functions ->
> >> catalog functions. (Partially or fully qualified functions has no
> >> ambiguity!)
> >>
> >> 2. In addition to #1, support creating and referencing temporary
> functions
> >> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
> >> functions. The resolution order is: global temp functions -> built-in
> >> functions -> temp functions in current cat/db -> catalog function.
> >> (Resolution for partially or fully qualified function reference is: temp
> >> functions -> persistent functions.)
> >>
> >> 3. In addition to #1, support creating and referencing temporary
> functions
> >> associated with a cat/db with a special namespace for built-in functions
> >> and global temp functions. The resolution is the same as #2, except that
> >> the special namespace might be prefixed to a reference to a built-in
> >> function or global temp function. (In absence of the special namespace,
> >> the
> >> resolution order is the same as in #2.)
> >>
> >> My personal preference is #1, given the unknown use case and introduced
> >> complexity for #2 and #3. However, #2 is an acceptable alternative.
> Thus,
> >> my votes are:
> >>
> >> +1 for #1
> >> +0 for #2
> >> -1 for #3
> >>
> >> Everyone, please cast your vote (in above format please!), or let me
> know
> >> if you have more questions or other candidates.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > I think this discussion and the one for FLIP-64 are very connected. To
> >> > resolve the differences, think we have to think about the basic
> >> principles
> >> > and find consensus there. The basic questions I see are:
> >> >
> >> >  - Do we want to support overriding builtin functions?
> >> >  - Do we want to support overriding catalog functions?
> >> >  - And then later: should temporary functions be tied to a
> >> > catalog/database?
> >> >
> >> > I don’t have much to say about these, except that we should somewhat
> >> stick
> >> > to what the industry does. But I also understand that the industry is
> >> > already very divided on this.
> >> >
> >> > Best,
> >> > Aljoscha
> >> >
> >> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > +1 to strive for reaching consensus on the remaining topics. We are
> >> > close to the truth. It will waste a lot of time if we resume the topic
> >> some
> >> > time later.
> >> > >
> >> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun”
> way
> >> > to override a catalog function.
> >> > >
> >> > > I’m not sure about “system.system.fun”, it introduces a nonexistent
> >> cat
> >> > & db? And we still need to do special treatment for the dedicated
> >> > system.system cat & db?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > >
> >> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> >> > >>
> >> > >> Hi everyone,
> >> > >>
> >> > >> @Xuefu: I would like to avoid adding too many things incrementally.
> >> > Users should be able to override all catalog objects consistently
> >> according
> >> > to FLIP-64 (Support for Temporary Objects in Table module). If
> functions
> >> > are treated completely different, we need more code and special cases.
> >> From
> >> > an implementation perspective, this topic only affects the lookup
> logic
> >> > which is rather low implementation effort which is why I would like to
> >> > clarify the remaining items. As you said, we have a slight consenus on
> >> > overriding built-in functions; we should also strive for reaching
> >> consensus
> >> > on the remaining topics.
> >> > >>
> >> > >> @Dawid: I like your idea as it ensures registering catalog objects
> >> > consistent and the overriding of built-in functions more explicit.
> >> > >>
> >> > >> Thanks,
> >> > >> Timo
> >> > >>
> >> > >>
> >> > >> On 17.09.19 11:59, kai wang wrote:
> >> > >>> hi, everyone
> >> > >>> I think this flip is very meaningful. it supports functions that
> >> can be
> >> > >>> shared by different catalogs and dbs, reducing the duplication of
> >> > functions.
> >> > >>>
> >> > >>> Our group based on flink's sql parser module implements create
> >> function
> >> > >>> feature, stores the parsed function metadata and schema into
> mysql,
> >> and
> >> > >>> also customizes the catalog, customizes sql-client to support
> custom
> >> > >>> schemas and functions. Loaded, but the function is currently
> global,
> >> > and is
> >> > >>> not subdivided according to catalog and db.
> >> > >>>
> >> > >>> In addition, I very much hope to participate in the development of
> >> this
> >> > >>> flip, I have been paying attention to the community, but found it
> is
> >> > more
> >> > >>> difficult to join.
> >> > >>> thank you.
> >> > >>>
> >> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> >> > >>>
> >> > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> >> > >>>>
> >> > >>>> It seems to me that there is a general consensus on having temp
> >> > functions
> >> > >>>> that have no namespaces and overwrite built-in functions. (As a
> >> side
> >> > note
> >> > >>>> for comparability, the current user defined functions are all
> >> > temporary and
> >> > >>>> having no namespaces.)
> >> > >>>>
> >> > >>>> Nevertheless, I can also see the merit of having namespaced temp
> >> > functions
> >> > >>>> that can overwrite functions defined in a specific cat/db.
> However,
> >> > this
> >> > >>>> idea appears orthogonal to the former and can be added
> >> incrementally.
> >> > >>>>
> >> > >>>> How about we first implement non-namespaced temp functions now
> and
> >> > leave
> >> > >>>> the door open for namespaced ones for later releases as the
> >> > requirement
> >> > >>>> might become more crystal? This also helps shorten the debate and
> >> > allow us
> >> > >>>> to make some progress along this direction.
> >> > >>>>
> >> > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> >> temporary
> >> > temp
> >> > >>>> functions that don't have namespaces, my only concern is the
> >> special
> >> > >>>> treatment for a cat/db, which makes code less clean, as evident
> in
> >> > treating
> >> > >>>> the built-in catalog currently.
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Xuefiu
> >> > >>>>
> >> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >> > >>>> [hidden email]>
> >> > >>>> wrote:
> >> > >>>>
> >> > >>>>> Hi,
> >> > >>>>> Another idea to consider on top of Timo's suggestion. How about
> we
> >> > have a
> >> > >>>>> special namespace (catalog + database) for built-in objects?
> This
> >> > catalog
> >> > >>>>> would be invisible for users as Xuefu was suggesting.
> >> > >>>>>
> >> > >>>>> Then users could still override built-in functions, if they
> fully
> >> > qualify
> >> > >>>>> object with the built-in namespace, but by default the common
> >> logic
> >> > of
> >> > >>>>> current dB & cat would be used.
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION func ...
> >> > >>>>> registers temporary function in current cat & dB
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >> > >>>>> registers temporary function in cat db
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >> > >>>>> Overrides built-in function with temporary function
> >> > >>>>>
> >> > >>>>> The built-in/system namespace would not be writable for
> permanent
> >> > >>>> objects.
> >> > >>>>> WDYT?
> >> > >>>>>
> >> > >>>>> This way I think we can have benefits of both solutions.
> >> > >>>>>
> >> > >>>>> Best,
> >> > >>>>> Dawid
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
> >> wrote:
> >> > >>>>>
> >> > >>>>>> Hi Bowen,
> >> > >>>>>>
> >> > >>>>>> I understand the potential benefit of overriding certain
> built-in
> >> > >>>>>> functions. I'm open to such a feature if many people agree.
> >> > However, it
> >> > >>>>>> would be great to still support overriding catalog functions
> with
> >> > >>>>>> temporary functions in order to prototype a query even though a
> >> > >>>>>> catalog/database might not be available currently or should not
> >> be
> >> > >>>>>> modified yet. How about we support both cases?
> >> > >>>>>>
> >> > >>>>>> CREATE TEMPORARY FUNCTION abs
> >> > >>>>>> -> creates/overrides a built-in function and never consideres
> >> > current
> >> > >>>>>> catalog and database; inconsistent with other DDL but
> acceptable
> >> for
> >> > >>>>>> functions I guess.
> >> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >> > >>>>>> -> creates/overrides a catalog function
> >> > >>>>>>
> >> > >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> >> > views)
> >> > >>>>>> except functions", this might change in the near future. Take
> >> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> example.
> >> > >>>>>>
> >> > >>>>>> Thanks,
> >> > >>>>>> Timo
> >> > >>>>>>
> >> > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >> > >>>>>>> Hi Fabian,
> >> > >>>>>>>
> >> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> >> > didn't
> >> > >>>>>>> include that as a voting option, and the discussion is mainly
> >> > between
> >> > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >> > >>>>>>>
> >> > >>>>>>> Re > However, it means that temp functions are differently
> >> treated
> >> > >>>> than
> >> > >>>>>>> other db objects.
> >> > >>>>>>> IMO, the treatment difference results from the fact that
> >> functions
> >> > >>>> are
> >> > >>>>> a
> >> > >>>>>>> bit different from other objects - Flink don't have any other
> >> > >>>> built-in
> >> > >>>>>>> objects (tables, views) except functions.
> >> > >>>>>>>
> >> > >>>>>>> Cheers,
> >> > >>>>>>> Bowen
> >> > >>>>>>>
> >> > >>>>>>
> >> > >>>>
> >> > >>>> --
> >> > >>>> Xuefu Zhang
> >> > >>>>
> >> > >>>> "In Honey We Trust!"
> >> > >>>>
> >> > >>
> >> > >
> >> >
> >> >
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Xuefu Z
In reply to this post by Dawid Wysakowicz
Hi Dawid,

"GLOBAL" is a temporary keyword that was given to the approach. It can be
changed to something else for better.

The difference between this and the #3 approach is that we only need the
keyword for this create DDL. For other places (such as function
referencing), no keyword or special namespace is needed.

Thanks,
Xuefu

On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Hi,
> I think it makes sense to start voting at this point.
>
> Option 1: Only 1-part identifiers
> PROS:
> - allows shadowing built-in functions
> CONS:
> - incosistent with all the other objects, both permanent & temporary
> - does not allow shadowing catalog functions
>
> Option 2: Special keyword for built-in function
> I think this is quite similar to the special catalog/db. The thing I am
> strongly against in this proposal is the GLOBAL keyword. This keyword has a
> meaning in rdbms systems and means a function that is present for a
> lifetime of a session in which it was created, but available in all other
> sessions. Therefore I really don't want to use this keyword in a different
> context.
>
> Option 3: Special catalog/db
>
> PROS:
> - allows shadowing built-in functions
> - allows shadowing catalog functions
> - consistent with other objects
> CONS:
> - we introduce a special namespace for built-in functions
>
> I don't see a problem with introducing the special namespace. In the end it
> is very similar to the keyword approach. In this case the catalog/db
> combination would be the "keyword"
>
> Therefore my votes:
> Option 1: -0
> Option 2: -1 (I might change to +0 if we can come up with a better keyword)
> Option 3: +1
>
> Best,
> Dawid
>
>
> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>
> > Hi Aljoscha,
> >
> > Thanks for the summary and these are great questions to be answered. The
> > answer to your first question is clear: there is a general agreement to
> > override built-in functions with temp functions.
> >
> > However, your second and third questions are sort of related, as a
> function
> > reference can be either just function name (like "func") or in the form
> or
> > "cat.db.func". When a reference is just function name, it can mean
> either a
> > built-in function or a function defined in the current cat/db. If we
> > support overriding a built-in function with a temp function, such
> > overriding can also cover a function in the current cat/db.
> >
> > I think what Timo referred as "overriding a catalog function" means a
> temp
> > function defined as "cat.db.func" overrides a catalog function "func" in
> > cat/db even if cat/db is not current. To support this, temp function has
> to
> > be tied to a cat/db. What's why I said above that the 2nd and 3rd
> questions
> > are related. The problem with such support is the ambiguity when user
> > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
> > Here "func" can means a global temp function, or a temp function in
> current
> > cat/db. If we can assume the former, this creates an inconsistency
> because
> > "CREATE FUNCTION func" actually means a function in current cat/db. If we
> > assume the latter, then there is no way for user to create a global temp
> > function.
> >
> > Giving a special namespace for built-in functions may solve the ambiguity
> > problem above, but it also introduces artificial catalog/database that
> > needs special treatment and pollutes the cleanness of  the code. I would
> > rather introduce a syntax in DDL to solve the problem, like "CREATE
> > [GLOBAL] TEMPORARY FUNCTION func".
> >
> > Thus, I'd like to summarize a few candidate proposals for voting
> purposes:
> >
> > 1. Support only global, temporary functions without namespace. Such temp
> > functions overrides built-in functions and catalog functions in current
> > cat/db. The resolution order is: temp functions -> built-in functions ->
> > catalog functions. (Partially or fully qualified functions has no
> > ambiguity!)
> >
> > 2. In addition to #1, support creating and referencing temporary
> functions
> > associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
> > functions. The resolution order is: global temp functions -> built-in
> > functions -> temp functions in current cat/db -> catalog function.
> > (Resolution for partially or fully qualified function reference is: temp
> > functions -> persistent functions.)
> >
> > 3. In addition to #1, support creating and referencing temporary
> functions
> > associated with a cat/db with a special namespace for built-in functions
> > and global temp functions. The resolution is the same as #2, except that
> > the special namespace might be prefixed to a reference to a built-in
> > function or global temp function. (In absence of the special namespace,
> the
> > resolution order is the same as in #2.)
> >
> > My personal preference is #1, given the unknown use case and introduced
> > complexity for #2 and #3. However, #2 is an acceptable alternative. Thus,
> > my votes are:
> >
> > +1 for #1
> > +0 for #2
> > -1 for #3
> >
> > Everyone, please cast your vote (in above format please!), or let me know
> > if you have more questions or other candidates.
> >
> > Thanks,
> > Xuefu
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > Hi,
> > >
> > > I think this discussion and the one for FLIP-64 are very connected. To
> > > resolve the differences, think we have to think about the basic
> > principles
> > > and find consensus there. The basic questions I see are:
> > >
> > >  - Do we want to support overriding builtin functions?
> > >  - Do we want to support overriding catalog functions?
> > >  - And then later: should temporary functions be tied to a
> > > catalog/database?
> > >
> > > I don’t have much to say about these, except that we should somewhat
> > stick
> > > to what the industry does. But I also understand that the industry is
> > > already very divided on this.
> > >
> > > Best,
> > > Aljoscha
> > >
> > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> > > >
> > > > Hi,
> > > >
> > > > +1 to strive for reaching consensus on the remaining topics. We are
> > > close to the truth. It will waste a lot of time if we resume the topic
> > some
> > > time later.
> > > >
> > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun”
> way
> > > to override a catalog function.
> > > >
> > > > I’m not sure about “system.system.fun”, it introduces a nonexistent
> cat
> > > & db? And we still need to do special treatment for the dedicated
> > > system.system cat & db?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> > > >>
> > > >> Hi everyone,
> > > >>
> > > >> @Xuefu: I would like to avoid adding too many things incrementally.
> > > Users should be able to override all catalog objects consistently
> > according
> > > to FLIP-64 (Support for Temporary Objects in Table module). If
> functions
> > > are treated completely different, we need more code and special cases.
> > From
> > > an implementation perspective, this topic only affects the lookup logic
> > > which is rather low implementation effort which is why I would like to
> > > clarify the remaining items. As you said, we have a slight consenus on
> > > overriding built-in functions; we should also strive for reaching
> > consensus
> > > on the remaining topics.
> > > >>
> > > >> @Dawid: I like your idea as it ensures registering catalog objects
> > > consistent and the overriding of built-in functions more explicit.
> > > >>
> > > >> Thanks,
> > > >> Timo
> > > >>
> > > >>
> > > >> On 17.09.19 11:59, kai wang wrote:
> > > >>> hi, everyone
> > > >>> I think this flip is very meaningful. it supports functions that
> can
> > be
> > > >>> shared by different catalogs and dbs, reducing the duplication of
> > > functions.
> > > >>>
> > > >>> Our group based on flink's sql parser module implements create
> > function
> > > >>> feature, stores the parsed function metadata and schema into mysql,
> > and
> > > >>> also customizes the catalog, customizes sql-client to support
> custom
> > > >>> schemas and functions. Loaded, but the function is currently
> global,
> > > and is
> > > >>> not subdivided according to catalog and db.
> > > >>>
> > > >>> In addition, I very much hope to participate in the development of
> > this
> > > >>> flip, I have been paying attention to the community, but found it
> is
> > > more
> > > >>> difficult to join.
> > > >>> thank you.
> > > >>>
> > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> > > >>>
> > > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > > >>>>
> > > >>>> It seems to me that there is a general consensus on having temp
> > > functions
> > > >>>> that have no namespaces and overwrite built-in functions. (As a
> side
> > > note
> > > >>>> for comparability, the current user defined functions are all
> > > temporary and
> > > >>>> having no namespaces.)
> > > >>>>
> > > >>>> Nevertheless, I can also see the merit of having namespaced temp
> > > functions
> > > >>>> that can overwrite functions defined in a specific cat/db.
> However,
> > > this
> > > >>>> idea appears orthogonal to the former and can be added
> > incrementally.
> > > >>>>
> > > >>>> How about we first implement non-namespaced temp functions now and
> > > leave
> > > >>>> the door open for namespaced ones for later releases as the
> > > requirement
> > > >>>> might become more crystal? This also helps shorten the debate and
> > > allow us
> > > >>>> to make some progress along this direction.
> > > >>>>
> > > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> > temporary
> > > temp
> > > >>>> functions that don't have namespaces, my only concern is the
> special
> > > >>>> treatment for a cat/db, which makes code less clean, as evident in
> > > treating
> > > >>>> the built-in catalog currently.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Xuefiu
> > > >>>>
> > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > > >>>> [hidden email]>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi,
> > > >>>>> Another idea to consider on top of Timo's suggestion. How about
> we
> > > have a
> > > >>>>> special namespace (catalog + database) for built-in objects? This
> > > catalog
> > > >>>>> would be invisible for users as Xuefu was suggesting.
> > > >>>>>
> > > >>>>> Then users could still override built-in functions, if they fully
> > > qualify
> > > >>>>> object with the built-in namespace, but by default the common
> logic
> > > of
> > > >>>>> current dB & cat would be used.
> > > >>>>>
> > > >>>>> CREATE TEMPORARY FUNCTION func ...
> > > >>>>> registers temporary function in current cat & dB
> > > >>>>>
> > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > > >>>>> registers temporary function in cat db
> > > >>>>>
> > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > > >>>>> Overrides built-in function with temporary function
> > > >>>>>
> > > >>>>> The built-in/system namespace would not be writable for permanent
> > > >>>> objects.
> > > >>>>> WDYT?
> > > >>>>>
> > > >>>>> This way I think we can have benefits of both solutions.
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> Dawid
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
> > wrote:
> > > >>>>>
> > > >>>>>> Hi Bowen,
> > > >>>>>>
> > > >>>>>> I understand the potential benefit of overriding certain
> built-in
> > > >>>>>> functions. I'm open to such a feature if many people agree.
> > > However, it
> > > >>>>>> would be great to still support overriding catalog functions
> with
> > > >>>>>> temporary functions in order to prototype a query even though a
> > > >>>>>> catalog/database might not be available currently or should not
> be
> > > >>>>>> modified yet. How about we support both cases?
> > > >>>>>>
> > > >>>>>> CREATE TEMPORARY FUNCTION abs
> > > >>>>>> -> creates/overrides a built-in function and never consideres
> > > current
> > > >>>>>> catalog and database; inconsistent with other DDL but acceptable
> > for
> > > >>>>>> functions I guess.
> > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > > >>>>>> -> creates/overrides a catalog function
> > > >>>>>>
> > > >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> > > views)
> > > >>>>>> except functions", this might change in the near future. Take
> > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> example.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> Timo
> > > >>>>>>
> > > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > > >>>>>>> Hi Fabian,
> > > >>>>>>>
> > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> > > didn't
> > > >>>>>>> include that as a voting option, and the discussion is mainly
> > > between
> > > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > > >>>>>>>
> > > >>>>>>> Re > However, it means that temp functions are differently
> > treated
> > > >>>> than
> > > >>>>>>> other db objects.
> > > >>>>>>> IMO, the treatment difference results from the fact that
> > functions
> > > >>>> are
> > > >>>>> a
> > > >>>>>>> bit different from other objects - Flink don't have any other
> > > >>>> built-in
> > > >>>>>>> objects (tables, views) except functions.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Bowen
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>> --
> > > >>>> Xuefu Zhang
> > > >>>>
> > > >>>> "In Honey We Trust!"
> > > >>>>
> > > >>
> > > >
> > >
> > >
> >
> > --
> > Xuefu Zhang
> >
> > "In Honey We Trust!"
> >
>


--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Dawid Wysakowicz
@Bowen I am not suggesting introducing additional catalog. I think we need
to get rid of the current built-in catalog.

@Xuefu in option #3 we also don't need additional referencing the special
catalog anywhere else besides in the CREATE statement. The resolution
behaviour is exactly the same in both options.

On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:

> Hi Dawid,
>
> "GLOBAL" is a temporary keyword that was given to the approach. It can be
> changed to something else for better.
>
> The difference between this and the #3 approach is that we only need the
> keyword for this create DDL. For other places (such as function
> referencing), no keyword or special namespace is needed.
>
> Thanks,
> Xuefu
>
> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> [hidden email]>
> wrote:
>
> > Hi,
> > I think it makes sense to start voting at this point.
> >
> > Option 1: Only 1-part identifiers
> > PROS:
> > - allows shadowing built-in functions
> > CONS:
> > - incosistent with all the other objects, both permanent & temporary
> > - does not allow shadowing catalog functions
> >
> > Option 2: Special keyword for built-in function
> > I think this is quite similar to the special catalog/db. The thing I am
> > strongly against in this proposal is the GLOBAL keyword. This keyword
> has a
> > meaning in rdbms systems and means a function that is present for a
> > lifetime of a session in which it was created, but available in all other
> > sessions. Therefore I really don't want to use this keyword in a
> different
> > context.
> >
> > Option 3: Special catalog/db
> >
> > PROS:
> > - allows shadowing built-in functions
> > - allows shadowing catalog functions
> > - consistent with other objects
> > CONS:
> > - we introduce a special namespace for built-in functions
> >
> > I don't see a problem with introducing the special namespace. In the end
> it
> > is very similar to the keyword approach. In this case the catalog/db
> > combination would be the "keyword"
> >
> > Therefore my votes:
> > Option 1: -0
> > Option 2: -1 (I might change to +0 if we can come up with a better
> keyword)
> > Option 3: +1
> >
> > Best,
> > Dawid
> >
> >
> > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> >
> > > Hi Aljoscha,
> > >
> > > Thanks for the summary and these are great questions to be answered.
> The
> > > answer to your first question is clear: there is a general agreement to
> > > override built-in functions with temp functions.
> > >
> > > However, your second and third questions are sort of related, as a
> > function
> > > reference can be either just function name (like "func") or in the form
> > or
> > > "cat.db.func". When a reference is just function name, it can mean
> > either a
> > > built-in function or a function defined in the current cat/db. If we
> > > support overriding a built-in function with a temp function, such
> > > overriding can also cover a function in the current cat/db.
> > >
> > > I think what Timo referred as "overriding a catalog function" means a
> > temp
> > > function defined as "cat.db.func" overrides a catalog function "func"
> in
> > > cat/db even if cat/db is not current. To support this, temp function
> has
> > to
> > > be tied to a cat/db. What's why I said above that the 2nd and 3rd
> > questions
> > > are related. The problem with such support is the ambiguity when user
> > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
> > > Here "func" can means a global temp function, or a temp function in
> > current
> > > cat/db. If we can assume the former, this creates an inconsistency
> > because
> > > "CREATE FUNCTION func" actually means a function in current cat/db. If
> we
> > > assume the latter, then there is no way for user to create a global
> temp
> > > function.
> > >
> > > Giving a special namespace for built-in functions may solve the
> ambiguity
> > > problem above, but it also introduces artificial catalog/database that
> > > needs special treatment and pollutes the cleanness of  the code. I
> would
> > > rather introduce a syntax in DDL to solve the problem, like "CREATE
> > > [GLOBAL] TEMPORARY FUNCTION func".
> > >
> > > Thus, I'd like to summarize a few candidate proposals for voting
> > purposes:
> > >
> > > 1. Support only global, temporary functions without namespace. Such
> temp
> > > functions overrides built-in functions and catalog functions in current
> > > cat/db. The resolution order is: temp functions -> built-in functions
> ->
> > > catalog functions. (Partially or fully qualified functions has no
> > > ambiguity!)
> > >
> > > 2. In addition to #1, support creating and referencing temporary
> > functions
> > > associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
> > > functions. The resolution order is: global temp functions -> built-in
> > > functions -> temp functions in current cat/db -> catalog function.
> > > (Resolution for partially or fully qualified function reference is:
> temp
> > > functions -> persistent functions.)
> > >
> > > 3. In addition to #1, support creating and referencing temporary
> > functions
> > > associated with a cat/db with a special namespace for built-in
> functions
> > > and global temp functions. The resolution is the same as #2, except
> that
> > > the special namespace might be prefixed to a reference to a built-in
> > > function or global temp function. (In absence of the special namespace,
> > the
> > > resolution order is the same as in #2.)
> > >
> > > My personal preference is #1, given the unknown use case and introduced
> > > complexity for #2 and #3. However, #2 is an acceptable alternative.
> Thus,
> > > my votes are:
> > >
> > > +1 for #1
> > > +0 for #2
> > > -1 for #3
> > >
> > > Everyone, please cast your vote (in above format please!), or let me
> know
> > > if you have more questions or other candidates.
> > >
> > > Thanks,
> > > Xuefu
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I think this discussion and the one for FLIP-64 are very connected.
> To
> > > > resolve the differences, think we have to think about the basic
> > > principles
> > > > and find consensus there. The basic questions I see are:
> > > >
> > > >  - Do we want to support overriding builtin functions?
> > > >  - Do we want to support overriding catalog functions?
> > > >  - And then later: should temporary functions be tied to a
> > > > catalog/database?
> > > >
> > > > I don’t have much to say about these, except that we should somewhat
> > > stick
> > > > to what the industry does. But I also understand that the industry is
> > > > already very divided on this.
> > > >
> > > > Best,
> > > > Aljoscha
> > > >
> > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > +1 to strive for reaching consensus on the remaining topics. We are
> > > > close to the truth. It will waste a lot of time if we resume the
> topic
> > > some
> > > > time later.
> > > > >
> > > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun”
> > way
> > > > to override a catalog function.
> > > > >
> > > > > I’m not sure about “system.system.fun”, it introduces a nonexistent
> > cat
> > > > & db? And we still need to do special treatment for the dedicated
> > > > system.system cat & db?
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> > > > >>
> > > > >> Hi everyone,
> > > > >>
> > > > >> @Xuefu: I would like to avoid adding too many things
> incrementally.
> > > > Users should be able to override all catalog objects consistently
> > > according
> > > > to FLIP-64 (Support for Temporary Objects in Table module). If
> > functions
> > > > are treated completely different, we need more code and special
> cases.
> > > From
> > > > an implementation perspective, this topic only affects the lookup
> logic
> > > > which is rather low implementation effort which is why I would like
> to
> > > > clarify the remaining items. As you said, we have a slight consenus
> on
> > > > overriding built-in functions; we should also strive for reaching
> > > consensus
> > > > on the remaining topics.
> > > > >>
> > > > >> @Dawid: I like your idea as it ensures registering catalog objects
> > > > consistent and the overriding of built-in functions more explicit.
> > > > >>
> > > > >> Thanks,
> > > > >> Timo
> > > > >>
> > > > >>
> > > > >> On 17.09.19 11:59, kai wang wrote:
> > > > >>> hi, everyone
> > > > >>> I think this flip is very meaningful. it supports functions that
> > can
> > > be
> > > > >>> shared by different catalogs and dbs, reducing the duplication of
> > > > functions.
> > > > >>>
> > > > >>> Our group based on flink's sql parser module implements create
> > > function
> > > > >>> feature, stores the parsed function metadata and schema into
> mysql,
> > > and
> > > > >>> also customizes the catalog, customizes sql-client to support
> > custom
> > > > >>> schemas and functions. Loaded, but the function is currently
> > global,
> > > > and is
> > > > >>> not subdivided according to catalog and db.
> > > > >>>
> > > > >>> In addition, I very much hope to participate in the development
> of
> > > this
> > > > >>> flip, I have been paying attention to the community, but found it
> > is
> > > > more
> > > > >>> difficult to join.
> > > > >>> thank you.
> > > > >>>
> > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> > > > >>>
> > > > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > > > >>>>
> > > > >>>> It seems to me that there is a general consensus on having temp
> > > > functions
> > > > >>>> that have no namespaces and overwrite built-in functions. (As a
> > side
> > > > note
> > > > >>>> for comparability, the current user defined functions are all
> > > > temporary and
> > > > >>>> having no namespaces.)
> > > > >>>>
> > > > >>>> Nevertheless, I can also see the merit of having namespaced temp
> > > > functions
> > > > >>>> that can overwrite functions defined in a specific cat/db.
> > However,
> > > > this
> > > > >>>> idea appears orthogonal to the former and can be added
> > > incrementally.
> > > > >>>>
> > > > >>>> How about we first implement non-namespaced temp functions now
> and
> > > > leave
> > > > >>>> the door open for namespaced ones for later releases as the
> > > > requirement
> > > > >>>> might become more crystal? This also helps shorten the debate
> and
> > > > allow us
> > > > >>>> to make some progress along this direction.
> > > > >>>>
> > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> > > temporary
> > > > temp
> > > > >>>> functions that don't have namespaces, my only concern is the
> > special
> > > > >>>> treatment for a cat/db, which makes code less clean, as evident
> in
> > > > treating
> > > > >>>> the built-in catalog currently.
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> Xuefiu
> > > > >>>>
> > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > > > >>>> [hidden email]>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi,
> > > > >>>>> Another idea to consider on top of Timo's suggestion. How about
> > we
> > > > have a
> > > > >>>>> special namespace (catalog + database) for built-in objects?
> This
> > > > catalog
> > > > >>>>> would be invisible for users as Xuefu was suggesting.
> > > > >>>>>
> > > > >>>>> Then users could still override built-in functions, if they
> fully
> > > > qualify
> > > > >>>>> object with the built-in namespace, but by default the common
> > logic
> > > > of
> > > > >>>>> current dB & cat would be used.
> > > > >>>>>
> > > > >>>>> CREATE TEMPORARY FUNCTION func ...
> > > > >>>>> registers temporary function in current cat & dB
> > > > >>>>>
> > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > > > >>>>> registers temporary function in cat db
> > > > >>>>>
> > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > > > >>>>> Overrides built-in function with temporary function
> > > > >>>>>
> > > > >>>>> The built-in/system namespace would not be writable for
> permanent
> > > > >>>> objects.
> > > > >>>>> WDYT?
> > > > >>>>>
> > > > >>>>> This way I think we can have benefits of both solutions.
> > > > >>>>>
> > > > >>>>> Best,
> > > > >>>>> Dawid
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
> > > wrote:
> > > > >>>>>
> > > > >>>>>> Hi Bowen,
> > > > >>>>>>
> > > > >>>>>> I understand the potential benefit of overriding certain
> > built-in
> > > > >>>>>> functions. I'm open to such a feature if many people agree.
> > > > However, it
> > > > >>>>>> would be great to still support overriding catalog functions
> > with
> > > > >>>>>> temporary functions in order to prototype a query even though
> a
> > > > >>>>>> catalog/database might not be available currently or should
> not
> > be
> > > > >>>>>> modified yet. How about we support both cases?
> > > > >>>>>>
> > > > >>>>>> CREATE TEMPORARY FUNCTION abs
> > > > >>>>>> -> creates/overrides a built-in function and never consideres
> > > > current
> > > > >>>>>> catalog and database; inconsistent with other DDL but
> acceptable
> > > for
> > > > >>>>>> functions I guess.
> > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > > > >>>>>> -> creates/overrides a catalog function
> > > > >>>>>>
> > > > >>>>>> Regarding "Flink don't have any other built-in objects
> (tables,
> > > > views)
> > > > >>>>>> except functions", this might change in the near future. Take
> > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> > example.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>> Timo
> > > > >>>>>>
> > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > > > >>>>>>> Hi Fabian,
> > > > >>>>>>>
> > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> > > > didn't
> > > > >>>>>>> include that as a voting option, and the discussion is mainly
> > > > between
> > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > > > >>>>>>>
> > > > >>>>>>> Re > However, it means that temp functions are differently
> > > treated
> > > > >>>> than
> > > > >>>>>>> other db objects.
> > > > >>>>>>> IMO, the treatment difference results from the fact that
> > > functions
> > > > >>>> are
> > > > >>>>> a
> > > > >>>>>>> bit different from other objects - Flink don't have any other
> > > > >>>> built-in
> > > > >>>>>>> objects (tables, views) except functions.
> > > > >>>>>>>
> > > > >>>>>>> Cheers,
> > > > >>>>>>> Bowen
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>>> --
> > > > >>>> Xuefu Zhang
> > > > >>>>
> > > > >>>> "In Honey We Trust!"
> > > > >>>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > > --
> > > Xuefu Zhang
> > >
> > > "In Honey We Trust!"
> > >
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Xuefu Z
In reply to this post by Dawid Wysakowicz
Re: The reason why I prefer option 3 is that in option 3 all objects
internally are identified with 3 parts.

True, but the problem we have is not about how to differentiate each type
objects internally. Rather, it's rather about how a user referencing an
object unambiguously and consistently.

Thanks,
Xuefu

On Wed, Sep 18, 2019 at 4:55 PM Dawid Wysakowicz <[hidden email]>
wrote:

> Last additional comment on Option 2. The reason why I prefer option 3 is
> that in option 3 all objects internally are identified with 3 parts. This
> makes it easier to handle at different locations e.g. while persisting
> views, as all objects have uniform representation.
>
> On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]>
> wrote:
>
> > Hi,
> > I think it makes sense to start voting at this point.
> >
> > Option 1: Only 1-part identifiers
> > PROS:
> > - allows shadowing built-in functions
> > CONS:
> > - incosistent with all the other objects, both permanent & temporary
> > - does not allow shadowing catalog functions
> >
> > Option 2: Special keyword for built-in function
> > I think this is quite similar to the special catalog/db. The thing I am
> > strongly against in this proposal is the GLOBAL keyword. This keyword
> has a
> > meaning in rdbms systems and means a function that is present for a
> > lifetime of a session in which it was created, but available in all other
> > sessions. Therefore I really don't want to use this keyword in a
> different
> > context.
> >
> > Option 3: Special catalog/db
> >
> > PROS:
> > - allows shadowing built-in functions
> > - allows shadowing catalog functions
> > - consistent with other objects
> > CONS:
> > - we introduce a special namespace for built-in functions
> >
> > I don't see a problem with introducing the special namespace. In the end
> > it is very similar to the keyword approach. In this case the catalog/db
> > combination would be the "keyword"
> >
> > Therefore my votes:
> > Option 1: -0
> > Option 2: -1 (I might change to +0 if we can come up with a better
> keyword)
> > Option 3: +1
> >
> > Best,
> > Dawid
> >
> >
> > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> >
> >> Hi Aljoscha,
> >>
> >> Thanks for the summary and these are great questions to be answered. The
> >> answer to your first question is clear: there is a general agreement to
> >> override built-in functions with temp functions.
> >>
> >> However, your second and third questions are sort of related, as a
> >> function
> >> reference can be either just function name (like "func") or in the form
> or
> >> "cat.db.func". When a reference is just function name, it can mean
> either
> >> a
> >> built-in function or a function defined in the current cat/db. If we
> >> support overriding a built-in function with a temp function, such
> >> overriding can also cover a function in the current cat/db.
> >>
> >> I think what Timo referred as "overriding a catalog function" means a
> temp
> >> function defined as "cat.db.func" overrides a catalog function "func" in
> >> cat/db even if cat/db is not current. To support this, temp function has
> >> to
> >> be tied to a cat/db. What's why I said above that the 2nd and 3rd
> >> questions
> >> are related. The problem with such support is the ambiguity when user
> >> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
> >> Here "func" can means a global temp function, or a temp function in
> >> current
> >> cat/db. If we can assume the former, this creates an inconsistency
> because
> >> "CREATE FUNCTION func" actually means a function in current cat/db. If
> we
> >> assume the latter, then there is no way for user to create a global temp
> >> function.
> >>
> >> Giving a special namespace for built-in functions may solve the
> ambiguity
> >> problem above, but it also introduces artificial catalog/database that
> >> needs special treatment and pollutes the cleanness of  the code. I would
> >> rather introduce a syntax in DDL to solve the problem, like "CREATE
> >> [GLOBAL] TEMPORARY FUNCTION func".
> >>
> >> Thus, I'd like to summarize a few candidate proposals for voting
> purposes:
> >>
> >> 1. Support only global, temporary functions without namespace. Such temp
> >> functions overrides built-in functions and catalog functions in current
> >> cat/db. The resolution order is: temp functions -> built-in functions ->
> >> catalog functions. (Partially or fully qualified functions has no
> >> ambiguity!)
> >>
> >> 2. In addition to #1, support creating and referencing temporary
> functions
> >> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
> >> functions. The resolution order is: global temp functions -> built-in
> >> functions -> temp functions in current cat/db -> catalog function.
> >> (Resolution for partially or fully qualified function reference is: temp
> >> functions -> persistent functions.)
> >>
> >> 3. In addition to #1, support creating and referencing temporary
> functions
> >> associated with a cat/db with a special namespace for built-in functions
> >> and global temp functions. The resolution is the same as #2, except that
> >> the special namespace might be prefixed to a reference to a built-in
> >> function or global temp function. (In absence of the special namespace,
> >> the
> >> resolution order is the same as in #2.)
> >>
> >> My personal preference is #1, given the unknown use case and introduced
> >> complexity for #2 and #3. However, #2 is an acceptable alternative.
> Thus,
> >> my votes are:
> >>
> >> +1 for #1
> >> +0 for #2
> >> -1 for #3
> >>
> >> Everyone, please cast your vote (in above format please!), or let me
> know
> >> if you have more questions or other candidates.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]>
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > I think this discussion and the one for FLIP-64 are very connected. To
> >> > resolve the differences, think we have to think about the basic
> >> principles
> >> > and find consensus there. The basic questions I see are:
> >> >
> >> >  - Do we want to support overriding builtin functions?
> >> >  - Do we want to support overriding catalog functions?
> >> >  - And then later: should temporary functions be tied to a
> >> > catalog/database?
> >> >
> >> > I don’t have much to say about these, except that we should somewhat
> >> stick
> >> > to what the industry does. But I also understand that the industry is
> >> > already very divided on this.
> >> >
> >> > Best,
> >> > Aljoscha
> >> >
> >> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > +1 to strive for reaching consensus on the remaining topics. We are
> >> > close to the truth. It will waste a lot of time if we resume the topic
> >> some
> >> > time later.
> >> > >
> >> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun”
> way
> >> > to override a catalog function.
> >> > >
> >> > > I’m not sure about “system.system.fun”, it introduces a nonexistent
> >> cat
> >> > & db? And we still need to do special treatment for the dedicated
> >> > system.system cat & db?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > >
> >> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> >> > >>
> >> > >> Hi everyone,
> >> > >>
> >> > >> @Xuefu: I would like to avoid adding too many things incrementally.
> >> > Users should be able to override all catalog objects consistently
> >> according
> >> > to FLIP-64 (Support for Temporary Objects in Table module). If
> functions
> >> > are treated completely different, we need more code and special cases.
> >> From
> >> > an implementation perspective, this topic only affects the lookup
> logic
> >> > which is rather low implementation effort which is why I would like to
> >> > clarify the remaining items. As you said, we have a slight consenus on
> >> > overriding built-in functions; we should also strive for reaching
> >> consensus
> >> > on the remaining topics.
> >> > >>
> >> > >> @Dawid: I like your idea as it ensures registering catalog objects
> >> > consistent and the overriding of built-in functions more explicit.
> >> > >>
> >> > >> Thanks,
> >> > >> Timo
> >> > >>
> >> > >>
> >> > >> On 17.09.19 11:59, kai wang wrote:
> >> > >>> hi, everyone
> >> > >>> I think this flip is very meaningful. it supports functions that
> >> can be
> >> > >>> shared by different catalogs and dbs, reducing the duplication of
> >> > functions.
> >> > >>>
> >> > >>> Our group based on flink's sql parser module implements create
> >> function
> >> > >>> feature, stores the parsed function metadata and schema into
> mysql,
> >> and
> >> > >>> also customizes the catalog, customizes sql-client to support
> custom
> >> > >>> schemas and functions. Loaded, but the function is currently
> global,
> >> > and is
> >> > >>> not subdivided according to catalog and db.
> >> > >>>
> >> > >>> In addition, I very much hope to participate in the development of
> >> this
> >> > >>> flip, I have been paying attention to the community, but found it
> is
> >> > more
> >> > >>> difficult to join.
> >> > >>> thank you.
> >> > >>>
> >> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> >> > >>>
> >> > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> >> > >>>>
> >> > >>>> It seems to me that there is a general consensus on having temp
> >> > functions
> >> > >>>> that have no namespaces and overwrite built-in functions. (As a
> >> side
> >> > note
> >> > >>>> for comparability, the current user defined functions are all
> >> > temporary and
> >> > >>>> having no namespaces.)
> >> > >>>>
> >> > >>>> Nevertheless, I can also see the merit of having namespaced temp
> >> > functions
> >> > >>>> that can overwrite functions defined in a specific cat/db.
> However,
> >> > this
> >> > >>>> idea appears orthogonal to the former and can be added
> >> incrementally.
> >> > >>>>
> >> > >>>> How about we first implement non-namespaced temp functions now
> and
> >> > leave
> >> > >>>> the door open for namespaced ones for later releases as the
> >> > requirement
> >> > >>>> might become more crystal? This also helps shorten the debate and
> >> > allow us
> >> > >>>> to make some progress along this direction.
> >> > >>>>
> >> > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> >> temporary
> >> > temp
> >> > >>>> functions that don't have namespaces, my only concern is the
> >> special
> >> > >>>> treatment for a cat/db, which makes code less clean, as evident
> in
> >> > treating
> >> > >>>> the built-in catalog currently.
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Xuefiu
> >> > >>>>
> >> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >> > >>>> [hidden email]>
> >> > >>>> wrote:
> >> > >>>>
> >> > >>>>> Hi,
> >> > >>>>> Another idea to consider on top of Timo's suggestion. How about
> we
> >> > have a
> >> > >>>>> special namespace (catalog + database) for built-in objects?
> This
> >> > catalog
> >> > >>>>> would be invisible for users as Xuefu was suggesting.
> >> > >>>>>
> >> > >>>>> Then users could still override built-in functions, if they
> fully
> >> > qualify
> >> > >>>>> object with the built-in namespace, but by default the common
> >> logic
> >> > of
> >> > >>>>> current dB & cat would be used.
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION func ...
> >> > >>>>> registers temporary function in current cat & dB
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >> > >>>>> registers temporary function in cat db
> >> > >>>>>
> >> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >> > >>>>> Overrides built-in function with temporary function
> >> > >>>>>
> >> > >>>>> The built-in/system namespace would not be writable for
> permanent
> >> > >>>> objects.
> >> > >>>>> WDYT?
> >> > >>>>>
> >> > >>>>> This way I think we can have benefits of both solutions.
> >> > >>>>>
> >> > >>>>> Best,
> >> > >>>>> Dawid
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]>
> >> wrote:
> >> > >>>>>
> >> > >>>>>> Hi Bowen,
> >> > >>>>>>
> >> > >>>>>> I understand the potential benefit of overriding certain
> built-in
> >> > >>>>>> functions. I'm open to such a feature if many people agree.
> >> > However, it
> >> > >>>>>> would be great to still support overriding catalog functions
> with
> >> > >>>>>> temporary functions in order to prototype a query even though a
> >> > >>>>>> catalog/database might not be available currently or should not
> >> be
> >> > >>>>>> modified yet. How about we support both cases?
> >> > >>>>>>
> >> > >>>>>> CREATE TEMPORARY FUNCTION abs
> >> > >>>>>> -> creates/overrides a built-in function and never consideres
> >> > current
> >> > >>>>>> catalog and database; inconsistent with other DDL but
> acceptable
> >> for
> >> > >>>>>> functions I guess.
> >> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >> > >>>>>> -> creates/overrides a catalog function
> >> > >>>>>>
> >> > >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> >> > views)
> >> > >>>>>> except functions", this might change in the near future. Take
> >> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> example.
> >> > >>>>>>
> >> > >>>>>> Thanks,
> >> > >>>>>> Timo
> >> > >>>>>>
> >> > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >> > >>>>>>> Hi Fabian,
> >> > >>>>>>>
> >> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> >> > didn't
> >> > >>>>>>> include that as a voting option, and the discussion is mainly
> >> > between
> >> > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >> > >>>>>>>
> >> > >>>>>>> Re > However, it means that temp functions are differently
> >> treated
> >> > >>>> than
> >> > >>>>>>> other db objects.
> >> > >>>>>>> IMO, the treatment difference results from the fact that
> >> functions
> >> > >>>> are
> >> > >>>>> a
> >> > >>>>>>> bit different from other objects - Flink don't have any other
> >> > >>>> built-in
> >> > >>>>>>> objects (tables, views) except functions.
> >> > >>>>>>>
> >> > >>>>>>> Cheers,
> >> > >>>>>>> Bowen
> >> > >>>>>>>
> >> > >>>>>>
> >> > >>>>
> >> > >>>> --
> >> > >>>> Xuefu Zhang
> >> > >>>>
> >> > >>>> "In Honey We Trust!"
> >> > >>>>
> >> > >>
> >> > >
> >> >
> >> >
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
> >
>


--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Xuefu Z
In reply to this post by Dawid Wysakowicz
@Dawid, Re: we also don't need additional referencing the specialcatalog
anywhere.

True. But once we allow such reference, then user can do so in any possible
place where a function name is expected, for which we have to handle.
That's a big difference, I think.

Thanks,
Xuefu

On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <[hidden email]>
wrote:

> @Bowen I am not suggesting introducing additional catalog. I think we need
> to get rid of the current built-in catalog.
>
> @Xuefu in option #3 we also don't need additional referencing the special
> catalog anywhere else besides in the CREATE statement. The resolution
> behaviour is exactly the same in both options.
>
> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
>
> > Hi Dawid,
> >
> > "GLOBAL" is a temporary keyword that was given to the approach. It can be
> > changed to something else for better.
> >
> > The difference between this and the #3 approach is that we only need the
> > keyword for this create DDL. For other places (such as function
> > referencing), no keyword or special namespace is needed.
> >
> > Thanks,
> > Xuefu
> >
> > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> > [hidden email]>
> > wrote:
> >
> > > Hi,
> > > I think it makes sense to start voting at this point.
> > >
> > > Option 1: Only 1-part identifiers
> > > PROS:
> > > - allows shadowing built-in functions
> > > CONS:
> > > - incosistent with all the other objects, both permanent & temporary
> > > - does not allow shadowing catalog functions
> > >
> > > Option 2: Special keyword for built-in function
> > > I think this is quite similar to the special catalog/db. The thing I am
> > > strongly against in this proposal is the GLOBAL keyword. This keyword
> > has a
> > > meaning in rdbms systems and means a function that is present for a
> > > lifetime of a session in which it was created, but available in all
> other
> > > sessions. Therefore I really don't want to use this keyword in a
> > different
> > > context.
> > >
> > > Option 3: Special catalog/db
> > >
> > > PROS:
> > > - allows shadowing built-in functions
> > > - allows shadowing catalog functions
> > > - consistent with other objects
> > > CONS:
> > > - we introduce a special namespace for built-in functions
> > >
> > > I don't see a problem with introducing the special namespace. In the
> end
> > it
> > > is very similar to the keyword approach. In this case the catalog/db
> > > combination would be the "keyword"
> > >
> > > Therefore my votes:
> > > Option 1: -0
> > > Option 2: -1 (I might change to +0 if we can come up with a better
> > keyword)
> > > Option 3: +1
> > >
> > > Best,
> > > Dawid
> > >
> > >
> > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> > >
> > > > Hi Aljoscha,
> > > >
> > > > Thanks for the summary and these are great questions to be answered.
> > The
> > > > answer to your first question is clear: there is a general agreement
> to
> > > > override built-in functions with temp functions.
> > > >
> > > > However, your second and third questions are sort of related, as a
> > > function
> > > > reference can be either just function name (like "func") or in the
> form
> > > or
> > > > "cat.db.func". When a reference is just function name, it can mean
> > > either a
> > > > built-in function or a function defined in the current cat/db. If we
> > > > support overriding a built-in function with a temp function, such
> > > > overriding can also cover a function in the current cat/db.
> > > >
> > > > I think what Timo referred as "overriding a catalog function" means a
> > > temp
> > > > function defined as "cat.db.func" overrides a catalog function "func"
> > in
> > > > cat/db even if cat/db is not current. To support this, temp function
> > has
> > > to
> > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd
> > > questions
> > > > are related. The problem with such support is the ambiguity when user
> > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
> ...".
> > > > Here "func" can means a global temp function, or a temp function in
> > > current
> > > > cat/db. If we can assume the former, this creates an inconsistency
> > > because
> > > > "CREATE FUNCTION func" actually means a function in current cat/db.
> If
> > we
> > > > assume the latter, then there is no way for user to create a global
> > temp
> > > > function.
> > > >
> > > > Giving a special namespace for built-in functions may solve the
> > ambiguity
> > > > problem above, but it also introduces artificial catalog/database
> that
> > > > needs special treatment and pollutes the cleanness of  the code. I
> > would
> > > > rather introduce a syntax in DDL to solve the problem, like "CREATE
> > > > [GLOBAL] TEMPORARY FUNCTION func".
> > > >
> > > > Thus, I'd like to summarize a few candidate proposals for voting
> > > purposes:
> > > >
> > > > 1. Support only global, temporary functions without namespace. Such
> > temp
> > > > functions overrides built-in functions and catalog functions in
> current
> > > > cat/db. The resolution order is: temp functions -> built-in functions
> > ->
> > > > catalog functions. (Partially or fully qualified functions has no
> > > > ambiguity!)
> > > >
> > > > 2. In addition to #1, support creating and referencing temporary
> > > functions
> > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global
> temp
> > > > functions. The resolution order is: global temp functions -> built-in
> > > > functions -> temp functions in current cat/db -> catalog function.
> > > > (Resolution for partially or fully qualified function reference is:
> > temp
> > > > functions -> persistent functions.)
> > > >
> > > > 3. In addition to #1, support creating and referencing temporary
> > > functions
> > > > associated with a cat/db with a special namespace for built-in
> > functions
> > > > and global temp functions. The resolution is the same as #2, except
> > that
> > > > the special namespace might be prefixed to a reference to a built-in
> > > > function or global temp function. (In absence of the special
> namespace,
> > > the
> > > > resolution order is the same as in #2.)
> > > >
> > > > My personal preference is #1, given the unknown use case and
> introduced
> > > > complexity for #2 and #3. However, #2 is an acceptable alternative.
> > Thus,
> > > > my votes are:
> > > >
> > > > +1 for #1
> > > > +0 for #2
> > > > -1 for #3
> > > >
> > > > Everyone, please cast your vote (in above format please!), or let me
> > know
> > > > if you have more questions or other candidates.
> > > >
> > > > Thanks,
> > > > Xuefu
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
> [hidden email]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I think this discussion and the one for FLIP-64 are very connected.
> > To
> > > > > resolve the differences, think we have to think about the basic
> > > > principles
> > > > > and find consensus there. The basic questions I see are:
> > > > >
> > > > >  - Do we want to support overriding builtin functions?
> > > > >  - Do we want to support overriding catalog functions?
> > > > >  - And then later: should temporary functions be tied to a
> > > > > catalog/database?
> > > > >
> > > > > I don’t have much to say about these, except that we should
> somewhat
> > > > stick
> > > > > to what the industry does. But I also understand that the industry
> is
> > > > > already very divided on this.
> > > > >
> > > > > Best,
> > > > > Aljoscha
> > > > >
> > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > +1 to strive for reaching consensus on the remaining topics. We
> are
> > > > > close to the truth. It will waste a lot of time if we resume the
> > topic
> > > > some
> > > > > time later.
> > > > > >
> > > > > > +1 to “1-part/override” and I’m also fine with Timo’s
> “cat.db.fun”
> > > way
> > > > > to override a catalog function.
> > > > > >
> > > > > > I’m not sure about “system.system.fun”, it introduces a
> nonexistent
> > > cat
> > > > > & db? And we still need to do special treatment for the dedicated
> > > > > system.system cat & db?
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> > > > > >>
> > > > > >> Hi everyone,
> > > > > >>
> > > > > >> @Xuefu: I would like to avoid adding too many things
> > incrementally.
> > > > > Users should be able to override all catalog objects consistently
> > > > according
> > > > > to FLIP-64 (Support for Temporary Objects in Table module). If
> > > functions
> > > > > are treated completely different, we need more code and special
> > cases.
> > > > From
> > > > > an implementation perspective, this topic only affects the lookup
> > logic
> > > > > which is rather low implementation effort which is why I would like
> > to
> > > > > clarify the remaining items. As you said, we have a slight consenus
> > on
> > > > > overriding built-in functions; we should also strive for reaching
> > > > consensus
> > > > > on the remaining topics.
> > > > > >>
> > > > > >> @Dawid: I like your idea as it ensures registering catalog
> objects
> > > > > consistent and the overriding of built-in functions more explicit.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Timo
> > > > > >>
> > > > > >>
> > > > > >> On 17.09.19 11:59, kai wang wrote:
> > > > > >>> hi, everyone
> > > > > >>> I think this flip is very meaningful. it supports functions
> that
> > > can
> > > > be
> > > > > >>> shared by different catalogs and dbs, reducing the duplication
> of
> > > > > functions.
> > > > > >>>
> > > > > >>> Our group based on flink's sql parser module implements create
> > > > function
> > > > > >>> feature, stores the parsed function metadata and schema into
> > mysql,
> > > > and
> > > > > >>> also customizes the catalog, customizes sql-client to support
> > > custom
> > > > > >>> schemas and functions. Loaded, but the function is currently
> > > global,
> > > > > and is
> > > > > >>> not subdivided according to catalog and db.
> > > > > >>>
> > > > > >>> In addition, I very much hope to participate in the development
> > of
> > > > this
> > > > > >>> flip, I have been paying attention to the community, but found
> it
> > > is
> > > > > more
> > > > > >>> difficult to join.
> > > > > >>> thank you.
> > > > > >>>
> > > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> > > > > >>>
> > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > > > > >>>>
> > > > > >>>> It seems to me that there is a general consensus on having
> temp
> > > > > functions
> > > > > >>>> that have no namespaces and overwrite built-in functions. (As
> a
> > > side
> > > > > note
> > > > > >>>> for comparability, the current user defined functions are all
> > > > > temporary and
> > > > > >>>> having no namespaces.)
> > > > > >>>>
> > > > > >>>> Nevertheless, I can also see the merit of having namespaced
> temp
> > > > > functions
> > > > > >>>> that can overwrite functions defined in a specific cat/db.
> > > However,
> > > > > this
> > > > > >>>> idea appears orthogonal to the former and can be added
> > > > incrementally.
> > > > > >>>>
> > > > > >>>> How about we first implement non-namespaced temp functions now
> > and
> > > > > leave
> > > > > >>>> the door open for namespaced ones for later releases as the
> > > > > requirement
> > > > > >>>> might become more crystal? This also helps shorten the debate
> > and
> > > > > allow us
> > > > > >>>> to make some progress along this direction.
> > > > > >>>>
> > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> > > > temporary
> > > > > temp
> > > > > >>>> functions that don't have namespaces, my only concern is the
> > > special
> > > > > >>>> treatment for a cat/db, which makes code less clean, as
> evident
> > in
> > > > > treating
> > > > > >>>> the built-in catalog currently.
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> Xuefiu
> > > > > >>>>
> > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > > > > >>>> [hidden email]>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> Hi,
> > > > > >>>>> Another idea to consider on top of Timo's suggestion. How
> about
> > > we
> > > > > have a
> > > > > >>>>> special namespace (catalog + database) for built-in objects?
> > This
> > > > > catalog
> > > > > >>>>> would be invisible for users as Xuefu was suggesting.
> > > > > >>>>>
> > > > > >>>>> Then users could still override built-in functions, if they
> > fully
> > > > > qualify
> > > > > >>>>> object with the built-in namespace, but by default the common
> > > logic
> > > > > of
> > > > > >>>>> current dB & cat would be used.
> > > > > >>>>>
> > > > > >>>>> CREATE TEMPORARY FUNCTION func ...
> > > > > >>>>> registers temporary function in current cat & dB
> > > > > >>>>>
> > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > > > > >>>>> registers temporary function in cat db
> > > > > >>>>>
> > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > > > > >>>>> Overrides built-in function with temporary function
> > > > > >>>>>
> > > > > >>>>> The built-in/system namespace would not be writable for
> > permanent
> > > > > >>>> objects.
> > > > > >>>>> WDYT?
> > > > > >>>>>
> > > > > >>>>> This way I think we can have benefits of both solutions.
> > > > > >>>>>
> > > > > >>>>> Best,
> > > > > >>>>> Dawid
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]
> >
> > > > wrote:
> > > > > >>>>>
> > > > > >>>>>> Hi Bowen,
> > > > > >>>>>>
> > > > > >>>>>> I understand the potential benefit of overriding certain
> > > built-in
> > > > > >>>>>> functions. I'm open to such a feature if many people agree.
> > > > > However, it
> > > > > >>>>>> would be great to still support overriding catalog functions
> > > with
> > > > > >>>>>> temporary functions in order to prototype a query even
> though
> > a
> > > > > >>>>>> catalog/database might not be available currently or should
> > not
> > > be
> > > > > >>>>>> modified yet. How about we support both cases?
> > > > > >>>>>>
> > > > > >>>>>> CREATE TEMPORARY FUNCTION abs
> > > > > >>>>>> -> creates/overrides a built-in function and never
> consideres
> > > > > current
> > > > > >>>>>> catalog and database; inconsistent with other DDL but
> > acceptable
> > > > for
> > > > > >>>>>> functions I guess.
> > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > > > > >>>>>> -> creates/overrides a catalog function
> > > > > >>>>>>
> > > > > >>>>>> Regarding "Flink don't have any other built-in objects
> > (tables,
> > > > > views)
> > > > > >>>>>> except functions", this might change in the near future.
> Take
> > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> > > example.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>> Timo
> > > > > >>>>>>
> > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > > > > >>>>>>> Hi Fabian,
> > > > > >>>>>>>
> > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable
> thus I
> > > > > didn't
> > > > > >>>>>>> include that as a voting option, and the discussion is
> mainly
> > > > > between
> > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > > > > >>>>>>>
> > > > > >>>>>>> Re > However, it means that temp functions are differently
> > > > treated
> > > > > >>>> than
> > > > > >>>>>>> other db objects.
> > > > > >>>>>>> IMO, the treatment difference results from the fact that
> > > > functions
> > > > > >>>> are
> > > > > >>>>> a
> > > > > >>>>>>> bit different from other objects - Flink don't have any
> other
> > > > > >>>> built-in
> > > > > >>>>>>> objects (tables, views) except functions.
> > > > > >>>>>>>
> > > > > >>>>>>> Cheers,
> > > > > >>>>>>> Bowen
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>
> > > > > >>>> --
> > > > > >>>> Xuefu Zhang
> > > > > >>>>
> > > > > >>>> "In Honey We Trust!"
> > > > > >>>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Xuefu Zhang
> > > >
> > > > "In Honey We Trust!"
> > > >
> > >
> >
> >
> > --
> > Xuefu Zhang
> >
> > "In Honey We Trust!"
> >
>


--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Jark Wu-2
I agree with Xuefu that inconsistent handling with all the other objects is
not a big problem.

Regarding to option#3, the special "system.system" namespace may confuse
users.
Users need to know the set of built-in function names to know when to use
"system.system" namespace.
What will happen if user registers a non-builtin function name under the
"system.system" namespace?
Besides, I think it doesn't solve the "explode" problem I mentioned at the
beginning of this thread.

So here is my vote:

+1 for #1
0 for #2
-1 for #3

Best,
Jark


On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:

> @Dawid, Re: we also don't need additional referencing the specialcatalog
> anywhere.
>
> True. But once we allow such reference, then user can do so in any possible
> place where a function name is expected, for which we have to handle.
> That's a big difference, I think.
>
> Thanks,
> Xuefu
>
> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> [hidden email]>
> wrote:
>
> > @Bowen I am not suggesting introducing additional catalog. I think we
> need
> > to get rid of the current built-in catalog.
> >
> > @Xuefu in option #3 we also don't need additional referencing the special
> > catalog anywhere else besides in the CREATE statement. The resolution
> > behaviour is exactly the same in both options.
> >
> > On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
> >
> > > Hi Dawid,
> > >
> > > "GLOBAL" is a temporary keyword that was given to the approach. It can
> be
> > > changed to something else for better.
> > >
> > > The difference between this and the #3 approach is that we only need
> the
> > > keyword for this create DDL. For other places (such as function
> > > referencing), no keyword or special namespace is needed.
> > >
> > > Thanks,
> > > Xuefu
> > >
> > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> > > [hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > > I think it makes sense to start voting at this point.
> > > >
> > > > Option 1: Only 1-part identifiers
> > > > PROS:
> > > > - allows shadowing built-in functions
> > > > CONS:
> > > > - incosistent with all the other objects, both permanent & temporary
> > > > - does not allow shadowing catalog functions
> > > >
> > > > Option 2: Special keyword for built-in function
> > > > I think this is quite similar to the special catalog/db. The thing I
> am
> > > > strongly against in this proposal is the GLOBAL keyword. This keyword
> > > has a
> > > > meaning in rdbms systems and means a function that is present for a
> > > > lifetime of a session in which it was created, but available in all
> > other
> > > > sessions. Therefore I really don't want to use this keyword in a
> > > different
> > > > context.
> > > >
> > > > Option 3: Special catalog/db
> > > >
> > > > PROS:
> > > > - allows shadowing built-in functions
> > > > - allows shadowing catalog functions
> > > > - consistent with other objects
> > > > CONS:
> > > > - we introduce a special namespace for built-in functions
> > > >
> > > > I don't see a problem with introducing the special namespace. In the
> > end
> > > it
> > > > is very similar to the keyword approach. In this case the catalog/db
> > > > combination would be the "keyword"
> > > >
> > > > Therefore my votes:
> > > > Option 1: -0
> > > > Option 2: -1 (I might change to +0 if we can come up with a better
> > > keyword)
> > > > Option 3: +1
> > > >
> > > > Best,
> > > > Dawid
> > > >
> > > >
> > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> > > >
> > > > > Hi Aljoscha,
> > > > >
> > > > > Thanks for the summary and these are great questions to be
> answered.
> > > The
> > > > > answer to your first question is clear: there is a general
> agreement
> > to
> > > > > override built-in functions with temp functions.
> > > > >
> > > > > However, your second and third questions are sort of related, as a
> > > > function
> > > > > reference can be either just function name (like "func") or in the
> > form
> > > > or
> > > > > "cat.db.func". When a reference is just function name, it can mean
> > > > either a
> > > > > built-in function or a function defined in the current cat/db. If
> we
> > > > > support overriding a built-in function with a temp function, such
> > > > > overriding can also cover a function in the current cat/db.
> > > > >
> > > > > I think what Timo referred as "overriding a catalog function"
> means a
> > > > temp
> > > > > function defined as "cat.db.func" overrides a catalog function
> "func"
> > > in
> > > > > cat/db even if cat/db is not current. To support this, temp
> function
> > > has
> > > > to
> > > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd
> > > > questions
> > > > > are related. The problem with such support is the ambiguity when
> user
> > > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
> > ...".
> > > > > Here "func" can means a global temp function, or a temp function in
> > > > current
> > > > > cat/db. If we can assume the former, this creates an inconsistency
> > > > because
> > > > > "CREATE FUNCTION func" actually means a function in current cat/db.
> > If
> > > we
> > > > > assume the latter, then there is no way for user to create a global
> > > temp
> > > > > function.
> > > > >
> > > > > Giving a special namespace for built-in functions may solve the
> > > ambiguity
> > > > > problem above, but it also introduces artificial catalog/database
> > that
> > > > > needs special treatment and pollutes the cleanness of  the code. I
> > > would
> > > > > rather introduce a syntax in DDL to solve the problem, like "CREATE
> > > > > [GLOBAL] TEMPORARY FUNCTION func".
> > > > >
> > > > > Thus, I'd like to summarize a few candidate proposals for voting
> > > > purposes:
> > > > >
> > > > > 1. Support only global, temporary functions without namespace. Such
> > > temp
> > > > > functions overrides built-in functions and catalog functions in
> > current
> > > > > cat/db. The resolution order is: temp functions -> built-in
> functions
> > > ->
> > > > > catalog functions. (Partially or fully qualified functions has no
> > > > > ambiguity!)
> > > > >
> > > > > 2. In addition to #1, support creating and referencing temporary
> > > > functions
> > > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global
> > temp
> > > > > functions. The resolution order is: global temp functions ->
> built-in
> > > > > functions -> temp functions in current cat/db -> catalog function.
> > > > > (Resolution for partially or fully qualified function reference is:
> > > temp
> > > > > functions -> persistent functions.)
> > > > >
> > > > > 3. In addition to #1, support creating and referencing temporary
> > > > functions
> > > > > associated with a cat/db with a special namespace for built-in
> > > functions
> > > > > and global temp functions. The resolution is the same as #2, except
> > > that
> > > > > the special namespace might be prefixed to a reference to a
> built-in
> > > > > function or global temp function. (In absence of the special
> > namespace,
> > > > the
> > > > > resolution order is the same as in #2.)
> > > > >
> > > > > My personal preference is #1, given the unknown use case and
> > introduced
> > > > > complexity for #2 and #3. However, #2 is an acceptable alternative.
> > > Thus,
> > > > > my votes are:
> > > > >
> > > > > +1 for #1
> > > > > +0 for #2
> > > > > -1 for #3
> > > > >
> > > > > Everyone, please cast your vote (in above format please!), or let
> me
> > > know
> > > > > if you have more questions or other candidates.
> > > > >
> > > > > Thanks,
> > > > > Xuefu
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
> > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I think this discussion and the one for FLIP-64 are very
> connected.
> > > To
> > > > > > resolve the differences, think we have to think about the basic
> > > > > principles
> > > > > > and find consensus there. The basic questions I see are:
> > > > > >
> > > > > >  - Do we want to support overriding builtin functions?
> > > > > >  - Do we want to support overriding catalog functions?
> > > > > >  - And then later: should temporary functions be tied to a
> > > > > > catalog/database?
> > > > > >
> > > > > > I don’t have much to say about these, except that we should
> > somewhat
> > > > > stick
> > > > > > to what the industry does. But I also understand that the
> industry
> > is
> > > > > > already very divided on this.
> > > > > >
> > > > > > Best,
> > > > > > Aljoscha
> > > > > >
> > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > +1 to strive for reaching consensus on the remaining topics. We
> > are
> > > > > > close to the truth. It will waste a lot of time if we resume the
> > > topic
> > > > > some
> > > > > > time later.
> > > > > > >
> > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s
> > “cat.db.fun”
> > > > way
> > > > > > to override a catalog function.
> > > > > > >
> > > > > > > I’m not sure about “system.system.fun”, it introduces a
> > nonexistent
> > > > cat
> > > > > > & db? And we still need to do special treatment for the dedicated
> > > > > > system.system cat & db?
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > >
> > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> > > > > > >>
> > > > > > >> Hi everyone,
> > > > > > >>
> > > > > > >> @Xuefu: I would like to avoid adding too many things
> > > incrementally.
> > > > > > Users should be able to override all catalog objects consistently
> > > > > according
> > > > > > to FLIP-64 (Support for Temporary Objects in Table module). If
> > > > functions
> > > > > > are treated completely different, we need more code and special
> > > cases.
> > > > > From
> > > > > > an implementation perspective, this topic only affects the lookup
> > > logic
> > > > > > which is rather low implementation effort which is why I would
> like
> > > to
> > > > > > clarify the remaining items. As you said, we have a slight
> consenus
> > > on
> > > > > > overriding built-in functions; we should also strive for reaching
> > > > > consensus
> > > > > > on the remaining topics.
> > > > > > >>
> > > > > > >> @Dawid: I like your idea as it ensures registering catalog
> > objects
> > > > > > consistent and the overriding of built-in functions more
> explicit.
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Timo
> > > > > > >>
> > > > > > >>
> > > > > > >> On 17.09.19 11:59, kai wang wrote:
> > > > > > >>> hi, everyone
> > > > > > >>> I think this flip is very meaningful. it supports functions
> > that
> > > > can
> > > > > be
> > > > > > >>> shared by different catalogs and dbs, reducing the
> duplication
> > of
> > > > > > functions.
> > > > > > >>>
> > > > > > >>> Our group based on flink's sql parser module implements
> create
> > > > > function
> > > > > > >>> feature, stores the parsed function metadata and schema into
> > > mysql,
> > > > > and
> > > > > > >>> also customizes the catalog, customizes sql-client to support
> > > > custom
> > > > > > >>> schemas and functions. Loaded, but the function is currently
> > > > global,
> > > > > > and is
> > > > > > >>> not subdivided according to catalog and db.
> > > > > > >>>
> > > > > > >>> In addition, I very much hope to participate in the
> development
> > > of
> > > > > this
> > > > > > >>> flip, I have been paying attention to the community, but
> found
> > it
> > > > is
> > > > > > more
> > > > > > >>> difficult to join.
> > > > > > >>> thank you.
> > > > > > >>>
> > > > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> > > > > > >>>
> > > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > > > > > >>>>
> > > > > > >>>> It seems to me that there is a general consensus on having
> > temp
> > > > > > functions
> > > > > > >>>> that have no namespaces and overwrite built-in functions.
> (As
> > a
> > > > side
> > > > > > note
> > > > > > >>>> for comparability, the current user defined functions are
> all
> > > > > > temporary and
> > > > > > >>>> having no namespaces.)
> > > > > > >>>>
> > > > > > >>>> Nevertheless, I can also see the merit of having namespaced
> > temp
> > > > > > functions
> > > > > > >>>> that can overwrite functions defined in a specific cat/db.
> > > > However,
> > > > > > this
> > > > > > >>>> idea appears orthogonal to the former and can be added
> > > > > incrementally.
> > > > > > >>>>
> > > > > > >>>> How about we first implement non-namespaced temp functions
> now
> > > and
> > > > > > leave
> > > > > > >>>> the door open for namespaced ones for later releases as the
> > > > > > requirement
> > > > > > >>>> might become more crystal? This also helps shorten the
> debate
> > > and
> > > > > > allow us
> > > > > > >>>> to make some progress along this direction.
> > > > > > >>>>
> > > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> > > > > temporary
> > > > > > temp
> > > > > > >>>> functions that don't have namespaces, my only concern is the
> > > > special
> > > > > > >>>> treatment for a cat/db, which makes code less clean, as
> > evident
> > > in
> > > > > > treating
> > > > > > >>>> the built-in catalog currently.
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> Xuefiu
> > > > > > >>>>
> > > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > > > > > >>>> [hidden email]>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi,
> > > > > > >>>>> Another idea to consider on top of Timo's suggestion. How
> > about
> > > > we
> > > > > > have a
> > > > > > >>>>> special namespace (catalog + database) for built-in
> objects?
> > > This
> > > > > > catalog
> > > > > > >>>>> would be invisible for users as Xuefu was suggesting.
> > > > > > >>>>>
> > > > > > >>>>> Then users could still override built-in functions, if they
> > > fully
> > > > > > qualify
> > > > > > >>>>> object with the built-in namespace, but by default the
> common
> > > > logic
> > > > > > of
> > > > > > >>>>> current dB & cat would be used.
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION func ...
> > > > > > >>>>> registers temporary function in current cat & dB
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > > > > > >>>>> registers temporary function in cat db
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > > > > > >>>>> Overrides built-in function with temporary function
> > > > > > >>>>>
> > > > > > >>>>> The built-in/system namespace would not be writable for
> > > permanent
> > > > > > >>>> objects.
> > > > > > >>>>> WDYT?
> > > > > > >>>>>
> > > > > > >>>>> This way I think we can have benefits of both solutions.
> > > > > > >>>>>
> > > > > > >>>>> Best,
> > > > > > >>>>> Dawid
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
> [hidden email]
> > >
> > > > > wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Hi Bowen,
> > > > > > >>>>>>
> > > > > > >>>>>> I understand the potential benefit of overriding certain
> > > > built-in
> > > > > > >>>>>> functions. I'm open to such a feature if many people
> agree.
> > > > > > However, it
> > > > > > >>>>>> would be great to still support overriding catalog
> functions
> > > > with
> > > > > > >>>>>> temporary functions in order to prototype a query even
> > though
> > > a
> > > > > > >>>>>> catalog/database might not be available currently or
> should
> > > not
> > > > be
> > > > > > >>>>>> modified yet. How about we support both cases?
> > > > > > >>>>>>
> > > > > > >>>>>> CREATE TEMPORARY FUNCTION abs
> > > > > > >>>>>> -> creates/overrides a built-in function and never
> > consideres
> > > > > > current
> > > > > > >>>>>> catalog and database; inconsistent with other DDL but
> > > acceptable
> > > > > for
> > > > > > >>>>>> functions I guess.
> > > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > > > > > >>>>>> -> creates/overrides a catalog function
> > > > > > >>>>>>
> > > > > > >>>>>> Regarding "Flink don't have any other built-in objects
> > > (tables,
> > > > > > views)
> > > > > > >>>>>> except functions", this might change in the near future.
> > Take
> > > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> > > > example.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>> Timo
> > > > > > >>>>>>
> > > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > > > > > >>>>>>> Hi Fabian,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable
> > thus I
> > > > > > didn't
> > > > > > >>>>>>> include that as a voting option, and the discussion is
> > mainly
> > > > > > between
> > > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Re > However, it means that temp functions are
> differently
> > > > > treated
> > > > > > >>>> than
> > > > > > >>>>>>> other db objects.
> > > > > > >>>>>>> IMO, the treatment difference results from the fact that
> > > > > functions
> > > > > > >>>> are
> > > > > > >>>>> a
> > > > > > >>>>>>> bit different from other objects - Flink don't have any
> > other
> > > > > > >>>> built-in
> > > > > > >>>>>>> objects (tables, views) except functions.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Cheers,
> > > > > > >>>>>>> Bowen
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>
> > > > > > >>>> --
> > > > > > >>>> Xuefu Zhang
> > > > > > >>>>
> > > > > > >>>> "In Honey We Trust!"
> > > > > > >>>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Xuefu Zhang
> > > > >
> > > > > "In Honey We Trust!"
> > > > >
> > > >
> > >
> > >
> > > --
> > > Xuefu Zhang
> > >
> > > "In Honey We Trust!"
> > >
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Piotr Nowojski-3
Hi,

It is a quite long discussion to follow and I hope I didn’t misunderstand anything. From the proposals presented by Xuefu I would vote:

-1 for #1 and #2
+1 for #3

Besides #3 being IMO more general and more consistent, having qualified names (#3) would help/make easier for someone to use cross databases/catalogs queries (joining multiple data sets/streams). For example with some functions to manipulate/clean up/convert the stored data in different catalogs registered in the respective catalogs.

Piotrek

> On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote:
>
> I agree with Xuefu that inconsistent handling with all the other objects is
> not a big problem.
>
> Regarding to option#3, the special "system.system" namespace may confuse
> users.
> Users need to know the set of built-in function names to know when to use
> "system.system" namespace.
> What will happen if user registers a non-builtin function name under the
> "system.system" namespace?
> Besides, I think it doesn't solve the "explode" problem I mentioned at the
> beginning of this thread.
>
> So here is my vote:
>
> +1 for #1
> 0 for #2
> -1 for #3
>
> Best,
> Jark
>
>
> On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:
>
>> @Dawid, Re: we also don't need additional referencing the specialcatalog
>> anywhere.
>>
>> True. But once we allow such reference, then user can do so in any possible
>> place where a function name is expected, for which we have to handle.
>> That's a big difference, I think.
>>
>> Thanks,
>> Xuefu
>>
>> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> [hidden email]>
>> wrote:
>>
>>> @Bowen I am not suggesting introducing additional catalog. I think we
>> need
>>> to get rid of the current built-in catalog.
>>>
>>> @Xuefu in option #3 we also don't need additional referencing the special
>>> catalog anywhere else besides in the CREATE statement. The resolution
>>> behaviour is exactly the same in both options.
>>>
>>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
>>>
>>>> Hi Dawid,
>>>>
>>>> "GLOBAL" is a temporary keyword that was given to the approach. It can
>> be
>>>> changed to something else for better.
>>>>
>>>> The difference between this and the #3 approach is that we only need
>> the
>>>> keyword for this create DDL. For other places (such as function
>>>> referencing), no keyword or special namespace is needed.
>>>>
>>>> Thanks,
>>>> Xuefu
>>>>
>>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>>>> [hidden email]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> I think it makes sense to start voting at this point.
>>>>>
>>>>> Option 1: Only 1-part identifiers
>>>>> PROS:
>>>>> - allows shadowing built-in functions
>>>>> CONS:
>>>>> - incosistent with all the other objects, both permanent & temporary
>>>>> - does not allow shadowing catalog functions
>>>>>
>>>>> Option 2: Special keyword for built-in function
>>>>> I think this is quite similar to the special catalog/db. The thing I
>> am
>>>>> strongly against in this proposal is the GLOBAL keyword. This keyword
>>>> has a
>>>>> meaning in rdbms systems and means a function that is present for a
>>>>> lifetime of a session in which it was created, but available in all
>>> other
>>>>> sessions. Therefore I really don't want to use this keyword in a
>>>> different
>>>>> context.
>>>>>
>>>>> Option 3: Special catalog/db
>>>>>
>>>>> PROS:
>>>>> - allows shadowing built-in functions
>>>>> - allows shadowing catalog functions
>>>>> - consistent with other objects
>>>>> CONS:
>>>>> - we introduce a special namespace for built-in functions
>>>>>
>>>>> I don't see a problem with introducing the special namespace. In the
>>> end
>>>> it
>>>>> is very similar to the keyword approach. In this case the catalog/db
>>>>> combination would be the "keyword"
>>>>>
>>>>> Therefore my votes:
>>>>> Option 1: -0
>>>>> Option 2: -1 (I might change to +0 if we can come up with a better
>>>> keyword)
>>>>> Option 3: +1
>>>>>
>>>>> Best,
>>>>> Dawid
>>>>>
>>>>>
>>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>>>>>
>>>>>> Hi Aljoscha,
>>>>>>
>>>>>> Thanks for the summary and these are great questions to be
>> answered.
>>>> The
>>>>>> answer to your first question is clear: there is a general
>> agreement
>>> to
>>>>>> override built-in functions with temp functions.
>>>>>>
>>>>>> However, your second and third questions are sort of related, as a
>>>>> function
>>>>>> reference can be either just function name (like "func") or in the
>>> form
>>>>> or
>>>>>> "cat.db.func". When a reference is just function name, it can mean
>>>>> either a
>>>>>> built-in function or a function defined in the current cat/db. If
>> we
>>>>>> support overriding a built-in function with a temp function, such
>>>>>> overriding can also cover a function in the current cat/db.
>>>>>>
>>>>>> I think what Timo referred as "overriding a catalog function"
>> means a
>>>>> temp
>>>>>> function defined as "cat.db.func" overrides a catalog function
>> "func"
>>>> in
>>>>>> cat/db even if cat/db is not current. To support this, temp
>> function
>>>> has
>>>>> to
>>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>>>>> questions
>>>>>> are related. The problem with such support is the ambiguity when
>> user
>>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
>>> ...".
>>>>>> Here "func" can means a global temp function, or a temp function in
>>>>> current
>>>>>> cat/db. If we can assume the former, this creates an inconsistency
>>>>> because
>>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
>>> If
>>>> we
>>>>>> assume the latter, then there is no way for user to create a global
>>>> temp
>>>>>> function.
>>>>>>
>>>>>> Giving a special namespace for built-in functions may solve the
>>>> ambiguity
>>>>>> problem above, but it also introduces artificial catalog/database
>>> that
>>>>>> needs special treatment and pollutes the cleanness of  the code. I
>>>> would
>>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>>>>>> [GLOBAL] TEMPORARY FUNCTION func".
>>>>>>
>>>>>> Thus, I'd like to summarize a few candidate proposals for voting
>>>>> purposes:
>>>>>>
>>>>>> 1. Support only global, temporary functions without namespace. Such
>>>> temp
>>>>>> functions overrides built-in functions and catalog functions in
>>> current
>>>>>> cat/db. The resolution order is: temp functions -> built-in
>> functions
>>>> ->
>>>>>> catalog functions. (Partially or fully qualified functions has no
>>>>>> ambiguity!)
>>>>>>
>>>>>> 2. In addition to #1, support creating and referencing temporary
>>>>> functions
>>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
>>> temp
>>>>>> functions. The resolution order is: global temp functions ->
>> built-in
>>>>>> functions -> temp functions in current cat/db -> catalog function.
>>>>>> (Resolution for partially or fully qualified function reference is:
>>>> temp
>>>>>> functions -> persistent functions.)
>>>>>>
>>>>>> 3. In addition to #1, support creating and referencing temporary
>>>>> functions
>>>>>> associated with a cat/db with a special namespace for built-in
>>>> functions
>>>>>> and global temp functions. The resolution is the same as #2, except
>>>> that
>>>>>> the special namespace might be prefixed to a reference to a
>> built-in
>>>>>> function or global temp function. (In absence of the special
>>> namespace,
>>>>> the
>>>>>> resolution order is the same as in #2.)
>>>>>>
>>>>>> My personal preference is #1, given the unknown use case and
>>> introduced
>>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
>>>> Thus,
>>>>>> my votes are:
>>>>>>
>>>>>> +1 for #1
>>>>>> +0 for #2
>>>>>> -1 for #3
>>>>>>
>>>>>> Everyone, please cast your vote (in above format please!), or let
>> me
>>>> know
>>>>>> if you have more questions or other candidates.
>>>>>>
>>>>>> Thanks,
>>>>>> Xuefu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I think this discussion and the one for FLIP-64 are very
>> connected.
>>>> To
>>>>>>> resolve the differences, think we have to think about the basic
>>>>>> principles
>>>>>>> and find consensus there. The basic questions I see are:
>>>>>>>
>>>>>>> - Do we want to support overriding builtin functions?
>>>>>>> - Do we want to support overriding catalog functions?
>>>>>>> - And then later: should temporary functions be tied to a
>>>>>>> catalog/database?
>>>>>>>
>>>>>>> I don’t have much to say about these, except that we should
>>> somewhat
>>>>>> stick
>>>>>>> to what the industry does. But I also understand that the
>> industry
>>> is
>>>>>>> already very divided on this.
>>>>>>>
>>>>>>> Best,
>>>>>>> Aljoscha
>>>>>>>
>>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
>>> are
>>>>>>> close to the truth. It will waste a lot of time if we resume the
>>>> topic
>>>>>> some
>>>>>>> time later.
>>>>>>>>
>>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
>>> “cat.db.fun”
>>>>> way
>>>>>>> to override a catalog function.
>>>>>>>>
>>>>>>>> I’m not sure about “system.system.fun”, it introduces a
>>> nonexistent
>>>>> cat
>>>>>>> & db? And we still need to do special treatment for the dedicated
>>>>>>> system.system cat & db?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jark
>>>>>>>>
>>>>>>>>
>>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>>>>>>>>>
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> @Xuefu: I would like to avoid adding too many things
>>>> incrementally.
>>>>>>> Users should be able to override all catalog objects consistently
>>>>>> according
>>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
>>>>> functions
>>>>>>> are treated completely different, we need more code and special
>>>> cases.
>>>>>> From
>>>>>>> an implementation perspective, this topic only affects the lookup
>>>> logic
>>>>>>> which is rather low implementation effort which is why I would
>> like
>>>> to
>>>>>>> clarify the remaining items. As you said, we have a slight
>> consenus
>>>> on
>>>>>>> overriding built-in functions; we should also strive for reaching
>>>>>> consensus
>>>>>>> on the remaining topics.
>>>>>>>>>
>>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
>>> objects
>>>>>>> consistent and the overriding of built-in functions more
>> explicit.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Timo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 17.09.19 11:59, kai wang wrote:
>>>>>>>>>> hi, everyone
>>>>>>>>>> I think this flip is very meaningful. it supports functions
>>> that
>>>>> can
>>>>>> be
>>>>>>>>>> shared by different catalogs and dbs, reducing the
>> duplication
>>> of
>>>>>>> functions.
>>>>>>>>>>
>>>>>>>>>> Our group based on flink's sql parser module implements
>> create
>>>>>> function
>>>>>>>>>> feature, stores the parsed function metadata and schema into
>>>> mysql,
>>>>>> and
>>>>>>>>>> also customizes the catalog, customizes sql-client to support
>>>>> custom
>>>>>>>>>> schemas and functions. Loaded, but the function is currently
>>>>> global,
>>>>>>> and is
>>>>>>>>>> not subdivided according to catalog and db.
>>>>>>>>>>
>>>>>>>>>> In addition, I very much hope to participate in the
>> development
>>>> of
>>>>>> this
>>>>>>>>>> flip, I have been paying attention to the community, but
>> found
>>> it
>>>>> is
>>>>>>> more
>>>>>>>>>> difficult to join.
>>>>>>>>>> thank you.
>>>>>>>>>>
>>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>>>>>>>>>>
>>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
>>>>>>>>>>>
>>>>>>>>>>> It seems to me that there is a general consensus on having
>>> temp
>>>>>>> functions
>>>>>>>>>>> that have no namespaces and overwrite built-in functions.
>> (As
>>> a
>>>>> side
>>>>>>> note
>>>>>>>>>>> for comparability, the current user defined functions are
>> all
>>>>>>> temporary and
>>>>>>>>>>> having no namespaces.)
>>>>>>>>>>>
>>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
>>> temp
>>>>>>> functions
>>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
>>>>> However,
>>>>>>> this
>>>>>>>>>>> idea appears orthogonal to the former and can be added
>>>>>> incrementally.
>>>>>>>>>>>
>>>>>>>>>>> How about we first implement non-namespaced temp functions
>> now
>>>> and
>>>>>>> leave
>>>>>>>>>>> the door open for namespaced ones for later releases as the
>>>>>>> requirement
>>>>>>>>>>> might become more crystal? This also helps shorten the
>> debate
>>>> and
>>>>>>> allow us
>>>>>>>>>>> to make some progress along this direction.
>>>>>>>>>>>
>>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
>>>>>> temporary
>>>>>>> temp
>>>>>>>>>>> functions that don't have namespaces, my only concern is the
>>>>> special
>>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
>>> evident
>>>> in
>>>>>>> treating
>>>>>>>>>>> the built-in catalog currently.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Xuefiu
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>>>>>>>>>>> [hidden email]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
>>> about
>>>>> we
>>>>>>> have a
>>>>>>>>>>>> special namespace (catalog + database) for built-in
>> objects?
>>>> This
>>>>>>> catalog
>>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
>>>>>>>>>>>>
>>>>>>>>>>>> Then users could still override built-in functions, if they
>>>> fully
>>>>>>> qualify
>>>>>>>>>>>> object with the built-in namespace, but by default the
>> common
>>>>> logic
>>>>>>> of
>>>>>>>>>>>> current dB & cat would be used.
>>>>>>>>>>>>
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
>>>>>>>>>>>> registers temporary function in current cat & dB
>>>>>>>>>>>>
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>>>>>>>>>>> registers temporary function in cat db
>>>>>>>>>>>>
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>>>>>>>>>>> Overrides built-in function with temporary function
>>>>>>>>>>>>
>>>>>>>>>>>> The built-in/system namespace would not be writable for
>>>> permanent
>>>>>>>>>>> objects.
>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>
>>>>>>>>>>>> This way I think we can have benefits of both solutions.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Dawid
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
>> [hidden email]
>>>>
>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Bowen,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I understand the potential benefit of overriding certain
>>>>> built-in
>>>>>>>>>>>>> functions. I'm open to such a feature if many people
>> agree.
>>>>>>> However, it
>>>>>>>>>>>>> would be great to still support overriding catalog
>> functions
>>>>> with
>>>>>>>>>>>>> temporary functions in order to prototype a query even
>>> though
>>>> a
>>>>>>>>>>>>> catalog/database might not be available currently or
>> should
>>>> not
>>>>> be
>>>>>>>>>>>>> modified yet. How about we support both cases?
>>>>>>>>>>>>>
>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
>>>>>>>>>>>>> -> creates/overrides a built-in function and never
>>> consideres
>>>>>>> current
>>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
>>>> acceptable
>>>>>> for
>>>>>>>>>>>>> functions I guess.
>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>>>>>>>>>>> -> creates/overrides a catalog function
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
>>>> (tables,
>>>>>>> views)
>>>>>>>>>>>>> except functions", this might change in the near future.
>>> Take
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
>>>>> example.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>>>>>>>>>>> Hi Fabian,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
>>> thus I
>>>>>>> didn't
>>>>>>>>>>>>>> include that as a voting option, and the discussion is
>>> mainly
>>>>>>> between
>>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Re > However, it means that temp functions are
>> differently
>>>>>> treated
>>>>>>>>>>> than
>>>>>>>>>>>>>> other db objects.
>>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
>>>>>> functions
>>>>>>>>>>> are
>>>>>>>>>>>> a
>>>>>>>>>>>>>> bit different from other objects - Flink don't have any
>>> other
>>>>>>>>>>> built-in
>>>>>>>>>>>>>> objects (tables, views) except functions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Bowen
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Xuefu Zhang
>>>>>>>>>>>
>>>>>>>>>>> "In Honey We Trust!"
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Xuefu Zhang
>>>>>>
>>>>>> "In Honey We Trust!"
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Xuefu Zhang
>>>>
>>>> "In Honey We Trust!"
>>>>
>>>
>>
>>
>> --
>> Xuefu Zhang
>>
>> "In Honey We Trust!"
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Kurt Young
Looks like I'm the only person who is willing to +1 to #2 for now :-)
But I would suggest to change the keyword from GLOBAL to
something like BUILTIN.

I think #2 and #3 are almost the same proposal, just with different
format to indicate whether it want to override built-in functions.

My biggest reason to choose it is I want this behavior be consistent
with temporal tables. I will give some examples to show the behavior
and also make sure I'm not misunderstanding anything here.

For most DBs, when user create a temporary table with:

CREATE TEMPORARY TABLE t1

It's actually equivalent with:

CREATE TEMPORARY TABLE `curent_db`.t1

If user change current database, they will not be able to access t1 without
fully qualified name, .i.e db1.t1 (assuming db1 is current database when
this temporary table is created).

Only #2 and #3 followed this behavior and I would vote for this since this
makes such behavior consistent through temporal tables and functions.

Why I'm not voting for #3 is a special catalog and database just looks very
hacky to me. It gave a imply that our built-in functions saved at a special
catalog and database, which is actually not. Introducing a dedicated keyword
like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
straightforward. One can argue that we should avoid introducing new keyword,
but it's also very rare that a system can overwrite built-in functions.
Since we
decided to support this, introduce a new keyword is not a big deal IMO.

Best,
Kurt


On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]> wrote:

> Hi,
>
> It is a quite long discussion to follow and I hope I didn’t misunderstand
> anything. From the proposals presented by Xuefu I would vote:
>
> -1 for #1 and #2
> +1 for #3
>
> Besides #3 being IMO more general and more consistent, having qualified
> names (#3) would help/make easier for someone to use cross
> databases/catalogs queries (joining multiple data sets/streams). For
> example with some functions to manipulate/clean up/convert the stored data
> in different catalogs registered in the respective catalogs.
>
> Piotrek
>
> > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote:
> >
> > I agree with Xuefu that inconsistent handling with all the other objects
> is
> > not a big problem.
> >
> > Regarding to option#3, the special "system.system" namespace may confuse
> > users.
> > Users need to know the set of built-in function names to know when to use
> > "system.system" namespace.
> > What will happen if user registers a non-builtin function name under the
> > "system.system" namespace?
> > Besides, I think it doesn't solve the "explode" problem I mentioned at
> the
> > beginning of this thread.
> >
> > So here is my vote:
> >
> > +1 for #1
> > 0 for #2
> > -1 for #3
> >
> > Best,
> > Jark
> >
> >
> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:
> >
> >> @Dawid, Re: we also don't need additional referencing the specialcatalog
> >> anywhere.
> >>
> >> True. But once we allow such reference, then user can do so in any
> possible
> >> place where a function name is expected, for which we have to handle.
> >> That's a big difference, I think.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> >> [hidden email]>
> >> wrote:
> >>
> >>> @Bowen I am not suggesting introducing additional catalog. I think we
> >> need
> >>> to get rid of the current built-in catalog.
> >>>
> >>> @Xuefu in option #3 we also don't need additional referencing the
> special
> >>> catalog anywhere else besides in the CREATE statement. The resolution
> >>> behaviour is exactly the same in both options.
> >>>
> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
> >>>
> >>>> Hi Dawid,
> >>>>
> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It can
> >> be
> >>>> changed to something else for better.
> >>>>
> >>>> The difference between this and the #3 approach is that we only need
> >> the
> >>>> keyword for this create DDL. For other places (such as function
> >>>> referencing), no keyword or special namespace is needed.
> >>>>
> >>>> Thanks,
> >>>> Xuefu
> >>>>
> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> >>>> [hidden email]>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I think it makes sense to start voting at this point.
> >>>>>
> >>>>> Option 1: Only 1-part identifiers
> >>>>> PROS:
> >>>>> - allows shadowing built-in functions
> >>>>> CONS:
> >>>>> - incosistent with all the other objects, both permanent & temporary
> >>>>> - does not allow shadowing catalog functions
> >>>>>
> >>>>> Option 2: Special keyword for built-in function
> >>>>> I think this is quite similar to the special catalog/db. The thing I
> >> am
> >>>>> strongly against in this proposal is the GLOBAL keyword. This keyword
> >>>> has a
> >>>>> meaning in rdbms systems and means a function that is present for a
> >>>>> lifetime of a session in which it was created, but available in all
> >>> other
> >>>>> sessions. Therefore I really don't want to use this keyword in a
> >>>> different
> >>>>> context.
> >>>>>
> >>>>> Option 3: Special catalog/db
> >>>>>
> >>>>> PROS:
> >>>>> - allows shadowing built-in functions
> >>>>> - allows shadowing catalog functions
> >>>>> - consistent with other objects
> >>>>> CONS:
> >>>>> - we introduce a special namespace for built-in functions
> >>>>>
> >>>>> I don't see a problem with introducing the special namespace. In the
> >>> end
> >>>> it
> >>>>> is very similar to the keyword approach. In this case the catalog/db
> >>>>> combination would be the "keyword"
> >>>>>
> >>>>> Therefore my votes:
> >>>>> Option 1: -0
> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better
> >>>> keyword)
> >>>>> Option 3: +1
> >>>>>
> >>>>> Best,
> >>>>> Dawid
> >>>>>
> >>>>>
> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
> >>>>>
> >>>>>> Hi Aljoscha,
> >>>>>>
> >>>>>> Thanks for the summary and these are great questions to be
> >> answered.
> >>>> The
> >>>>>> answer to your first question is clear: there is a general
> >> agreement
> >>> to
> >>>>>> override built-in functions with temp functions.
> >>>>>>
> >>>>>> However, your second and third questions are sort of related, as a
> >>>>> function
> >>>>>> reference can be either just function name (like "func") or in the
> >>> form
> >>>>> or
> >>>>>> "cat.db.func". When a reference is just function name, it can mean
> >>>>> either a
> >>>>>> built-in function or a function defined in the current cat/db. If
> >> we
> >>>>>> support overriding a built-in function with a temp function, such
> >>>>>> overriding can also cover a function in the current cat/db.
> >>>>>>
> >>>>>> I think what Timo referred as "overriding a catalog function"
> >> means a
> >>>>> temp
> >>>>>> function defined as "cat.db.func" overrides a catalog function
> >> "func"
> >>>> in
> >>>>>> cat/db even if cat/db is not current. To support this, temp
> >> function
> >>>> has
> >>>>> to
> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
> >>>>> questions
> >>>>>> are related. The problem with such support is the ambiguity when
> >> user
> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
> >>> ...".
> >>>>>> Here "func" can means a global temp function, or a temp function in
> >>>>> current
> >>>>>> cat/db. If we can assume the former, this creates an inconsistency
> >>>>> because
> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
> >>> If
> >>>> we
> >>>>>> assume the latter, then there is no way for user to create a global
> >>>> temp
> >>>>>> function.
> >>>>>>
> >>>>>> Giving a special namespace for built-in functions may solve the
> >>>> ambiguity
> >>>>>> problem above, but it also introduces artificial catalog/database
> >>> that
> >>>>>> needs special treatment and pollutes the cleanness of  the code. I
> >>>> would
> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
> >>>>>> [GLOBAL] TEMPORARY FUNCTION func".
> >>>>>>
> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting
> >>>>> purposes:
> >>>>>>
> >>>>>> 1. Support only global, temporary functions without namespace. Such
> >>>> temp
> >>>>>> functions overrides built-in functions and catalog functions in
> >>> current
> >>>>>> cat/db. The resolution order is: temp functions -> built-in
> >> functions
> >>>> ->
> >>>>>> catalog functions. (Partially or fully qualified functions has no
> >>>>>> ambiguity!)
> >>>>>>
> >>>>>> 2. In addition to #1, support creating and referencing temporary
> >>>>> functions
> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
> >>> temp
> >>>>>> functions. The resolution order is: global temp functions ->
> >> built-in
> >>>>>> functions -> temp functions in current cat/db -> catalog function.
> >>>>>> (Resolution for partially or fully qualified function reference is:
> >>>> temp
> >>>>>> functions -> persistent functions.)
> >>>>>>
> >>>>>> 3. In addition to #1, support creating and referencing temporary
> >>>>> functions
> >>>>>> associated with a cat/db with a special namespace for built-in
> >>>> functions
> >>>>>> and global temp functions. The resolution is the same as #2, except
> >>>> that
> >>>>>> the special namespace might be prefixed to a reference to a
> >> built-in
> >>>>>> function or global temp function. (In absence of the special
> >>> namespace,
> >>>>> the
> >>>>>> resolution order is the same as in #2.)
> >>>>>>
> >>>>>> My personal preference is #1, given the unknown use case and
> >>> introduced
> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
> >>>> Thus,
> >>>>>> my votes are:
> >>>>>>
> >>>>>> +1 for #1
> >>>>>> +0 for #2
> >>>>>> -1 for #3
> >>>>>>
> >>>>>> Everyone, please cast your vote (in above format please!), or let
> >> me
> >>>> know
> >>>>>> if you have more questions or other candidates.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Xuefu
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
> >>> [hidden email]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I think this discussion and the one for FLIP-64 are very
> >> connected.
> >>>> To
> >>>>>>> resolve the differences, think we have to think about the basic
> >>>>>> principles
> >>>>>>> and find consensus there. The basic questions I see are:
> >>>>>>>
> >>>>>>> - Do we want to support overriding builtin functions?
> >>>>>>> - Do we want to support overriding catalog functions?
> >>>>>>> - And then later: should temporary functions be tied to a
> >>>>>>> catalog/database?
> >>>>>>>
> >>>>>>> I don’t have much to say about these, except that we should
> >>> somewhat
> >>>>>> stick
> >>>>>>> to what the industry does. But I also understand that the
> >> industry
> >>> is
> >>>>>>> already very divided on this.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Aljoscha
> >>>>>>>
> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
> >>> are
> >>>>>>> close to the truth. It will waste a lot of time if we resume the
> >>>> topic
> >>>>>> some
> >>>>>>> time later.
> >>>>>>>>
> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
> >>> “cat.db.fun”
> >>>>> way
> >>>>>>> to override a catalog function.
> >>>>>>>>
> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a
> >>> nonexistent
> >>>>> cat
> >>>>>>> & db? And we still need to do special treatment for the dedicated
> >>>>>>> system.system cat & db?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jark
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
> >>>>>>>>>
> >>>>>>>>> Hi everyone,
> >>>>>>>>>
> >>>>>>>>> @Xuefu: I would like to avoid adding too many things
> >>>> incrementally.
> >>>>>>> Users should be able to override all catalog objects consistently
> >>>>>> according
> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
> >>>>> functions
> >>>>>>> are treated completely different, we need more code and special
> >>>> cases.
> >>>>>> From
> >>>>>>> an implementation perspective, this topic only affects the lookup
> >>>> logic
> >>>>>>> which is rather low implementation effort which is why I would
> >> like
> >>>> to
> >>>>>>> clarify the remaining items. As you said, we have a slight
> >> consenus
> >>>> on
> >>>>>>> overriding built-in functions; we should also strive for reaching
> >>>>>> consensus
> >>>>>>> on the remaining topics.
> >>>>>>>>>
> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
> >>> objects
> >>>>>>> consistent and the overriding of built-in functions more
> >> explicit.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 17.09.19 11:59, kai wang wrote:
> >>>>>>>>>> hi, everyone
> >>>>>>>>>> I think this flip is very meaningful. it supports functions
> >>> that
> >>>>> can
> >>>>>> be
> >>>>>>>>>> shared by different catalogs and dbs, reducing the
> >> duplication
> >>> of
> >>>>>>> functions.
> >>>>>>>>>>
> >>>>>>>>>> Our group based on flink's sql parser module implements
> >> create
> >>>>>> function
> >>>>>>>>>> feature, stores the parsed function metadata and schema into
> >>>> mysql,
> >>>>>> and
> >>>>>>>>>> also customizes the catalog, customizes sql-client to support
> >>>>> custom
> >>>>>>>>>> schemas and functions. Loaded, but the function is currently
> >>>>> global,
> >>>>>>> and is
> >>>>>>>>>> not subdivided according to catalog and db.
> >>>>>>>>>>
> >>>>>>>>>> In addition, I very much hope to participate in the
> >> development
> >>>> of
> >>>>>> this
> >>>>>>>>>> flip, I have been paying attention to the community, but
> >> found
> >>> it
> >>>>> is
> >>>>>>> more
> >>>>>>>>>> difficult to join.
> >>>>>>>>>> thank you.
> >>>>>>>>>>
> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
> >>>>>>>>>>>
> >>>>>>>>>>> It seems to me that there is a general consensus on having
> >>> temp
> >>>>>>> functions
> >>>>>>>>>>> that have no namespaces and overwrite built-in functions.
> >> (As
> >>> a
> >>>>> side
> >>>>>>> note
> >>>>>>>>>>> for comparability, the current user defined functions are
> >> all
> >>>>>>> temporary and
> >>>>>>>>>>> having no namespaces.)
> >>>>>>>>>>>
> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
> >>> temp
> >>>>>>> functions
> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
> >>>>> However,
> >>>>>>> this
> >>>>>>>>>>> idea appears orthogonal to the former and can be added
> >>>>>> incrementally.
> >>>>>>>>>>>
> >>>>>>>>>>> How about we first implement non-namespaced temp functions
> >> now
> >>>> and
> >>>>>>> leave
> >>>>>>>>>>> the door open for namespaced ones for later releases as the
> >>>>>>> requirement
> >>>>>>>>>>> might become more crystal? This also helps shorten the
> >> debate
> >>>> and
> >>>>>>> allow us
> >>>>>>>>>>> to make some progress along this direction.
> >>>>>>>>>>>
> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
> >>>>>> temporary
> >>>>>>> temp
> >>>>>>>>>>> functions that don't have namespaces, my only concern is the
> >>>>> special
> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
> >>> evident
> >>>> in
> >>>>>>> treating
> >>>>>>>>>>> the built-in catalog currently.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Xuefiu
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >>>>>>>>>>> [hidden email]>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
> >>> about
> >>>>> we
> >>>>>>> have a
> >>>>>>>>>>>> special namespace (catalog + database) for built-in
> >> objects?
> >>>> This
> >>>>>>> catalog
> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Then users could still override built-in functions, if they
> >>>> fully
> >>>>>>> qualify
> >>>>>>>>>>>> object with the built-in namespace, but by default the
> >> common
> >>>>> logic
> >>>>>>> of
> >>>>>>>>>>>> current dB & cat would be used.
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
> >>>>>>>>>>>> registers temporary function in current cat & dB
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >>>>>>>>>>>> registers temporary function in cat db
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >>>>>>>>>>>> Overrides built-in function with temporary function
> >>>>>>>>>>>>
> >>>>>>>>>>>> The built-in/system namespace would not be writable for
> >>>> permanent
> >>>>>>>>>>> objects.
> >>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>
> >>>>>>>>>>>> This way I think we can have benefits of both solutions.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Dawid
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
> >> [hidden email]
> >>>>
> >>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Bowen,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I understand the potential benefit of overriding certain
> >>>>> built-in
> >>>>>>>>>>>>> functions. I'm open to such a feature if many people
> >> agree.
> >>>>>>> However, it
> >>>>>>>>>>>>> would be great to still support overriding catalog
> >> functions
> >>>>> with
> >>>>>>>>>>>>> temporary functions in order to prototype a query even
> >>> though
> >>>> a
> >>>>>>>>>>>>> catalog/database might not be available currently or
> >> should
> >>>> not
> >>>>> be
> >>>>>>>>>>>>> modified yet. How about we support both cases?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
> >>>>>>>>>>>>> -> creates/overrides a built-in function and never
> >>> consideres
> >>>>>>> current
> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
> >>>> acceptable
> >>>>>> for
> >>>>>>>>>>>>> functions I guess.
> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >>>>>>>>>>>>> -> creates/overrides a catalog function
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
> >>>> (tables,
> >>>>>>> views)
> >>>>>>>>>>>>> except functions", this might change in the near future.
> >>> Take
> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> >>>>> example.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >>>>>>>>>>>>>> Hi Fabian,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
> >>> thus I
> >>>>>>> didn't
> >>>>>>>>>>>>>> include that as a voting option, and the discussion is
> >>> mainly
> >>>>>>> between
> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Re > However, it means that temp functions are
> >> differently
> >>>>>> treated
> >>>>>>>>>>> than
> >>>>>>>>>>>>>> other db objects.
> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
> >>>>>> functions
> >>>>>>>>>>> are
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any
> >>> other
> >>>>>>>>>>> built-in
> >>>>>>>>>>>>>> objects (tables, views) except functions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Bowen
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Xuefu Zhang
> >>>>>>>>>>>
> >>>>>>>>>>> "In Honey We Trust!"
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Xuefu Zhang
> >>>>>>
> >>>>>> "In Honey We Trust!"
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Xuefu Zhang
> >>>>
> >>>> "In Honey We Trust!"
> >>>>
> >>>
> >>
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Kurt Young
And let me make my vote complete:

-1 for #1
+1 for #2 with different keyword
-0 for #3

Best,
Kurt


On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote:

> Looks like I'm the only person who is willing to +1 to #2 for now :-)
> But I would suggest to change the keyword from GLOBAL to
> something like BUILTIN.
>
> I think #2 and #3 are almost the same proposal, just with different
> format to indicate whether it want to override built-in functions.
>
> My biggest reason to choose it is I want this behavior be consistent
> with temporal tables. I will give some examples to show the behavior
> and also make sure I'm not misunderstanding anything here.
>
> For most DBs, when user create a temporary table with:
>
> CREATE TEMPORARY TABLE t1
>
> It's actually equivalent with:
>
> CREATE TEMPORARY TABLE `curent_db`.t1
>
> If user change current database, they will not be able to access t1 without
> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
> this temporary table is created).
>
> Only #2 and #3 followed this behavior and I would vote for this since this
> makes such behavior consistent through temporal tables and functions.
>
> Why I'm not voting for #3 is a special catalog and database just looks very
> hacky to me. It gave a imply that our built-in functions saved at a
> special
> catalog and database, which is actually not. Introducing a dedicated
> keyword
> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
> straightforward. One can argue that we should avoid introducing new
> keyword,
> but it's also very rare that a system can overwrite built-in functions.
> Since we
> decided to support this, introduce a new keyword is not a big deal IMO.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]>
> wrote:
>
>> Hi,
>>
>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>> anything. From the proposals presented by Xuefu I would vote:
>>
>> -1 for #1 and #2
>> +1 for #3
>>
>> Besides #3 being IMO more general and more consistent, having qualified
>> names (#3) would help/make easier for someone to use cross
>> databases/catalogs queries (joining multiple data sets/streams). For
>> example with some functions to manipulate/clean up/convert the stored data
>> in different catalogs registered in the respective catalogs.
>>
>> Piotrek
>>
>> > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote:
>> >
>> > I agree with Xuefu that inconsistent handling with all the other
>> objects is
>> > not a big problem.
>> >
>> > Regarding to option#3, the special "system.system" namespace may confuse
>> > users.
>> > Users need to know the set of built-in function names to know when to
>> use
>> > "system.system" namespace.
>> > What will happen if user registers a non-builtin function name under the
>> > "system.system" namespace?
>> > Besides, I think it doesn't solve the "explode" problem I mentioned at
>> the
>> > beginning of this thread.
>> >
>> > So here is my vote:
>> >
>> > +1 for #1
>> > 0 for #2
>> > -1 for #3
>> >
>> > Best,
>> > Jark
>> >
>> >
>> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:
>> >
>> >> @Dawid, Re: we also don't need additional referencing the
>> specialcatalog
>> >> anywhere.
>> >>
>> >> True. But once we allow such reference, then user can do so in any
>> possible
>> >> place where a function name is expected, for which we have to handle.
>> >> That's a big difference, I think.
>> >>
>> >> Thanks,
>> >> Xuefu
>> >>
>> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> >> [hidden email]>
>> >> wrote:
>> >>
>> >>> @Bowen I am not suggesting introducing additional catalog. I think we
>> >> need
>> >>> to get rid of the current built-in catalog.
>> >>>
>> >>> @Xuefu in option #3 we also don't need additional referencing the
>> special
>> >>> catalog anywhere else besides in the CREATE statement. The resolution
>> >>> behaviour is exactly the same in both options.
>> >>>
>> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
>> >>>
>> >>>> Hi Dawid,
>> >>>>
>> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It
>> can
>> >> be
>> >>>> changed to something else for better.
>> >>>>
>> >>>> The difference between this and the #3 approach is that we only need
>> >> the
>> >>>> keyword for this create DDL. For other places (such as function
>> >>>> referencing), no keyword or special namespace is needed.
>> >>>>
>> >>>> Thanks,
>> >>>> Xuefu
>> >>>>
>> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>> >>>> [hidden email]>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi,
>> >>>>> I think it makes sense to start voting at this point.
>> >>>>>
>> >>>>> Option 1: Only 1-part identifiers
>> >>>>> PROS:
>> >>>>> - allows shadowing built-in functions
>> >>>>> CONS:
>> >>>>> - incosistent with all the other objects, both permanent & temporary
>> >>>>> - does not allow shadowing catalog functions
>> >>>>>
>> >>>>> Option 2: Special keyword for built-in function
>> >>>>> I think this is quite similar to the special catalog/db. The thing I
>> >> am
>> >>>>> strongly against in this proposal is the GLOBAL keyword. This
>> keyword
>> >>>> has a
>> >>>>> meaning in rdbms systems and means a function that is present for a
>> >>>>> lifetime of a session in which it was created, but available in all
>> >>> other
>> >>>>> sessions. Therefore I really don't want to use this keyword in a
>> >>>> different
>> >>>>> context.
>> >>>>>
>> >>>>> Option 3: Special catalog/db
>> >>>>>
>> >>>>> PROS:
>> >>>>> - allows shadowing built-in functions
>> >>>>> - allows shadowing catalog functions
>> >>>>> - consistent with other objects
>> >>>>> CONS:
>> >>>>> - we introduce a special namespace for built-in functions
>> >>>>>
>> >>>>> I don't see a problem with introducing the special namespace. In the
>> >>> end
>> >>>> it
>> >>>>> is very similar to the keyword approach. In this case the catalog/db
>> >>>>> combination would be the "keyword"
>> >>>>>
>> >>>>> Therefore my votes:
>> >>>>> Option 1: -0
>> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better
>> >>>> keyword)
>> >>>>> Option 3: +1
>> >>>>>
>> >>>>> Best,
>> >>>>> Dawid
>> >>>>>
>> >>>>>
>> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>> >>>>>
>> >>>>>> Hi Aljoscha,
>> >>>>>>
>> >>>>>> Thanks for the summary and these are great questions to be
>> >> answered.
>> >>>> The
>> >>>>>> answer to your first question is clear: there is a general
>> >> agreement
>> >>> to
>> >>>>>> override built-in functions with temp functions.
>> >>>>>>
>> >>>>>> However, your second and third questions are sort of related, as a
>> >>>>> function
>> >>>>>> reference can be either just function name (like "func") or in the
>> >>> form
>> >>>>> or
>> >>>>>> "cat.db.func". When a reference is just function name, it can mean
>> >>>>> either a
>> >>>>>> built-in function or a function defined in the current cat/db. If
>> >> we
>> >>>>>> support overriding a built-in function with a temp function, such
>> >>>>>> overriding can also cover a function in the current cat/db.
>> >>>>>>
>> >>>>>> I think what Timo referred as "overriding a catalog function"
>> >> means a
>> >>>>> temp
>> >>>>>> function defined as "cat.db.func" overrides a catalog function
>> >> "func"
>> >>>> in
>> >>>>>> cat/db even if cat/db is not current. To support this, temp
>> >> function
>> >>>> has
>> >>>>> to
>> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>> >>>>> questions
>> >>>>>> are related. The problem with such support is the ambiguity when
>> >> user
>> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
>> >>> ...".
>> >>>>>> Here "func" can means a global temp function, or a temp function in
>> >>>>> current
>> >>>>>> cat/db. If we can assume the former, this creates an inconsistency
>> >>>>> because
>> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
>> >>> If
>> >>>> we
>> >>>>>> assume the latter, then there is no way for user to create a global
>> >>>> temp
>> >>>>>> function.
>> >>>>>>
>> >>>>>> Giving a special namespace for built-in functions may solve the
>> >>>> ambiguity
>> >>>>>> problem above, but it also introduces artificial catalog/database
>> >>> that
>> >>>>>> needs special treatment and pollutes the cleanness of  the code. I
>> >>>> would
>> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>> >>>>>> [GLOBAL] TEMPORARY FUNCTION func".
>> >>>>>>
>> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting
>> >>>>> purposes:
>> >>>>>>
>> >>>>>> 1. Support only global, temporary functions without namespace. Such
>> >>>> temp
>> >>>>>> functions overrides built-in functions and catalog functions in
>> >>> current
>> >>>>>> cat/db. The resolution order is: temp functions -> built-in
>> >> functions
>> >>>> ->
>> >>>>>> catalog functions. (Partially or fully qualified functions has no
>> >>>>>> ambiguity!)
>> >>>>>>
>> >>>>>> 2. In addition to #1, support creating and referencing temporary
>> >>>>> functions
>> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
>> >>> temp
>> >>>>>> functions. The resolution order is: global temp functions ->
>> >> built-in
>> >>>>>> functions -> temp functions in current cat/db -> catalog function.
>> >>>>>> (Resolution for partially or fully qualified function reference is:
>> >>>> temp
>> >>>>>> functions -> persistent functions.)
>> >>>>>>
>> >>>>>> 3. In addition to #1, support creating and referencing temporary
>> >>>>> functions
>> >>>>>> associated with a cat/db with a special namespace for built-in
>> >>>> functions
>> >>>>>> and global temp functions. The resolution is the same as #2, except
>> >>>> that
>> >>>>>> the special namespace might be prefixed to a reference to a
>> >> built-in
>> >>>>>> function or global temp function. (In absence of the special
>> >>> namespace,
>> >>>>> the
>> >>>>>> resolution order is the same as in #2.)
>> >>>>>>
>> >>>>>> My personal preference is #1, given the unknown use case and
>> >>> introduced
>> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
>> >>>> Thus,
>> >>>>>> my votes are:
>> >>>>>>
>> >>>>>> +1 for #1
>> >>>>>> +0 for #2
>> >>>>>> -1 for #3
>> >>>>>>
>> >>>>>> Everyone, please cast your vote (in above format please!), or let
>> >> me
>> >>>> know
>> >>>>>> if you have more questions or other candidates.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Xuefu
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
>> >>> [hidden email]>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> I think this discussion and the one for FLIP-64 are very
>> >> connected.
>> >>>> To
>> >>>>>>> resolve the differences, think we have to think about the basic
>> >>>>>> principles
>> >>>>>>> and find consensus there. The basic questions I see are:
>> >>>>>>>
>> >>>>>>> - Do we want to support overriding builtin functions?
>> >>>>>>> - Do we want to support overriding catalog functions?
>> >>>>>>> - And then later: should temporary functions be tied to a
>> >>>>>>> catalog/database?
>> >>>>>>>
>> >>>>>>> I don’t have much to say about these, except that we should
>> >>> somewhat
>> >>>>>> stick
>> >>>>>>> to what the industry does. But I also understand that the
>> >> industry
>> >>> is
>> >>>>>>> already very divided on this.
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Aljoscha
>> >>>>>>>
>> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
>> >>> are
>> >>>>>>> close to the truth. It will waste a lot of time if we resume the
>> >>>> topic
>> >>>>>> some
>> >>>>>>> time later.
>> >>>>>>>>
>> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
>> >>> “cat.db.fun”
>> >>>>> way
>> >>>>>>> to override a catalog function.
>> >>>>>>>>
>> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a
>> >>> nonexistent
>> >>>>> cat
>> >>>>>>> & db? And we still need to do special treatment for the dedicated
>> >>>>>>> system.system cat & db?
>> >>>>>>>>
>> >>>>>>>> Best,
>> >>>>>>>> Jark
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>> >>>>>>>>>
>> >>>>>>>>> Hi everyone,
>> >>>>>>>>>
>> >>>>>>>>> @Xuefu: I would like to avoid adding too many things
>> >>>> incrementally.
>> >>>>>>> Users should be able to override all catalog objects consistently
>> >>>>>> according
>> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
>> >>>>> functions
>> >>>>>>> are treated completely different, we need more code and special
>> >>>> cases.
>> >>>>>> From
>> >>>>>>> an implementation perspective, this topic only affects the lookup
>> >>>> logic
>> >>>>>>> which is rather low implementation effort which is why I would
>> >> like
>> >>>> to
>> >>>>>>> clarify the remaining items. As you said, we have a slight
>> >> consenus
>> >>>> on
>> >>>>>>> overriding built-in functions; we should also strive for reaching
>> >>>>>> consensus
>> >>>>>>> on the remaining topics.
>> >>>>>>>>>
>> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
>> >>> objects
>> >>>>>>> consistent and the overriding of built-in functions more
>> >> explicit.
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Timo
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On 17.09.19 11:59, kai wang wrote:
>> >>>>>>>>>> hi, everyone
>> >>>>>>>>>> I think this flip is very meaningful. it supports functions
>> >>> that
>> >>>>> can
>> >>>>>> be
>> >>>>>>>>>> shared by different catalogs and dbs, reducing the
>> >> duplication
>> >>> of
>> >>>>>>> functions.
>> >>>>>>>>>>
>> >>>>>>>>>> Our group based on flink's sql parser module implements
>> >> create
>> >>>>>> function
>> >>>>>>>>>> feature, stores the parsed function metadata and schema into
>> >>>> mysql,
>> >>>>>> and
>> >>>>>>>>>> also customizes the catalog, customizes sql-client to support
>> >>>>> custom
>> >>>>>>>>>> schemas and functions. Loaded, but the function is currently
>> >>>>> global,
>> >>>>>>> and is
>> >>>>>>>>>> not subdivided according to catalog and db.
>> >>>>>>>>>>
>> >>>>>>>>>> In addition, I very much hope to participate in the
>> >> development
>> >>>> of
>> >>>>>> this
>> >>>>>>>>>> flip, I have been paying attention to the community, but
>> >> found
>> >>> it
>> >>>>> is
>> >>>>>>> more
>> >>>>>>>>>> difficult to join.
>> >>>>>>>>>> thank you.
>> >>>>>>>>>>
>> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>> >>>>>>>>>>
>> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
>> >>>>>>>>>>>
>> >>>>>>>>>>> It seems to me that there is a general consensus on having
>> >>> temp
>> >>>>>>> functions
>> >>>>>>>>>>> that have no namespaces and overwrite built-in functions.
>> >> (As
>> >>> a
>> >>>>> side
>> >>>>>>> note
>> >>>>>>>>>>> for comparability, the current user defined functions are
>> >> all
>> >>>>>>> temporary and
>> >>>>>>>>>>> having no namespaces.)
>> >>>>>>>>>>>
>> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
>> >>> temp
>> >>>>>>> functions
>> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
>> >>>>> However,
>> >>>>>>> this
>> >>>>>>>>>>> idea appears orthogonal to the former and can be added
>> >>>>>> incrementally.
>> >>>>>>>>>>>
>> >>>>>>>>>>> How about we first implement non-namespaced temp functions
>> >> now
>> >>>> and
>> >>>>>>> leave
>> >>>>>>>>>>> the door open for namespaced ones for later releases as the
>> >>>>>>> requirement
>> >>>>>>>>>>> might become more crystal? This also helps shorten the
>> >> debate
>> >>>> and
>> >>>>>>> allow us
>> >>>>>>>>>>> to make some progress along this direction.
>> >>>>>>>>>>>
>> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
>> >>>>>> temporary
>> >>>>>>> temp
>> >>>>>>>>>>> functions that don't have namespaces, my only concern is the
>> >>>>> special
>> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
>> >>> evident
>> >>>> in
>> >>>>>>> treating
>> >>>>>>>>>>> the built-in catalog currently.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks,
>> >>>>>>>>>>> Xuefiu
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>> >>>>>>>>>>> [hidden email]>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi,
>> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
>> >>> about
>> >>>>> we
>> >>>>>>> have a
>> >>>>>>>>>>>> special namespace (catalog + database) for built-in
>> >> objects?
>> >>>> This
>> >>>>>>> catalog
>> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Then users could still override built-in functions, if they
>> >>>> fully
>> >>>>>>> qualify
>> >>>>>>>>>>>> object with the built-in namespace, but by default the
>> >> common
>> >>>>> logic
>> >>>>>>> of
>> >>>>>>>>>>>> current dB & cat would be used.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
>> >>>>>>>>>>>> registers temporary function in current cat & dB
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>> >>>>>>>>>>>> registers temporary function in cat db
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>> >>>>>>>>>>>> Overrides built-in function with temporary function
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The built-in/system namespace would not be writable for
>> >>>> permanent
>> >>>>>>>>>>> objects.
>> >>>>>>>>>>>> WDYT?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> This way I think we can have benefits of both solutions.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Best,
>> >>>>>>>>>>>> Dawid
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
>> >> [hidden email]
>> >>>>
>> >>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> Hi Bowen,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I understand the potential benefit of overriding certain
>> >>>>> built-in
>> >>>>>>>>>>>>> functions. I'm open to such a feature if many people
>> >> agree.
>> >>>>>>> However, it
>> >>>>>>>>>>>>> would be great to still support overriding catalog
>> >> functions
>> >>>>> with
>> >>>>>>>>>>>>> temporary functions in order to prototype a query even
>> >>> though
>> >>>> a
>> >>>>>>>>>>>>> catalog/database might not be available currently or
>> >> should
>> >>>> not
>> >>>>> be
>> >>>>>>>>>>>>> modified yet. How about we support both cases?
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
>> >>>>>>>>>>>>> -> creates/overrides a built-in function and never
>> >>> consideres
>> >>>>>>> current
>> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
>> >>>> acceptable
>> >>>>>> for
>> >>>>>>>>>>>>> functions I guess.
>> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>> >>>>>>>>>>>>> -> creates/overrides a catalog function
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
>> >>>> (tables,
>> >>>>>>> views)
>> >>>>>>>>>>>>> except functions", this might change in the near future.
>> >>> Take
>> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
>> >>>>> example.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>> Timo
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>> >>>>>>>>>>>>>> Hi Fabian,
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
>> >>> thus I
>> >>>>>>> didn't
>> >>>>>>>>>>>>>> include that as a voting option, and the discussion is
>> >>> mainly
>> >>>>>>> between
>> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Re > However, it means that temp functions are
>> >> differently
>> >>>>>> treated
>> >>>>>>>>>>> than
>> >>>>>>>>>>>>>> other db objects.
>> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
>> >>>>>> functions
>> >>>>>>>>>>> are
>> >>>>>>>>>>>> a
>> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any
>> >>> other
>> >>>>>>>>>>> built-in
>> >>>>>>>>>>>>>> objects (tables, views) except functions.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Cheers,
>> >>>>>>>>>>>>>> Bowen
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> Xuefu Zhang
>> >>>>>>>>>>>
>> >>>>>>>>>>> "In Honey We Trust!"
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Xuefu Zhang
>> >>>>>>
>> >>>>>> "In Honey We Trust!"
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Xuefu Zhang
>> >>>>
>> >>>> "In Honey We Trust!"
>> >>>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Xuefu Zhang
>> >>
>> >> "In Honey We Trust!"
>> >>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

JingsongLee-2
I know Hive and Spark can shadow built-in functions by temporary function.
Mysql, Oracle, Sql server can not shadow.
User can use full names to access functions instead of shadowing.

So I think it is a completely new thing, and the direct way to deal with new things is to add new grammar. So,
+1 for #2, +0 for #3, -1 for #1

Best,
Jingsong Lee


------------------------------------------------------------------
From:Kurt Young <[hidden email]>
Send Time:2019年9月19日(星期四) 16:43
To:dev <[hidden email]>
Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

And let me make my vote complete:

-1 for #1
+1 for #2 with different keyword
-0 for #3

Best,
Kurt


On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote:

> Looks like I'm the only person who is willing to +1 to #2 for now :-)
> But I would suggest to change the keyword from GLOBAL to
> something like BUILTIN.
>
> I think #2 and #3 are almost the same proposal, just with different
> format to indicate whether it want to override built-in functions.
>
> My biggest reason to choose it is I want this behavior be consistent
> with temporal tables. I will give some examples to show the behavior
> and also make sure I'm not misunderstanding anything here.
>
> For most DBs, when user create a temporary table with:
>
> CREATE TEMPORARY TABLE t1
>
> It's actually equivalent with:
>
> CREATE TEMPORARY TABLE `curent_db`.t1
>
> If user change current database, they will not be able to access t1 without
> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
> this temporary table is created).
>
> Only #2 and #3 followed this behavior and I would vote for this since this
> makes such behavior consistent through temporal tables and functions.
>
> Why I'm not voting for #3 is a special catalog and database just looks very
> hacky to me. It gave a imply that our built-in functions saved at a
> special
> catalog and database, which is actually not. Introducing a dedicated
> keyword
> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
> straightforward. One can argue that we should avoid introducing new
> keyword,
> but it's also very rare that a system can overwrite built-in functions.
> Since we
> decided to support this, introduce a new keyword is not a big deal IMO.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]>
> wrote:
>
>> Hi,
>>
>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>> anything. From the proposals presented by Xuefu I would vote:
>>
>> -1 for #1 and #2
>> +1 for #3
>>
>> Besides #3 being IMO more general and more consistent, having qualified
>> names (#3) would help/make easier for someone to use cross
>> databases/catalogs queries (joining multiple data sets/streams). For
>> example with some functions to manipulate/clean up/convert the stored data
>> in different catalogs registered in the respective catalogs.
>>
>> Piotrek
>>
>> > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote:
>> >
>> > I agree with Xuefu that inconsistent handling with all the other
>> objects is
>> > not a big problem.
>> >
>> > Regarding to option#3, the special "system.system" namespace may confuse
>> > users.
>> > Users need to know the set of built-in function names to know when to
>> use
>> > "system.system" namespace.
>> > What will happen if user registers a non-builtin function name under the
>> > "system.system" namespace?
>> > Besides, I think it doesn't solve the "explode" problem I mentioned at
>> the
>> > beginning of this thread.
>> >
>> > So here is my vote:
>> >
>> > +1 for #1
>> > 0 for #2
>> > -1 for #3
>> >
>> > Best,
>> > Jark
>> >
>> >
>> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:
>> >
>> >> @Dawid, Re: we also don't need additional referencing the
>> specialcatalog
>> >> anywhere.
>> >>
>> >> True. But once we allow such reference, then user can do so in any
>> possible
>> >> place where a function name is expected, for which we have to handle.
>> >> That's a big difference, I think.
>> >>
>> >> Thanks,
>> >> Xuefu
>> >>
>> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> >> [hidden email]>
>> >> wrote:
>> >>
>> >>> @Bowen I am not suggesting introducing additional catalog. I think we
>> >> need
>> >>> to get rid of the current built-in catalog.
>> >>>
>> >>> @Xuefu in option #3 we also don't need additional referencing the
>> special
>> >>> catalog anywhere else besides in the CREATE statement. The resolution
>> >>> behaviour is exactly the same in both options.
>> >>>
>> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
>> >>>
>> >>>> Hi Dawid,
>> >>>>
>> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It
>> can
>> >> be
>> >>>> changed to something else for better.
>> >>>>
>> >>>> The difference between this and the #3 approach is that we only need
>> >> the
>> >>>> keyword for this create DDL. For other places (such as function
>> >>>> referencing), no keyword or special namespace is needed.
>> >>>>
>> >>>> Thanks,
>> >>>> Xuefu
>> >>>>
>> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>> >>>> [hidden email]>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi,
>> >>>>> I think it makes sense to start voting at this point.
>> >>>>>
>> >>>>> Option 1: Only 1-part identifiers
>> >>>>> PROS:
>> >>>>> - allows shadowing built-in functions
>> >>>>> CONS:
>> >>>>> - incosistent with all the other objects, both permanent & temporary
>> >>>>> - does not allow shadowing catalog functions
>> >>>>>
>> >>>>> Option 2: Special keyword for built-in function
>> >>>>> I think this is quite similar to the special catalog/db. The thing I
>> >> am
>> >>>>> strongly against in this proposal is the GLOBAL keyword. This
>> keyword
>> >>>> has a
>> >>>>> meaning in rdbms systems and means a function that is present for a
>> >>>>> lifetime of a session in which it was created, but available in all
>> >>> other
>> >>>>> sessions. Therefore I really don't want to use this keyword in a
>> >>>> different
>> >>>>> context.
>> >>>>>
>> >>>>> Option 3: Special catalog/db
>> >>>>>
>> >>>>> PROS:
>> >>>>> - allows shadowing built-in functions
>> >>>>> - allows shadowing catalog functions
>> >>>>> - consistent with other objects
>> >>>>> CONS:
>> >>>>> - we introduce a special namespace for built-in functions
>> >>>>>
>> >>>>> I don't see a problem with introducing the special namespace. In the
>> >>> end
>> >>>> it
>> >>>>> is very similar to the keyword approach. In this case the catalog/db
>> >>>>> combination would be the "keyword"
>> >>>>>
>> >>>>> Therefore my votes:
>> >>>>> Option 1: -0
>> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better
>> >>>> keyword)
>> >>>>> Option 3: +1
>> >>>>>
>> >>>>> Best,
>> >>>>> Dawid
>> >>>>>
>> >>>>>
>> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>> >>>>>
>> >>>>>> Hi Aljoscha,
>> >>>>>>
>> >>>>>> Thanks for the summary and these are great questions to be
>> >> answered.
>> >>>> The
>> >>>>>> answer to your first question is clear: there is a general
>> >> agreement
>> >>> to
>> >>>>>> override built-in functions with temp functions.
>> >>>>>>
>> >>>>>> However, your second and third questions are sort of related, as a
>> >>>>> function
>> >>>>>> reference can be either just function name (like "func") or in the
>> >>> form
>> >>>>> or
>> >>>>>> "cat.db.func". When a reference is just function name, it can mean
>> >>>>> either a
>> >>>>>> built-in function or a function defined in the current cat/db. If
>> >> we
>> >>>>>> support overriding a built-in function with a temp function, such
>> >>>>>> overriding can also cover a function in the current cat/db.
>> >>>>>>
>> >>>>>> I think what Timo referred as "overriding a catalog function"
>> >> means a
>> >>>>> temp
>> >>>>>> function defined as "cat.db.func" overrides a catalog function
>> >> "func"
>> >>>> in
>> >>>>>> cat/db even if cat/db is not current. To support this, temp
>> >> function
>> >>>> has
>> >>>>> to
>> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>> >>>>> questions
>> >>>>>> are related. The problem with such support is the ambiguity when
>> >> user
>> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
>> >>> ...".
>> >>>>>> Here "func" can means a global temp function, or a temp function in
>> >>>>> current
>> >>>>>> cat/db. If we can assume the former, this creates an inconsistency
>> >>>>> because
>> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
>> >>> If
>> >>>> we
>> >>>>>> assume the latter, then there is no way for user to create a global
>> >>>> temp
>> >>>>>> function.
>> >>>>>>
>> >>>>>> Giving a special namespace for built-in functions may solve the
>> >>>> ambiguity
>> >>>>>> problem above, but it also introduces artificial catalog/database
>> >>> that
>> >>>>>> needs special treatment and pollutes the cleanness of  the code. I
>> >>>> would
>> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>> >>>>>> [GLOBAL] TEMPORARY FUNCTION func".
>> >>>>>>
>> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting
>> >>>>> purposes:
>> >>>>>>
>> >>>>>> 1. Support only global, temporary functions without namespace. Such
>> >>>> temp
>> >>>>>> functions overrides built-in functions and catalog functions in
>> >>> current
>> >>>>>> cat/db. The resolution order is: temp functions -> built-in
>> >> functions
>> >>>> ->
>> >>>>>> catalog functions. (Partially or fully qualified functions has no
>> >>>>>> ambiguity!)
>> >>>>>>
>> >>>>>> 2. In addition to #1, support creating and referencing temporary
>> >>>>> functions
>> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
>> >>> temp
>> >>>>>> functions. The resolution order is: global temp functions ->
>> >> built-in
>> >>>>>> functions -> temp functions in current cat/db -> catalog function.
>> >>>>>> (Resolution for partially or fully qualified function reference is:
>> >>>> temp
>> >>>>>> functions -> persistent functions.)
>> >>>>>>
>> >>>>>> 3. In addition to #1, support creating and referencing temporary
>> >>>>> functions
>> >>>>>> associated with a cat/db with a special namespace for built-in
>> >>>> functions
>> >>>>>> and global temp functions. The resolution is the same as #2, except
>> >>>> that
>> >>>>>> the special namespace might be prefixed to a reference to a
>> >> built-in
>> >>>>>> function or global temp function. (In absence of the special
>> >>> namespace,
>> >>>>> the
>> >>>>>> resolution order is the same as in #2.)
>> >>>>>>
>> >>>>>> My personal preference is #1, given the unknown use case and
>> >>> introduced
>> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
>> >>>> Thus,
>> >>>>>> my votes are:
>> >>>>>>
>> >>>>>> +1 for #1
>> >>>>>> +0 for #2
>> >>>>>> -1 for #3
>> >>>>>>
>> >>>>>> Everyone, please cast your vote (in above format please!), or let
>> >> me
>> >>>> know
>> >>>>>> if you have more questions or other candidates.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Xuefu
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
>> >>> [hidden email]>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> I think this discussion and the one for FLIP-64 are very
>> >> connected.
>> >>>> To
>> >>>>>>> resolve the differences, think we have to think about the basic
>> >>>>>> principles
>> >>>>>>> and find consensus there. The basic questions I see are:
>> >>>>>>>
>> >>>>>>> - Do we want to support overriding builtin functions?
>> >>>>>>> - Do we want to support overriding catalog functions?
>> >>>>>>> - And then later: should temporary functions be tied to a
>> >>>>>>> catalog/database?
>> >>>>>>>
>> >>>>>>> I don’t have much to say about these, except that we should
>> >>> somewhat
>> >>>>>> stick
>> >>>>>>> to what the industry does. But I also understand that the
>> >> industry
>> >>> is
>> >>>>>>> already very divided on this.
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Aljoscha
>> >>>>>>>
>> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
>> >>> are
>> >>>>>>> close to the truth. It will waste a lot of time if we resume the
>> >>>> topic
>> >>>>>> some
>> >>>>>>> time later.
>> >>>>>>>>
>> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
>> >>> “cat.db.fun”
>> >>>>> way
>> >>>>>>> to override a catalog function.
>> >>>>>>>>
>> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a
>> >>> nonexistent
>> >>>>> cat
>> >>>>>>> & db? And we still need to do special treatment for the dedicated
>> >>>>>>> system.system cat & db?
>> >>>>>>>>
>> >>>>>>>> Best,
>> >>>>>>>> Jark
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>> >>>>>>>>>
>> >>>>>>>>> Hi everyone,
>> >>>>>>>>>
>> >>>>>>>>> @Xuefu: I would like to avoid adding too many things
>> >>>> incrementally.
>> >>>>>>> Users should be able to override all catalog objects consistently
>> >>>>>> according
>> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
>> >>>>> functions
>> >>>>>>> are treated completely different, we need more code and special
>> >>>> cases.
>> >>>>>> From
>> >>>>>>> an implementation perspective, this topic only affects the lookup
>> >>>> logic
>> >>>>>>> which is rather low implementation effort which is why I would
>> >> like
>> >>>> to
>> >>>>>>> clarify the remaining items. As you said, we have a slight
>> >> consenus
>> >>>> on
>> >>>>>>> overriding built-in functions; we should also strive for reaching
>> >>>>>> consensus
>> >>>>>>> on the remaining topics.
>> >>>>>>>>>
>> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
>> >>> objects
>> >>>>>>> consistent and the overriding of built-in functions more
>> >> explicit.
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Timo
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On 17.09.19 11:59, kai wang wrote:
>> >>>>>>>>>> hi, everyone
>> >>>>>>>>>> I think this flip is very meaningful. it supports functions
>> >>> that
>> >>>>> can
>> >>>>>> be
>> >>>>>>>>>> shared by different catalogs and dbs, reducing the
>> >> duplication
>> >>> of
>> >>>>>>> functions.
>> >>>>>>>>>>
>> >>>>>>>>>> Our group based on flink's sql parser module implements
>> >> create
>> >>>>>> function
>> >>>>>>>>>> feature, stores the parsed function metadata and schema into
>> >>>> mysql,
>> >>>>>> and
>> >>>>>>>>>> also customizes the catalog, customizes sql-client to support
>> >>>>> custom
>> >>>>>>>>>> schemas and functions. Loaded, but the function is currently
>> >>>>> global,
>> >>>>>>> and is
>> >>>>>>>>>> not subdivided according to catalog and db.
>> >>>>>>>>>>
>> >>>>>>>>>> In addition, I very much hope to participate in the
>> >> development
>> >>>> of
>> >>>>>> this
>> >>>>>>>>>> flip, I have been paying attention to the community, but
>> >> found
>> >>> it
>> >>>>> is
>> >>>>>>> more
>> >>>>>>>>>> difficult to join.
>> >>>>>>>>>> thank you.
>> >>>>>>>>>>
>> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>> >>>>>>>>>>
>> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
>> >>>>>>>>>>>
>> >>>>>>>>>>> It seems to me that there is a general consensus on having
>> >>> temp
>> >>>>>>> functions
>> >>>>>>>>>>> that have no namespaces and overwrite built-in functions.
>> >> (As
>> >>> a
>> >>>>> side
>> >>>>>>> note
>> >>>>>>>>>>> for comparability, the current user defined functions are
>> >> all
>> >>>>>>> temporary and
>> >>>>>>>>>>> having no namespaces.)
>> >>>>>>>>>>>
>> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
>> >>> temp
>> >>>>>>> functions
>> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
>> >>>>> However,
>> >>>>>>> this
>> >>>>>>>>>>> idea appears orthogonal to the former and can be added
>> >>>>>> incrementally.
>> >>>>>>>>>>>
>> >>>>>>>>>>> How about we first implement non-namespaced temp functions
>> >> now
>> >>>> and
>> >>>>>>> leave
>> >>>>>>>>>>> the door open for namespaced ones for later releases as the
>> >>>>>>> requirement
>> >>>>>>>>>>> might become more crystal? This also helps shorten the
>> >> debate
>> >>>> and
>> >>>>>>> allow us
>> >>>>>>>>>>> to make some progress along this direction.
>> >>>>>>>>>>>
>> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
>> >>>>>> temporary
>> >>>>>>> temp
>> >>>>>>>>>>> functions that don't have namespaces, my only concern is the
>> >>>>> special
>> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
>> >>> evident
>> >>>> in
>> >>>>>>> treating
>> >>>>>>>>>>> the built-in catalog currently.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks,
>> >>>>>>>>>>> Xuefiu
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>> >>>>>>>>>>> [hidden email]>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi,
>> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
>> >>> about
>> >>>>> we
>> >>>>>>> have a
>> >>>>>>>>>>>> special namespace (catalog + database) for built-in
>> >> objects?
>> >>>> This
>> >>>>>>> catalog
>> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Then users could still override built-in functions, if they
>> >>>> fully
>> >>>>>>> qualify
>> >>>>>>>>>>>> object with the built-in namespace, but by default the
>> >> common
>> >>>>> logic
>> >>>>>>> of
>> >>>>>>>>>>>> current dB & cat would be used.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
>> >>>>>>>>>>>> registers temporary function in current cat & dB
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>> >>>>>>>>>>>> registers temporary function in cat db
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>> >>>>>>>>>>>> Overrides built-in function with temporary function
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The built-in/system namespace would not be writable for
>> >>>> permanent
>> >>>>>>>>>>> objects.
>> >>>>>>>>>>>> WDYT?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> This way I think we can have benefits of both solutions.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Best,
>> >>>>>>>>>>>> Dawid
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
>> >> [hidden email]
>> >>>>
>> >>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> Hi Bowen,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I understand the potential benefit of overriding certain
>> >>>>> built-in
>> >>>>>>>>>>>>> functions. I'm open to such a feature if many people
>> >> agree.
>> >>>>>>> However, it
>> >>>>>>>>>>>>> would be great to still support overriding catalog
>> >> functions
>> >>>>> with
>> >>>>>>>>>>>>> temporary functions in order to prototype a query even
>> >>> though
>> >>>> a
>> >>>>>>>>>>>>> catalog/database might not be available currently or
>> >> should
>> >>>> not
>> >>>>> be
>> >>>>>>>>>>>>> modified yet. How about we support both cases?
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
>> >>>>>>>>>>>>> -> creates/overrides a built-in function and never
>> >>> consideres
>> >>>>>>> current
>> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
>> >>>> acceptable
>> >>>>>> for
>> >>>>>>>>>>>>> functions I guess.
>> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>> >>>>>>>>>>>>> -> creates/overrides a catalog function
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
>> >>>> (tables,
>> >>>>>>> views)
>> >>>>>>>>>>>>> except functions", this might change in the near future.
>> >>> Take
>> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
>> >>>>> example.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>> Timo
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>> >>>>>>>>>>>>>> Hi Fabian,
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
>> >>> thus I
>> >>>>>>> didn't
>> >>>>>>>>>>>>>> include that as a voting option, and the discussion is
>> >>> mainly
>> >>>>>>> between
>> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Re > However, it means that temp functions are
>> >> differently
>> >>>>>> treated
>> >>>>>>>>>>> than
>> >>>>>>>>>>>>>> other db objects.
>> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
>> >>>>>> functions
>> >>>>>>>>>>> are
>> >>>>>>>>>>>> a
>> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any
>> >>> other
>> >>>>>>>>>>> built-in
>> >>>>>>>>>>>>>> objects (tables, views) except functions.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Cheers,
>> >>>>>>>>>>>>>> Bowen
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> Xuefu Zhang
>> >>>>>>>>>>>
>> >>>>>>>>>>> "In Honey We Trust!"
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Xuefu Zhang
>> >>>>>>
>> >>>>>> "In Honey We Trust!"
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Xuefu Zhang
>> >>>>
>> >>>> "In Honey We Trust!"
>> >>>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Xuefu Zhang
>> >>
>> >> "In Honey We Trust!"
>> >>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

Piotr Nowojski-3
After reading Kurt’s reasoning I’m bumping my vote for #2 from -1 to +0, or even +0.5, so my final vote is:

-1 for #1
+0.5 for #2
+1 for #3

Re confusion about “system_db”. I think quite a lot of DBs are storing some meta tables in some system and often hidden db/schema, so I don’t think that if we do the same with built in functions will be that big of a deal. In the end, both for #2 and #3 user will have to check in the documentation what’s the syntax for overriding built-in functions for the first time he will want to do it.

Piotrek

> On 19 Sep 2019, at 11:03, JingsongLee <[hidden email]> wrote:
>
> I know Hive and Spark can shadow built-in functions by temporary function.
> Mysql, Oracle, Sql server can not shadow.
> User can use full names to access functions instead of shadowing.
>
> So I think it is a completely new thing, and the direct way to deal with new things is to add new grammar. So,
> +1 for #2, +0 for #3, -1 for #1
>
> Best,
> Jingsong Lee
>
>
> ------------------------------------------------------------------
> From:Kurt Young <[hidden email]>
> Send Time:2019年9月19日(星期四) 16:43
> To:dev <[hidden email]>
> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
>
> And let me make my vote complete:
>
> -1 for #1
> +1 for #2 with different keyword
> -0 for #3
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote:
>
>> Looks like I'm the only person who is willing to +1 to #2 for now :-)
>> But I would suggest to change the keyword from GLOBAL to
>> something like BUILTIN.
>>
>> I think #2 and #3 are almost the same proposal, just with different
>> format to indicate whether it want to override built-in functions.
>>
>> My biggest reason to choose it is I want this behavior be consistent
>> with temporal tables. I will give some examples to show the behavior
>> and also make sure I'm not misunderstanding anything here.
>>
>> For most DBs, when user create a temporary table with:
>>
>> CREATE TEMPORARY TABLE t1
>>
>> It's actually equivalent with:
>>
>> CREATE TEMPORARY TABLE `curent_db`.t1
>>
>> If user change current database, they will not be able to access t1 without
>> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
>> this temporary table is created).
>>
>> Only #2 and #3 followed this behavior and I would vote for this since this
>> makes such behavior consistent through temporal tables and functions.
>>
>> Why I'm not voting for #3 is a special catalog and database just looks very
>> hacky to me. It gave a imply that our built-in functions saved at a
>> special
>> catalog and database, which is actually not. Introducing a dedicated
>> keyword
>> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
>> straightforward. One can argue that we should avoid introducing new
>> keyword,
>> but it's also very rare that a system can overwrite built-in functions.
>> Since we
>> decided to support this, introduce a new keyword is not a big deal IMO.
>>
>> Best,
>> Kurt
>>
>>
>> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]>
>> wrote:
>>
>>> Hi,
>>>
>>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>>> anything. From the proposals presented by Xuefu I would vote:
>>>
>>> -1 for #1 and #2
>>> +1 for #3
>>>
>>> Besides #3 being IMO more general and more consistent, having qualified
>>> names (#3) would help/make easier for someone to use cross
>>> databases/catalogs queries (joining multiple data sets/streams). For
>>> example with some functions to manipulate/clean up/convert the stored data
>>> in different catalogs registered in the respective catalogs.
>>>
>>> Piotrek
>>>
>>>> On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote:
>>>>
>>>> I agree with Xuefu that inconsistent handling with all the other
>>> objects is
>>>> not a big problem.
>>>>
>>>> Regarding to option#3, the special "system.system" namespace may confuse
>>>> users.
>>>> Users need to know the set of built-in function names to know when to
>>> use
>>>> "system.system" namespace.
>>>> What will happen if user registers a non-builtin function name under the
>>>> "system.system" namespace?
>>>> Besides, I think it doesn't solve the "explode" problem I mentioned at
>>> the
>>>> beginning of this thread.
>>>>
>>>> So here is my vote:
>>>>
>>>> +1 for #1
>>>> 0 for #2
>>>> -1 for #3
>>>>
>>>> Best,
>>>> Jark
>>>>
>>>>
>>>> On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote:
>>>>
>>>>> @Dawid, Re: we also don't need additional referencing the
>>> specialcatalog
>>>>> anywhere.
>>>>>
>>>>> True. But once we allow such reference, then user can do so in any
>>> possible
>>>>> place where a function name is expected, for which we have to handle.
>>>>> That's a big difference, I think.
>>>>>
>>>>> Thanks,
>>>>> Xuefu
>>>>>
>>>>> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>>>>> [hidden email]>
>>>>> wrote:
>>>>>
>>>>>> @Bowen I am not suggesting introducing additional catalog. I think we
>>>>> need
>>>>>> to get rid of the current built-in catalog.
>>>>>>
>>>>>> @Xuefu in option #3 we also don't need additional referencing the
>>> special
>>>>>> catalog anywhere else besides in the CREATE statement. The resolution
>>>>>> behaviour is exactly the same in both options.
>>>>>>
>>>>>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote:
>>>>>>
>>>>>>> Hi Dawid,
>>>>>>>
>>>>>>> "GLOBAL" is a temporary keyword that was given to the approach. It
>>> can
>>>>> be
>>>>>>> changed to something else for better.
>>>>>>>
>>>>>>> The difference between this and the #3 approach is that we only need
>>>>> the
>>>>>>> keyword for this create DDL. For other places (such as function
>>>>>>> referencing), no keyword or special namespace is needed.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Xuefu
>>>>>>>
>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>>>>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I think it makes sense to start voting at this point.
>>>>>>>>
>>>>>>>> Option 1: Only 1-part identifiers
>>>>>>>> PROS:
>>>>>>>> - allows shadowing built-in functions
>>>>>>>> CONS:
>>>>>>>> - incosistent with all the other objects, both permanent & temporary
>>>>>>>> - does not allow shadowing catalog functions
>>>>>>>>
>>>>>>>> Option 2: Special keyword for built-in function
>>>>>>>> I think this is quite similar to the special catalog/db. The thing I
>>>>> am
>>>>>>>> strongly against in this proposal is the GLOBAL keyword. This
>>> keyword
>>>>>>> has a
>>>>>>>> meaning in rdbms systems and means a function that is present for a
>>>>>>>> lifetime of a session in which it was created, but available in all
>>>>>> other
>>>>>>>> sessions. Therefore I really don't want to use this keyword in a
>>>>>>> different
>>>>>>>> context.
>>>>>>>>
>>>>>>>> Option 3: Special catalog/db
>>>>>>>>
>>>>>>>> PROS:
>>>>>>>> - allows shadowing built-in functions
>>>>>>>> - allows shadowing catalog functions
>>>>>>>> - consistent with other objects
>>>>>>>> CONS:
>>>>>>>> - we introduce a special namespace for built-in functions
>>>>>>>>
>>>>>>>> I don't see a problem with introducing the special namespace. In the
>>>>>> end
>>>>>>> it
>>>>>>>> is very similar to the keyword approach. In this case the catalog/db
>>>>>>>> combination would be the "keyword"
>>>>>>>>
>>>>>>>> Therefore my votes:
>>>>>>>> Option 1: -0
>>>>>>>> Option 2: -1 (I might change to +0 if we can come up with a better
>>>>>>> keyword)
>>>>>>>> Option 3: +1
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Dawid
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>> Hi Aljoscha,
>>>>>>>>>
>>>>>>>>> Thanks for the summary and these are great questions to be
>>>>> answered.
>>>>>>> The
>>>>>>>>> answer to your first question is clear: there is a general
>>>>> agreement
>>>>>> to
>>>>>>>>> override built-in functions with temp functions.
>>>>>>>>>
>>>>>>>>> However, your second and third questions are sort of related, as a
>>>>>>>> function
>>>>>>>>> reference can be either just function name (like "func") or in the
>>>>>> form
>>>>>>>> or
>>>>>>>>> "cat.db.func". When a reference is just function name, it can mean
>>>>>>>> either a
>>>>>>>>> built-in function or a function defined in the current cat/db. If
>>>>> we
>>>>>>>>> support overriding a built-in function with a temp function, such
>>>>>>>>> overriding can also cover a function in the current cat/db.
>>>>>>>>>
>>>>>>>>> I think what Timo referred as "overriding a catalog function"
>>>>> means a
>>>>>>>> temp
>>>>>>>>> function defined as "cat.db.func" overrides a catalog function
>>>>> "func"
>>>>>>> in
>>>>>>>>> cat/db even if cat/db is not current. To support this, temp
>>>>> function
>>>>>>> has
>>>>>>>> to
>>>>>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>>>>>>>> questions
>>>>>>>>> are related. The problem with such support is the ambiguity when
>>>>> user
>>>>>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
>>>>>> ...".
>>>>>>>>> Here "func" can means a global temp function, or a temp function in
>>>>>>>> current
>>>>>>>>> cat/db. If we can assume the former, this creates an inconsistency
>>>>>>>> because
>>>>>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
>>>>>> If
>>>>>>> we
>>>>>>>>> assume the latter, then there is no way for user to create a global
>>>>>>> temp
>>>>>>>>> function.
>>>>>>>>>
>>>>>>>>> Giving a special namespace for built-in functions may solve the
>>>>>>> ambiguity
>>>>>>>>> problem above, but it also introduces artificial catalog/database
>>>>>> that
>>>>>>>>> needs special treatment and pollutes the cleanness of  the code. I
>>>>>>> would
>>>>>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>>>>>>>>> [GLOBAL] TEMPORARY FUNCTION func".
>>>>>>>>>
>>>>>>>>> Thus, I'd like to summarize a few candidate proposals for voting
>>>>>>>> purposes:
>>>>>>>>>
>>>>>>>>> 1. Support only global, temporary functions without namespace. Such
>>>>>>> temp
>>>>>>>>> functions overrides built-in functions and catalog functions in
>>>>>> current
>>>>>>>>> cat/db. The resolution order is: temp functions -> built-in
>>>>> functions
>>>>>>> ->
>>>>>>>>> catalog functions. (Partially or fully qualified functions has no
>>>>>>>>> ambiguity!)
>>>>>>>>>
>>>>>>>>> 2. In addition to #1, support creating and referencing temporary
>>>>>>>> functions
>>>>>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
>>>>>> temp
>>>>>>>>> functions. The resolution order is: global temp functions ->
>>>>> built-in
>>>>>>>>> functions -> temp functions in current cat/db -> catalog function.
>>>>>>>>> (Resolution for partially or fully qualified function reference is:
>>>>>>> temp
>>>>>>>>> functions -> persistent functions.)
>>>>>>>>>
>>>>>>>>> 3. In addition to #1, support creating and referencing temporary
>>>>>>>> functions
>>>>>>>>> associated with a cat/db with a special namespace for built-in
>>>>>>> functions
>>>>>>>>> and global temp functions. The resolution is the same as #2, except
>>>>>>> that
>>>>>>>>> the special namespace might be prefixed to a reference to a
>>>>> built-in
>>>>>>>>> function or global temp function. (In absence of the special
>>>>>> namespace,
>>>>>>>> the
>>>>>>>>> resolution order is the same as in #2.)
>>>>>>>>>
>>>>>>>>> My personal preference is #1, given the unknown use case and
>>>>>> introduced
>>>>>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
>>>>>>> Thus,
>>>>>>>>> my votes are:
>>>>>>>>>
>>>>>>>>> +1 for #1
>>>>>>>>> +0 for #2
>>>>>>>>> -1 for #3
>>>>>>>>>
>>>>>>>>> Everyone, please cast your vote (in above format please!), or let
>>>>> me
>>>>>>> know
>>>>>>>>> if you have more questions or other candidates.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Xuefu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
>>>>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I think this discussion and the one for FLIP-64 are very
>>>>> connected.
>>>>>>> To
>>>>>>>>>> resolve the differences, think we have to think about the basic
>>>>>>>>> principles
>>>>>>>>>> and find consensus there. The basic questions I see are:
>>>>>>>>>>
>>>>>>>>>> - Do we want to support overriding builtin functions?
>>>>>>>>>> - Do we want to support overriding catalog functions?
>>>>>>>>>> - And then later: should temporary functions be tied to a
>>>>>>>>>> catalog/database?
>>>>>>>>>>
>>>>>>>>>> I don’t have much to say about these, except that we should
>>>>>> somewhat
>>>>>>>>> stick
>>>>>>>>>> to what the industry does. But I also understand that the
>>>>> industry
>>>>>> is
>>>>>>>>>> already very divided on this.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Aljoscha
>>>>>>>>>>
>>>>>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
>>>>>> are
>>>>>>>>>> close to the truth. It will waste a lot of time if we resume the
>>>>>>> topic
>>>>>>>>> some
>>>>>>>>>> time later.
>>>>>>>>>>>
>>>>>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
>>>>>> “cat.db.fun”
>>>>>>>> way
>>>>>>>>>> to override a catalog function.
>>>>>>>>>>>
>>>>>>>>>>> I’m not sure about “system.system.fun”, it introduces a
>>>>>> nonexistent
>>>>>>>> cat
>>>>>>>>>> & db? And we still need to do special treatment for the dedicated
>>>>>>>>>> system.system cat & db?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jark
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> @Xuefu: I would like to avoid adding too many things
>>>>>>> incrementally.
>>>>>>>>>> Users should be able to override all catalog objects consistently
>>>>>>>>> according
>>>>>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
>>>>>>>> functions
>>>>>>>>>> are treated completely different, we need more code and special
>>>>>>> cases.
>>>>>>>>> From
>>>>>>>>>> an implementation perspective, this topic only affects the lookup
>>>>>>> logic
>>>>>>>>>> which is rather low implementation effort which is why I would
>>>>> like
>>>>>>> to
>>>>>>>>>> clarify the remaining items. As you said, we have a slight
>>>>> consenus
>>>>>>> on
>>>>>>>>>> overriding built-in functions; we should also strive for reaching
>>>>>>>>> consensus
>>>>>>>>>> on the remaining topics.
>>>>>>>>>>>>
>>>>>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
>>>>>> objects
>>>>>>>>>> consistent and the overriding of built-in functions more
>>>>> explicit.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Timo
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 17.09.19 11:59, kai wang wrote:
>>>>>>>>>>>>> hi, everyone
>>>>>>>>>>>>> I think this flip is very meaningful. it supports functions
>>>>>> that
>>>>>>>> can
>>>>>>>>> be
>>>>>>>>>>>>> shared by different catalogs and dbs, reducing the
>>>>> duplication
>>>>>> of
>>>>>>>>>> functions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Our group based on flink's sql parser module implements
>>>>> create
>>>>>>>>> function
>>>>>>>>>>>>> feature, stores the parsed function metadata and schema into
>>>>>>> mysql,
>>>>>>>>> and
>>>>>>>>>>>>> also customizes the catalog, customizes sql-client to support
>>>>>>>> custom
>>>>>>>>>>>>> schemas and functions. Loaded, but the function is currently
>>>>>>>> global,
>>>>>>>>>> and is
>>>>>>>>>>>>> not subdivided according to catalog and db.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition, I very much hope to participate in the
>>>>> development
>>>>>>> of
>>>>>>>>> this
>>>>>>>>>>>>> flip, I have been paying attention to the community, but
>>>>> found
>>>>>> it
>>>>>>>> is
>>>>>>>>>> more
>>>>>>>>>>>>> difficult to join.
>>>>>>>>>>>>> thank you.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It seems to me that there is a general consensus on having
>>>>>> temp
>>>>>>>>>> functions
>>>>>>>>>>>>>> that have no namespaces and overwrite built-in functions.
>>>>> (As
>>>>>> a
>>>>>>>> side
>>>>>>>>>> note
>>>>>>>>>>>>>> for comparability, the current user defined functions are
>>>>> all
>>>>>>>>>> temporary and
>>>>>>>>>>>>>> having no namespaces.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
>>>>>> temp
>>>>>>>>>> functions
>>>>>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
>>>>>>>> However,
>>>>>>>>>> this
>>>>>>>>>>>>>> idea appears orthogonal to the former and can be added
>>>>>>>>> incrementally.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How about we first implement non-namespaced temp functions
>>>>> now
>>>>>>> and
>>>>>>>>>> leave
>>>>>>>>>>>>>> the door open for namespaced ones for later releases as the
>>>>>>>>>> requirement
>>>>>>>>>>>>>> might become more crystal? This also helps shorten the
>>>>> debate
>>>>>>> and
>>>>>>>>>> allow us
>>>>>>>>>>>>>> to make some progress along this direction.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
>>>>>>>>> temporary
>>>>>>>>>> temp
>>>>>>>>>>>>>> functions that don't have namespaces, my only concern is the
>>>>>>>> special
>>>>>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
>>>>>> evident
>>>>>>> in
>>>>>>>>>> treating
>>>>>>>>>>>>>> the built-in catalog currently.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Xuefiu
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>>>>>>>>>>>>>> [hidden email]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
>>>>>> about
>>>>>>>> we
>>>>>>>>>> have a
>>>>>>>>>>>>>>> special namespace (catalog + database) for built-in
>>>>> objects?
>>>>>>> This
>>>>>>>>>> catalog
>>>>>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Then users could still override built-in functions, if they
>>>>>>> fully
>>>>>>>>>> qualify
>>>>>>>>>>>>>>> object with the built-in namespace, but by default the
>>>>> common
>>>>>>>> logic
>>>>>>>>>> of
>>>>>>>>>>>>>>> current dB & cat would be used.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
>>>>>>>>>>>>>>> registers temporary function in current cat & dB
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>>>>>>>>>>>>>> registers temporary function in cat db
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>>>>>>>>>>>>>> Overrides built-in function with temporary function
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The built-in/system namespace would not be writable for
>>>>>>> permanent
>>>>>>>>>>>>>> objects.
>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This way I think we can have benefits of both solutions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Dawid
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
>>>>> [hidden email]
>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Bowen,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I understand the potential benefit of overriding certain
>>>>>>>> built-in
>>>>>>>>>>>>>>>> functions. I'm open to such a feature if many people
>>>>> agree.
>>>>>>>>>> However, it
>>>>>>>>>>>>>>>> would be great to still support overriding catalog
>>>>> functions
>>>>>>>> with
>>>>>>>>>>>>>>>> temporary functions in order to prototype a query even
>>>>>> though
>>>>>>> a
>>>>>>>>>>>>>>>> catalog/database might not be available currently or
>>>>> should
>>>>>>> not
>>>>>>>> be
>>>>>>>>>>>>>>>> modified yet. How about we support both cases?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
>>>>>>>>>>>>>>>> -> creates/overrides a built-in function and never
>>>>>> consideres
>>>>>>>>>> current
>>>>>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
>>>>>>> acceptable
>>>>>>>>> for
>>>>>>>>>>>>>>>> functions I guess.
>>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>>>>>>>>>>>>>> -> creates/overrides a catalog function
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
>>>>>>> (tables,
>>>>>>>>>> views)
>>>>>>>>>>>>>>>> except functions", this might change in the near future.
>>>>>> Take
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
>>>>>>>> example.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>>>>>>>>>>>>>> Hi Fabian,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
>>>>>> thus I
>>>>>>>>>> didn't
>>>>>>>>>>>>>>>>> include that as a voting option, and the discussion is
>>>>>> mainly
>>>>>>>>>> between
>>>>>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Re > However, it means that temp functions are
>>>>> differently
>>>>>>>>> treated
>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>>>> other db objects.
>>>>>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
>>>>>>>>> functions
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> bit different from other objects - Flink don't have any
>>>>>> other
>>>>>>>>>>>>>> built-in
>>>>>>>>>>>>>>>>> objects (tables, views) except functions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Bowen
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Xuefu Zhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> "In Honey We Trust!"
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Xuefu Zhang
>>>>>>>>>
>>>>>>>>> "In Honey We Trust!"
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Xuefu Zhang
>>>>>>>
>>>>>>> "In Honey We Trust!"
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Xuefu Zhang
>>>>>
>>>>> "In Honey We Trust!"
>>>>>
>>>
>>>

1234