Hi,
Another idea to consider on top of Timo's suggestion. How about we have a special namespace (catalog + database) for built-in objects? This catalog would be invisible for users as Xuefu was suggesting. Then users could still override built-in functions, if they fully qualify object with the built-in namespace, but by default the common logic of current dB & cat would be used. CREATE TEMPORARY FUNCTION func ... registers temporary function in current cat & dB CREATE TEMPORARY FUNCTION cat.db.func ... registers temporary function in cat db CREATE TEMPORARY FUNCTION system.system.func ... Overrides built-in function with temporary function The built-in/system namespace would not be writable for permanent objects. WDYT? This way I think we can have benefits of both solutions. Best, Dawid On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: > Hi Bowen, > > I understand the potential benefit of overriding certain built-in > functions. I'm open to such a feature if many people agree. However, it > would be great to still support overriding catalog functions with > temporary functions in order to prototype a query even though a > catalog/database might not be available currently or should not be > modified yet. How about we support both cases? > > CREATE TEMPORARY FUNCTION abs > -> creates/overrides a built-in function and never consideres current > catalog and database; inconsistent with other DDL but acceptable for > functions I guess. > CREATE TEMPORARY FUNCTION cat.db.fun > -> creates/overrides a catalog function > > Regarding "Flink don't have any other built-in objects (tables, views) > except functions", this might change in the near future. Take > https://issues.apache.org/jira/browse/FLINK-13900 as an example. > > Thanks, > Timo > > On 14.09.19 01:40, Bowen Li wrote: > > Hi Fabian, > > > > Yes, I agree 1-part/no-override is the least favorable thus I didn't > > include that as a voting option, and the discussion is mainly between > > 1-part/override builtin and 3-part/not override builtin. > > > > Re > However, it means that temp functions are differently treated than > > other db objects. > > IMO, the treatment difference results from the fact that functions are a > > bit different from other objects - Flink don't have any other built-in > > objects (tables, views) except functions. > > > > Cheers, > > Bowen > > > > |
Thanks to Tmo and Dawid for sharing thoughts.
It seems to me that there is a general consensus on having temp functions that have no namespaces and overwrite built-in functions. (As a side note for comparability, the current user defined functions are all temporary and having no namespaces.) Nevertheless, I can also see the merit of having namespaced temp functions that can overwrite functions defined in a specific cat/db. However, this idea appears orthogonal to the former and can be added incrementally. How about we first implement non-namespaced temp functions now and leave the door open for namespaced ones for later releases as the requirement might become more crystal? This also helps shorten the debate and allow us to make some progress along this direction. As to Dawid's idea of having a dedicated cat/db to host the temporary temp functions that don't have namespaces, my only concern is the special treatment for a cat/db, which makes code less clean, as evident in treating the built-in catalog currently. Thanks, Xuefiu On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <[hidden email]> wrote: > Hi, > Another idea to consider on top of Timo's suggestion. How about we have a > special namespace (catalog + database) for built-in objects? This catalog > would be invisible for users as Xuefu was suggesting. > > Then users could still override built-in functions, if they fully qualify > object with the built-in namespace, but by default the common logic of > current dB & cat would be used. > > CREATE TEMPORARY FUNCTION func ... > registers temporary function in current cat & dB > > CREATE TEMPORARY FUNCTION cat.db.func ... > registers temporary function in cat db > > CREATE TEMPORARY FUNCTION system.system.func ... > Overrides built-in function with temporary function > > The built-in/system namespace would not be writable for permanent objects. > WDYT? > > This way I think we can have benefits of both solutions. > > Best, > Dawid > > > On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: > > > Hi Bowen, > > > > I understand the potential benefit of overriding certain built-in > > functions. I'm open to such a feature if many people agree. However, it > > would be great to still support overriding catalog functions with > > temporary functions in order to prototype a query even though a > > catalog/database might not be available currently or should not be > > modified yet. How about we support both cases? > > > > CREATE TEMPORARY FUNCTION abs > > -> creates/overrides a built-in function and never consideres current > > catalog and database; inconsistent with other DDL but acceptable for > > functions I guess. > > CREATE TEMPORARY FUNCTION cat.db.fun > > -> creates/overrides a catalog function > > > > Regarding "Flink don't have any other built-in objects (tables, views) > > except functions", this might change in the near future. Take > > https://issues.apache.org/jira/browse/FLINK-13900 as an example. > > > > Thanks, > > Timo > > > > On 14.09.19 01:40, Bowen Li wrote: > > > Hi Fabian, > > > > > > Yes, I agree 1-part/no-override is the least favorable thus I didn't > > > include that as a voting option, and the discussion is mainly between > > > 1-part/override builtin and 3-part/not override builtin. > > > > > > Re > However, it means that temp functions are differently treated than > > > other db objects. > > > IMO, the treatment difference results from the fact that functions are > a > > > bit different from other objects - Flink don't have any other built-in > > > objects (tables, views) except functions. > > > > > > Cheers, > > > Bowen > > > > > > > > -- Xuefu Zhang "In Honey We Trust!" |
hi, everyone
I think this flip is very meaningful. it supports functions that can be shared by different catalogs and dbs, reducing the duplication of functions. Our group based on flink's sql parser module implements create function feature, stores the parsed function metadata and schema into mysql, and also customizes the catalog, customizes sql-client to support custom schemas and functions. Loaded, but the function is currently global, and is not subdivided according to catalog and db. In addition, I very much hope to participate in the development of this flip, I have been paying attention to the community, but found it is more difficult to join. thank you. Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > Thanks to Tmo and Dawid for sharing thoughts. > > It seems to me that there is a general consensus on having temp functions > that have no namespaces and overwrite built-in functions. (As a side note > for comparability, the current user defined functions are all temporary and > having no namespaces.) > > Nevertheless, I can also see the merit of having namespaced temp functions > that can overwrite functions defined in a specific cat/db. However, this > idea appears orthogonal to the former and can be added incrementally. > > How about we first implement non-namespaced temp functions now and leave > the door open for namespaced ones for later releases as the requirement > might become more crystal? This also helps shorten the debate and allow us > to make some progress along this direction. > > As to Dawid's idea of having a dedicated cat/db to host the temporary temp > functions that don't have namespaces, my only concern is the special > treatment for a cat/db, which makes code less clean, as evident in treating > the built-in catalog currently. > > Thanks, > Xuefiu > > On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > [hidden email]> > wrote: > > > Hi, > > Another idea to consider on top of Timo's suggestion. How about we have a > > special namespace (catalog + database) for built-in objects? This catalog > > would be invisible for users as Xuefu was suggesting. > > > > Then users could still override built-in functions, if they fully qualify > > object with the built-in namespace, but by default the common logic of > > current dB & cat would be used. > > > > CREATE TEMPORARY FUNCTION func ... > > registers temporary function in current cat & dB > > > > CREATE TEMPORARY FUNCTION cat.db.func ... > > registers temporary function in cat db > > > > CREATE TEMPORARY FUNCTION system.system.func ... > > Overrides built-in function with temporary function > > > > The built-in/system namespace would not be writable for permanent > objects. > > WDYT? > > > > This way I think we can have benefits of both solutions. > > > > Best, > > Dawid > > > > > > On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: > > > > > Hi Bowen, > > > > > > I understand the potential benefit of overriding certain built-in > > > functions. I'm open to such a feature if many people agree. However, it > > > would be great to still support overriding catalog functions with > > > temporary functions in order to prototype a query even though a > > > catalog/database might not be available currently or should not be > > > modified yet. How about we support both cases? > > > > > > CREATE TEMPORARY FUNCTION abs > > > -> creates/overrides a built-in function and never consideres current > > > catalog and database; inconsistent with other DDL but acceptable for > > > functions I guess. > > > CREATE TEMPORARY FUNCTION cat.db.fun > > > -> creates/overrides a catalog function > > > > > > Regarding "Flink don't have any other built-in objects (tables, views) > > > except functions", this might change in the near future. Take > > > https://issues.apache.org/jira/browse/FLINK-13900 as an example. > > > > > > Thanks, > > > Timo > > > > > > On 14.09.19 01:40, Bowen Li wrote: > > > > Hi Fabian, > > > > > > > > Yes, I agree 1-part/no-override is the least favorable thus I didn't > > > > include that as a voting option, and the discussion is mainly between > > > > 1-part/override builtin and 3-part/not override builtin. > > > > > > > > Re > However, it means that temp functions are differently treated > than > > > > other db objects. > > > > IMO, the treatment difference results from the fact that functions > are > > a > > > > bit different from other objects - Flink don't have any other > built-in > > > > objects (tables, views) except functions. > > > > > > > > Cheers, > > > > Bowen > > > > > > > > > > > > > > > -- > Xuefu Zhang > > "In Honey We Trust!" > |
Hi everyone,
@Xuefu: I would like to avoid adding too many things incrementally. Users should be able to override all catalog objects consistently according to FLIP-64 (Support for Temporary Objects in Table module). If functions are treated completely different, we need more code and special cases. From an implementation perspective, this topic only affects the lookup logic which is rather low implementation effort which is why I would like to clarify the remaining items. As you said, we have a slight consenus on overriding built-in functions; we should also strive for reaching consensus on the remaining topics. @Dawid: I like your idea as it ensures registering catalog objects consistent and the overriding of built-in functions more explicit. Thanks, Timo On 17.09.19 11:59, kai wang wrote: > hi, everyone > I think this flip is very meaningful. it supports functions that can be > shared by different catalogs and dbs, reducing the duplication of functions. > > Our group based on flink's sql parser module implements create function > feature, stores the parsed function metadata and schema into mysql, and > also customizes the catalog, customizes sql-client to support custom > schemas and functions. Loaded, but the function is currently global, and is > not subdivided according to catalog and db. > > In addition, I very much hope to participate in the development of this > flip, I have been paying attention to the community, but found it is more > difficult to join. > thank you. > > Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > >> Thanks to Tmo and Dawid for sharing thoughts. >> >> It seems to me that there is a general consensus on having temp functions >> that have no namespaces and overwrite built-in functions. (As a side note >> for comparability, the current user defined functions are all temporary and >> having no namespaces.) >> >> Nevertheless, I can also see the merit of having namespaced temp functions >> that can overwrite functions defined in a specific cat/db. However, this >> idea appears orthogonal to the former and can be added incrementally. >> >> How about we first implement non-namespaced temp functions now and leave >> the door open for namespaced ones for later releases as the requirement >> might become more crystal? This also helps shorten the debate and allow us >> to make some progress along this direction. >> >> As to Dawid's idea of having a dedicated cat/db to host the temporary temp >> functions that don't have namespaces, my only concern is the special >> treatment for a cat/db, which makes code less clean, as evident in treating >> the built-in catalog currently. >> >> Thanks, >> Xuefiu >> >> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >> [hidden email]> >> wrote: >> >>> Hi, >>> Another idea to consider on top of Timo's suggestion. How about we have a >>> special namespace (catalog + database) for built-in objects? This catalog >>> would be invisible for users as Xuefu was suggesting. >>> >>> Then users could still override built-in functions, if they fully qualify >>> object with the built-in namespace, but by default the common logic of >>> current dB & cat would be used. >>> >>> CREATE TEMPORARY FUNCTION func ... >>> registers temporary function in current cat & dB >>> >>> CREATE TEMPORARY FUNCTION cat.db.func ... >>> registers temporary function in cat db >>> >>> CREATE TEMPORARY FUNCTION system.system.func ... >>> Overrides built-in function with temporary function >>> >>> The built-in/system namespace would not be writable for permanent >> objects. >>> WDYT? >>> >>> This way I think we can have benefits of both solutions. >>> >>> Best, >>> Dawid >>> >>> >>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: >>> >>>> Hi Bowen, >>>> >>>> I understand the potential benefit of overriding certain built-in >>>> functions. I'm open to such a feature if many people agree. However, it >>>> would be great to still support overriding catalog functions with >>>> temporary functions in order to prototype a query even though a >>>> catalog/database might not be available currently or should not be >>>> modified yet. How about we support both cases? >>>> >>>> CREATE TEMPORARY FUNCTION abs >>>> -> creates/overrides a built-in function and never consideres current >>>> catalog and database; inconsistent with other DDL but acceptable for >>>> functions I guess. >>>> CREATE TEMPORARY FUNCTION cat.db.fun >>>> -> creates/overrides a catalog function >>>> >>>> Regarding "Flink don't have any other built-in objects (tables, views) >>>> except functions", this might change in the near future. Take >>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. >>>> >>>> Thanks, >>>> Timo >>>> >>>> On 14.09.19 01:40, Bowen Li wrote: >>>>> Hi Fabian, >>>>> >>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't >>>>> include that as a voting option, and the discussion is mainly between >>>>> 1-part/override builtin and 3-part/not override builtin. >>>>> >>>>> Re > However, it means that temp functions are differently treated >> than >>>>> other db objects. >>>>> IMO, the treatment difference results from the fact that functions >> are >>> a >>>>> bit different from other objects - Flink don't have any other >> built-in >>>>> objects (tables, views) except functions. >>>>> >>>>> Cheers, >>>>> Bowen >>>>> >>>> >> >> -- >> Xuefu Zhang >> >> "In Honey We Trust!" >> |
Hi,
+1 to strive for reaching consensus on the remaining topics. We are close to the truth. It will waste a lot of time if we resume the topic some time later. +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way to override a catalog function. I’m not sure about “system.system.fun”, it introduces a nonexistent cat & db? And we still need to do special treatment for the dedicated system.system cat & db? Best, Jark > 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > Hi everyone, > > @Xuefu: I would like to avoid adding too many things incrementally. Users should be able to override all catalog objects consistently according to FLIP-64 (Support for Temporary Objects in Table module). If functions are treated completely different, we need more code and special cases. From an implementation perspective, this topic only affects the lookup logic which is rather low implementation effort which is why I would like to clarify the remaining items. As you said, we have a slight consenus on overriding built-in functions; we should also strive for reaching consensus on the remaining topics. > > @Dawid: I like your idea as it ensures registering catalog objects consistent and the overriding of built-in functions more explicit. > > Thanks, > Timo > > > On 17.09.19 11:59, kai wang wrote: >> hi, everyone >> I think this flip is very meaningful. it supports functions that can be >> shared by different catalogs and dbs, reducing the duplication of functions. >> >> Our group based on flink's sql parser module implements create function >> feature, stores the parsed function metadata and schema into mysql, and >> also customizes the catalog, customizes sql-client to support custom >> schemas and functions. Loaded, but the function is currently global, and is >> not subdivided according to catalog and db. >> >> In addition, I very much hope to participate in the development of this >> flip, I have been paying attention to the community, but found it is more >> difficult to join. >> thank you. >> >> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >> >>> Thanks to Tmo and Dawid for sharing thoughts. >>> >>> It seems to me that there is a general consensus on having temp functions >>> that have no namespaces and overwrite built-in functions. (As a side note >>> for comparability, the current user defined functions are all temporary and >>> having no namespaces.) >>> >>> Nevertheless, I can also see the merit of having namespaced temp functions >>> that can overwrite functions defined in a specific cat/db. However, this >>> idea appears orthogonal to the former and can be added incrementally. >>> >>> How about we first implement non-namespaced temp functions now and leave >>> the door open for namespaced ones for later releases as the requirement >>> might become more crystal? This also helps shorten the debate and allow us >>> to make some progress along this direction. >>> >>> As to Dawid's idea of having a dedicated cat/db to host the temporary temp >>> functions that don't have namespaces, my only concern is the special >>> treatment for a cat/db, which makes code less clean, as evident in treating >>> the built-in catalog currently. >>> >>> Thanks, >>> Xuefiu >>> >>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >>> [hidden email]> >>> wrote: >>> >>>> Hi, >>>> Another idea to consider on top of Timo's suggestion. How about we have a >>>> special namespace (catalog + database) for built-in objects? This catalog >>>> would be invisible for users as Xuefu was suggesting. >>>> >>>> Then users could still override built-in functions, if they fully qualify >>>> object with the built-in namespace, but by default the common logic of >>>> current dB & cat would be used. >>>> >>>> CREATE TEMPORARY FUNCTION func ... >>>> registers temporary function in current cat & dB >>>> >>>> CREATE TEMPORARY FUNCTION cat.db.func ... >>>> registers temporary function in cat db >>>> >>>> CREATE TEMPORARY FUNCTION system.system.func ... >>>> Overrides built-in function with temporary function >>>> >>>> The built-in/system namespace would not be writable for permanent >>> objects. >>>> WDYT? >>>> >>>> This way I think we can have benefits of both solutions. >>>> >>>> Best, >>>> Dawid >>>> >>>> >>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: >>>> >>>>> Hi Bowen, >>>>> >>>>> I understand the potential benefit of overriding certain built-in >>>>> functions. I'm open to such a feature if many people agree. However, it >>>>> would be great to still support overriding catalog functions with >>>>> temporary functions in order to prototype a query even though a >>>>> catalog/database might not be available currently or should not be >>>>> modified yet. How about we support both cases? >>>>> >>>>> CREATE TEMPORARY FUNCTION abs >>>>> -> creates/overrides a built-in function and never consideres current >>>>> catalog and database; inconsistent with other DDL but acceptable for >>>>> functions I guess. >>>>> CREATE TEMPORARY FUNCTION cat.db.fun >>>>> -> creates/overrides a catalog function >>>>> >>>>> Regarding "Flink don't have any other built-in objects (tables, views) >>>>> except functions", this might change in the near future. Take >>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. >>>>> >>>>> Thanks, >>>>> Timo >>>>> >>>>> On 14.09.19 01:40, Bowen Li wrote: >>>>>> Hi Fabian, >>>>>> >>>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't >>>>>> include that as a voting option, and the discussion is mainly between >>>>>> 1-part/override builtin and 3-part/not override builtin. >>>>>> >>>>>> Re > However, it means that temp functions are differently treated >>> than >>>>>> other db objects. >>>>>> IMO, the treatment difference results from the fact that functions >>> are >>>> a >>>>>> bit different from other objects - Flink don't have any other >>> built-in >>>>>> objects (tables, views) except functions. >>>>>> >>>>>> Cheers, >>>>>> Bowen >>>>>> >>>>> >>> >>> -- >>> Xuefu Zhang >>> >>> "In Honey We Trust!" >>> > |
Hi,
I think this discussion and the one for FLIP-64 are very connected. To resolve the differences, think we have to think about the basic principles and find consensus there. The basic questions I see are: - Do we want to support overriding builtin functions? - Do we want to support overriding catalog functions? - And then later: should temporary functions be tied to a catalog/database? I don’t have much to say about these, except that we should somewhat stick to what the industry does. But I also understand that the industry is already very divided on this. Best, Aljoscha > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > Hi, > > +1 to strive for reaching consensus on the remaining topics. We are close to the truth. It will waste a lot of time if we resume the topic some time later. > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way to override a catalog function. > > I’m not sure about “system.system.fun”, it introduces a nonexistent cat & db? And we still need to do special treatment for the dedicated system.system cat & db? > > Best, > Jark > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >> >> Hi everyone, >> >> @Xuefu: I would like to avoid adding too many things incrementally. Users should be able to override all catalog objects consistently according to FLIP-64 (Support for Temporary Objects in Table module). If functions are treated completely different, we need more code and special cases. From an implementation perspective, this topic only affects the lookup logic which is rather low implementation effort which is why I would like to clarify the remaining items. As you said, we have a slight consenus on overriding built-in functions; we should also strive for reaching consensus on the remaining topics. >> >> @Dawid: I like your idea as it ensures registering catalog objects consistent and the overriding of built-in functions more explicit. >> >> Thanks, >> Timo >> >> >> On 17.09.19 11:59, kai wang wrote: >>> hi, everyone >>> I think this flip is very meaningful. it supports functions that can be >>> shared by different catalogs and dbs, reducing the duplication of functions. >>> >>> Our group based on flink's sql parser module implements create function >>> feature, stores the parsed function metadata and schema into mysql, and >>> also customizes the catalog, customizes sql-client to support custom >>> schemas and functions. Loaded, but the function is currently global, and is >>> not subdivided according to catalog and db. >>> >>> In addition, I very much hope to participate in the development of this >>> flip, I have been paying attention to the community, but found it is more >>> difficult to join. >>> thank you. >>> >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >>> >>>> Thanks to Tmo and Dawid for sharing thoughts. >>>> >>>> It seems to me that there is a general consensus on having temp functions >>>> that have no namespaces and overwrite built-in functions. (As a side note >>>> for comparability, the current user defined functions are all temporary and >>>> having no namespaces.) >>>> >>>> Nevertheless, I can also see the merit of having namespaced temp functions >>>> that can overwrite functions defined in a specific cat/db. However, this >>>> idea appears orthogonal to the former and can be added incrementally. >>>> >>>> How about we first implement non-namespaced temp functions now and leave >>>> the door open for namespaced ones for later releases as the requirement >>>> might become more crystal? This also helps shorten the debate and allow us >>>> to make some progress along this direction. >>>> >>>> As to Dawid's idea of having a dedicated cat/db to host the temporary temp >>>> functions that don't have namespaces, my only concern is the special >>>> treatment for a cat/db, which makes code less clean, as evident in treating >>>> the built-in catalog currently. >>>> >>>> Thanks, >>>> Xuefiu >>>> >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >>>> [hidden email]> >>>> wrote: >>>> >>>>> Hi, >>>>> Another idea to consider on top of Timo's suggestion. How about we have a >>>>> special namespace (catalog + database) for built-in objects? This catalog >>>>> would be invisible for users as Xuefu was suggesting. >>>>> >>>>> Then users could still override built-in functions, if they fully qualify >>>>> object with the built-in namespace, but by default the common logic of >>>>> current dB & cat would be used. >>>>> >>>>> CREATE TEMPORARY FUNCTION func ... >>>>> registers temporary function in current cat & dB >>>>> >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >>>>> registers temporary function in cat db >>>>> >>>>> CREATE TEMPORARY FUNCTION system.system.func ... >>>>> Overrides built-in function with temporary function >>>>> >>>>> The built-in/system namespace would not be writable for permanent >>>> objects. >>>>> WDYT? >>>>> >>>>> This way I think we can have benefits of both solutions. >>>>> >>>>> Best, >>>>> Dawid >>>>> >>>>> >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: >>>>> >>>>>> Hi Bowen, >>>>>> >>>>>> I understand the potential benefit of overriding certain built-in >>>>>> functions. I'm open to such a feature if many people agree. However, it >>>>>> would be great to still support overriding catalog functions with >>>>>> temporary functions in order to prototype a query even though a >>>>>> catalog/database might not be available currently or should not be >>>>>> modified yet. How about we support both cases? >>>>>> >>>>>> CREATE TEMPORARY FUNCTION abs >>>>>> -> creates/overrides a built-in function and never consideres current >>>>>> catalog and database; inconsistent with other DDL but acceptable for >>>>>> functions I guess. >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >>>>>> -> creates/overrides a catalog function >>>>>> >>>>>> Regarding "Flink don't have any other built-in objects (tables, views) >>>>>> except functions", this might change in the near future. Take >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. >>>>>> >>>>>> Thanks, >>>>>> Timo >>>>>> >>>>>> On 14.09.19 01:40, Bowen Li wrote: >>>>>>> Hi Fabian, >>>>>>> >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I didn't >>>>>>> include that as a voting option, and the discussion is mainly between >>>>>>> 1-part/override builtin and 3-part/not override builtin. >>>>>>> >>>>>>> Re > However, it means that temp functions are differently treated >>>> than >>>>>>> other db objects. >>>>>>> IMO, the treatment difference results from the fact that functions >>>> are >>>>> a >>>>>>> bit different from other objects - Flink don't have any other >>>> built-in >>>>>>> objects (tables, views) except functions. >>>>>>> >>>>>>> Cheers, >>>>>>> Bowen >>>>>>> >>>>>> >>>> >>>> -- >>>> Xuefu Zhang >>>> >>>> "In Honey We Trust!" >>>> >> > |
Hi Aljoscha,
Thanks for the summary and these are great questions to be answered. The answer to your first question is clear: there is a general agreement to override built-in functions with temp functions. However, your second and third questions are sort of related, as a function reference can be either just function name (like "func") or in the form or "cat.db.func". When a reference is just function name, it can mean either a built-in function or a function defined in the current cat/db. If we support overriding a built-in function with a temp function, such overriding can also cover a function in the current cat/db. I think what Timo referred as "overriding a catalog function" means a temp function defined as "cat.db.func" overrides a catalog function "func" in cat/db even if cat/db is not current. To support this, temp function has to be tied to a cat/db. What's why I said above that the 2nd and 3rd questions are related. The problem with such support is the ambiguity when user defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". Here "func" can means a global temp function, or a temp function in current cat/db. If we can assume the former, this creates an inconsistency because "CREATE FUNCTION func" actually means a function in current cat/db. If we assume the latter, then there is no way for user to create a global temp function. Giving a special namespace for built-in functions may solve the ambiguity problem above, but it also introduces artificial catalog/database that needs special treatment and pollutes the cleanness of the code. I would rather introduce a syntax in DDL to solve the problem, like "CREATE [GLOBAL] TEMPORARY FUNCTION func". Thus, I'd like to summarize a few candidate proposals for voting purposes: 1. Support only global, temporary functions without namespace. Such temp functions overrides built-in functions and catalog functions in current cat/db. The resolution order is: temp functions -> built-in functions -> catalog functions. (Partially or fully qualified functions has no ambiguity!) 2. In addition to #1, support creating and referencing temporary functions associated with a cat/db with "GLOBAL" qualifier in DDL for global temp functions. The resolution order is: global temp functions -> built-in functions -> temp functions in current cat/db -> catalog function. (Resolution for partially or fully qualified function reference is: temp functions -> persistent functions.) 3. In addition to #1, support creating and referencing temporary functions associated with a cat/db with a special namespace for built-in functions and global temp functions. The resolution is the same as #2, except that the special namespace might be prefixed to a reference to a built-in function or global temp function. (In absence of the special namespace, the resolution order is the same as in #2.) My personal preference is #1, given the unknown use case and introduced complexity for #2 and #3. However, #2 is an acceptable alternative. Thus, my votes are: +1 for #1 +0 for #2 -1 for #3 Everyone, please cast your vote (in above format please!), or let me know if you have more questions or other candidates. Thanks, Xuefu On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> wrote: > Hi, > > I think this discussion and the one for FLIP-64 are very connected. To > resolve the differences, think we have to think about the basic principles > and find consensus there. The basic questions I see are: > > - Do we want to support overriding builtin functions? > - Do we want to support overriding catalog functions? > - And then later: should temporary functions be tied to a > catalog/database? > > I don’t have much to say about these, except that we should somewhat stick > to what the industry does. But I also understand that the industry is > already very divided on this. > > Best, > Aljoscha > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > Hi, > > > > +1 to strive for reaching consensus on the remaining topics. We are > close to the truth. It will waste a lot of time if we resume the topic some > time later. > > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way > to override a catalog function. > > > > I’m not sure about “system.system.fun”, it introduces a nonexistent cat > & db? And we still need to do special treatment for the dedicated > system.system cat & db? > > > > Best, > > Jark > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > >> > >> Hi everyone, > >> > >> @Xuefu: I would like to avoid adding too many things incrementally. > Users should be able to override all catalog objects consistently according > to FLIP-64 (Support for Temporary Objects in Table module). If functions > are treated completely different, we need more code and special cases. From > an implementation perspective, this topic only affects the lookup logic > which is rather low implementation effort which is why I would like to > clarify the remaining items. As you said, we have a slight consenus on > overriding built-in functions; we should also strive for reaching consensus > on the remaining topics. > >> > >> @Dawid: I like your idea as it ensures registering catalog objects > consistent and the overriding of built-in functions more explicit. > >> > >> Thanks, > >> Timo > >> > >> > >> On 17.09.19 11:59, kai wang wrote: > >>> hi, everyone > >>> I think this flip is very meaningful. it supports functions that can be > >>> shared by different catalogs and dbs, reducing the duplication of > functions. > >>> > >>> Our group based on flink's sql parser module implements create function > >>> feature, stores the parsed function metadata and schema into mysql, and > >>> also customizes the catalog, customizes sql-client to support custom > >>> schemas and functions. Loaded, but the function is currently global, > and is > >>> not subdivided according to catalog and db. > >>> > >>> In addition, I very much hope to participate in the development of this > >>> flip, I have been paying attention to the community, but found it is > more > >>> difficult to join. > >>> thank you. > >>> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > >>> > >>>> Thanks to Tmo and Dawid for sharing thoughts. > >>>> > >>>> It seems to me that there is a general consensus on having temp > functions > >>>> that have no namespaces and overwrite built-in functions. (As a side > note > >>>> for comparability, the current user defined functions are all > temporary and > >>>> having no namespaces.) > >>>> > >>>> Nevertheless, I can also see the merit of having namespaced temp > functions > >>>> that can overwrite functions defined in a specific cat/db. However, > this > >>>> idea appears orthogonal to the former and can be added incrementally. > >>>> > >>>> How about we first implement non-namespaced temp functions now and > leave > >>>> the door open for namespaced ones for later releases as the > requirement > >>>> might become more crystal? This also helps shorten the debate and > allow us > >>>> to make some progress along this direction. > >>>> > >>>> As to Dawid's idea of having a dedicated cat/db to host the temporary > temp > >>>> functions that don't have namespaces, my only concern is the special > >>>> treatment for a cat/db, which makes code less clean, as evident in > treating > >>>> the built-in catalog currently. > >>>> > >>>> Thanks, > >>>> Xuefiu > >>>> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > >>>> [hidden email]> > >>>> wrote: > >>>> > >>>>> Hi, > >>>>> Another idea to consider on top of Timo's suggestion. How about we > have a > >>>>> special namespace (catalog + database) for built-in objects? This > catalog > >>>>> would be invisible for users as Xuefu was suggesting. > >>>>> > >>>>> Then users could still override built-in functions, if they fully > qualify > >>>>> object with the built-in namespace, but by default the common logic > of > >>>>> current dB & cat would be used. > >>>>> > >>>>> CREATE TEMPORARY FUNCTION func ... > >>>>> registers temporary function in current cat & dB > >>>>> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > >>>>> registers temporary function in cat db > >>>>> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > >>>>> Overrides built-in function with temporary function > >>>>> > >>>>> The built-in/system namespace would not be writable for permanent > >>>> objects. > >>>>> WDYT? > >>>>> > >>>>> This way I think we can have benefits of both solutions. > >>>>> > >>>>> Best, > >>>>> Dawid > >>>>> > >>>>> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> wrote: > >>>>> > >>>>>> Hi Bowen, > >>>>>> > >>>>>> I understand the potential benefit of overriding certain built-in > >>>>>> functions. I'm open to such a feature if many people agree. > However, it > >>>>>> would be great to still support overriding catalog functions with > >>>>>> temporary functions in order to prototype a query even though a > >>>>>> catalog/database might not be available currently or should not be > >>>>>> modified yet. How about we support both cases? > >>>>>> > >>>>>> CREATE TEMPORARY FUNCTION abs > >>>>>> -> creates/overrides a built-in function and never consideres > current > >>>>>> catalog and database; inconsistent with other DDL but acceptable for > >>>>>> functions I guess. > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > >>>>>> -> creates/overrides a catalog function > >>>>>> > >>>>>> Regarding "Flink don't have any other built-in objects (tables, > views) > >>>>>> except functions", this might change in the near future. Take > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. > >>>>>> > >>>>>> Thanks, > >>>>>> Timo > >>>>>> > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > >>>>>>> Hi Fabian, > >>>>>>> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > didn't > >>>>>>> include that as a voting option, and the discussion is mainly > between > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > >>>>>>> > >>>>>>> Re > However, it means that temp functions are differently treated > >>>> than > >>>>>>> other db objects. > >>>>>>> IMO, the treatment difference results from the fact that functions > >>>> are > >>>>> a > >>>>>>> bit different from other objects - Flink don't have any other > >>>> built-in > >>>>>>> objects (tables, views) except functions. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Bowen > >>>>>>> > >>>>>> > >>>> > >>>> -- > >>>> Xuefu Zhang > >>>> > >>>> "In Honey We Trust!" > >>>> > >> > > > > -- Xuefu Zhang "In Honey We Trust!" |
Hi,
I think it makes sense to start voting at this point. Option 1: Only 1-part identifiers PROS: - allows shadowing built-in functions CONS: - incosistent with all the other objects, both permanent & temporary - does not allow shadowing catalog functions Option 2: Special keyword for built-in function I think this is quite similar to the special catalog/db. The thing I am strongly against in this proposal is the GLOBAL keyword. This keyword has a meaning in rdbms systems and means a function that is present for a lifetime of a session in which it was created, but available in all other sessions. Therefore I really don't want to use this keyword in a different context. Option 3: Special catalog/db PROS: - allows shadowing built-in functions - allows shadowing catalog functions - consistent with other objects CONS: - we introduce a special namespace for built-in functions I don't see a problem with introducing the special namespace. In the end it is very similar to the keyword approach. In this case the catalog/db combination would be the "keyword" Therefore my votes: Option 1: -0 Option 2: -1 (I might change to +0 if we can come up with a better keyword) Option 3: +1 Best, Dawid On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > Hi Aljoscha, > > Thanks for the summary and these are great questions to be answered. The > answer to your first question is clear: there is a general agreement to > override built-in functions with temp functions. > > However, your second and third questions are sort of related, as a function > reference can be either just function name (like "func") or in the form or > "cat.db.func". When a reference is just function name, it can mean either a > built-in function or a function defined in the current cat/db. If we > support overriding a built-in function with a temp function, such > overriding can also cover a function in the current cat/db. > > I think what Timo referred as "overriding a catalog function" means a temp > function defined as "cat.db.func" overrides a catalog function "func" in > cat/db even if cat/db is not current. To support this, temp function has to > be tied to a cat/db. What's why I said above that the 2nd and 3rd questions > are related. The problem with such support is the ambiguity when user > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". > Here "func" can means a global temp function, or a temp function in current > cat/db. If we can assume the former, this creates an inconsistency because > "CREATE FUNCTION func" actually means a function in current cat/db. If we > assume the latter, then there is no way for user to create a global temp > function. > > Giving a special namespace for built-in functions may solve the ambiguity > problem above, but it also introduces artificial catalog/database that > needs special treatment and pollutes the cleanness of the code. I would > rather introduce a syntax in DDL to solve the problem, like "CREATE > [GLOBAL] TEMPORARY FUNCTION func". > > Thus, I'd like to summarize a few candidate proposals for voting purposes: > > 1. Support only global, temporary functions without namespace. Such temp > functions overrides built-in functions and catalog functions in current > cat/db. The resolution order is: temp functions -> built-in functions -> > catalog functions. (Partially or fully qualified functions has no > ambiguity!) > > 2. In addition to #1, support creating and referencing temporary functions > associated with a cat/db with "GLOBAL" qualifier in DDL for global temp > functions. The resolution order is: global temp functions -> built-in > functions -> temp functions in current cat/db -> catalog function. > (Resolution for partially or fully qualified function reference is: temp > functions -> persistent functions.) > > 3. In addition to #1, support creating and referencing temporary functions > associated with a cat/db with a special namespace for built-in functions > and global temp functions. The resolution is the same as #2, except that > the special namespace might be prefixed to a reference to a built-in > function or global temp function. (In absence of the special namespace, the > resolution order is the same as in #2.) > > My personal preference is #1, given the unknown use case and introduced > complexity for #2 and #3. However, #2 is an acceptable alternative. Thus, > my votes are: > > +1 for #1 > +0 for #2 > -1 for #3 > > Everyone, please cast your vote (in above format please!), or let me know > if you have more questions or other candidates. > > Thanks, > Xuefu > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> > wrote: > > > Hi, > > > > I think this discussion and the one for FLIP-64 are very connected. To > > resolve the differences, think we have to think about the basic > principles > > and find consensus there. The basic questions I see are: > > > > - Do we want to support overriding builtin functions? > > - Do we want to support overriding catalog functions? > > - And then later: should temporary functions be tied to a > > catalog/database? > > > > I don’t have much to say about these, except that we should somewhat > stick > > to what the industry does. But I also understand that the industry is > > already very divided on this. > > > > Best, > > Aljoscha > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > > > Hi, > > > > > > +1 to strive for reaching consensus on the remaining topics. We are > > close to the truth. It will waste a lot of time if we resume the topic > some > > time later. > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way > > to override a catalog function. > > > > > > I’m not sure about “system.system.fun”, it introduces a nonexistent cat > > & db? And we still need to do special treatment for the dedicated > > system.system cat & db? > > > > > > Best, > > > Jark > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > >> > > >> Hi everyone, > > >> > > >> @Xuefu: I would like to avoid adding too many things incrementally. > > Users should be able to override all catalog objects consistently > according > > to FLIP-64 (Support for Temporary Objects in Table module). If functions > > are treated completely different, we need more code and special cases. > From > > an implementation perspective, this topic only affects the lookup logic > > which is rather low implementation effort which is why I would like to > > clarify the remaining items. As you said, we have a slight consenus on > > overriding built-in functions; we should also strive for reaching > consensus > > on the remaining topics. > > >> > > >> @Dawid: I like your idea as it ensures registering catalog objects > > consistent and the overriding of built-in functions more explicit. > > >> > > >> Thanks, > > >> Timo > > >> > > >> > > >> On 17.09.19 11:59, kai wang wrote: > > >>> hi, everyone > > >>> I think this flip is very meaningful. it supports functions that can > be > > >>> shared by different catalogs and dbs, reducing the duplication of > > functions. > > >>> > > >>> Our group based on flink's sql parser module implements create > function > > >>> feature, stores the parsed function metadata and schema into mysql, > and > > >>> also customizes the catalog, customizes sql-client to support custom > > >>> schemas and functions. Loaded, but the function is currently global, > > and is > > >>> not subdivided according to catalog and db. > > >>> > > >>> In addition, I very much hope to participate in the development of > this > > >>> flip, I have been paying attention to the community, but found it is > > more > > >>> difficult to join. > > >>> thank you. > > >>> > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > > >>> > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > >>>> > > >>>> It seems to me that there is a general consensus on having temp > > functions > > >>>> that have no namespaces and overwrite built-in functions. (As a side > > note > > >>>> for comparability, the current user defined functions are all > > temporary and > > >>>> having no namespaces.) > > >>>> > > >>>> Nevertheless, I can also see the merit of having namespaced temp > > functions > > >>>> that can overwrite functions defined in a specific cat/db. However, > > this > > >>>> idea appears orthogonal to the former and can be added > incrementally. > > >>>> > > >>>> How about we first implement non-namespaced temp functions now and > > leave > > >>>> the door open for namespaced ones for later releases as the > > requirement > > >>>> might become more crystal? This also helps shorten the debate and > > allow us > > >>>> to make some progress along this direction. > > >>>> > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > temporary > > temp > > >>>> functions that don't have namespaces, my only concern is the special > > >>>> treatment for a cat/db, which makes code less clean, as evident in > > treating > > >>>> the built-in catalog currently. > > >>>> > > >>>> Thanks, > > >>>> Xuefiu > > >>>> > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > >>>> [hidden email]> > > >>>> wrote: > > >>>> > > >>>>> Hi, > > >>>>> Another idea to consider on top of Timo's suggestion. How about we > > have a > > >>>>> special namespace (catalog + database) for built-in objects? This > > catalog > > >>>>> would be invisible for users as Xuefu was suggesting. > > >>>>> > > >>>>> Then users could still override built-in functions, if they fully > > qualify > > >>>>> object with the built-in namespace, but by default the common logic > > of > > >>>>> current dB & cat would be used. > > >>>>> > > >>>>> CREATE TEMPORARY FUNCTION func ... > > >>>>> registers temporary function in current cat & dB > > >>>>> > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > >>>>> registers temporary function in cat db > > >>>>> > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > >>>>> Overrides built-in function with temporary function > > >>>>> > > >>>>> The built-in/system namespace would not be writable for permanent > > >>>> objects. > > >>>>> WDYT? > > >>>>> > > >>>>> This way I think we can have benefits of both solutions. > > >>>>> > > >>>>> Best, > > >>>>> Dawid > > >>>>> > > >>>>> > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> > wrote: > > >>>>> > > >>>>>> Hi Bowen, > > >>>>>> > > >>>>>> I understand the potential benefit of overriding certain built-in > > >>>>>> functions. I'm open to such a feature if many people agree. > > However, it > > >>>>>> would be great to still support overriding catalog functions with > > >>>>>> temporary functions in order to prototype a query even though a > > >>>>>> catalog/database might not be available currently or should not be > > >>>>>> modified yet. How about we support both cases? > > >>>>>> > > >>>>>> CREATE TEMPORARY FUNCTION abs > > >>>>>> -> creates/overrides a built-in function and never consideres > > current > > >>>>>> catalog and database; inconsistent with other DDL but acceptable > for > > >>>>>> functions I guess. > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > >>>>>> -> creates/overrides a catalog function > > >>>>>> > > >>>>>> Regarding "Flink don't have any other built-in objects (tables, > > views) > > >>>>>> except functions", this might change in the near future. Take > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Timo > > >>>>>> > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > >>>>>>> Hi Fabian, > > >>>>>>> > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > > didn't > > >>>>>>> include that as a voting option, and the discussion is mainly > > between > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > >>>>>>> > > >>>>>>> Re > However, it means that temp functions are differently > treated > > >>>> than > > >>>>>>> other db objects. > > >>>>>>> IMO, the treatment difference results from the fact that > functions > > >>>> are > > >>>>> a > > >>>>>>> bit different from other objects - Flink don't have any other > > >>>> built-in > > >>>>>>> objects (tables, views) except functions. > > >>>>>>> > > >>>>>>> Cheers, > > >>>>>>> Bowen > > >>>>>>> > > >>>>>> > > >>>> > > >>>> -- > > >>>> Xuefu Zhang > > >>>> > > >>>> "In Honey We Trust!" > > >>>> > > >> > > > > > > > > > -- > Xuefu Zhang > > "In Honey We Trust!" > |
Last additional comment on Option 2. The reason why I prefer option 3 is
that in option 3 all objects internally are identified with 3 parts. This makes it easier to handle at different locations e.g. while persisting views, as all objects have uniform representation. On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]> wrote: > Hi, > I think it makes sense to start voting at this point. > > Option 1: Only 1-part identifiers > PROS: > - allows shadowing built-in functions > CONS: > - incosistent with all the other objects, both permanent & temporary > - does not allow shadowing catalog functions > > Option 2: Special keyword for built-in function > I think this is quite similar to the special catalog/db. The thing I am > strongly against in this proposal is the GLOBAL keyword. This keyword has a > meaning in rdbms systems and means a function that is present for a > lifetime of a session in which it was created, but available in all other > sessions. Therefore I really don't want to use this keyword in a different > context. > > Option 3: Special catalog/db > > PROS: > - allows shadowing built-in functions > - allows shadowing catalog functions > - consistent with other objects > CONS: > - we introduce a special namespace for built-in functions > > I don't see a problem with introducing the special namespace. In the end > it is very similar to the keyword approach. In this case the catalog/db > combination would be the "keyword" > > Therefore my votes: > Option 1: -0 > Option 2: -1 (I might change to +0 if we can come up with a better keyword) > Option 3: +1 > > Best, > Dawid > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > >> Hi Aljoscha, >> >> Thanks for the summary and these are great questions to be answered. The >> answer to your first question is clear: there is a general agreement to >> override built-in functions with temp functions. >> >> However, your second and third questions are sort of related, as a >> function >> reference can be either just function name (like "func") or in the form or >> "cat.db.func". When a reference is just function name, it can mean either >> a >> built-in function or a function defined in the current cat/db. If we >> support overriding a built-in function with a temp function, such >> overriding can also cover a function in the current cat/db. >> >> I think what Timo referred as "overriding a catalog function" means a temp >> function defined as "cat.db.func" overrides a catalog function "func" in >> cat/db even if cat/db is not current. To support this, temp function has >> to >> be tied to a cat/db. What's why I said above that the 2nd and 3rd >> questions >> are related. The problem with such support is the ambiguity when user >> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". >> Here "func" can means a global temp function, or a temp function in >> current >> cat/db. If we can assume the former, this creates an inconsistency because >> "CREATE FUNCTION func" actually means a function in current cat/db. If we >> assume the latter, then there is no way for user to create a global temp >> function. >> >> Giving a special namespace for built-in functions may solve the ambiguity >> problem above, but it also introduces artificial catalog/database that >> needs special treatment and pollutes the cleanness of the code. I would >> rather introduce a syntax in DDL to solve the problem, like "CREATE >> [GLOBAL] TEMPORARY FUNCTION func". >> >> Thus, I'd like to summarize a few candidate proposals for voting purposes: >> >> 1. Support only global, temporary functions without namespace. Such temp >> functions overrides built-in functions and catalog functions in current >> cat/db. The resolution order is: temp functions -> built-in functions -> >> catalog functions. (Partially or fully qualified functions has no >> ambiguity!) >> >> 2. In addition to #1, support creating and referencing temporary functions >> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp >> functions. The resolution order is: global temp functions -> built-in >> functions -> temp functions in current cat/db -> catalog function. >> (Resolution for partially or fully qualified function reference is: temp >> functions -> persistent functions.) >> >> 3. In addition to #1, support creating and referencing temporary functions >> associated with a cat/db with a special namespace for built-in functions >> and global temp functions. The resolution is the same as #2, except that >> the special namespace might be prefixed to a reference to a built-in >> function or global temp function. (In absence of the special namespace, >> the >> resolution order is the same as in #2.) >> >> My personal preference is #1, given the unknown use case and introduced >> complexity for #2 and #3. However, #2 is an acceptable alternative. Thus, >> my votes are: >> >> +1 for #1 >> +0 for #2 >> -1 for #3 >> >> Everyone, please cast your vote (in above format please!), or let me know >> if you have more questions or other candidates. >> >> Thanks, >> Xuefu >> >> >> >> >> >> >> >> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> >> wrote: >> >> > Hi, >> > >> > I think this discussion and the one for FLIP-64 are very connected. To >> > resolve the differences, think we have to think about the basic >> principles >> > and find consensus there. The basic questions I see are: >> > >> > - Do we want to support overriding builtin functions? >> > - Do we want to support overriding catalog functions? >> > - And then later: should temporary functions be tied to a >> > catalog/database? >> > >> > I don’t have much to say about these, except that we should somewhat >> stick >> > to what the industry does. But I also understand that the industry is >> > already very divided on this. >> > >> > Best, >> > Aljoscha >> > >> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: >> > > >> > > Hi, >> > > >> > > +1 to strive for reaching consensus on the remaining topics. We are >> > close to the truth. It will waste a lot of time if we resume the topic >> some >> > time later. >> > > >> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way >> > to override a catalog function. >> > > >> > > I’m not sure about “system.system.fun”, it introduces a nonexistent >> cat >> > & db? And we still need to do special treatment for the dedicated >> > system.system cat & db? >> > > >> > > Best, >> > > Jark >> > > >> > > >> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >> > >> >> > >> Hi everyone, >> > >> >> > >> @Xuefu: I would like to avoid adding too many things incrementally. >> > Users should be able to override all catalog objects consistently >> according >> > to FLIP-64 (Support for Temporary Objects in Table module). If functions >> > are treated completely different, we need more code and special cases. >> From >> > an implementation perspective, this topic only affects the lookup logic >> > which is rather low implementation effort which is why I would like to >> > clarify the remaining items. As you said, we have a slight consenus on >> > overriding built-in functions; we should also strive for reaching >> consensus >> > on the remaining topics. >> > >> >> > >> @Dawid: I like your idea as it ensures registering catalog objects >> > consistent and the overriding of built-in functions more explicit. >> > >> >> > >> Thanks, >> > >> Timo >> > >> >> > >> >> > >> On 17.09.19 11:59, kai wang wrote: >> > >>> hi, everyone >> > >>> I think this flip is very meaningful. it supports functions that >> can be >> > >>> shared by different catalogs and dbs, reducing the duplication of >> > functions. >> > >>> >> > >>> Our group based on flink's sql parser module implements create >> function >> > >>> feature, stores the parsed function metadata and schema into mysql, >> and >> > >>> also customizes the catalog, customizes sql-client to support custom >> > >>> schemas and functions. Loaded, but the function is currently global, >> > and is >> > >>> not subdivided according to catalog and db. >> > >>> >> > >>> In addition, I very much hope to participate in the development of >> this >> > >>> flip, I have been paying attention to the community, but found it is >> > more >> > >>> difficult to join. >> > >>> thank you. >> > >>> >> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >> > >>> >> > >>>> Thanks to Tmo and Dawid for sharing thoughts. >> > >>>> >> > >>>> It seems to me that there is a general consensus on having temp >> > functions >> > >>>> that have no namespaces and overwrite built-in functions. (As a >> side >> > note >> > >>>> for comparability, the current user defined functions are all >> > temporary and >> > >>>> having no namespaces.) >> > >>>> >> > >>>> Nevertheless, I can also see the merit of having namespaced temp >> > functions >> > >>>> that can overwrite functions defined in a specific cat/db. However, >> > this >> > >>>> idea appears orthogonal to the former and can be added >> incrementally. >> > >>>> >> > >>>> How about we first implement non-namespaced temp functions now and >> > leave >> > >>>> the door open for namespaced ones for later releases as the >> > requirement >> > >>>> might become more crystal? This also helps shorten the debate and >> > allow us >> > >>>> to make some progress along this direction. >> > >>>> >> > >>>> As to Dawid's idea of having a dedicated cat/db to host the >> temporary >> > temp >> > >>>> functions that don't have namespaces, my only concern is the >> special >> > >>>> treatment for a cat/db, which makes code less clean, as evident in >> > treating >> > >>>> the built-in catalog currently. >> > >>>> >> > >>>> Thanks, >> > >>>> Xuefiu >> > >>>> >> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >> > >>>> [hidden email]> >> > >>>> wrote: >> > >>>> >> > >>>>> Hi, >> > >>>>> Another idea to consider on top of Timo's suggestion. How about we >> > have a >> > >>>>> special namespace (catalog + database) for built-in objects? This >> > catalog >> > >>>>> would be invisible for users as Xuefu was suggesting. >> > >>>>> >> > >>>>> Then users could still override built-in functions, if they fully >> > qualify >> > >>>>> object with the built-in namespace, but by default the common >> logic >> > of >> > >>>>> current dB & cat would be used. >> > >>>>> >> > >>>>> CREATE TEMPORARY FUNCTION func ... >> > >>>>> registers temporary function in current cat & dB >> > >>>>> >> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >> > >>>>> registers temporary function in cat db >> > >>>>> >> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... >> > >>>>> Overrides built-in function with temporary function >> > >>>>> >> > >>>>> The built-in/system namespace would not be writable for permanent >> > >>>> objects. >> > >>>>> WDYT? >> > >>>>> >> > >>>>> This way I think we can have benefits of both solutions. >> > >>>>> >> > >>>>> Best, >> > >>>>> Dawid >> > >>>>> >> > >>>>> >> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> >> wrote: >> > >>>>> >> > >>>>>> Hi Bowen, >> > >>>>>> >> > >>>>>> I understand the potential benefit of overriding certain built-in >> > >>>>>> functions. I'm open to such a feature if many people agree. >> > However, it >> > >>>>>> would be great to still support overriding catalog functions with >> > >>>>>> temporary functions in order to prototype a query even though a >> > >>>>>> catalog/database might not be available currently or should not >> be >> > >>>>>> modified yet. How about we support both cases? >> > >>>>>> >> > >>>>>> CREATE TEMPORARY FUNCTION abs >> > >>>>>> -> creates/overrides a built-in function and never consideres >> > current >> > >>>>>> catalog and database; inconsistent with other DDL but acceptable >> for >> > >>>>>> functions I guess. >> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >> > >>>>>> -> creates/overrides a catalog function >> > >>>>>> >> > >>>>>> Regarding "Flink don't have any other built-in objects (tables, >> > views) >> > >>>>>> except functions", this might change in the near future. Take >> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example. >> > >>>>>> >> > >>>>>> Thanks, >> > >>>>>> Timo >> > >>>>>> >> > >>>>>> On 14.09.19 01:40, Bowen Li wrote: >> > >>>>>>> Hi Fabian, >> > >>>>>>> >> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I >> > didn't >> > >>>>>>> include that as a voting option, and the discussion is mainly >> > between >> > >>>>>>> 1-part/override builtin and 3-part/not override builtin. >> > >>>>>>> >> > >>>>>>> Re > However, it means that temp functions are differently >> treated >> > >>>> than >> > >>>>>>> other db objects. >> > >>>>>>> IMO, the treatment difference results from the fact that >> functions >> > >>>> are >> > >>>>> a >> > >>>>>>> bit different from other objects - Flink don't have any other >> > >>>> built-in >> > >>>>>>> objects (tables, views) except functions. >> > >>>>>>> >> > >>>>>>> Cheers, >> > >>>>>>> Bowen >> > >>>>>>> >> > >>>>>> >> > >>>> >> > >>>> -- >> > >>>> Xuefu Zhang >> > >>>> >> > >>>> "In Honey We Trust!" >> > >>>> >> > >> >> > > >> > >> > >> >> -- >> Xuefu Zhang >> >> "In Honey We Trust!" >> > |
Hi,
For #2, as Xuefu and I discussed offline, the key point is to introduce a keyword to SQL DDL to distinguish temp function that override built-in functions v.s. temp functions that override catalog functions. It can be something else than "GLOBAL", like "BUILTIN" (e.g. "CREATE BUILTIN TEMP FUNCTION") given there's no SQL standard for temp functions. This option is more about whether we want to have such a new keyword in SQL for the proposed functionality, and I personally think it's acceptable. For #3, besides the drawbacks mentioned by Xuefu and Jark, another con is: we already have a special catalog and db as "default_catalog" and "default_database", though they are not used very correctly at the moment (another story...), they are at least physically present. Introducing yet another virtual special catalog/db as something like "system"."system" that never physically exist would further confuse users, and hurt understandability and usability. Thus my vote is: +1 for #1 0 for #2 -1 for #3 Thanks, Bowen On Wed, Sep 18, 2019 at 4:55 PM Dawid Wysakowicz <[hidden email]> wrote: > Last additional comment on Option 2. The reason why I prefer option 3 is > that in option 3 all objects internally are identified with 3 parts. This > makes it easier to handle at different locations e.g. while persisting > views, as all objects have uniform representation. > > On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]> > wrote: > > > Hi, > > I think it makes sense to start voting at this point. > > > > Option 1: Only 1-part identifiers > > PROS: > > - allows shadowing built-in functions > > CONS: > > - incosistent with all the other objects, both permanent & temporary > > - does not allow shadowing catalog functions > > > > Option 2: Special keyword for built-in function > > I think this is quite similar to the special catalog/db. The thing I am > > strongly against in this proposal is the GLOBAL keyword. This keyword > has a > > meaning in rdbms systems and means a function that is present for a > > lifetime of a session in which it was created, but available in all other > > sessions. Therefore I really don't want to use this keyword in a > different > > context. > > > > Option 3: Special catalog/db > > > > PROS: > > - allows shadowing built-in functions > > - allows shadowing catalog functions > > - consistent with other objects > > CONS: > > - we introduce a special namespace for built-in functions > > > > I don't see a problem with introducing the special namespace. In the end > > it is very similar to the keyword approach. In this case the catalog/db > > combination would be the "keyword" > > > > Therefore my votes: > > Option 1: -0 > > Option 2: -1 (I might change to +0 if we can come up with a better > keyword) > > Option 3: +1 > > > > Best, > > Dawid > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > >> Hi Aljoscha, > >> > >> Thanks for the summary and these are great questions to be answered. The > >> answer to your first question is clear: there is a general agreement to > >> override built-in functions with temp functions. > >> > >> However, your second and third questions are sort of related, as a > >> function > >> reference can be either just function name (like "func") or in the form > or > >> "cat.db.func". When a reference is just function name, it can mean > either > >> a > >> built-in function or a function defined in the current cat/db. If we > >> support overriding a built-in function with a temp function, such > >> overriding can also cover a function in the current cat/db. > >> > >> I think what Timo referred as "overriding a catalog function" means a > temp > >> function defined as "cat.db.func" overrides a catalog function "func" in > >> cat/db even if cat/db is not current. To support this, temp function has > >> to > >> be tied to a cat/db. What's why I said above that the 2nd and 3rd > >> questions > >> are related. The problem with such support is the ambiguity when user > >> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". > >> Here "func" can means a global temp function, or a temp function in > >> current > >> cat/db. If we can assume the former, this creates an inconsistency > because > >> "CREATE FUNCTION func" actually means a function in current cat/db. If > we > >> assume the latter, then there is no way for user to create a global temp > >> function. > >> > >> Giving a special namespace for built-in functions may solve the > ambiguity > >> problem above, but it also introduces artificial catalog/database that > >> needs special treatment and pollutes the cleanness of the code. I would > >> rather introduce a syntax in DDL to solve the problem, like "CREATE > >> [GLOBAL] TEMPORARY FUNCTION func". > >> > >> Thus, I'd like to summarize a few candidate proposals for voting > purposes: > >> > >> 1. Support only global, temporary functions without namespace. Such temp > >> functions overrides built-in functions and catalog functions in current > >> cat/db. The resolution order is: temp functions -> built-in functions -> > >> catalog functions. (Partially or fully qualified functions has no > >> ambiguity!) > >> > >> 2. In addition to #1, support creating and referencing temporary > functions > >> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp > >> functions. The resolution order is: global temp functions -> built-in > >> functions -> temp functions in current cat/db -> catalog function. > >> (Resolution for partially or fully qualified function reference is: temp > >> functions -> persistent functions.) > >> > >> 3. In addition to #1, support creating and referencing temporary > functions > >> associated with a cat/db with a special namespace for built-in functions > >> and global temp functions. The resolution is the same as #2, except that > >> the special namespace might be prefixed to a reference to a built-in > >> function or global temp function. (In absence of the special namespace, > >> the > >> resolution order is the same as in #2.) > >> > >> My personal preference is #1, given the unknown use case and introduced > >> complexity for #2 and #3. However, #2 is an acceptable alternative. > Thus, > >> my votes are: > >> > >> +1 for #1 > >> +0 for #2 > >> -1 for #3 > >> > >> Everyone, please cast your vote (in above format please!), or let me > know > >> if you have more questions or other candidates. > >> > >> Thanks, > >> Xuefu > >> > >> > >> > >> > >> > >> > >> > >> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> > >> wrote: > >> > >> > Hi, > >> > > >> > I think this discussion and the one for FLIP-64 are very connected. To > >> > resolve the differences, think we have to think about the basic > >> principles > >> > and find consensus there. The basic questions I see are: > >> > > >> > - Do we want to support overriding builtin functions? > >> > - Do we want to support overriding catalog functions? > >> > - And then later: should temporary functions be tied to a > >> > catalog/database? > >> > > >> > I don’t have much to say about these, except that we should somewhat > >> stick > >> > to what the industry does. But I also understand that the industry is > >> > already very divided on this. > >> > > >> > Best, > >> > Aljoscha > >> > > >> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > >> > > > >> > > Hi, > >> > > > >> > > +1 to strive for reaching consensus on the remaining topics. We are > >> > close to the truth. It will waste a lot of time if we resume the topic > >> some > >> > time later. > >> > > > >> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” > way > >> > to override a catalog function. > >> > > > >> > > I’m not sure about “system.system.fun”, it introduces a nonexistent > >> cat > >> > & db? And we still need to do special treatment for the dedicated > >> > system.system cat & db? > >> > > > >> > > Best, > >> > > Jark > >> > > > >> > > > >> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > >> > >> > >> > >> Hi everyone, > >> > >> > >> > >> @Xuefu: I would like to avoid adding too many things incrementally. > >> > Users should be able to override all catalog objects consistently > >> according > >> > to FLIP-64 (Support for Temporary Objects in Table module). If > functions > >> > are treated completely different, we need more code and special cases. > >> From > >> > an implementation perspective, this topic only affects the lookup > logic > >> > which is rather low implementation effort which is why I would like to > >> > clarify the remaining items. As you said, we have a slight consenus on > >> > overriding built-in functions; we should also strive for reaching > >> consensus > >> > on the remaining topics. > >> > >> > >> > >> @Dawid: I like your idea as it ensures registering catalog objects > >> > consistent and the overriding of built-in functions more explicit. > >> > >> > >> > >> Thanks, > >> > >> Timo > >> > >> > >> > >> > >> > >> On 17.09.19 11:59, kai wang wrote: > >> > >>> hi, everyone > >> > >>> I think this flip is very meaningful. it supports functions that > >> can be > >> > >>> shared by different catalogs and dbs, reducing the duplication of > >> > functions. > >> > >>> > >> > >>> Our group based on flink's sql parser module implements create > >> function > >> > >>> feature, stores the parsed function metadata and schema into > mysql, > >> and > >> > >>> also customizes the catalog, customizes sql-client to support > custom > >> > >>> schemas and functions. Loaded, but the function is currently > global, > >> > and is > >> > >>> not subdivided according to catalog and db. > >> > >>> > >> > >>> In addition, I very much hope to participate in the development of > >> this > >> > >>> flip, I have been paying attention to the community, but found it > is > >> > more > >> > >>> difficult to join. > >> > >>> thank you. > >> > >>> > >> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > >> > >>> > >> > >>>> Thanks to Tmo and Dawid for sharing thoughts. > >> > >>>> > >> > >>>> It seems to me that there is a general consensus on having temp > >> > functions > >> > >>>> that have no namespaces and overwrite built-in functions. (As a > >> side > >> > note > >> > >>>> for comparability, the current user defined functions are all > >> > temporary and > >> > >>>> having no namespaces.) > >> > >>>> > >> > >>>> Nevertheless, I can also see the merit of having namespaced temp > >> > functions > >> > >>>> that can overwrite functions defined in a specific cat/db. > However, > >> > this > >> > >>>> idea appears orthogonal to the former and can be added > >> incrementally. > >> > >>>> > >> > >>>> How about we first implement non-namespaced temp functions now > and > >> > leave > >> > >>>> the door open for namespaced ones for later releases as the > >> > requirement > >> > >>>> might become more crystal? This also helps shorten the debate and > >> > allow us > >> > >>>> to make some progress along this direction. > >> > >>>> > >> > >>>> As to Dawid's idea of having a dedicated cat/db to host the > >> temporary > >> > temp > >> > >>>> functions that don't have namespaces, my only concern is the > >> special > >> > >>>> treatment for a cat/db, which makes code less clean, as evident > in > >> > treating > >> > >>>> the built-in catalog currently. > >> > >>>> > >> > >>>> Thanks, > >> > >>>> Xuefiu > >> > >>>> > >> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > >> > >>>> [hidden email]> > >> > >>>> wrote: > >> > >>>> > >> > >>>>> Hi, > >> > >>>>> Another idea to consider on top of Timo's suggestion. How about > we > >> > have a > >> > >>>>> special namespace (catalog + database) for built-in objects? > This > >> > catalog > >> > >>>>> would be invisible for users as Xuefu was suggesting. > >> > >>>>> > >> > >>>>> Then users could still override built-in functions, if they > fully > >> > qualify > >> > >>>>> object with the built-in namespace, but by default the common > >> logic > >> > of > >> > >>>>> current dB & cat would be used. > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION func ... > >> > >>>>> registers temporary function in current cat & dB > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > >> > >>>>> registers temporary function in cat db > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > >> > >>>>> Overrides built-in function with temporary function > >> > >>>>> > >> > >>>>> The built-in/system namespace would not be writable for > permanent > >> > >>>> objects. > >> > >>>>> WDYT? > >> > >>>>> > >> > >>>>> This way I think we can have benefits of both solutions. > >> > >>>>> > >> > >>>>> Best, > >> > >>>>> Dawid > >> > >>>>> > >> > >>>>> > >> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> > >> wrote: > >> > >>>>> > >> > >>>>>> Hi Bowen, > >> > >>>>>> > >> > >>>>>> I understand the potential benefit of overriding certain > built-in > >> > >>>>>> functions. I'm open to such a feature if many people agree. > >> > However, it > >> > >>>>>> would be great to still support overriding catalog functions > with > >> > >>>>>> temporary functions in order to prototype a query even though a > >> > >>>>>> catalog/database might not be available currently or should not > >> be > >> > >>>>>> modified yet. How about we support both cases? > >> > >>>>>> > >> > >>>>>> CREATE TEMPORARY FUNCTION abs > >> > >>>>>> -> creates/overrides a built-in function and never consideres > >> > current > >> > >>>>>> catalog and database; inconsistent with other DDL but > acceptable > >> for > >> > >>>>>> functions I guess. > >> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > >> > >>>>>> -> creates/overrides a catalog function > >> > >>>>>> > >> > >>>>>> Regarding "Flink don't have any other built-in objects (tables, > >> > views) > >> > >>>>>> except functions", this might change in the near future. Take > >> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > example. > >> > >>>>>> > >> > >>>>>> Thanks, > >> > >>>>>> Timo > >> > >>>>>> > >> > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > >> > >>>>>>> Hi Fabian, > >> > >>>>>>> > >> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > >> > didn't > >> > >>>>>>> include that as a voting option, and the discussion is mainly > >> > between > >> > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > >> > >>>>>>> > >> > >>>>>>> Re > However, it means that temp functions are differently > >> treated > >> > >>>> than > >> > >>>>>>> other db objects. > >> > >>>>>>> IMO, the treatment difference results from the fact that > >> functions > >> > >>>> are > >> > >>>>> a > >> > >>>>>>> bit different from other objects - Flink don't have any other > >> > >>>> built-in > >> > >>>>>>> objects (tables, views) except functions. > >> > >>>>>>> > >> > >>>>>>> Cheers, > >> > >>>>>>> Bowen > >> > >>>>>>> > >> > >>>>>> > >> > >>>> > >> > >>>> -- > >> > >>>> Xuefu Zhang > >> > >>>> > >> > >>>> "In Honey We Trust!" > >> > >>>> > >> > >> > >> > > > >> > > >> > > >> > >> -- > >> Xuefu Zhang > >> > >> "In Honey We Trust!" > >> > > > |
In reply to this post by Dawid Wysakowicz
Hi Dawid,
"GLOBAL" is a temporary keyword that was given to the approach. It can be changed to something else for better. The difference between this and the #3 approach is that we only need the keyword for this create DDL. For other places (such as function referencing), no keyword or special namespace is needed. Thanks, Xuefu On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <[hidden email]> wrote: > Hi, > I think it makes sense to start voting at this point. > > Option 1: Only 1-part identifiers > PROS: > - allows shadowing built-in functions > CONS: > - incosistent with all the other objects, both permanent & temporary > - does not allow shadowing catalog functions > > Option 2: Special keyword for built-in function > I think this is quite similar to the special catalog/db. The thing I am > strongly against in this proposal is the GLOBAL keyword. This keyword has a > meaning in rdbms systems and means a function that is present for a > lifetime of a session in which it was created, but available in all other > sessions. Therefore I really don't want to use this keyword in a different > context. > > Option 3: Special catalog/db > > PROS: > - allows shadowing built-in functions > - allows shadowing catalog functions > - consistent with other objects > CONS: > - we introduce a special namespace for built-in functions > > I don't see a problem with introducing the special namespace. In the end it > is very similar to the keyword approach. In this case the catalog/db > combination would be the "keyword" > > Therefore my votes: > Option 1: -0 > Option 2: -1 (I might change to +0 if we can come up with a better keyword) > Option 3: +1 > > Best, > Dawid > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > Hi Aljoscha, > > > > Thanks for the summary and these are great questions to be answered. The > > answer to your first question is clear: there is a general agreement to > > override built-in functions with temp functions. > > > > However, your second and third questions are sort of related, as a > function > > reference can be either just function name (like "func") or in the form > or > > "cat.db.func". When a reference is just function name, it can mean > either a > > built-in function or a function defined in the current cat/db. If we > > support overriding a built-in function with a temp function, such > > overriding can also cover a function in the current cat/db. > > > > I think what Timo referred as "overriding a catalog function" means a > temp > > function defined as "cat.db.func" overrides a catalog function "func" in > > cat/db even if cat/db is not current. To support this, temp function has > to > > be tied to a cat/db. What's why I said above that the 2nd and 3rd > questions > > are related. The problem with such support is the ambiguity when user > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". > > Here "func" can means a global temp function, or a temp function in > current > > cat/db. If we can assume the former, this creates an inconsistency > because > > "CREATE FUNCTION func" actually means a function in current cat/db. If we > > assume the latter, then there is no way for user to create a global temp > > function. > > > > Giving a special namespace for built-in functions may solve the ambiguity > > problem above, but it also introduces artificial catalog/database that > > needs special treatment and pollutes the cleanness of the code. I would > > rather introduce a syntax in DDL to solve the problem, like "CREATE > > [GLOBAL] TEMPORARY FUNCTION func". > > > > Thus, I'd like to summarize a few candidate proposals for voting > purposes: > > > > 1. Support only global, temporary functions without namespace. Such temp > > functions overrides built-in functions and catalog functions in current > > cat/db. The resolution order is: temp functions -> built-in functions -> > > catalog functions. (Partially or fully qualified functions has no > > ambiguity!) > > > > 2. In addition to #1, support creating and referencing temporary > functions > > associated with a cat/db with "GLOBAL" qualifier in DDL for global temp > > functions. The resolution order is: global temp functions -> built-in > > functions -> temp functions in current cat/db -> catalog function. > > (Resolution for partially or fully qualified function reference is: temp > > functions -> persistent functions.) > > > > 3. In addition to #1, support creating and referencing temporary > functions > > associated with a cat/db with a special namespace for built-in functions > > and global temp functions. The resolution is the same as #2, except that > > the special namespace might be prefixed to a reference to a built-in > > function or global temp function. (In absence of the special namespace, > the > > resolution order is the same as in #2.) > > > > My personal preference is #1, given the unknown use case and introduced > > complexity for #2 and #3. However, #2 is an acceptable alternative. Thus, > > my votes are: > > > > +1 for #1 > > +0 for #2 > > -1 for #3 > > > > Everyone, please cast your vote (in above format please!), or let me know > > if you have more questions or other candidates. > > > > Thanks, > > Xuefu > > > > > > > > > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> > > wrote: > > > > > Hi, > > > > > > I think this discussion and the one for FLIP-64 are very connected. To > > > resolve the differences, think we have to think about the basic > > principles > > > and find consensus there. The basic questions I see are: > > > > > > - Do we want to support overriding builtin functions? > > > - Do we want to support overriding catalog functions? > > > - And then later: should temporary functions be tied to a > > > catalog/database? > > > > > > I don’t have much to say about these, except that we should somewhat > > stick > > > to what the industry does. But I also understand that the industry is > > > already very divided on this. > > > > > > Best, > > > Aljoscha > > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > > > > > Hi, > > > > > > > > +1 to strive for reaching consensus on the remaining topics. We are > > > close to the truth. It will waste a lot of time if we resume the topic > > some > > > time later. > > > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” > way > > > to override a catalog function. > > > > > > > > I’m not sure about “system.system.fun”, it introduces a nonexistent > cat > > > & db? And we still need to do special treatment for the dedicated > > > system.system cat & db? > > > > > > > > Best, > > > > Jark > > > > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > > >> > > > >> Hi everyone, > > > >> > > > >> @Xuefu: I would like to avoid adding too many things incrementally. > > > Users should be able to override all catalog objects consistently > > according > > > to FLIP-64 (Support for Temporary Objects in Table module). If > functions > > > are treated completely different, we need more code and special cases. > > From > > > an implementation perspective, this topic only affects the lookup logic > > > which is rather low implementation effort which is why I would like to > > > clarify the remaining items. As you said, we have a slight consenus on > > > overriding built-in functions; we should also strive for reaching > > consensus > > > on the remaining topics. > > > >> > > > >> @Dawid: I like your idea as it ensures registering catalog objects > > > consistent and the overriding of built-in functions more explicit. > > > >> > > > >> Thanks, > > > >> Timo > > > >> > > > >> > > > >> On 17.09.19 11:59, kai wang wrote: > > > >>> hi, everyone > > > >>> I think this flip is very meaningful. it supports functions that > can > > be > > > >>> shared by different catalogs and dbs, reducing the duplication of > > > functions. > > > >>> > > > >>> Our group based on flink's sql parser module implements create > > function > > > >>> feature, stores the parsed function metadata and schema into mysql, > > and > > > >>> also customizes the catalog, customizes sql-client to support > custom > > > >>> schemas and functions. Loaded, but the function is currently > global, > > > and is > > > >>> not subdivided according to catalog and db. > > > >>> > > > >>> In addition, I very much hope to participate in the development of > > this > > > >>> flip, I have been paying attention to the community, but found it > is > > > more > > > >>> difficult to join. > > > >>> thank you. > > > >>> > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > > > >>> > > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > > >>>> > > > >>>> It seems to me that there is a general consensus on having temp > > > functions > > > >>>> that have no namespaces and overwrite built-in functions. (As a > side > > > note > > > >>>> for comparability, the current user defined functions are all > > > temporary and > > > >>>> having no namespaces.) > > > >>>> > > > >>>> Nevertheless, I can also see the merit of having namespaced temp > > > functions > > > >>>> that can overwrite functions defined in a specific cat/db. > However, > > > this > > > >>>> idea appears orthogonal to the former and can be added > > incrementally. > > > >>>> > > > >>>> How about we first implement non-namespaced temp functions now and > > > leave > > > >>>> the door open for namespaced ones for later releases as the > > > requirement > > > >>>> might become more crystal? This also helps shorten the debate and > > > allow us > > > >>>> to make some progress along this direction. > > > >>>> > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > > temporary > > > temp > > > >>>> functions that don't have namespaces, my only concern is the > special > > > >>>> treatment for a cat/db, which makes code less clean, as evident in > > > treating > > > >>>> the built-in catalog currently. > > > >>>> > > > >>>> Thanks, > > > >>>> Xuefiu > > > >>>> > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > > >>>> [hidden email]> > > > >>>> wrote: > > > >>>> > > > >>>>> Hi, > > > >>>>> Another idea to consider on top of Timo's suggestion. How about > we > > > have a > > > >>>>> special namespace (catalog + database) for built-in objects? This > > > catalog > > > >>>>> would be invisible for users as Xuefu was suggesting. > > > >>>>> > > > >>>>> Then users could still override built-in functions, if they fully > > > qualify > > > >>>>> object with the built-in namespace, but by default the common > logic > > > of > > > >>>>> current dB & cat would be used. > > > >>>>> > > > >>>>> CREATE TEMPORARY FUNCTION func ... > > > >>>>> registers temporary function in current cat & dB > > > >>>>> > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > > >>>>> registers temporary function in cat db > > > >>>>> > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > > >>>>> Overrides built-in function with temporary function > > > >>>>> > > > >>>>> The built-in/system namespace would not be writable for permanent > > > >>>> objects. > > > >>>>> WDYT? > > > >>>>> > > > >>>>> This way I think we can have benefits of both solutions. > > > >>>>> > > > >>>>> Best, > > > >>>>> Dawid > > > >>>>> > > > >>>>> > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> > > wrote: > > > >>>>> > > > >>>>>> Hi Bowen, > > > >>>>>> > > > >>>>>> I understand the potential benefit of overriding certain > built-in > > > >>>>>> functions. I'm open to such a feature if many people agree. > > > However, it > > > >>>>>> would be great to still support overriding catalog functions > with > > > >>>>>> temporary functions in order to prototype a query even though a > > > >>>>>> catalog/database might not be available currently or should not > be > > > >>>>>> modified yet. How about we support both cases? > > > >>>>>> > > > >>>>>> CREATE TEMPORARY FUNCTION abs > > > >>>>>> -> creates/overrides a built-in function and never consideres > > > current > > > >>>>>> catalog and database; inconsistent with other DDL but acceptable > > for > > > >>>>>> functions I guess. > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > > >>>>>> -> creates/overrides a catalog function > > > >>>>>> > > > >>>>>> Regarding "Flink don't have any other built-in objects (tables, > > > views) > > > >>>>>> except functions", this might change in the near future. Take > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > example. > > > >>>>>> > > > >>>>>> Thanks, > > > >>>>>> Timo > > > >>>>>> > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > > >>>>>>> Hi Fabian, > > > >>>>>>> > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > > > didn't > > > >>>>>>> include that as a voting option, and the discussion is mainly > > > between > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > > >>>>>>> > > > >>>>>>> Re > However, it means that temp functions are differently > > treated > > > >>>> than > > > >>>>>>> other db objects. > > > >>>>>>> IMO, the treatment difference results from the fact that > > functions > > > >>>> are > > > >>>>> a > > > >>>>>>> bit different from other objects - Flink don't have any other > > > >>>> built-in > > > >>>>>>> objects (tables, views) except functions. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Bowen > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>>> -- > > > >>>> Xuefu Zhang > > > >>>> > > > >>>> "In Honey We Trust!" > > > >>>> > > > >> > > > > > > > > > > > > > > -- > > Xuefu Zhang > > > > "In Honey We Trust!" > > > -- Xuefu Zhang "In Honey We Trust!" |
@Bowen I am not suggesting introducing additional catalog. I think we need
to get rid of the current built-in catalog. @Xuefu in option #3 we also don't need additional referencing the special catalog anywhere else besides in the CREATE statement. The resolution behaviour is exactly the same in both options. On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: > Hi Dawid, > > "GLOBAL" is a temporary keyword that was given to the approach. It can be > changed to something else for better. > > The difference between this and the #3 approach is that we only need the > keyword for this create DDL. For other places (such as function > referencing), no keyword or special namespace is needed. > > Thanks, > Xuefu > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < > [hidden email]> > wrote: > > > Hi, > > I think it makes sense to start voting at this point. > > > > Option 1: Only 1-part identifiers > > PROS: > > - allows shadowing built-in functions > > CONS: > > - incosistent with all the other objects, both permanent & temporary > > - does not allow shadowing catalog functions > > > > Option 2: Special keyword for built-in function > > I think this is quite similar to the special catalog/db. The thing I am > > strongly against in this proposal is the GLOBAL keyword. This keyword > has a > > meaning in rdbms systems and means a function that is present for a > > lifetime of a session in which it was created, but available in all other > > sessions. Therefore I really don't want to use this keyword in a > different > > context. > > > > Option 3: Special catalog/db > > > > PROS: > > - allows shadowing built-in functions > > - allows shadowing catalog functions > > - consistent with other objects > > CONS: > > - we introduce a special namespace for built-in functions > > > > I don't see a problem with introducing the special namespace. In the end > it > > is very similar to the keyword approach. In this case the catalog/db > > combination would be the "keyword" > > > > Therefore my votes: > > Option 1: -0 > > Option 2: -1 (I might change to +0 if we can come up with a better > keyword) > > Option 3: +1 > > > > Best, > > Dawid > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > > > Hi Aljoscha, > > > > > > Thanks for the summary and these are great questions to be answered. > The > > > answer to your first question is clear: there is a general agreement to > > > override built-in functions with temp functions. > > > > > > However, your second and third questions are sort of related, as a > > function > > > reference can be either just function name (like "func") or in the form > > or > > > "cat.db.func". When a reference is just function name, it can mean > > either a > > > built-in function or a function defined in the current cat/db. If we > > > support overriding a built-in function with a temp function, such > > > overriding can also cover a function in the current cat/db. > > > > > > I think what Timo referred as "overriding a catalog function" means a > > temp > > > function defined as "cat.db.func" overrides a catalog function "func" > in > > > cat/db even if cat/db is not current. To support this, temp function > has > > to > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd > > questions > > > are related. The problem with such support is the ambiguity when user > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". > > > Here "func" can means a global temp function, or a temp function in > > current > > > cat/db. If we can assume the former, this creates an inconsistency > > because > > > "CREATE FUNCTION func" actually means a function in current cat/db. If > we > > > assume the latter, then there is no way for user to create a global > temp > > > function. > > > > > > Giving a special namespace for built-in functions may solve the > ambiguity > > > problem above, but it also introduces artificial catalog/database that > > > needs special treatment and pollutes the cleanness of the code. I > would > > > rather introduce a syntax in DDL to solve the problem, like "CREATE > > > [GLOBAL] TEMPORARY FUNCTION func". > > > > > > Thus, I'd like to summarize a few candidate proposals for voting > > purposes: > > > > > > 1. Support only global, temporary functions without namespace. Such > temp > > > functions overrides built-in functions and catalog functions in current > > > cat/db. The resolution order is: temp functions -> built-in functions > -> > > > catalog functions. (Partially or fully qualified functions has no > > > ambiguity!) > > > > > > 2. In addition to #1, support creating and referencing temporary > > functions > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global temp > > > functions. The resolution order is: global temp functions -> built-in > > > functions -> temp functions in current cat/db -> catalog function. > > > (Resolution for partially or fully qualified function reference is: > temp > > > functions -> persistent functions.) > > > > > > 3. In addition to #1, support creating and referencing temporary > > functions > > > associated with a cat/db with a special namespace for built-in > functions > > > and global temp functions. The resolution is the same as #2, except > that > > > the special namespace might be prefixed to a reference to a built-in > > > function or global temp function. (In absence of the special namespace, > > the > > > resolution order is the same as in #2.) > > > > > > My personal preference is #1, given the unknown use case and introduced > > > complexity for #2 and #3. However, #2 is an acceptable alternative. > Thus, > > > my votes are: > > > > > > +1 for #1 > > > +0 for #2 > > > -1 for #3 > > > > > > Everyone, please cast your vote (in above format please!), or let me > know > > > if you have more questions or other candidates. > > > > > > Thanks, > > > Xuefu > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> > > > wrote: > > > > > > > Hi, > > > > > > > > I think this discussion and the one for FLIP-64 are very connected. > To > > > > resolve the differences, think we have to think about the basic > > > principles > > > > and find consensus there. The basic questions I see are: > > > > > > > > - Do we want to support overriding builtin functions? > > > > - Do we want to support overriding catalog functions? > > > > - And then later: should temporary functions be tied to a > > > > catalog/database? > > > > > > > > I don’t have much to say about these, except that we should somewhat > > > stick > > > > to what the industry does. But I also understand that the industry is > > > > already very divided on this. > > > > > > > > Best, > > > > Aljoscha > > > > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > > > > > > > Hi, > > > > > > > > > > +1 to strive for reaching consensus on the remaining topics. We are > > > > close to the truth. It will waste a lot of time if we resume the > topic > > > some > > > > time later. > > > > > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” > > way > > > > to override a catalog function. > > > > > > > > > > I’m not sure about “system.system.fun”, it introduces a nonexistent > > cat > > > > & db? And we still need to do special treatment for the dedicated > > > > system.system cat & db? > > > > > > > > > > Best, > > > > > Jark > > > > > > > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > > > >> > > > > >> Hi everyone, > > > > >> > > > > >> @Xuefu: I would like to avoid adding too many things > incrementally. > > > > Users should be able to override all catalog objects consistently > > > according > > > > to FLIP-64 (Support for Temporary Objects in Table module). If > > functions > > > > are treated completely different, we need more code and special > cases. > > > From > > > > an implementation perspective, this topic only affects the lookup > logic > > > > which is rather low implementation effort which is why I would like > to > > > > clarify the remaining items. As you said, we have a slight consenus > on > > > > overriding built-in functions; we should also strive for reaching > > > consensus > > > > on the remaining topics. > > > > >> > > > > >> @Dawid: I like your idea as it ensures registering catalog objects > > > > consistent and the overriding of built-in functions more explicit. > > > > >> > > > > >> Thanks, > > > > >> Timo > > > > >> > > > > >> > > > > >> On 17.09.19 11:59, kai wang wrote: > > > > >>> hi, everyone > > > > >>> I think this flip is very meaningful. it supports functions that > > can > > > be > > > > >>> shared by different catalogs and dbs, reducing the duplication of > > > > functions. > > > > >>> > > > > >>> Our group based on flink's sql parser module implements create > > > function > > > > >>> feature, stores the parsed function metadata and schema into > mysql, > > > and > > > > >>> also customizes the catalog, customizes sql-client to support > > custom > > > > >>> schemas and functions. Loaded, but the function is currently > > global, > > > > and is > > > > >>> not subdivided according to catalog and db. > > > > >>> > > > > >>> In addition, I very much hope to participate in the development > of > > > this > > > > >>> flip, I have been paying attention to the community, but found it > > is > > > > more > > > > >>> difficult to join. > > > > >>> thank you. > > > > >>> > > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > > > > >>> > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > > > >>>> > > > > >>>> It seems to me that there is a general consensus on having temp > > > > functions > > > > >>>> that have no namespaces and overwrite built-in functions. (As a > > side > > > > note > > > > >>>> for comparability, the current user defined functions are all > > > > temporary and > > > > >>>> having no namespaces.) > > > > >>>> > > > > >>>> Nevertheless, I can also see the merit of having namespaced temp > > > > functions > > > > >>>> that can overwrite functions defined in a specific cat/db. > > However, > > > > this > > > > >>>> idea appears orthogonal to the former and can be added > > > incrementally. > > > > >>>> > > > > >>>> How about we first implement non-namespaced temp functions now > and > > > > leave > > > > >>>> the door open for namespaced ones for later releases as the > > > > requirement > > > > >>>> might become more crystal? This also helps shorten the debate > and > > > > allow us > > > > >>>> to make some progress along this direction. > > > > >>>> > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > > > temporary > > > > temp > > > > >>>> functions that don't have namespaces, my only concern is the > > special > > > > >>>> treatment for a cat/db, which makes code less clean, as evident > in > > > > treating > > > > >>>> the built-in catalog currently. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> Xuefiu > > > > >>>> > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > > > >>>> [hidden email]> > > > > >>>> wrote: > > > > >>>> > > > > >>>>> Hi, > > > > >>>>> Another idea to consider on top of Timo's suggestion. How about > > we > > > > have a > > > > >>>>> special namespace (catalog + database) for built-in objects? > This > > > > catalog > > > > >>>>> would be invisible for users as Xuefu was suggesting. > > > > >>>>> > > > > >>>>> Then users could still override built-in functions, if they > fully > > > > qualify > > > > >>>>> object with the built-in namespace, but by default the common > > logic > > > > of > > > > >>>>> current dB & cat would be used. > > > > >>>>> > > > > >>>>> CREATE TEMPORARY FUNCTION func ... > > > > >>>>> registers temporary function in current cat & dB > > > > >>>>> > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > > > >>>>> registers temporary function in cat db > > > > >>>>> > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > > > >>>>> Overrides built-in function with temporary function > > > > >>>>> > > > > >>>>> The built-in/system namespace would not be writable for > permanent > > > > >>>> objects. > > > > >>>>> WDYT? > > > > >>>>> > > > > >>>>> This way I think we can have benefits of both solutions. > > > > >>>>> > > > > >>>>> Best, > > > > >>>>> Dawid > > > > >>>>> > > > > >>>>> > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> > > > wrote: > > > > >>>>> > > > > >>>>>> Hi Bowen, > > > > >>>>>> > > > > >>>>>> I understand the potential benefit of overriding certain > > built-in > > > > >>>>>> functions. I'm open to such a feature if many people agree. > > > > However, it > > > > >>>>>> would be great to still support overriding catalog functions > > with > > > > >>>>>> temporary functions in order to prototype a query even though > a > > > > >>>>>> catalog/database might not be available currently or should > not > > be > > > > >>>>>> modified yet. How about we support both cases? > > > > >>>>>> > > > > >>>>>> CREATE TEMPORARY FUNCTION abs > > > > >>>>>> -> creates/overrides a built-in function and never consideres > > > > current > > > > >>>>>> catalog and database; inconsistent with other DDL but > acceptable > > > for > > > > >>>>>> functions I guess. > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > > > >>>>>> -> creates/overrides a catalog function > > > > >>>>>> > > > > >>>>>> Regarding "Flink don't have any other built-in objects > (tables, > > > > views) > > > > >>>>>> except functions", this might change in the near future. Take > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > > example. > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> Timo > > > > >>>>>> > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > > > >>>>>>> Hi Fabian, > > > > >>>>>>> > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > > > > didn't > > > > >>>>>>> include that as a voting option, and the discussion is mainly > > > > between > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > > > >>>>>>> > > > > >>>>>>> Re > However, it means that temp functions are differently > > > treated > > > > >>>> than > > > > >>>>>>> other db objects. > > > > >>>>>>> IMO, the treatment difference results from the fact that > > > functions > > > > >>>> are > > > > >>>>> a > > > > >>>>>>> bit different from other objects - Flink don't have any other > > > > >>>> built-in > > > > >>>>>>> objects (tables, views) except functions. > > > > >>>>>>> > > > > >>>>>>> Cheers, > > > > >>>>>>> Bowen > > > > >>>>>>> > > > > >>>>>> > > > > >>>> > > > > >>>> -- > > > > >>>> Xuefu Zhang > > > > >>>> > > > > >>>> "In Honey We Trust!" > > > > >>>> > > > > >> > > > > > > > > > > > > > > > > > > > -- > > > Xuefu Zhang > > > > > > "In Honey We Trust!" > > > > > > > > -- > Xuefu Zhang > > "In Honey We Trust!" > |
In reply to this post by Dawid Wysakowicz
Re: The reason why I prefer option 3 is that in option 3 all objects
internally are identified with 3 parts. True, but the problem we have is not about how to differentiate each type objects internally. Rather, it's rather about how a user referencing an object unambiguously and consistently. Thanks, Xuefu On Wed, Sep 18, 2019 at 4:55 PM Dawid Wysakowicz <[hidden email]> wrote: > Last additional comment on Option 2. The reason why I prefer option 3 is > that in option 3 all objects internally are identified with 3 parts. This > makes it easier to handle at different locations e.g. while persisting > views, as all objects have uniform representation. > > On Thu, 19 Sep 2019, 07:31 Dawid Wysakowicz, <[hidden email]> > wrote: > > > Hi, > > I think it makes sense to start voting at this point. > > > > Option 1: Only 1-part identifiers > > PROS: > > - allows shadowing built-in functions > > CONS: > > - incosistent with all the other objects, both permanent & temporary > > - does not allow shadowing catalog functions > > > > Option 2: Special keyword for built-in function > > I think this is quite similar to the special catalog/db. The thing I am > > strongly against in this proposal is the GLOBAL keyword. This keyword > has a > > meaning in rdbms systems and means a function that is present for a > > lifetime of a session in which it was created, but available in all other > > sessions. Therefore I really don't want to use this keyword in a > different > > context. > > > > Option 3: Special catalog/db > > > > PROS: > > - allows shadowing built-in functions > > - allows shadowing catalog functions > > - consistent with other objects > > CONS: > > - we introduce a special namespace for built-in functions > > > > I don't see a problem with introducing the special namespace. In the end > > it is very similar to the keyword approach. In this case the catalog/db > > combination would be the "keyword" > > > > Therefore my votes: > > Option 1: -0 > > Option 2: -1 (I might change to +0 if we can come up with a better > keyword) > > Option 3: +1 > > > > Best, > > Dawid > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > >> Hi Aljoscha, > >> > >> Thanks for the summary and these are great questions to be answered. The > >> answer to your first question is clear: there is a general agreement to > >> override built-in functions with temp functions. > >> > >> However, your second and third questions are sort of related, as a > >> function > >> reference can be either just function name (like "func") or in the form > or > >> "cat.db.func". When a reference is just function name, it can mean > either > >> a > >> built-in function or a function defined in the current cat/db. If we > >> support overriding a built-in function with a temp function, such > >> overriding can also cover a function in the current cat/db. > >> > >> I think what Timo referred as "overriding a catalog function" means a > temp > >> function defined as "cat.db.func" overrides a catalog function "func" in > >> cat/db even if cat/db is not current. To support this, temp function has > >> to > >> be tied to a cat/db. What's why I said above that the 2nd and 3rd > >> questions > >> are related. The problem with such support is the ambiguity when user > >> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...". > >> Here "func" can means a global temp function, or a temp function in > >> current > >> cat/db. If we can assume the former, this creates an inconsistency > because > >> "CREATE FUNCTION func" actually means a function in current cat/db. If > we > >> assume the latter, then there is no way for user to create a global temp > >> function. > >> > >> Giving a special namespace for built-in functions may solve the > ambiguity > >> problem above, but it also introduces artificial catalog/database that > >> needs special treatment and pollutes the cleanness of the code. I would > >> rather introduce a syntax in DDL to solve the problem, like "CREATE > >> [GLOBAL] TEMPORARY FUNCTION func". > >> > >> Thus, I'd like to summarize a few candidate proposals for voting > purposes: > >> > >> 1. Support only global, temporary functions without namespace. Such temp > >> functions overrides built-in functions and catalog functions in current > >> cat/db. The resolution order is: temp functions -> built-in functions -> > >> catalog functions. (Partially or fully qualified functions has no > >> ambiguity!) > >> > >> 2. In addition to #1, support creating and referencing temporary > functions > >> associated with a cat/db with "GLOBAL" qualifier in DDL for global temp > >> functions. The resolution order is: global temp functions -> built-in > >> functions -> temp functions in current cat/db -> catalog function. > >> (Resolution for partially or fully qualified function reference is: temp > >> functions -> persistent functions.) > >> > >> 3. In addition to #1, support creating and referencing temporary > functions > >> associated with a cat/db with a special namespace for built-in functions > >> and global temp functions. The resolution is the same as #2, except that > >> the special namespace might be prefixed to a reference to a built-in > >> function or global temp function. (In absence of the special namespace, > >> the > >> resolution order is the same as in #2.) > >> > >> My personal preference is #1, given the unknown use case and introduced > >> complexity for #2 and #3. However, #2 is an acceptable alternative. > Thus, > >> my votes are: > >> > >> +1 for #1 > >> +0 for #2 > >> -1 for #3 > >> > >> Everyone, please cast your vote (in above format please!), or let me > know > >> if you have more questions or other candidates. > >> > >> Thanks, > >> Xuefu > >> > >> > >> > >> > >> > >> > >> > >> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <[hidden email]> > >> wrote: > >> > >> > Hi, > >> > > >> > I think this discussion and the one for FLIP-64 are very connected. To > >> > resolve the differences, think we have to think about the basic > >> principles > >> > and find consensus there. The basic questions I see are: > >> > > >> > - Do we want to support overriding builtin functions? > >> > - Do we want to support overriding catalog functions? > >> > - And then later: should temporary functions be tied to a > >> > catalog/database? > >> > > >> > I don’t have much to say about these, except that we should somewhat > >> stick > >> > to what the industry does. But I also understand that the industry is > >> > already very divided on this. > >> > > >> > Best, > >> > Aljoscha > >> > > >> > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > >> > > > >> > > Hi, > >> > > > >> > > +1 to strive for reaching consensus on the remaining topics. We are > >> > close to the truth. It will waste a lot of time if we resume the topic > >> some > >> > time later. > >> > > > >> > > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” > way > >> > to override a catalog function. > >> > > > >> > > I’m not sure about “system.system.fun”, it introduces a nonexistent > >> cat > >> > & db? And we still need to do special treatment for the dedicated > >> > system.system cat & db? > >> > > > >> > > Best, > >> > > Jark > >> > > > >> > > > >> > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > >> > >> > >> > >> Hi everyone, > >> > >> > >> > >> @Xuefu: I would like to avoid adding too many things incrementally. > >> > Users should be able to override all catalog objects consistently > >> according > >> > to FLIP-64 (Support for Temporary Objects in Table module). If > functions > >> > are treated completely different, we need more code and special cases. > >> From > >> > an implementation perspective, this topic only affects the lookup > logic > >> > which is rather low implementation effort which is why I would like to > >> > clarify the remaining items. As you said, we have a slight consenus on > >> > overriding built-in functions; we should also strive for reaching > >> consensus > >> > on the remaining topics. > >> > >> > >> > >> @Dawid: I like your idea as it ensures registering catalog objects > >> > consistent and the overriding of built-in functions more explicit. > >> > >> > >> > >> Thanks, > >> > >> Timo > >> > >> > >> > >> > >> > >> On 17.09.19 11:59, kai wang wrote: > >> > >>> hi, everyone > >> > >>> I think this flip is very meaningful. it supports functions that > >> can be > >> > >>> shared by different catalogs and dbs, reducing the duplication of > >> > functions. > >> > >>> > >> > >>> Our group based on flink's sql parser module implements create > >> function > >> > >>> feature, stores the parsed function metadata and schema into > mysql, > >> and > >> > >>> also customizes the catalog, customizes sql-client to support > custom > >> > >>> schemas and functions. Loaded, but the function is currently > global, > >> > and is > >> > >>> not subdivided according to catalog and db. > >> > >>> > >> > >>> In addition, I very much hope to participate in the development of > >> this > >> > >>> flip, I have been paying attention to the community, but found it > is > >> > more > >> > >>> difficult to join. > >> > >>> thank you. > >> > >>> > >> > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > >> > >>> > >> > >>>> Thanks to Tmo and Dawid for sharing thoughts. > >> > >>>> > >> > >>>> It seems to me that there is a general consensus on having temp > >> > functions > >> > >>>> that have no namespaces and overwrite built-in functions. (As a > >> side > >> > note > >> > >>>> for comparability, the current user defined functions are all > >> > temporary and > >> > >>>> having no namespaces.) > >> > >>>> > >> > >>>> Nevertheless, I can also see the merit of having namespaced temp > >> > functions > >> > >>>> that can overwrite functions defined in a specific cat/db. > However, > >> > this > >> > >>>> idea appears orthogonal to the former and can be added > >> incrementally. > >> > >>>> > >> > >>>> How about we first implement non-namespaced temp functions now > and > >> > leave > >> > >>>> the door open for namespaced ones for later releases as the > >> > requirement > >> > >>>> might become more crystal? This also helps shorten the debate and > >> > allow us > >> > >>>> to make some progress along this direction. > >> > >>>> > >> > >>>> As to Dawid's idea of having a dedicated cat/db to host the > >> temporary > >> > temp > >> > >>>> functions that don't have namespaces, my only concern is the > >> special > >> > >>>> treatment for a cat/db, which makes code less clean, as evident > in > >> > treating > >> > >>>> the built-in catalog currently. > >> > >>>> > >> > >>>> Thanks, > >> > >>>> Xuefiu > >> > >>>> > >> > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > >> > >>>> [hidden email]> > >> > >>>> wrote: > >> > >>>> > >> > >>>>> Hi, > >> > >>>>> Another idea to consider on top of Timo's suggestion. How about > we > >> > have a > >> > >>>>> special namespace (catalog + database) for built-in objects? > This > >> > catalog > >> > >>>>> would be invisible for users as Xuefu was suggesting. > >> > >>>>> > >> > >>>>> Then users could still override built-in functions, if they > fully > >> > qualify > >> > >>>>> object with the built-in namespace, but by default the common > >> logic > >> > of > >> > >>>>> current dB & cat would be used. > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION func ... > >> > >>>>> registers temporary function in current cat & dB > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > >> > >>>>> registers temporary function in cat db > >> > >>>>> > >> > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > >> > >>>>> Overrides built-in function with temporary function > >> > >>>>> > >> > >>>>> The built-in/system namespace would not be writable for > permanent > >> > >>>> objects. > >> > >>>>> WDYT? > >> > >>>>> > >> > >>>>> This way I think we can have benefits of both solutions. > >> > >>>>> > >> > >>>>> Best, > >> > >>>>> Dawid > >> > >>>>> > >> > >>>>> > >> > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email]> > >> wrote: > >> > >>>>> > >> > >>>>>> Hi Bowen, > >> > >>>>>> > >> > >>>>>> I understand the potential benefit of overriding certain > built-in > >> > >>>>>> functions. I'm open to such a feature if many people agree. > >> > However, it > >> > >>>>>> would be great to still support overriding catalog functions > with > >> > >>>>>> temporary functions in order to prototype a query even though a > >> > >>>>>> catalog/database might not be available currently or should not > >> be > >> > >>>>>> modified yet. How about we support both cases? > >> > >>>>>> > >> > >>>>>> CREATE TEMPORARY FUNCTION abs > >> > >>>>>> -> creates/overrides a built-in function and never consideres > >> > current > >> > >>>>>> catalog and database; inconsistent with other DDL but > acceptable > >> for > >> > >>>>>> functions I guess. > >> > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > >> > >>>>>> -> creates/overrides a catalog function > >> > >>>>>> > >> > >>>>>> Regarding "Flink don't have any other built-in objects (tables, > >> > views) > >> > >>>>>> except functions", this might change in the near future. Take > >> > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > example. > >> > >>>>>> > >> > >>>>>> Thanks, > >> > >>>>>> Timo > >> > >>>>>> > >> > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > >> > >>>>>>> Hi Fabian, > >> > >>>>>>> > >> > >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I > >> > didn't > >> > >>>>>>> include that as a voting option, and the discussion is mainly > >> > between > >> > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > >> > >>>>>>> > >> > >>>>>>> Re > However, it means that temp functions are differently > >> treated > >> > >>>> than > >> > >>>>>>> other db objects. > >> > >>>>>>> IMO, the treatment difference results from the fact that > >> functions > >> > >>>> are > >> > >>>>> a > >> > >>>>>>> bit different from other objects - Flink don't have any other > >> > >>>> built-in > >> > >>>>>>> objects (tables, views) except functions. > >> > >>>>>>> > >> > >>>>>>> Cheers, > >> > >>>>>>> Bowen > >> > >>>>>>> > >> > >>>>>> > >> > >>>> > >> > >>>> -- > >> > >>>> Xuefu Zhang > >> > >>>> > >> > >>>> "In Honey We Trust!" > >> > >>>> > >> > >> > >> > > > >> > > >> > > >> > >> -- > >> Xuefu Zhang > >> > >> "In Honey We Trust!" > >> > > > -- Xuefu Zhang "In Honey We Trust!" |
In reply to this post by Dawid Wysakowicz
@Dawid, Re: we also don't need additional referencing the specialcatalog
anywhere. True. But once we allow such reference, then user can do so in any possible place where a function name is expected, for which we have to handle. That's a big difference, I think. Thanks, Xuefu On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <[hidden email]> wrote: > @Bowen I am not suggesting introducing additional catalog. I think we need > to get rid of the current built-in catalog. > > @Xuefu in option #3 we also don't need additional referencing the special > catalog anywhere else besides in the CREATE statement. The resolution > behaviour is exactly the same in both options. > > On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: > > > Hi Dawid, > > > > "GLOBAL" is a temporary keyword that was given to the approach. It can be > > changed to something else for better. > > > > The difference between this and the #3 approach is that we only need the > > keyword for this create DDL. For other places (such as function > > referencing), no keyword or special namespace is needed. > > > > Thanks, > > Xuefu > > > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < > > [hidden email]> > > wrote: > > > > > Hi, > > > I think it makes sense to start voting at this point. > > > > > > Option 1: Only 1-part identifiers > > > PROS: > > > - allows shadowing built-in functions > > > CONS: > > > - incosistent with all the other objects, both permanent & temporary > > > - does not allow shadowing catalog functions > > > > > > Option 2: Special keyword for built-in function > > > I think this is quite similar to the special catalog/db. The thing I am > > > strongly against in this proposal is the GLOBAL keyword. This keyword > > has a > > > meaning in rdbms systems and means a function that is present for a > > > lifetime of a session in which it was created, but available in all > other > > > sessions. Therefore I really don't want to use this keyword in a > > different > > > context. > > > > > > Option 3: Special catalog/db > > > > > > PROS: > > > - allows shadowing built-in functions > > > - allows shadowing catalog functions > > > - consistent with other objects > > > CONS: > > > - we introduce a special namespace for built-in functions > > > > > > I don't see a problem with introducing the special namespace. In the > end > > it > > > is very similar to the keyword approach. In this case the catalog/db > > > combination would be the "keyword" > > > > > > Therefore my votes: > > > Option 1: -0 > > > Option 2: -1 (I might change to +0 if we can come up with a better > > keyword) > > > Option 3: +1 > > > > > > Best, > > > Dawid > > > > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > > > > > Hi Aljoscha, > > > > > > > > Thanks for the summary and these are great questions to be answered. > > The > > > > answer to your first question is clear: there is a general agreement > to > > > > override built-in functions with temp functions. > > > > > > > > However, your second and third questions are sort of related, as a > > > function > > > > reference can be either just function name (like "func") or in the > form > > > or > > > > "cat.db.func". When a reference is just function name, it can mean > > > either a > > > > built-in function or a function defined in the current cat/db. If we > > > > support overriding a built-in function with a temp function, such > > > > overriding can also cover a function in the current cat/db. > > > > > > > > I think what Timo referred as "overriding a catalog function" means a > > > temp > > > > function defined as "cat.db.func" overrides a catalog function "func" > > in > > > > cat/db even if cat/db is not current. To support this, temp function > > has > > > to > > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd > > > questions > > > > are related. The problem with such support is the ambiguity when user > > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func > ...". > > > > Here "func" can means a global temp function, or a temp function in > > > current > > > > cat/db. If we can assume the former, this creates an inconsistency > > > because > > > > "CREATE FUNCTION func" actually means a function in current cat/db. > If > > we > > > > assume the latter, then there is no way for user to create a global > > temp > > > > function. > > > > > > > > Giving a special namespace for built-in functions may solve the > > ambiguity > > > > problem above, but it also introduces artificial catalog/database > that > > > > needs special treatment and pollutes the cleanness of the code. I > > would > > > > rather introduce a syntax in DDL to solve the problem, like "CREATE > > > > [GLOBAL] TEMPORARY FUNCTION func". > > > > > > > > Thus, I'd like to summarize a few candidate proposals for voting > > > purposes: > > > > > > > > 1. Support only global, temporary functions without namespace. Such > > temp > > > > functions overrides built-in functions and catalog functions in > current > > > > cat/db. The resolution order is: temp functions -> built-in functions > > -> > > > > catalog functions. (Partially or fully qualified functions has no > > > > ambiguity!) > > > > > > > > 2. In addition to #1, support creating and referencing temporary > > > functions > > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global > temp > > > > functions. The resolution order is: global temp functions -> built-in > > > > functions -> temp functions in current cat/db -> catalog function. > > > > (Resolution for partially or fully qualified function reference is: > > temp > > > > functions -> persistent functions.) > > > > > > > > 3. In addition to #1, support creating and referencing temporary > > > functions > > > > associated with a cat/db with a special namespace for built-in > > functions > > > > and global temp functions. The resolution is the same as #2, except > > that > > > > the special namespace might be prefixed to a reference to a built-in > > > > function or global temp function. (In absence of the special > namespace, > > > the > > > > resolution order is the same as in #2.) > > > > > > > > My personal preference is #1, given the unknown use case and > introduced > > > > complexity for #2 and #3. However, #2 is an acceptable alternative. > > Thus, > > > > my votes are: > > > > > > > > +1 for #1 > > > > +0 for #2 > > > > -1 for #3 > > > > > > > > Everyone, please cast your vote (in above format please!), or let me > > know > > > > if you have more questions or other candidates. > > > > > > > > Thanks, > > > > Xuefu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < > [hidden email]> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I think this discussion and the one for FLIP-64 are very connected. > > To > > > > > resolve the differences, think we have to think about the basic > > > > principles > > > > > and find consensus there. The basic questions I see are: > > > > > > > > > > - Do we want to support overriding builtin functions? > > > > > - Do we want to support overriding catalog functions? > > > > > - And then later: should temporary functions be tied to a > > > > > catalog/database? > > > > > > > > > > I don’t have much to say about these, except that we should > somewhat > > > > stick > > > > > to what the industry does. But I also understand that the industry > is > > > > > already very divided on this. > > > > > > > > > > Best, > > > > > Aljoscha > > > > > > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > +1 to strive for reaching consensus on the remaining topics. We > are > > > > > close to the truth. It will waste a lot of time if we resume the > > topic > > > > some > > > > > time later. > > > > > > > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s > “cat.db.fun” > > > way > > > > > to override a catalog function. > > > > > > > > > > > > I’m not sure about “system.system.fun”, it introduces a > nonexistent > > > cat > > > > > & db? And we still need to do special treatment for the dedicated > > > > > system.system cat & db? > > > > > > > > > > > > Best, > > > > > > Jark > > > > > > > > > > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > > > > >> > > > > > >> Hi everyone, > > > > > >> > > > > > >> @Xuefu: I would like to avoid adding too many things > > incrementally. > > > > > Users should be able to override all catalog objects consistently > > > > according > > > > > to FLIP-64 (Support for Temporary Objects in Table module). If > > > functions > > > > > are treated completely different, we need more code and special > > cases. > > > > From > > > > > an implementation perspective, this topic only affects the lookup > > logic > > > > > which is rather low implementation effort which is why I would like > > to > > > > > clarify the remaining items. As you said, we have a slight consenus > > on > > > > > overriding built-in functions; we should also strive for reaching > > > > consensus > > > > > on the remaining topics. > > > > > >> > > > > > >> @Dawid: I like your idea as it ensures registering catalog > objects > > > > > consistent and the overriding of built-in functions more explicit. > > > > > >> > > > > > >> Thanks, > > > > > >> Timo > > > > > >> > > > > > >> > > > > > >> On 17.09.19 11:59, kai wang wrote: > > > > > >>> hi, everyone > > > > > >>> I think this flip is very meaningful. it supports functions > that > > > can > > > > be > > > > > >>> shared by different catalogs and dbs, reducing the duplication > of > > > > > functions. > > > > > >>> > > > > > >>> Our group based on flink's sql parser module implements create > > > > function > > > > > >>> feature, stores the parsed function metadata and schema into > > mysql, > > > > and > > > > > >>> also customizes the catalog, customizes sql-client to support > > > custom > > > > > >>> schemas and functions. Loaded, but the function is currently > > > global, > > > > > and is > > > > > >>> not subdivided according to catalog and db. > > > > > >>> > > > > > >>> In addition, I very much hope to participate in the development > > of > > > > this > > > > > >>> flip, I have been paying attention to the community, but found > it > > > is > > > > > more > > > > > >>> difficult to join. > > > > > >>> thank you. > > > > > >>> > > > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > > > > > >>> > > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > > > > >>>> > > > > > >>>> It seems to me that there is a general consensus on having > temp > > > > > functions > > > > > >>>> that have no namespaces and overwrite built-in functions. (As > a > > > side > > > > > note > > > > > >>>> for comparability, the current user defined functions are all > > > > > temporary and > > > > > >>>> having no namespaces.) > > > > > >>>> > > > > > >>>> Nevertheless, I can also see the merit of having namespaced > temp > > > > > functions > > > > > >>>> that can overwrite functions defined in a specific cat/db. > > > However, > > > > > this > > > > > >>>> idea appears orthogonal to the former and can be added > > > > incrementally. > > > > > >>>> > > > > > >>>> How about we first implement non-namespaced temp functions now > > and > > > > > leave > > > > > >>>> the door open for namespaced ones for later releases as the > > > > > requirement > > > > > >>>> might become more crystal? This also helps shorten the debate > > and > > > > > allow us > > > > > >>>> to make some progress along this direction. > > > > > >>>> > > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > > > > temporary > > > > > temp > > > > > >>>> functions that don't have namespaces, my only concern is the > > > special > > > > > >>>> treatment for a cat/db, which makes code less clean, as > evident > > in > > > > > treating > > > > > >>>> the built-in catalog currently. > > > > > >>>> > > > > > >>>> Thanks, > > > > > >>>> Xuefiu > > > > > >>>> > > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > > > > >>>> [hidden email]> > > > > > >>>> wrote: > > > > > >>>> > > > > > >>>>> Hi, > > > > > >>>>> Another idea to consider on top of Timo's suggestion. How > about > > > we > > > > > have a > > > > > >>>>> special namespace (catalog + database) for built-in objects? > > This > > > > > catalog > > > > > >>>>> would be invisible for users as Xuefu was suggesting. > > > > > >>>>> > > > > > >>>>> Then users could still override built-in functions, if they > > fully > > > > > qualify > > > > > >>>>> object with the built-in namespace, but by default the common > > > logic > > > > > of > > > > > >>>>> current dB & cat would be used. > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION func ... > > > > > >>>>> registers temporary function in current cat & dB > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > > > > >>>>> registers temporary function in cat db > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > > > > >>>>> Overrides built-in function with temporary function > > > > > >>>>> > > > > > >>>>> The built-in/system namespace would not be writable for > > permanent > > > > > >>>> objects. > > > > > >>>>> WDYT? > > > > > >>>>> > > > > > >>>>> This way I think we can have benefits of both solutions. > > > > > >>>>> > > > > > >>>>> Best, > > > > > >>>>> Dawid > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <[hidden email] > > > > > > wrote: > > > > > >>>>> > > > > > >>>>>> Hi Bowen, > > > > > >>>>>> > > > > > >>>>>> I understand the potential benefit of overriding certain > > > built-in > > > > > >>>>>> functions. I'm open to such a feature if many people agree. > > > > > However, it > > > > > >>>>>> would be great to still support overriding catalog functions > > > with > > > > > >>>>>> temporary functions in order to prototype a query even > though > > a > > > > > >>>>>> catalog/database might not be available currently or should > > not > > > be > > > > > >>>>>> modified yet. How about we support both cases? > > > > > >>>>>> > > > > > >>>>>> CREATE TEMPORARY FUNCTION abs > > > > > >>>>>> -> creates/overrides a built-in function and never > consideres > > > > > current > > > > > >>>>>> catalog and database; inconsistent with other DDL but > > acceptable > > > > for > > > > > >>>>>> functions I guess. > > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > > > > >>>>>> -> creates/overrides a catalog function > > > > > >>>>>> > > > > > >>>>>> Regarding "Flink don't have any other built-in objects > > (tables, > > > > > views) > > > > > >>>>>> except functions", this might change in the near future. > Take > > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > > > example. > > > > > >>>>>> > > > > > >>>>>> Thanks, > > > > > >>>>>> Timo > > > > > >>>>>> > > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > > > > >>>>>>> Hi Fabian, > > > > > >>>>>>> > > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable > thus I > > > > > didn't > > > > > >>>>>>> include that as a voting option, and the discussion is > mainly > > > > > between > > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > > > > >>>>>>> > > > > > >>>>>>> Re > However, it means that temp functions are differently > > > > treated > > > > > >>>> than > > > > > >>>>>>> other db objects. > > > > > >>>>>>> IMO, the treatment difference results from the fact that > > > > functions > > > > > >>>> are > > > > > >>>>> a > > > > > >>>>>>> bit different from other objects - Flink don't have any > other > > > > > >>>> built-in > > > > > >>>>>>> objects (tables, views) except functions. > > > > > >>>>>>> > > > > > >>>>>>> Cheers, > > > > > >>>>>>> Bowen > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>> > > > > > >>>> -- > > > > > >>>> Xuefu Zhang > > > > > >>>> > > > > > >>>> "In Honey We Trust!" > > > > > >>>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Xuefu Zhang > > > > > > > > "In Honey We Trust!" > > > > > > > > > > > > > -- > > Xuefu Zhang > > > > "In Honey We Trust!" > > > -- Xuefu Zhang "In Honey We Trust!" |
I agree with Xuefu that inconsistent handling with all the other objects is
not a big problem. Regarding to option#3, the special "system.system" namespace may confuse users. Users need to know the set of built-in function names to know when to use "system.system" namespace. What will happen if user registers a non-builtin function name under the "system.system" namespace? Besides, I think it doesn't solve the "explode" problem I mentioned at the beginning of this thread. So here is my vote: +1 for #1 0 for #2 -1 for #3 Best, Jark On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: > @Dawid, Re: we also don't need additional referencing the specialcatalog > anywhere. > > True. But once we allow such reference, then user can do so in any possible > place where a function name is expected, for which we have to handle. > That's a big difference, I think. > > Thanks, > Xuefu > > On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < > [hidden email]> > wrote: > > > @Bowen I am not suggesting introducing additional catalog. I think we > need > > to get rid of the current built-in catalog. > > > > @Xuefu in option #3 we also don't need additional referencing the special > > catalog anywhere else besides in the CREATE statement. The resolution > > behaviour is exactly the same in both options. > > > > On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: > > > > > Hi Dawid, > > > > > > "GLOBAL" is a temporary keyword that was given to the approach. It can > be > > > changed to something else for better. > > > > > > The difference between this and the #3 approach is that we only need > the > > > keyword for this create DDL. For other places (such as function > > > referencing), no keyword or special namespace is needed. > > > > > > Thanks, > > > Xuefu > > > > > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < > > > [hidden email]> > > > wrote: > > > > > > > Hi, > > > > I think it makes sense to start voting at this point. > > > > > > > > Option 1: Only 1-part identifiers > > > > PROS: > > > > - allows shadowing built-in functions > > > > CONS: > > > > - incosistent with all the other objects, both permanent & temporary > > > > - does not allow shadowing catalog functions > > > > > > > > Option 2: Special keyword for built-in function > > > > I think this is quite similar to the special catalog/db. The thing I > am > > > > strongly against in this proposal is the GLOBAL keyword. This keyword > > > has a > > > > meaning in rdbms systems and means a function that is present for a > > > > lifetime of a session in which it was created, but available in all > > other > > > > sessions. Therefore I really don't want to use this keyword in a > > > different > > > > context. > > > > > > > > Option 3: Special catalog/db > > > > > > > > PROS: > > > > - allows shadowing built-in functions > > > > - allows shadowing catalog functions > > > > - consistent with other objects > > > > CONS: > > > > - we introduce a special namespace for built-in functions > > > > > > > > I don't see a problem with introducing the special namespace. In the > > end > > > it > > > > is very similar to the keyword approach. In this case the catalog/db > > > > combination would be the "keyword" > > > > > > > > Therefore my votes: > > > > Option 1: -0 > > > > Option 2: -1 (I might change to +0 if we can come up with a better > > > keyword) > > > > Option 3: +1 > > > > > > > > Best, > > > > Dawid > > > > > > > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > > > > > > > > > Hi Aljoscha, > > > > > > > > > > Thanks for the summary and these are great questions to be > answered. > > > The > > > > > answer to your first question is clear: there is a general > agreement > > to > > > > > override built-in functions with temp functions. > > > > > > > > > > However, your second and third questions are sort of related, as a > > > > function > > > > > reference can be either just function name (like "func") or in the > > form > > > > or > > > > > "cat.db.func". When a reference is just function name, it can mean > > > > either a > > > > > built-in function or a function defined in the current cat/db. If > we > > > > > support overriding a built-in function with a temp function, such > > > > > overriding can also cover a function in the current cat/db. > > > > > > > > > > I think what Timo referred as "overriding a catalog function" > means a > > > > temp > > > > > function defined as "cat.db.func" overrides a catalog function > "func" > > > in > > > > > cat/db even if cat/db is not current. To support this, temp > function > > > has > > > > to > > > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd > > > > questions > > > > > are related. The problem with such support is the ambiguity when > user > > > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func > > ...". > > > > > Here "func" can means a global temp function, or a temp function in > > > > current > > > > > cat/db. If we can assume the former, this creates an inconsistency > > > > because > > > > > "CREATE FUNCTION func" actually means a function in current cat/db. > > If > > > we > > > > > assume the latter, then there is no way for user to create a global > > > temp > > > > > function. > > > > > > > > > > Giving a special namespace for built-in functions may solve the > > > ambiguity > > > > > problem above, but it also introduces artificial catalog/database > > that > > > > > needs special treatment and pollutes the cleanness of the code. I > > > would > > > > > rather introduce a syntax in DDL to solve the problem, like "CREATE > > > > > [GLOBAL] TEMPORARY FUNCTION func". > > > > > > > > > > Thus, I'd like to summarize a few candidate proposals for voting > > > > purposes: > > > > > > > > > > 1. Support only global, temporary functions without namespace. Such > > > temp > > > > > functions overrides built-in functions and catalog functions in > > current > > > > > cat/db. The resolution order is: temp functions -> built-in > functions > > > -> > > > > > catalog functions. (Partially or fully qualified functions has no > > > > > ambiguity!) > > > > > > > > > > 2. In addition to #1, support creating and referencing temporary > > > > functions > > > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global > > temp > > > > > functions. The resolution order is: global temp functions -> > built-in > > > > > functions -> temp functions in current cat/db -> catalog function. > > > > > (Resolution for partially or fully qualified function reference is: > > > temp > > > > > functions -> persistent functions.) > > > > > > > > > > 3. In addition to #1, support creating and referencing temporary > > > > functions > > > > > associated with a cat/db with a special namespace for built-in > > > functions > > > > > and global temp functions. The resolution is the same as #2, except > > > that > > > > > the special namespace might be prefixed to a reference to a > built-in > > > > > function or global temp function. (In absence of the special > > namespace, > > > > the > > > > > resolution order is the same as in #2.) > > > > > > > > > > My personal preference is #1, given the unknown use case and > > introduced > > > > > complexity for #2 and #3. However, #2 is an acceptable alternative. > > > Thus, > > > > > my votes are: > > > > > > > > > > +1 for #1 > > > > > +0 for #2 > > > > > -1 for #3 > > > > > > > > > > Everyone, please cast your vote (in above format please!), or let > me > > > know > > > > > if you have more questions or other candidates. > > > > > > > > > > Thanks, > > > > > Xuefu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I think this discussion and the one for FLIP-64 are very > connected. > > > To > > > > > > resolve the differences, think we have to think about the basic > > > > > principles > > > > > > and find consensus there. The basic questions I see are: > > > > > > > > > > > > - Do we want to support overriding builtin functions? > > > > > > - Do we want to support overriding catalog functions? > > > > > > - And then later: should temporary functions be tied to a > > > > > > catalog/database? > > > > > > > > > > > > I don’t have much to say about these, except that we should > > somewhat > > > > > stick > > > > > > to what the industry does. But I also understand that the > industry > > is > > > > > > already very divided on this. > > > > > > > > > > > > Best, > > > > > > Aljoscha > > > > > > > > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > +1 to strive for reaching consensus on the remaining topics. We > > are > > > > > > close to the truth. It will waste a lot of time if we resume the > > > topic > > > > > some > > > > > > time later. > > > > > > > > > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s > > “cat.db.fun” > > > > way > > > > > > to override a catalog function. > > > > > > > > > > > > > > I’m not sure about “system.system.fun”, it introduces a > > nonexistent > > > > cat > > > > > > & db? And we still need to do special treatment for the dedicated > > > > > > system.system cat & db? > > > > > > > > > > > > > > Best, > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > > > > > > >> > > > > > > >> Hi everyone, > > > > > > >> > > > > > > >> @Xuefu: I would like to avoid adding too many things > > > incrementally. > > > > > > Users should be able to override all catalog objects consistently > > > > > according > > > > > > to FLIP-64 (Support for Temporary Objects in Table module). If > > > > functions > > > > > > are treated completely different, we need more code and special > > > cases. > > > > > From > > > > > > an implementation perspective, this topic only affects the lookup > > > logic > > > > > > which is rather low implementation effort which is why I would > like > > > to > > > > > > clarify the remaining items. As you said, we have a slight > consenus > > > on > > > > > > overriding built-in functions; we should also strive for reaching > > > > > consensus > > > > > > on the remaining topics. > > > > > > >> > > > > > > >> @Dawid: I like your idea as it ensures registering catalog > > objects > > > > > > consistent and the overriding of built-in functions more > explicit. > > > > > > >> > > > > > > >> Thanks, > > > > > > >> Timo > > > > > > >> > > > > > > >> > > > > > > >> On 17.09.19 11:59, kai wang wrote: > > > > > > >>> hi, everyone > > > > > > >>> I think this flip is very meaningful. it supports functions > > that > > > > can > > > > > be > > > > > > >>> shared by different catalogs and dbs, reducing the > duplication > > of > > > > > > functions. > > > > > > >>> > > > > > > >>> Our group based on flink's sql parser module implements > create > > > > > function > > > > > > >>> feature, stores the parsed function metadata and schema into > > > mysql, > > > > > and > > > > > > >>> also customizes the catalog, customizes sql-client to support > > > > custom > > > > > > >>> schemas and functions. Loaded, but the function is currently > > > > global, > > > > > > and is > > > > > > >>> not subdivided according to catalog and db. > > > > > > >>> > > > > > > >>> In addition, I very much hope to participate in the > development > > > of > > > > > this > > > > > > >>> flip, I have been paying attention to the community, but > found > > it > > > > is > > > > > > more > > > > > > >>> difficult to join. > > > > > > >>> thank you. > > > > > > >>> > > > > > > >>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > > > > > > >>> > > > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > > > > > >>>> > > > > > > >>>> It seems to me that there is a general consensus on having > > temp > > > > > > functions > > > > > > >>>> that have no namespaces and overwrite built-in functions. > (As > > a > > > > side > > > > > > note > > > > > > >>>> for comparability, the current user defined functions are > all > > > > > > temporary and > > > > > > >>>> having no namespaces.) > > > > > > >>>> > > > > > > >>>> Nevertheless, I can also see the merit of having namespaced > > temp > > > > > > functions > > > > > > >>>> that can overwrite functions defined in a specific cat/db. > > > > However, > > > > > > this > > > > > > >>>> idea appears orthogonal to the former and can be added > > > > > incrementally. > > > > > > >>>> > > > > > > >>>> How about we first implement non-namespaced temp functions > now > > > and > > > > > > leave > > > > > > >>>> the door open for namespaced ones for later releases as the > > > > > > requirement > > > > > > >>>> might become more crystal? This also helps shorten the > debate > > > and > > > > > > allow us > > > > > > >>>> to make some progress along this direction. > > > > > > >>>> > > > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > > > > > temporary > > > > > > temp > > > > > > >>>> functions that don't have namespaces, my only concern is the > > > > special > > > > > > >>>> treatment for a cat/db, which makes code less clean, as > > evident > > > in > > > > > > treating > > > > > > >>>> the built-in catalog currently. > > > > > > >>>> > > > > > > >>>> Thanks, > > > > > > >>>> Xuefiu > > > > > > >>>> > > > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > > > > > >>>> [hidden email]> > > > > > > >>>> wrote: > > > > > > >>>> > > > > > > >>>>> Hi, > > > > > > >>>>> Another idea to consider on top of Timo's suggestion. How > > about > > > > we > > > > > > have a > > > > > > >>>>> special namespace (catalog + database) for built-in > objects? > > > This > > > > > > catalog > > > > > > >>>>> would be invisible for users as Xuefu was suggesting. > > > > > > >>>>> > > > > > > >>>>> Then users could still override built-in functions, if they > > > fully > > > > > > qualify > > > > > > >>>>> object with the built-in namespace, but by default the > common > > > > logic > > > > > > of > > > > > > >>>>> current dB & cat would be used. > > > > > > >>>>> > > > > > > >>>>> CREATE TEMPORARY FUNCTION func ... > > > > > > >>>>> registers temporary function in current cat & dB > > > > > > >>>>> > > > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > > > > > >>>>> registers temporary function in cat db > > > > > > >>>>> > > > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > > > > > >>>>> Overrides built-in function with temporary function > > > > > > >>>>> > > > > > > >>>>> The built-in/system namespace would not be writable for > > > permanent > > > > > > >>>> objects. > > > > > > >>>>> WDYT? > > > > > > >>>>> > > > > > > >>>>> This way I think we can have benefits of both solutions. > > > > > > >>>>> > > > > > > >>>>> Best, > > > > > > >>>>> Dawid > > > > > > >>>>> > > > > > > >>>>> > > > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < > [hidden email] > > > > > > > > wrote: > > > > > > >>>>> > > > > > > >>>>>> Hi Bowen, > > > > > > >>>>>> > > > > > > >>>>>> I understand the potential benefit of overriding certain > > > > built-in > > > > > > >>>>>> functions. I'm open to such a feature if many people > agree. > > > > > > However, it > > > > > > >>>>>> would be great to still support overriding catalog > functions > > > > with > > > > > > >>>>>> temporary functions in order to prototype a query even > > though > > > a > > > > > > >>>>>> catalog/database might not be available currently or > should > > > not > > > > be > > > > > > >>>>>> modified yet. How about we support both cases? > > > > > > >>>>>> > > > > > > >>>>>> CREATE TEMPORARY FUNCTION abs > > > > > > >>>>>> -> creates/overrides a built-in function and never > > consideres > > > > > > current > > > > > > >>>>>> catalog and database; inconsistent with other DDL but > > > acceptable > > > > > for > > > > > > >>>>>> functions I guess. > > > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > > > > > >>>>>> -> creates/overrides a catalog function > > > > > > >>>>>> > > > > > > >>>>>> Regarding "Flink don't have any other built-in objects > > > (tables, > > > > > > views) > > > > > > >>>>>> except functions", this might change in the near future. > > Take > > > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > > > > example. > > > > > > >>>>>> > > > > > > >>>>>> Thanks, > > > > > > >>>>>> Timo > > > > > > >>>>>> > > > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > > > > > >>>>>>> Hi Fabian, > > > > > > >>>>>>> > > > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable > > thus I > > > > > > didn't > > > > > > >>>>>>> include that as a voting option, and the discussion is > > mainly > > > > > > between > > > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > > > > > >>>>>>> > > > > > > >>>>>>> Re > However, it means that temp functions are > differently > > > > > treated > > > > > > >>>> than > > > > > > >>>>>>> other db objects. > > > > > > >>>>>>> IMO, the treatment difference results from the fact that > > > > > functions > > > > > > >>>> are > > > > > > >>>>> a > > > > > > >>>>>>> bit different from other objects - Flink don't have any > > other > > > > > > >>>> built-in > > > > > > >>>>>>> objects (tables, views) except functions. > > > > > > >>>>>>> > > > > > > >>>>>>> Cheers, > > > > > > >>>>>>> Bowen > > > > > > >>>>>>> > > > > > > >>>>>> > > > > > > >>>> > > > > > > >>>> -- > > > > > > >>>> Xuefu Zhang > > > > > > >>>> > > > > > > >>>> "In Honey We Trust!" > > > > > > >>>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Xuefu Zhang > > > > > > > > > > "In Honey We Trust!" > > > > > > > > > > > > > > > > > > -- > > > Xuefu Zhang > > > > > > "In Honey We Trust!" > > > > > > > > -- > Xuefu Zhang > > "In Honey We Trust!" > |
Hi,
It is a quite long discussion to follow and I hope I didn’t misunderstand anything. From the proposals presented by Xuefu I would vote: -1 for #1 and #2 +1 for #3 Besides #3 being IMO more general and more consistent, having qualified names (#3) would help/make easier for someone to use cross databases/catalogs queries (joining multiple data sets/streams). For example with some functions to manipulate/clean up/convert the stored data in different catalogs registered in the respective catalogs. Piotrek > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote: > > I agree with Xuefu that inconsistent handling with all the other objects is > not a big problem. > > Regarding to option#3, the special "system.system" namespace may confuse > users. > Users need to know the set of built-in function names to know when to use > "system.system" namespace. > What will happen if user registers a non-builtin function name under the > "system.system" namespace? > Besides, I think it doesn't solve the "explode" problem I mentioned at the > beginning of this thread. > > So here is my vote: > > +1 for #1 > 0 for #2 > -1 for #3 > > Best, > Jark > > > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: > >> @Dawid, Re: we also don't need additional referencing the specialcatalog >> anywhere. >> >> True. But once we allow such reference, then user can do so in any possible >> place where a function name is expected, for which we have to handle. >> That's a big difference, I think. >> >> Thanks, >> Xuefu >> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < >> [hidden email]> >> wrote: >> >>> @Bowen I am not suggesting introducing additional catalog. I think we >> need >>> to get rid of the current built-in catalog. >>> >>> @Xuefu in option #3 we also don't need additional referencing the special >>> catalog anywhere else besides in the CREATE statement. The resolution >>> behaviour is exactly the same in both options. >>> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: >>> >>>> Hi Dawid, >>>> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It can >> be >>>> changed to something else for better. >>>> >>>> The difference between this and the #3 approach is that we only need >> the >>>> keyword for this create DDL. For other places (such as function >>>> referencing), no keyword or special namespace is needed. >>>> >>>> Thanks, >>>> Xuefu >>>> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < >>>> [hidden email]> >>>> wrote: >>>> >>>>> Hi, >>>>> I think it makes sense to start voting at this point. >>>>> >>>>> Option 1: Only 1-part identifiers >>>>> PROS: >>>>> - allows shadowing built-in functions >>>>> CONS: >>>>> - incosistent with all the other objects, both permanent & temporary >>>>> - does not allow shadowing catalog functions >>>>> >>>>> Option 2: Special keyword for built-in function >>>>> I think this is quite similar to the special catalog/db. The thing I >> am >>>>> strongly against in this proposal is the GLOBAL keyword. This keyword >>>> has a >>>>> meaning in rdbms systems and means a function that is present for a >>>>> lifetime of a session in which it was created, but available in all >>> other >>>>> sessions. Therefore I really don't want to use this keyword in a >>>> different >>>>> context. >>>>> >>>>> Option 3: Special catalog/db >>>>> >>>>> PROS: >>>>> - allows shadowing built-in functions >>>>> - allows shadowing catalog functions >>>>> - consistent with other objects >>>>> CONS: >>>>> - we introduce a special namespace for built-in functions >>>>> >>>>> I don't see a problem with introducing the special namespace. In the >>> end >>>> it >>>>> is very similar to the keyword approach. In this case the catalog/db >>>>> combination would be the "keyword" >>>>> >>>>> Therefore my votes: >>>>> Option 1: -0 >>>>> Option 2: -1 (I might change to +0 if we can come up with a better >>>> keyword) >>>>> Option 3: +1 >>>>> >>>>> Best, >>>>> Dawid >>>>> >>>>> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: >>>>> >>>>>> Hi Aljoscha, >>>>>> >>>>>> Thanks for the summary and these are great questions to be >> answered. >>>> The >>>>>> answer to your first question is clear: there is a general >> agreement >>> to >>>>>> override built-in functions with temp functions. >>>>>> >>>>>> However, your second and third questions are sort of related, as a >>>>> function >>>>>> reference can be either just function name (like "func") or in the >>> form >>>>> or >>>>>> "cat.db.func". When a reference is just function name, it can mean >>>>> either a >>>>>> built-in function or a function defined in the current cat/db. If >> we >>>>>> support overriding a built-in function with a temp function, such >>>>>> overriding can also cover a function in the current cat/db. >>>>>> >>>>>> I think what Timo referred as "overriding a catalog function" >> means a >>>>> temp >>>>>> function defined as "cat.db.func" overrides a catalog function >> "func" >>>> in >>>>>> cat/db even if cat/db is not current. To support this, temp >> function >>>> has >>>>> to >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd >>>>> questions >>>>>> are related. The problem with such support is the ambiguity when >> user >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func >>> ...". >>>>>> Here "func" can means a global temp function, or a temp function in >>>>> current >>>>>> cat/db. If we can assume the former, this creates an inconsistency >>>>> because >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db. >>> If >>>> we >>>>>> assume the latter, then there is no way for user to create a global >>>> temp >>>>>> function. >>>>>> >>>>>> Giving a special namespace for built-in functions may solve the >>>> ambiguity >>>>>> problem above, but it also introduces artificial catalog/database >>> that >>>>>> needs special treatment and pollutes the cleanness of the code. I >>>> would >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE >>>>>> [GLOBAL] TEMPORARY FUNCTION func". >>>>>> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting >>>>> purposes: >>>>>> >>>>>> 1. Support only global, temporary functions without namespace. Such >>>> temp >>>>>> functions overrides built-in functions and catalog functions in >>> current >>>>>> cat/db. The resolution order is: temp functions -> built-in >> functions >>>> -> >>>>>> catalog functions. (Partially or fully qualified functions has no >>>>>> ambiguity!) >>>>>> >>>>>> 2. In addition to #1, support creating and referencing temporary >>>>> functions >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global >>> temp >>>>>> functions. The resolution order is: global temp functions -> >> built-in >>>>>> functions -> temp functions in current cat/db -> catalog function. >>>>>> (Resolution for partially or fully qualified function reference is: >>>> temp >>>>>> functions -> persistent functions.) >>>>>> >>>>>> 3. In addition to #1, support creating and referencing temporary >>>>> functions >>>>>> associated with a cat/db with a special namespace for built-in >>>> functions >>>>>> and global temp functions. The resolution is the same as #2, except >>>> that >>>>>> the special namespace might be prefixed to a reference to a >> built-in >>>>>> function or global temp function. (In absence of the special >>> namespace, >>>>> the >>>>>> resolution order is the same as in #2.) >>>>>> >>>>>> My personal preference is #1, given the unknown use case and >>> introduced >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative. >>>> Thus, >>>>>> my votes are: >>>>>> >>>>>> +1 for #1 >>>>>> +0 for #2 >>>>>> -1 for #3 >>>>>> >>>>>> Everyone, please cast your vote (in above format please!), or let >> me >>>> know >>>>>> if you have more questions or other candidates. >>>>>> >>>>>> Thanks, >>>>>> Xuefu >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < >>> [hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think this discussion and the one for FLIP-64 are very >> connected. >>>> To >>>>>>> resolve the differences, think we have to think about the basic >>>>>> principles >>>>>>> and find consensus there. The basic questions I see are: >>>>>>> >>>>>>> - Do we want to support overriding builtin functions? >>>>>>> - Do we want to support overriding catalog functions? >>>>>>> - And then later: should temporary functions be tied to a >>>>>>> catalog/database? >>>>>>> >>>>>>> I don’t have much to say about these, except that we should >>> somewhat >>>>>> stick >>>>>>> to what the industry does. But I also understand that the >> industry >>> is >>>>>>> already very divided on this. >>>>>>> >>>>>>> Best, >>>>>>> Aljoscha >>>>>>> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We >>> are >>>>>>> close to the truth. It will waste a lot of time if we resume the >>>> topic >>>>>> some >>>>>>> time later. >>>>>>>> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s >>> “cat.db.fun” >>>>> way >>>>>>> to override a catalog function. >>>>>>>> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a >>> nonexistent >>>>> cat >>>>>>> & db? And we still need to do special treatment for the dedicated >>>>>>> system.system cat & db? >>>>>>>> >>>>>>>> Best, >>>>>>>> Jark >>>>>>>> >>>>>>>> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >>>>>>>>> >>>>>>>>> Hi everyone, >>>>>>>>> >>>>>>>>> @Xuefu: I would like to avoid adding too many things >>>> incrementally. >>>>>>> Users should be able to override all catalog objects consistently >>>>>> according >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If >>>>> functions >>>>>>> are treated completely different, we need more code and special >>>> cases. >>>>>> From >>>>>>> an implementation perspective, this topic only affects the lookup >>>> logic >>>>>>> which is rather low implementation effort which is why I would >> like >>>> to >>>>>>> clarify the remaining items. As you said, we have a slight >> consenus >>>> on >>>>>>> overriding built-in functions; we should also strive for reaching >>>>>> consensus >>>>>>> on the remaining topics. >>>>>>>>> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog >>> objects >>>>>>> consistent and the overriding of built-in functions more >> explicit. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Timo >>>>>>>>> >>>>>>>>> >>>>>>>>> On 17.09.19 11:59, kai wang wrote: >>>>>>>>>> hi, everyone >>>>>>>>>> I think this flip is very meaningful. it supports functions >>> that >>>>> can >>>>>> be >>>>>>>>>> shared by different catalogs and dbs, reducing the >> duplication >>> of >>>>>>> functions. >>>>>>>>>> >>>>>>>>>> Our group based on flink's sql parser module implements >> create >>>>>> function >>>>>>>>>> feature, stores the parsed function metadata and schema into >>>> mysql, >>>>>> and >>>>>>>>>> also customizes the catalog, customizes sql-client to support >>>>> custom >>>>>>>>>> schemas and functions. Loaded, but the function is currently >>>>> global, >>>>>>> and is >>>>>>>>>> not subdivided according to catalog and db. >>>>>>>>>> >>>>>>>>>> In addition, I very much hope to participate in the >> development >>>> of >>>>>> this >>>>>>>>>> flip, I have been paying attention to the community, but >> found >>> it >>>>> is >>>>>>> more >>>>>>>>>> difficult to join. >>>>>>>>>> thank you. >>>>>>>>>> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >>>>>>>>>> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts. >>>>>>>>>>> >>>>>>>>>>> It seems to me that there is a general consensus on having >>> temp >>>>>>> functions >>>>>>>>>>> that have no namespaces and overwrite built-in functions. >> (As >>> a >>>>> side >>>>>>> note >>>>>>>>>>> for comparability, the current user defined functions are >> all >>>>>>> temporary and >>>>>>>>>>> having no namespaces.) >>>>>>>>>>> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced >>> temp >>>>>>> functions >>>>>>>>>>> that can overwrite functions defined in a specific cat/db. >>>>> However, >>>>>>> this >>>>>>>>>>> idea appears orthogonal to the former and can be added >>>>>> incrementally. >>>>>>>>>>> >>>>>>>>>>> How about we first implement non-namespaced temp functions >> now >>>> and >>>>>>> leave >>>>>>>>>>> the door open for namespaced ones for later releases as the >>>>>>> requirement >>>>>>>>>>> might become more crystal? This also helps shorten the >> debate >>>> and >>>>>>> allow us >>>>>>>>>>> to make some progress along this direction. >>>>>>>>>>> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the >>>>>> temporary >>>>>>> temp >>>>>>>>>>> functions that don't have namespaces, my only concern is the >>>>> special >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as >>> evident >>>> in >>>>>>> treating >>>>>>>>>>> the built-in catalog currently. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Xuefiu >>>>>>>>>>> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >>>>>>>>>>> [hidden email]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How >>> about >>>>> we >>>>>>> have a >>>>>>>>>>>> special namespace (catalog + database) for built-in >> objects? >>>> This >>>>>>> catalog >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting. >>>>>>>>>>>> >>>>>>>>>>>> Then users could still override built-in functions, if they >>>> fully >>>>>>> qualify >>>>>>>>>>>> object with the built-in namespace, but by default the >> common >>>>> logic >>>>>>> of >>>>>>>>>>>> current dB & cat would be used. >>>>>>>>>>>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ... >>>>>>>>>>>> registers temporary function in current cat & dB >>>>>>>>>>>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >>>>>>>>>>>> registers temporary function in cat db >>>>>>>>>>>> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ... >>>>>>>>>>>> Overrides built-in function with temporary function >>>>>>>>>>>> >>>>>>>>>>>> The built-in/system namespace would not be writable for >>>> permanent >>>>>>>>>>> objects. >>>>>>>>>>>> WDYT? >>>>>>>>>>>> >>>>>>>>>>>> This way I think we can have benefits of both solutions. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Dawid >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < >> [hidden email] >>>> >>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Bowen, >>>>>>>>>>>>> >>>>>>>>>>>>> I understand the potential benefit of overriding certain >>>>> built-in >>>>>>>>>>>>> functions. I'm open to such a feature if many people >> agree. >>>>>>> However, it >>>>>>>>>>>>> would be great to still support overriding catalog >> functions >>>>> with >>>>>>>>>>>>> temporary functions in order to prototype a query even >>> though >>>> a >>>>>>>>>>>>> catalog/database might not be available currently or >> should >>>> not >>>>> be >>>>>>>>>>>>> modified yet. How about we support both cases? >>>>>>>>>>>>> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs >>>>>>>>>>>>> -> creates/overrides a built-in function and never >>> consideres >>>>>>> current >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but >>>> acceptable >>>>>> for >>>>>>>>>>>>> functions I guess. >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >>>>>>>>>>>>> -> creates/overrides a catalog function >>>>>>>>>>>>> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects >>>> (tables, >>>>>>> views) >>>>>>>>>>>>> except functions", this might change in the near future. >>> Take >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an >>>>> example. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Timo >>>>>>>>>>>>> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote: >>>>>>>>>>>>>> Hi Fabian, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable >>> thus I >>>>>>> didn't >>>>>>>>>>>>>> include that as a voting option, and the discussion is >>> mainly >>>>>>> between >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Re > However, it means that temp functions are >> differently >>>>>> treated >>>>>>>>>>> than >>>>>>>>>>>>>> other db objects. >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that >>>>>> functions >>>>>>>>>>> are >>>>>>>>>>>> a >>>>>>>>>>>>>> bit different from other objects - Flink don't have any >>> other >>>>>>>>>>> built-in >>>>>>>>>>>>>> objects (tables, views) except functions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Bowen >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Xuefu Zhang >>>>>>>>>>> >>>>>>>>>>> "In Honey We Trust!" >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Xuefu Zhang >>>>>> >>>>>> "In Honey We Trust!" >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Xuefu Zhang >>>> >>>> "In Honey We Trust!" >>>> >>> >> >> >> -- >> Xuefu Zhang >> >> "In Honey We Trust!" >> |
Looks like I'm the only person who is willing to +1 to #2 for now :-)
But I would suggest to change the keyword from GLOBAL to something like BUILTIN. I think #2 and #3 are almost the same proposal, just with different format to indicate whether it want to override built-in functions. My biggest reason to choose it is I want this behavior be consistent with temporal tables. I will give some examples to show the behavior and also make sure I'm not misunderstanding anything here. For most DBs, when user create a temporary table with: CREATE TEMPORARY TABLE t1 It's actually equivalent with: CREATE TEMPORARY TABLE `curent_db`.t1 If user change current database, they will not be able to access t1 without fully qualified name, .i.e db1.t1 (assuming db1 is current database when this temporary table is created). Only #2 and #3 followed this behavior and I would vote for this since this makes such behavior consistent through temporal tables and functions. Why I'm not voting for #3 is a special catalog and database just looks very hacky to me. It gave a imply that our built-in functions saved at a special catalog and database, which is actually not. Introducing a dedicated keyword like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and straightforward. One can argue that we should avoid introducing new keyword, but it's also very rare that a system can overwrite built-in functions. Since we decided to support this, introduce a new keyword is not a big deal IMO. Best, Kurt On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]> wrote: > Hi, > > It is a quite long discussion to follow and I hope I didn’t misunderstand > anything. From the proposals presented by Xuefu I would vote: > > -1 for #1 and #2 > +1 for #3 > > Besides #3 being IMO more general and more consistent, having qualified > names (#3) would help/make easier for someone to use cross > databases/catalogs queries (joining multiple data sets/streams). For > example with some functions to manipulate/clean up/convert the stored data > in different catalogs registered in the respective catalogs. > > Piotrek > > > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote: > > > > I agree with Xuefu that inconsistent handling with all the other objects > is > > not a big problem. > > > > Regarding to option#3, the special "system.system" namespace may confuse > > users. > > Users need to know the set of built-in function names to know when to use > > "system.system" namespace. > > What will happen if user registers a non-builtin function name under the > > "system.system" namespace? > > Besides, I think it doesn't solve the "explode" problem I mentioned at > the > > beginning of this thread. > > > > So here is my vote: > > > > +1 for #1 > > 0 for #2 > > -1 for #3 > > > > Best, > > Jark > > > > > > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: > > > >> @Dawid, Re: we also don't need additional referencing the specialcatalog > >> anywhere. > >> > >> True. But once we allow such reference, then user can do so in any > possible > >> place where a function name is expected, for which we have to handle. > >> That's a big difference, I think. > >> > >> Thanks, > >> Xuefu > >> > >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < > >> [hidden email]> > >> wrote: > >> > >>> @Bowen I am not suggesting introducing additional catalog. I think we > >> need > >>> to get rid of the current built-in catalog. > >>> > >>> @Xuefu in option #3 we also don't need additional referencing the > special > >>> catalog anywhere else besides in the CREATE statement. The resolution > >>> behaviour is exactly the same in both options. > >>> > >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: > >>> > >>>> Hi Dawid, > >>>> > >>>> "GLOBAL" is a temporary keyword that was given to the approach. It can > >> be > >>>> changed to something else for better. > >>>> > >>>> The difference between this and the #3 approach is that we only need > >> the > >>>> keyword for this create DDL. For other places (such as function > >>>> referencing), no keyword or special namespace is needed. > >>>> > >>>> Thanks, > >>>> Xuefu > >>>> > >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < > >>>> [hidden email]> > >>>> wrote: > >>>> > >>>>> Hi, > >>>>> I think it makes sense to start voting at this point. > >>>>> > >>>>> Option 1: Only 1-part identifiers > >>>>> PROS: > >>>>> - allows shadowing built-in functions > >>>>> CONS: > >>>>> - incosistent with all the other objects, both permanent & temporary > >>>>> - does not allow shadowing catalog functions > >>>>> > >>>>> Option 2: Special keyword for built-in function > >>>>> I think this is quite similar to the special catalog/db. The thing I > >> am > >>>>> strongly against in this proposal is the GLOBAL keyword. This keyword > >>>> has a > >>>>> meaning in rdbms systems and means a function that is present for a > >>>>> lifetime of a session in which it was created, but available in all > >>> other > >>>>> sessions. Therefore I really don't want to use this keyword in a > >>>> different > >>>>> context. > >>>>> > >>>>> Option 3: Special catalog/db > >>>>> > >>>>> PROS: > >>>>> - allows shadowing built-in functions > >>>>> - allows shadowing catalog functions > >>>>> - consistent with other objects > >>>>> CONS: > >>>>> - we introduce a special namespace for built-in functions > >>>>> > >>>>> I don't see a problem with introducing the special namespace. In the > >>> end > >>>> it > >>>>> is very similar to the keyword approach. In this case the catalog/db > >>>>> combination would be the "keyword" > >>>>> > >>>>> Therefore my votes: > >>>>> Option 1: -0 > >>>>> Option 2: -1 (I might change to +0 if we can come up with a better > >>>> keyword) > >>>>> Option 3: +1 > >>>>> > >>>>> Best, > >>>>> Dawid > >>>>> > >>>>> > >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: > >>>>> > >>>>>> Hi Aljoscha, > >>>>>> > >>>>>> Thanks for the summary and these are great questions to be > >> answered. > >>>> The > >>>>>> answer to your first question is clear: there is a general > >> agreement > >>> to > >>>>>> override built-in functions with temp functions. > >>>>>> > >>>>>> However, your second and third questions are sort of related, as a > >>>>> function > >>>>>> reference can be either just function name (like "func") or in the > >>> form > >>>>> or > >>>>>> "cat.db.func". When a reference is just function name, it can mean > >>>>> either a > >>>>>> built-in function or a function defined in the current cat/db. If > >> we > >>>>>> support overriding a built-in function with a temp function, such > >>>>>> overriding can also cover a function in the current cat/db. > >>>>>> > >>>>>> I think what Timo referred as "overriding a catalog function" > >> means a > >>>>> temp > >>>>>> function defined as "cat.db.func" overrides a catalog function > >> "func" > >>>> in > >>>>>> cat/db even if cat/db is not current. To support this, temp > >> function > >>>> has > >>>>> to > >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd > >>>>> questions > >>>>>> are related. The problem with such support is the ambiguity when > >> user > >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func > >>> ...". > >>>>>> Here "func" can means a global temp function, or a temp function in > >>>>> current > >>>>>> cat/db. If we can assume the former, this creates an inconsistency > >>>>> because > >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db. > >>> If > >>>> we > >>>>>> assume the latter, then there is no way for user to create a global > >>>> temp > >>>>>> function. > >>>>>> > >>>>>> Giving a special namespace for built-in functions may solve the > >>>> ambiguity > >>>>>> problem above, but it also introduces artificial catalog/database > >>> that > >>>>>> needs special treatment and pollutes the cleanness of the code. I > >>>> would > >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE > >>>>>> [GLOBAL] TEMPORARY FUNCTION func". > >>>>>> > >>>>>> Thus, I'd like to summarize a few candidate proposals for voting > >>>>> purposes: > >>>>>> > >>>>>> 1. Support only global, temporary functions without namespace. Such > >>>> temp > >>>>>> functions overrides built-in functions and catalog functions in > >>> current > >>>>>> cat/db. The resolution order is: temp functions -> built-in > >> functions > >>>> -> > >>>>>> catalog functions. (Partially or fully qualified functions has no > >>>>>> ambiguity!) > >>>>>> > >>>>>> 2. In addition to #1, support creating and referencing temporary > >>>>> functions > >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global > >>> temp > >>>>>> functions. The resolution order is: global temp functions -> > >> built-in > >>>>>> functions -> temp functions in current cat/db -> catalog function. > >>>>>> (Resolution for partially or fully qualified function reference is: > >>>> temp > >>>>>> functions -> persistent functions.) > >>>>>> > >>>>>> 3. In addition to #1, support creating and referencing temporary > >>>>> functions > >>>>>> associated with a cat/db with a special namespace for built-in > >>>> functions > >>>>>> and global temp functions. The resolution is the same as #2, except > >>>> that > >>>>>> the special namespace might be prefixed to a reference to a > >> built-in > >>>>>> function or global temp function. (In absence of the special > >>> namespace, > >>>>> the > >>>>>> resolution order is the same as in #2.) > >>>>>> > >>>>>> My personal preference is #1, given the unknown use case and > >>> introduced > >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative. > >>>> Thus, > >>>>>> my votes are: > >>>>>> > >>>>>> +1 for #1 > >>>>>> +0 for #2 > >>>>>> -1 for #3 > >>>>>> > >>>>>> Everyone, please cast your vote (in above format please!), or let > >> me > >>>> know > >>>>>> if you have more questions or other candidates. > >>>>>> > >>>>>> Thanks, > >>>>>> Xuefu > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < > >>> [hidden email]> > >>>>>> wrote: > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> I think this discussion and the one for FLIP-64 are very > >> connected. > >>>> To > >>>>>>> resolve the differences, think we have to think about the basic > >>>>>> principles > >>>>>>> and find consensus there. The basic questions I see are: > >>>>>>> > >>>>>>> - Do we want to support overriding builtin functions? > >>>>>>> - Do we want to support overriding catalog functions? > >>>>>>> - And then later: should temporary functions be tied to a > >>>>>>> catalog/database? > >>>>>>> > >>>>>>> I don’t have much to say about these, except that we should > >>> somewhat > >>>>>> stick > >>>>>>> to what the industry does. But I also understand that the > >> industry > >>> is > >>>>>>> already very divided on this. > >>>>>>> > >>>>>>> Best, > >>>>>>> Aljoscha > >>>>>>> > >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: > >>>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We > >>> are > >>>>>>> close to the truth. It will waste a lot of time if we resume the > >>>> topic > >>>>>> some > >>>>>>> time later. > >>>>>>>> > >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s > >>> “cat.db.fun” > >>>>> way > >>>>>>> to override a catalog function. > >>>>>>>> > >>>>>>>> I’m not sure about “system.system.fun”, it introduces a > >>> nonexistent > >>>>> cat > >>>>>>> & db? And we still need to do special treatment for the dedicated > >>>>>>> system.system cat & db? > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Jark > >>>>>>>> > >>>>>>>> > >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: > >>>>>>>>> > >>>>>>>>> Hi everyone, > >>>>>>>>> > >>>>>>>>> @Xuefu: I would like to avoid adding too many things > >>>> incrementally. > >>>>>>> Users should be able to override all catalog objects consistently > >>>>>> according > >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If > >>>>> functions > >>>>>>> are treated completely different, we need more code and special > >>>> cases. > >>>>>> From > >>>>>>> an implementation perspective, this topic only affects the lookup > >>>> logic > >>>>>>> which is rather low implementation effort which is why I would > >> like > >>>> to > >>>>>>> clarify the remaining items. As you said, we have a slight > >> consenus > >>>> on > >>>>>>> overriding built-in functions; we should also strive for reaching > >>>>>> consensus > >>>>>>> on the remaining topics. > >>>>>>>>> > >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog > >>> objects > >>>>>>> consistent and the overriding of built-in functions more > >> explicit. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Timo > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 17.09.19 11:59, kai wang wrote: > >>>>>>>>>> hi, everyone > >>>>>>>>>> I think this flip is very meaningful. it supports functions > >>> that > >>>>> can > >>>>>> be > >>>>>>>>>> shared by different catalogs and dbs, reducing the > >> duplication > >>> of > >>>>>>> functions. > >>>>>>>>>> > >>>>>>>>>> Our group based on flink's sql parser module implements > >> create > >>>>>> function > >>>>>>>>>> feature, stores the parsed function metadata and schema into > >>>> mysql, > >>>>>> and > >>>>>>>>>> also customizes the catalog, customizes sql-client to support > >>>>> custom > >>>>>>>>>> schemas and functions. Loaded, but the function is currently > >>>>> global, > >>>>>>> and is > >>>>>>>>>> not subdivided according to catalog and db. > >>>>>>>>>> > >>>>>>>>>> In addition, I very much hope to participate in the > >> development > >>>> of > >>>>>> this > >>>>>>>>>> flip, I have been paying attention to the community, but > >> found > >>> it > >>>>> is > >>>>>>> more > >>>>>>>>>> difficult to join. > >>>>>>>>>> thank you. > >>>>>>>>>> > >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: > >>>>>>>>>> > >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts. > >>>>>>>>>>> > >>>>>>>>>>> It seems to me that there is a general consensus on having > >>> temp > >>>>>>> functions > >>>>>>>>>>> that have no namespaces and overwrite built-in functions. > >> (As > >>> a > >>>>> side > >>>>>>> note > >>>>>>>>>>> for comparability, the current user defined functions are > >> all > >>>>>>> temporary and > >>>>>>>>>>> having no namespaces.) > >>>>>>>>>>> > >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced > >>> temp > >>>>>>> functions > >>>>>>>>>>> that can overwrite functions defined in a specific cat/db. > >>>>> However, > >>>>>>> this > >>>>>>>>>>> idea appears orthogonal to the former and can be added > >>>>>> incrementally. > >>>>>>>>>>> > >>>>>>>>>>> How about we first implement non-namespaced temp functions > >> now > >>>> and > >>>>>>> leave > >>>>>>>>>>> the door open for namespaced ones for later releases as the > >>>>>>> requirement > >>>>>>>>>>> might become more crystal? This also helps shorten the > >> debate > >>>> and > >>>>>>> allow us > >>>>>>>>>>> to make some progress along this direction. > >>>>>>>>>>> > >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the > >>>>>> temporary > >>>>>>> temp > >>>>>>>>>>> functions that don't have namespaces, my only concern is the > >>>>> special > >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as > >>> evident > >>>> in > >>>>>>> treating > >>>>>>>>>>> the built-in catalog currently. > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Xuefiu > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > >>>>>>>>>>> [hidden email]> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How > >>> about > >>>>> we > >>>>>>> have a > >>>>>>>>>>>> special namespace (catalog + database) for built-in > >> objects? > >>>> This > >>>>>>> catalog > >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting. > >>>>>>>>>>>> > >>>>>>>>>>>> Then users could still override built-in functions, if they > >>>> fully > >>>>>>> qualify > >>>>>>>>>>>> object with the built-in namespace, but by default the > >> common > >>>>> logic > >>>>>>> of > >>>>>>>>>>>> current dB & cat would be used. > >>>>>>>>>>>> > >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ... > >>>>>>>>>>>> registers temporary function in current cat & dB > >>>>>>>>>>>> > >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > >>>>>>>>>>>> registers temporary function in cat db > >>>>>>>>>>>> > >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ... > >>>>>>>>>>>> Overrides built-in function with temporary function > >>>>>>>>>>>> > >>>>>>>>>>>> The built-in/system namespace would not be writable for > >>>> permanent > >>>>>>>>>>> objects. > >>>>>>>>>>>> WDYT? > >>>>>>>>>>>> > >>>>>>>>>>>> This way I think we can have benefits of both solutions. > >>>>>>>>>>>> > >>>>>>>>>>>> Best, > >>>>>>>>>>>> Dawid > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < > >> [hidden email] > >>>> > >>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Hi Bowen, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I understand the potential benefit of overriding certain > >>>>> built-in > >>>>>>>>>>>>> functions. I'm open to such a feature if many people > >> agree. > >>>>>>> However, it > >>>>>>>>>>>>> would be great to still support overriding catalog > >> functions > >>>>> with > >>>>>>>>>>>>> temporary functions in order to prototype a query even > >>> though > >>>> a > >>>>>>>>>>>>> catalog/database might not be available currently or > >> should > >>>> not > >>>>> be > >>>>>>>>>>>>> modified yet. How about we support both cases? > >>>>>>>>>>>>> > >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs > >>>>>>>>>>>>> -> creates/overrides a built-in function and never > >>> consideres > >>>>>>> current > >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but > >>>> acceptable > >>>>>> for > >>>>>>>>>>>>> functions I guess. > >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > >>>>>>>>>>>>> -> creates/overrides a catalog function > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects > >>>> (tables, > >>>>>>> views) > >>>>>>>>>>>>> except functions", this might change in the near future. > >>> Take > >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > >>>>> example. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> Timo > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote: > >>>>>>>>>>>>>> Hi Fabian, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable > >>> thus I > >>>>>>> didn't > >>>>>>>>>>>>>> include that as a voting option, and the discussion is > >>> mainly > >>>>>>> between > >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Re > However, it means that temp functions are > >> differently > >>>>>> treated > >>>>>>>>>>> than > >>>>>>>>>>>>>> other db objects. > >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that > >>>>>> functions > >>>>>>>>>>> are > >>>>>>>>>>>> a > >>>>>>>>>>>>>> bit different from other objects - Flink don't have any > >>> other > >>>>>>>>>>> built-in > >>>>>>>>>>>>>> objects (tables, views) except functions. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>> Bowen > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Xuefu Zhang > >>>>>>>>>>> > >>>>>>>>>>> "In Honey We Trust!" > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> Xuefu Zhang > >>>>>> > >>>>>> "In Honey We Trust!" > >>>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> Xuefu Zhang > >>>> > >>>> "In Honey We Trust!" > >>>> > >>> > >> > >> > >> -- > >> Xuefu Zhang > >> > >> "In Honey We Trust!" > >> > > |
And let me make my vote complete:
-1 for #1 +1 for #2 with different keyword -0 for #3 Best, Kurt On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote: > Looks like I'm the only person who is willing to +1 to #2 for now :-) > But I would suggest to change the keyword from GLOBAL to > something like BUILTIN. > > I think #2 and #3 are almost the same proposal, just with different > format to indicate whether it want to override built-in functions. > > My biggest reason to choose it is I want this behavior be consistent > with temporal tables. I will give some examples to show the behavior > and also make sure I'm not misunderstanding anything here. > > For most DBs, when user create a temporary table with: > > CREATE TEMPORARY TABLE t1 > > It's actually equivalent with: > > CREATE TEMPORARY TABLE `curent_db`.t1 > > If user change current database, they will not be able to access t1 without > fully qualified name, .i.e db1.t1 (assuming db1 is current database when > this temporary table is created). > > Only #2 and #3 followed this behavior and I would vote for this since this > makes such behavior consistent through temporal tables and functions. > > Why I'm not voting for #3 is a special catalog and database just looks very > hacky to me. It gave a imply that our built-in functions saved at a > special > catalog and database, which is actually not. Introducing a dedicated > keyword > like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and > straightforward. One can argue that we should avoid introducing new > keyword, > but it's also very rare that a system can overwrite built-in functions. > Since we > decided to support this, introduce a new keyword is not a big deal IMO. > > Best, > Kurt > > > On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]> > wrote: > >> Hi, >> >> It is a quite long discussion to follow and I hope I didn’t misunderstand >> anything. From the proposals presented by Xuefu I would vote: >> >> -1 for #1 and #2 >> +1 for #3 >> >> Besides #3 being IMO more general and more consistent, having qualified >> names (#3) would help/make easier for someone to use cross >> databases/catalogs queries (joining multiple data sets/streams). For >> example with some functions to manipulate/clean up/convert the stored data >> in different catalogs registered in the respective catalogs. >> >> Piotrek >> >> > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote: >> > >> > I agree with Xuefu that inconsistent handling with all the other >> objects is >> > not a big problem. >> > >> > Regarding to option#3, the special "system.system" namespace may confuse >> > users. >> > Users need to know the set of built-in function names to know when to >> use >> > "system.system" namespace. >> > What will happen if user registers a non-builtin function name under the >> > "system.system" namespace? >> > Besides, I think it doesn't solve the "explode" problem I mentioned at >> the >> > beginning of this thread. >> > >> > So here is my vote: >> > >> > +1 for #1 >> > 0 for #2 >> > -1 for #3 >> > >> > Best, >> > Jark >> > >> > >> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: >> > >> >> @Dawid, Re: we also don't need additional referencing the >> specialcatalog >> >> anywhere. >> >> >> >> True. But once we allow such reference, then user can do so in any >> possible >> >> place where a function name is expected, for which we have to handle. >> >> That's a big difference, I think. >> >> >> >> Thanks, >> >> Xuefu >> >> >> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < >> >> [hidden email]> >> >> wrote: >> >> >> >>> @Bowen I am not suggesting introducing additional catalog. I think we >> >> need >> >>> to get rid of the current built-in catalog. >> >>> >> >>> @Xuefu in option #3 we also don't need additional referencing the >> special >> >>> catalog anywhere else besides in the CREATE statement. The resolution >> >>> behaviour is exactly the same in both options. >> >>> >> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: >> >>> >> >>>> Hi Dawid, >> >>>> >> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It >> can >> >> be >> >>>> changed to something else for better. >> >>>> >> >>>> The difference between this and the #3 approach is that we only need >> >> the >> >>>> keyword for this create DDL. For other places (such as function >> >>>> referencing), no keyword or special namespace is needed. >> >>>> >> >>>> Thanks, >> >>>> Xuefu >> >>>> >> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < >> >>>> [hidden email]> >> >>>> wrote: >> >>>> >> >>>>> Hi, >> >>>>> I think it makes sense to start voting at this point. >> >>>>> >> >>>>> Option 1: Only 1-part identifiers >> >>>>> PROS: >> >>>>> - allows shadowing built-in functions >> >>>>> CONS: >> >>>>> - incosistent with all the other objects, both permanent & temporary >> >>>>> - does not allow shadowing catalog functions >> >>>>> >> >>>>> Option 2: Special keyword for built-in function >> >>>>> I think this is quite similar to the special catalog/db. The thing I >> >> am >> >>>>> strongly against in this proposal is the GLOBAL keyword. This >> keyword >> >>>> has a >> >>>>> meaning in rdbms systems and means a function that is present for a >> >>>>> lifetime of a session in which it was created, but available in all >> >>> other >> >>>>> sessions. Therefore I really don't want to use this keyword in a >> >>>> different >> >>>>> context. >> >>>>> >> >>>>> Option 3: Special catalog/db >> >>>>> >> >>>>> PROS: >> >>>>> - allows shadowing built-in functions >> >>>>> - allows shadowing catalog functions >> >>>>> - consistent with other objects >> >>>>> CONS: >> >>>>> - we introduce a special namespace for built-in functions >> >>>>> >> >>>>> I don't see a problem with introducing the special namespace. In the >> >>> end >> >>>> it >> >>>>> is very similar to the keyword approach. In this case the catalog/db >> >>>>> combination would be the "keyword" >> >>>>> >> >>>>> Therefore my votes: >> >>>>> Option 1: -0 >> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better >> >>>> keyword) >> >>>>> Option 3: +1 >> >>>>> >> >>>>> Best, >> >>>>> Dawid >> >>>>> >> >>>>> >> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: >> >>>>> >> >>>>>> Hi Aljoscha, >> >>>>>> >> >>>>>> Thanks for the summary and these are great questions to be >> >> answered. >> >>>> The >> >>>>>> answer to your first question is clear: there is a general >> >> agreement >> >>> to >> >>>>>> override built-in functions with temp functions. >> >>>>>> >> >>>>>> However, your second and third questions are sort of related, as a >> >>>>> function >> >>>>>> reference can be either just function name (like "func") or in the >> >>> form >> >>>>> or >> >>>>>> "cat.db.func". When a reference is just function name, it can mean >> >>>>> either a >> >>>>>> built-in function or a function defined in the current cat/db. If >> >> we >> >>>>>> support overriding a built-in function with a temp function, such >> >>>>>> overriding can also cover a function in the current cat/db. >> >>>>>> >> >>>>>> I think what Timo referred as "overriding a catalog function" >> >> means a >> >>>>> temp >> >>>>>> function defined as "cat.db.func" overrides a catalog function >> >> "func" >> >>>> in >> >>>>>> cat/db even if cat/db is not current. To support this, temp >> >> function >> >>>> has >> >>>>> to >> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd >> >>>>> questions >> >>>>>> are related. The problem with such support is the ambiguity when >> >> user >> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func >> >>> ...". >> >>>>>> Here "func" can means a global temp function, or a temp function in >> >>>>> current >> >>>>>> cat/db. If we can assume the former, this creates an inconsistency >> >>>>> because >> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db. >> >>> If >> >>>> we >> >>>>>> assume the latter, then there is no way for user to create a global >> >>>> temp >> >>>>>> function. >> >>>>>> >> >>>>>> Giving a special namespace for built-in functions may solve the >> >>>> ambiguity >> >>>>>> problem above, but it also introduces artificial catalog/database >> >>> that >> >>>>>> needs special treatment and pollutes the cleanness of the code. I >> >>>> would >> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE >> >>>>>> [GLOBAL] TEMPORARY FUNCTION func". >> >>>>>> >> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting >> >>>>> purposes: >> >>>>>> >> >>>>>> 1. Support only global, temporary functions without namespace. Such >> >>>> temp >> >>>>>> functions overrides built-in functions and catalog functions in >> >>> current >> >>>>>> cat/db. The resolution order is: temp functions -> built-in >> >> functions >> >>>> -> >> >>>>>> catalog functions. (Partially or fully qualified functions has no >> >>>>>> ambiguity!) >> >>>>>> >> >>>>>> 2. In addition to #1, support creating and referencing temporary >> >>>>> functions >> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global >> >>> temp >> >>>>>> functions. The resolution order is: global temp functions -> >> >> built-in >> >>>>>> functions -> temp functions in current cat/db -> catalog function. >> >>>>>> (Resolution for partially or fully qualified function reference is: >> >>>> temp >> >>>>>> functions -> persistent functions.) >> >>>>>> >> >>>>>> 3. In addition to #1, support creating and referencing temporary >> >>>>> functions >> >>>>>> associated with a cat/db with a special namespace for built-in >> >>>> functions >> >>>>>> and global temp functions. The resolution is the same as #2, except >> >>>> that >> >>>>>> the special namespace might be prefixed to a reference to a >> >> built-in >> >>>>>> function or global temp function. (In absence of the special >> >>> namespace, >> >>>>> the >> >>>>>> resolution order is the same as in #2.) >> >>>>>> >> >>>>>> My personal preference is #1, given the unknown use case and >> >>> introduced >> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative. >> >>>> Thus, >> >>>>>> my votes are: >> >>>>>> >> >>>>>> +1 for #1 >> >>>>>> +0 for #2 >> >>>>>> -1 for #3 >> >>>>>> >> >>>>>> Everyone, please cast your vote (in above format please!), or let >> >> me >> >>>> know >> >>>>>> if you have more questions or other candidates. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Xuefu >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < >> >>> [hidden email]> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Hi, >> >>>>>>> >> >>>>>>> I think this discussion and the one for FLIP-64 are very >> >> connected. >> >>>> To >> >>>>>>> resolve the differences, think we have to think about the basic >> >>>>>> principles >> >>>>>>> and find consensus there. The basic questions I see are: >> >>>>>>> >> >>>>>>> - Do we want to support overriding builtin functions? >> >>>>>>> - Do we want to support overriding catalog functions? >> >>>>>>> - And then later: should temporary functions be tied to a >> >>>>>>> catalog/database? >> >>>>>>> >> >>>>>>> I don’t have much to say about these, except that we should >> >>> somewhat >> >>>>>> stick >> >>>>>>> to what the industry does. But I also understand that the >> >> industry >> >>> is >> >>>>>>> already very divided on this. >> >>>>>>> >> >>>>>>> Best, >> >>>>>>> Aljoscha >> >>>>>>> >> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: >> >>>>>>>> >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We >> >>> are >> >>>>>>> close to the truth. It will waste a lot of time if we resume the >> >>>> topic >> >>>>>> some >> >>>>>>> time later. >> >>>>>>>> >> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s >> >>> “cat.db.fun” >> >>>>> way >> >>>>>>> to override a catalog function. >> >>>>>>>> >> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a >> >>> nonexistent >> >>>>> cat >> >>>>>>> & db? And we still need to do special treatment for the dedicated >> >>>>>>> system.system cat & db? >> >>>>>>>> >> >>>>>>>> Best, >> >>>>>>>> Jark >> >>>>>>>> >> >>>>>>>> >> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >> >>>>>>>>> >> >>>>>>>>> Hi everyone, >> >>>>>>>>> >> >>>>>>>>> @Xuefu: I would like to avoid adding too many things >> >>>> incrementally. >> >>>>>>> Users should be able to override all catalog objects consistently >> >>>>>> according >> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If >> >>>>> functions >> >>>>>>> are treated completely different, we need more code and special >> >>>> cases. >> >>>>>> From >> >>>>>>> an implementation perspective, this topic only affects the lookup >> >>>> logic >> >>>>>>> which is rather low implementation effort which is why I would >> >> like >> >>>> to >> >>>>>>> clarify the remaining items. As you said, we have a slight >> >> consenus >> >>>> on >> >>>>>>> overriding built-in functions; we should also strive for reaching >> >>>>>> consensus >> >>>>>>> on the remaining topics. >> >>>>>>>>> >> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog >> >>> objects >> >>>>>>> consistent and the overriding of built-in functions more >> >> explicit. >> >>>>>>>>> >> >>>>>>>>> Thanks, >> >>>>>>>>> Timo >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On 17.09.19 11:59, kai wang wrote: >> >>>>>>>>>> hi, everyone >> >>>>>>>>>> I think this flip is very meaningful. it supports functions >> >>> that >> >>>>> can >> >>>>>> be >> >>>>>>>>>> shared by different catalogs and dbs, reducing the >> >> duplication >> >>> of >> >>>>>>> functions. >> >>>>>>>>>> >> >>>>>>>>>> Our group based on flink's sql parser module implements >> >> create >> >>>>>> function >> >>>>>>>>>> feature, stores the parsed function metadata and schema into >> >>>> mysql, >> >>>>>> and >> >>>>>>>>>> also customizes the catalog, customizes sql-client to support >> >>>>> custom >> >>>>>>>>>> schemas and functions. Loaded, but the function is currently >> >>>>> global, >> >>>>>>> and is >> >>>>>>>>>> not subdivided according to catalog and db. >> >>>>>>>>>> >> >>>>>>>>>> In addition, I very much hope to participate in the >> >> development >> >>>> of >> >>>>>> this >> >>>>>>>>>> flip, I have been paying attention to the community, but >> >> found >> >>> it >> >>>>> is >> >>>>>>> more >> >>>>>>>>>> difficult to join. >> >>>>>>>>>> thank you. >> >>>>>>>>>> >> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >> >>>>>>>>>> >> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts. >> >>>>>>>>>>> >> >>>>>>>>>>> It seems to me that there is a general consensus on having >> >>> temp >> >>>>>>> functions >> >>>>>>>>>>> that have no namespaces and overwrite built-in functions. >> >> (As >> >>> a >> >>>>> side >> >>>>>>> note >> >>>>>>>>>>> for comparability, the current user defined functions are >> >> all >> >>>>>>> temporary and >> >>>>>>>>>>> having no namespaces.) >> >>>>>>>>>>> >> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced >> >>> temp >> >>>>>>> functions >> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db. >> >>>>> However, >> >>>>>>> this >> >>>>>>>>>>> idea appears orthogonal to the former and can be added >> >>>>>> incrementally. >> >>>>>>>>>>> >> >>>>>>>>>>> How about we first implement non-namespaced temp functions >> >> now >> >>>> and >> >>>>>>> leave >> >>>>>>>>>>> the door open for namespaced ones for later releases as the >> >>>>>>> requirement >> >>>>>>>>>>> might become more crystal? This also helps shorten the >> >> debate >> >>>> and >> >>>>>>> allow us >> >>>>>>>>>>> to make some progress along this direction. >> >>>>>>>>>>> >> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the >> >>>>>> temporary >> >>>>>>> temp >> >>>>>>>>>>> functions that don't have namespaces, my only concern is the >> >>>>> special >> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as >> >>> evident >> >>>> in >> >>>>>>> treating >> >>>>>>>>>>> the built-in catalog currently. >> >>>>>>>>>>> >> >>>>>>>>>>> Thanks, >> >>>>>>>>>>> Xuefiu >> >>>>>>>>>>> >> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >> >>>>>>>>>>> [hidden email]> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi, >> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How >> >>> about >> >>>>> we >> >>>>>>> have a >> >>>>>>>>>>>> special namespace (catalog + database) for built-in >> >> objects? >> >>>> This >> >>>>>>> catalog >> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Then users could still override built-in functions, if they >> >>>> fully >> >>>>>>> qualify >> >>>>>>>>>>>> object with the built-in namespace, but by default the >> >> common >> >>>>> logic >> >>>>>>> of >> >>>>>>>>>>>> current dB & cat would be used. >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ... >> >>>>>>>>>>>> registers temporary function in current cat & dB >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >> >>>>>>>>>>>> registers temporary function in cat db >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ... >> >>>>>>>>>>>> Overrides built-in function with temporary function >> >>>>>>>>>>>> >> >>>>>>>>>>>> The built-in/system namespace would not be writable for >> >>>> permanent >> >>>>>>>>>>> objects. >> >>>>>>>>>>>> WDYT? >> >>>>>>>>>>>> >> >>>>>>>>>>>> This way I think we can have benefits of both solutions. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Best, >> >>>>>>>>>>>> Dawid >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < >> >> [hidden email] >> >>>> >> >>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>>> Hi Bowen, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> I understand the potential benefit of overriding certain >> >>>>> built-in >> >>>>>>>>>>>>> functions. I'm open to such a feature if many people >> >> agree. >> >>>>>>> However, it >> >>>>>>>>>>>>> would be great to still support overriding catalog >> >> functions >> >>>>> with >> >>>>>>>>>>>>> temporary functions in order to prototype a query even >> >>> though >> >>>> a >> >>>>>>>>>>>>> catalog/database might not be available currently or >> >> should >> >>>> not >> >>>>> be >> >>>>>>>>>>>>> modified yet. How about we support both cases? >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs >> >>>>>>>>>>>>> -> creates/overrides a built-in function and never >> >>> consideres >> >>>>>>> current >> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but >> >>>> acceptable >> >>>>>> for >> >>>>>>>>>>>>> functions I guess. >> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >> >>>>>>>>>>>>> -> creates/overrides a catalog function >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects >> >>>> (tables, >> >>>>>>> views) >> >>>>>>>>>>>>> except functions", this might change in the near future. >> >>> Take >> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an >> >>>>> example. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>> Timo >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote: >> >>>>>>>>>>>>>> Hi Fabian, >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable >> >>> thus I >> >>>>>>> didn't >> >>>>>>>>>>>>>> include that as a voting option, and the discussion is >> >>> mainly >> >>>>>>> between >> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Re > However, it means that temp functions are >> >> differently >> >>>>>> treated >> >>>>>>>>>>> than >> >>>>>>>>>>>>>> other db objects. >> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that >> >>>>>> functions >> >>>>>>>>>>> are >> >>>>>>>>>>>> a >> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any >> >>> other >> >>>>>>>>>>> built-in >> >>>>>>>>>>>>>> objects (tables, views) except functions. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Cheers, >> >>>>>>>>>>>>>> Bowen >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> Xuefu Zhang >> >>>>>>>>>>> >> >>>>>>>>>>> "In Honey We Trust!" >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Xuefu Zhang >> >>>>>> >> >>>>>> "In Honey We Trust!" >> >>>>>> >> >>>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Xuefu Zhang >> >>>> >> >>>> "In Honey We Trust!" >> >>>> >> >>> >> >> >> >> >> >> -- >> >> Xuefu Zhang >> >> >> >> "In Honey We Trust!" >> >> >> >> |
I know Hive and Spark can shadow built-in functions by temporary function.
Mysql, Oracle, Sql server can not shadow. User can use full names to access functions instead of shadowing. So I think it is a completely new thing, and the direct way to deal with new things is to add new grammar. So, +1 for #2, +0 for #3, -1 for #1 Best, Jingsong Lee ------------------------------------------------------------------ From:Kurt Young <[hidden email]> Send Time:2019年9月19日(星期四) 16:43 To:dev <[hidden email]> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog And let me make my vote complete: -1 for #1 +1 for #2 with different keyword -0 for #3 Best, Kurt On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote: > Looks like I'm the only person who is willing to +1 to #2 for now :-) > But I would suggest to change the keyword from GLOBAL to > something like BUILTIN. > > I think #2 and #3 are almost the same proposal, just with different > format to indicate whether it want to override built-in functions. > > My biggest reason to choose it is I want this behavior be consistent > with temporal tables. I will give some examples to show the behavior > and also make sure I'm not misunderstanding anything here. > > For most DBs, when user create a temporary table with: > > CREATE TEMPORARY TABLE t1 > > It's actually equivalent with: > > CREATE TEMPORARY TABLE `curent_db`.t1 > > If user change current database, they will not be able to access t1 without > fully qualified name, .i.e db1.t1 (assuming db1 is current database when > this temporary table is created). > > Only #2 and #3 followed this behavior and I would vote for this since this > makes such behavior consistent through temporal tables and functions. > > Why I'm not voting for #3 is a special catalog and database just looks very > hacky to me. It gave a imply that our built-in functions saved at a > special > catalog and database, which is actually not. Introducing a dedicated > keyword > like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and > straightforward. One can argue that we should avoid introducing new > keyword, > but it's also very rare that a system can overwrite built-in functions. > Since we > decided to support this, introduce a new keyword is not a big deal IMO. > > Best, > Kurt > > > On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]> > wrote: > >> Hi, >> >> It is a quite long discussion to follow and I hope I didn’t misunderstand >> anything. From the proposals presented by Xuefu I would vote: >> >> -1 for #1 and #2 >> +1 for #3 >> >> Besides #3 being IMO more general and more consistent, having qualified >> names (#3) would help/make easier for someone to use cross >> databases/catalogs queries (joining multiple data sets/streams). For >> example with some functions to manipulate/clean up/convert the stored data >> in different catalogs registered in the respective catalogs. >> >> Piotrek >> >> > On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote: >> > >> > I agree with Xuefu that inconsistent handling with all the other >> objects is >> > not a big problem. >> > >> > Regarding to option#3, the special "system.system" namespace may confuse >> > users. >> > Users need to know the set of built-in function names to know when to >> use >> > "system.system" namespace. >> > What will happen if user registers a non-builtin function name under the >> > "system.system" namespace? >> > Besides, I think it doesn't solve the "explode" problem I mentioned at >> the >> > beginning of this thread. >> > >> > So here is my vote: >> > >> > +1 for #1 >> > 0 for #2 >> > -1 for #3 >> > >> > Best, >> > Jark >> > >> > >> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: >> > >> >> @Dawid, Re: we also don't need additional referencing the >> specialcatalog >> >> anywhere. >> >> >> >> True. But once we allow such reference, then user can do so in any >> possible >> >> place where a function name is expected, for which we have to handle. >> >> That's a big difference, I think. >> >> >> >> Thanks, >> >> Xuefu >> >> >> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < >> >> [hidden email]> >> >> wrote: >> >> >> >>> @Bowen I am not suggesting introducing additional catalog. I think we >> >> need >> >>> to get rid of the current built-in catalog. >> >>> >> >>> @Xuefu in option #3 we also don't need additional referencing the >> special >> >>> catalog anywhere else besides in the CREATE statement. The resolution >> >>> behaviour is exactly the same in both options. >> >>> >> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: >> >>> >> >>>> Hi Dawid, >> >>>> >> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It >> can >> >> be >> >>>> changed to something else for better. >> >>>> >> >>>> The difference between this and the #3 approach is that we only need >> >> the >> >>>> keyword for this create DDL. For other places (such as function >> >>>> referencing), no keyword or special namespace is needed. >> >>>> >> >>>> Thanks, >> >>>> Xuefu >> >>>> >> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < >> >>>> [hidden email]> >> >>>> wrote: >> >>>> >> >>>>> Hi, >> >>>>> I think it makes sense to start voting at this point. >> >>>>> >> >>>>> Option 1: Only 1-part identifiers >> >>>>> PROS: >> >>>>> - allows shadowing built-in functions >> >>>>> CONS: >> >>>>> - incosistent with all the other objects, both permanent & temporary >> >>>>> - does not allow shadowing catalog functions >> >>>>> >> >>>>> Option 2: Special keyword for built-in function >> >>>>> I think this is quite similar to the special catalog/db. The thing I >> >> am >> >>>>> strongly against in this proposal is the GLOBAL keyword. This >> keyword >> >>>> has a >> >>>>> meaning in rdbms systems and means a function that is present for a >> >>>>> lifetime of a session in which it was created, but available in all >> >>> other >> >>>>> sessions. Therefore I really don't want to use this keyword in a >> >>>> different >> >>>>> context. >> >>>>> >> >>>>> Option 3: Special catalog/db >> >>>>> >> >>>>> PROS: >> >>>>> - allows shadowing built-in functions >> >>>>> - allows shadowing catalog functions >> >>>>> - consistent with other objects >> >>>>> CONS: >> >>>>> - we introduce a special namespace for built-in functions >> >>>>> >> >>>>> I don't see a problem with introducing the special namespace. In the >> >>> end >> >>>> it >> >>>>> is very similar to the keyword approach. In this case the catalog/db >> >>>>> combination would be the "keyword" >> >>>>> >> >>>>> Therefore my votes: >> >>>>> Option 1: -0 >> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better >> >>>> keyword) >> >>>>> Option 3: +1 >> >>>>> >> >>>>> Best, >> >>>>> Dawid >> >>>>> >> >>>>> >> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: >> >>>>> >> >>>>>> Hi Aljoscha, >> >>>>>> >> >>>>>> Thanks for the summary and these are great questions to be >> >> answered. >> >>>> The >> >>>>>> answer to your first question is clear: there is a general >> >> agreement >> >>> to >> >>>>>> override built-in functions with temp functions. >> >>>>>> >> >>>>>> However, your second and third questions are sort of related, as a >> >>>>> function >> >>>>>> reference can be either just function name (like "func") or in the >> >>> form >> >>>>> or >> >>>>>> "cat.db.func". When a reference is just function name, it can mean >> >>>>> either a >> >>>>>> built-in function or a function defined in the current cat/db. If >> >> we >> >>>>>> support overriding a built-in function with a temp function, such >> >>>>>> overriding can also cover a function in the current cat/db. >> >>>>>> >> >>>>>> I think what Timo referred as "overriding a catalog function" >> >> means a >> >>>>> temp >> >>>>>> function defined as "cat.db.func" overrides a catalog function >> >> "func" >> >>>> in >> >>>>>> cat/db even if cat/db is not current. To support this, temp >> >> function >> >>>> has >> >>>>> to >> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd >> >>>>> questions >> >>>>>> are related. The problem with such support is the ambiguity when >> >> user >> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func >> >>> ...". >> >>>>>> Here "func" can means a global temp function, or a temp function in >> >>>>> current >> >>>>>> cat/db. If we can assume the former, this creates an inconsistency >> >>>>> because >> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db. >> >>> If >> >>>> we >> >>>>>> assume the latter, then there is no way for user to create a global >> >>>> temp >> >>>>>> function. >> >>>>>> >> >>>>>> Giving a special namespace for built-in functions may solve the >> >>>> ambiguity >> >>>>>> problem above, but it also introduces artificial catalog/database >> >>> that >> >>>>>> needs special treatment and pollutes the cleanness of the code. I >> >>>> would >> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE >> >>>>>> [GLOBAL] TEMPORARY FUNCTION func". >> >>>>>> >> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting >> >>>>> purposes: >> >>>>>> >> >>>>>> 1. Support only global, temporary functions without namespace. Such >> >>>> temp >> >>>>>> functions overrides built-in functions and catalog functions in >> >>> current >> >>>>>> cat/db. The resolution order is: temp functions -> built-in >> >> functions >> >>>> -> >> >>>>>> catalog functions. (Partially or fully qualified functions has no >> >>>>>> ambiguity!) >> >>>>>> >> >>>>>> 2. In addition to #1, support creating and referencing temporary >> >>>>> functions >> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global >> >>> temp >> >>>>>> functions. The resolution order is: global temp functions -> >> >> built-in >> >>>>>> functions -> temp functions in current cat/db -> catalog function. >> >>>>>> (Resolution for partially or fully qualified function reference is: >> >>>> temp >> >>>>>> functions -> persistent functions.) >> >>>>>> >> >>>>>> 3. In addition to #1, support creating and referencing temporary >> >>>>> functions >> >>>>>> associated with a cat/db with a special namespace for built-in >> >>>> functions >> >>>>>> and global temp functions. The resolution is the same as #2, except >> >>>> that >> >>>>>> the special namespace might be prefixed to a reference to a >> >> built-in >> >>>>>> function or global temp function. (In absence of the special >> >>> namespace, >> >>>>> the >> >>>>>> resolution order is the same as in #2.) >> >>>>>> >> >>>>>> My personal preference is #1, given the unknown use case and >> >>> introduced >> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative. >> >>>> Thus, >> >>>>>> my votes are: >> >>>>>> >> >>>>>> +1 for #1 >> >>>>>> +0 for #2 >> >>>>>> -1 for #3 >> >>>>>> >> >>>>>> Everyone, please cast your vote (in above format please!), or let >> >> me >> >>>> know >> >>>>>> if you have more questions or other candidates. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Xuefu >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < >> >>> [hidden email]> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Hi, >> >>>>>>> >> >>>>>>> I think this discussion and the one for FLIP-64 are very >> >> connected. >> >>>> To >> >>>>>>> resolve the differences, think we have to think about the basic >> >>>>>> principles >> >>>>>>> and find consensus there. The basic questions I see are: >> >>>>>>> >> >>>>>>> - Do we want to support overriding builtin functions? >> >>>>>>> - Do we want to support overriding catalog functions? >> >>>>>>> - And then later: should temporary functions be tied to a >> >>>>>>> catalog/database? >> >>>>>>> >> >>>>>>> I don’t have much to say about these, except that we should >> >>> somewhat >> >>>>>> stick >> >>>>>>> to what the industry does. But I also understand that the >> >> industry >> >>> is >> >>>>>>> already very divided on this. >> >>>>>>> >> >>>>>>> Best, >> >>>>>>> Aljoscha >> >>>>>>> >> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: >> >>>>>>>> >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We >> >>> are >> >>>>>>> close to the truth. It will waste a lot of time if we resume the >> >>>> topic >> >>>>>> some >> >>>>>>> time later. >> >>>>>>>> >> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s >> >>> “cat.db.fun” >> >>>>> way >> >>>>>>> to override a catalog function. >> >>>>>>>> >> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a >> >>> nonexistent >> >>>>> cat >> >>>>>>> & db? And we still need to do special treatment for the dedicated >> >>>>>>> system.system cat & db? >> >>>>>>>> >> >>>>>>>> Best, >> >>>>>>>> Jark >> >>>>>>>> >> >>>>>>>> >> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >> >>>>>>>>> >> >>>>>>>>> Hi everyone, >> >>>>>>>>> >> >>>>>>>>> @Xuefu: I would like to avoid adding too many things >> >>>> incrementally. >> >>>>>>> Users should be able to override all catalog objects consistently >> >>>>>> according >> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If >> >>>>> functions >> >>>>>>> are treated completely different, we need more code and special >> >>>> cases. >> >>>>>> From >> >>>>>>> an implementation perspective, this topic only affects the lookup >> >>>> logic >> >>>>>>> which is rather low implementation effort which is why I would >> >> like >> >>>> to >> >>>>>>> clarify the remaining items. As you said, we have a slight >> >> consenus >> >>>> on >> >>>>>>> overriding built-in functions; we should also strive for reaching >> >>>>>> consensus >> >>>>>>> on the remaining topics. >> >>>>>>>>> >> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog >> >>> objects >> >>>>>>> consistent and the overriding of built-in functions more >> >> explicit. >> >>>>>>>>> >> >>>>>>>>> Thanks, >> >>>>>>>>> Timo >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On 17.09.19 11:59, kai wang wrote: >> >>>>>>>>>> hi, everyone >> >>>>>>>>>> I think this flip is very meaningful. it supports functions >> >>> that >> >>>>> can >> >>>>>> be >> >>>>>>>>>> shared by different catalogs and dbs, reducing the >> >> duplication >> >>> of >> >>>>>>> functions. >> >>>>>>>>>> >> >>>>>>>>>> Our group based on flink's sql parser module implements >> >> create >> >>>>>> function >> >>>>>>>>>> feature, stores the parsed function metadata and schema into >> >>>> mysql, >> >>>>>> and >> >>>>>>>>>> also customizes the catalog, customizes sql-client to support >> >>>>> custom >> >>>>>>>>>> schemas and functions. Loaded, but the function is currently >> >>>>> global, >> >>>>>>> and is >> >>>>>>>>>> not subdivided according to catalog and db. >> >>>>>>>>>> >> >>>>>>>>>> In addition, I very much hope to participate in the >> >> development >> >>>> of >> >>>>>> this >> >>>>>>>>>> flip, I have been paying attention to the community, but >> >> found >> >>> it >> >>>>> is >> >>>>>>> more >> >>>>>>>>>> difficult to join. >> >>>>>>>>>> thank you. >> >>>>>>>>>> >> >>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >> >>>>>>>>>> >> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts. >> >>>>>>>>>>> >> >>>>>>>>>>> It seems to me that there is a general consensus on having >> >>> temp >> >>>>>>> functions >> >>>>>>>>>>> that have no namespaces and overwrite built-in functions. >> >> (As >> >>> a >> >>>>> side >> >>>>>>> note >> >>>>>>>>>>> for comparability, the current user defined functions are >> >> all >> >>>>>>> temporary and >> >>>>>>>>>>> having no namespaces.) >> >>>>>>>>>>> >> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced >> >>> temp >> >>>>>>> functions >> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db. >> >>>>> However, >> >>>>>>> this >> >>>>>>>>>>> idea appears orthogonal to the former and can be added >> >>>>>> incrementally. >> >>>>>>>>>>> >> >>>>>>>>>>> How about we first implement non-namespaced temp functions >> >> now >> >>>> and >> >>>>>>> leave >> >>>>>>>>>>> the door open for namespaced ones for later releases as the >> >>>>>>> requirement >> >>>>>>>>>>> might become more crystal? This also helps shorten the >> >> debate >> >>>> and >> >>>>>>> allow us >> >>>>>>>>>>> to make some progress along this direction. >> >>>>>>>>>>> >> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the >> >>>>>> temporary >> >>>>>>> temp >> >>>>>>>>>>> functions that don't have namespaces, my only concern is the >> >>>>> special >> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as >> >>> evident >> >>>> in >> >>>>>>> treating >> >>>>>>>>>>> the built-in catalog currently. >> >>>>>>>>>>> >> >>>>>>>>>>> Thanks, >> >>>>>>>>>>> Xuefiu >> >>>>>>>>>>> >> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >> >>>>>>>>>>> [hidden email]> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi, >> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How >> >>> about >> >>>>> we >> >>>>>>> have a >> >>>>>>>>>>>> special namespace (catalog + database) for built-in >> >> objects? >> >>>> This >> >>>>>>> catalog >> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Then users could still override built-in functions, if they >> >>>> fully >> >>>>>>> qualify >> >>>>>>>>>>>> object with the built-in namespace, but by default the >> >> common >> >>>>> logic >> >>>>>>> of >> >>>>>>>>>>>> current dB & cat would be used. >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ... >> >>>>>>>>>>>> registers temporary function in current cat & dB >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >> >>>>>>>>>>>> registers temporary function in cat db >> >>>>>>>>>>>> >> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ... >> >>>>>>>>>>>> Overrides built-in function with temporary function >> >>>>>>>>>>>> >> >>>>>>>>>>>> The built-in/system namespace would not be writable for >> >>>> permanent >> >>>>>>>>>>> objects. >> >>>>>>>>>>>> WDYT? >> >>>>>>>>>>>> >> >>>>>>>>>>>> This way I think we can have benefits of both solutions. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Best, >> >>>>>>>>>>>> Dawid >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < >> >> [hidden email] >> >>>> >> >>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>>> Hi Bowen, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> I understand the potential benefit of overriding certain >> >>>>> built-in >> >>>>>>>>>>>>> functions. I'm open to such a feature if many people >> >> agree. >> >>>>>>> However, it >> >>>>>>>>>>>>> would be great to still support overriding catalog >> >> functions >> >>>>> with >> >>>>>>>>>>>>> temporary functions in order to prototype a query even >> >>> though >> >>>> a >> >>>>>>>>>>>>> catalog/database might not be available currently or >> >> should >> >>>> not >> >>>>> be >> >>>>>>>>>>>>> modified yet. How about we support both cases? >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs >> >>>>>>>>>>>>> -> creates/overrides a built-in function and never >> >>> consideres >> >>>>>>> current >> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but >> >>>> acceptable >> >>>>>> for >> >>>>>>>>>>>>> functions I guess. >> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >> >>>>>>>>>>>>> -> creates/overrides a catalog function >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects >> >>>> (tables, >> >>>>>>> views) >> >>>>>>>>>>>>> except functions", this might change in the near future. >> >>> Take >> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an >> >>>>> example. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>> Timo >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote: >> >>>>>>>>>>>>>> Hi Fabian, >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable >> >>> thus I >> >>>>>>> didn't >> >>>>>>>>>>>>>> include that as a voting option, and the discussion is >> >>> mainly >> >>>>>>> between >> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Re > However, it means that temp functions are >> >> differently >> >>>>>> treated >> >>>>>>>>>>> than >> >>>>>>>>>>>>>> other db objects. >> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that >> >>>>>> functions >> >>>>>>>>>>> are >> >>>>>>>>>>>> a >> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any >> >>> other >> >>>>>>>>>>> built-in >> >>>>>>>>>>>>>> objects (tables, views) except functions. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Cheers, >> >>>>>>>>>>>>>> Bowen >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> Xuefu Zhang >> >>>>>>>>>>> >> >>>>>>>>>>> "In Honey We Trust!" >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Xuefu Zhang >> >>>>>> >> >>>>>> "In Honey We Trust!" >> >>>>>> >> >>>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Xuefu Zhang >> >>>> >> >>>> "In Honey We Trust!" >> >>>> >> >>> >> >> >> >> >> >> -- >> >> Xuefu Zhang >> >> >> >> "In Honey We Trust!" >> >> >> >> |
After reading Kurt’s reasoning I’m bumping my vote for #2 from -1 to +0, or even +0.5, so my final vote is:
-1 for #1 +0.5 for #2 +1 for #3 Re confusion about “system_db”. I think quite a lot of DBs are storing some meta tables in some system and often hidden db/schema, so I don’t think that if we do the same with built in functions will be that big of a deal. In the end, both for #2 and #3 user will have to check in the documentation what’s the syntax for overriding built-in functions for the first time he will want to do it. Piotrek > On 19 Sep 2019, at 11:03, JingsongLee <[hidden email]> wrote: > > I know Hive and Spark can shadow built-in functions by temporary function. > Mysql, Oracle, Sql server can not shadow. > User can use full names to access functions instead of shadowing. > > So I think it is a completely new thing, and the direct way to deal with new things is to add new grammar. So, > +1 for #2, +0 for #3, -1 for #1 > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Kurt Young <[hidden email]> > Send Time:2019年9月19日(星期四) 16:43 > To:dev <[hidden email]> > Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog > > And let me make my vote complete: > > -1 for #1 > +1 for #2 with different keyword > -0 for #3 > > Best, > Kurt > > > On Thu, Sep 19, 2019 at 4:40 PM Kurt Young <[hidden email]> wrote: > >> Looks like I'm the only person who is willing to +1 to #2 for now :-) >> But I would suggest to change the keyword from GLOBAL to >> something like BUILTIN. >> >> I think #2 and #3 are almost the same proposal, just with different >> format to indicate whether it want to override built-in functions. >> >> My biggest reason to choose it is I want this behavior be consistent >> with temporal tables. I will give some examples to show the behavior >> and also make sure I'm not misunderstanding anything here. >> >> For most DBs, when user create a temporary table with: >> >> CREATE TEMPORARY TABLE t1 >> >> It's actually equivalent with: >> >> CREATE TEMPORARY TABLE `curent_db`.t1 >> >> If user change current database, they will not be able to access t1 without >> fully qualified name, .i.e db1.t1 (assuming db1 is current database when >> this temporary table is created). >> >> Only #2 and #3 followed this behavior and I would vote for this since this >> makes such behavior consistent through temporal tables and functions. >> >> Why I'm not voting for #3 is a special catalog and database just looks very >> hacky to me. It gave a imply that our built-in functions saved at a >> special >> catalog and database, which is actually not. Introducing a dedicated >> keyword >> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and >> straightforward. One can argue that we should avoid introducing new >> keyword, >> but it's also very rare that a system can overwrite built-in functions. >> Since we >> decided to support this, introduce a new keyword is not a big deal IMO. >> >> Best, >> Kurt >> >> >> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <[hidden email]> >> wrote: >> >>> Hi, >>> >>> It is a quite long discussion to follow and I hope I didn’t misunderstand >>> anything. From the proposals presented by Xuefu I would vote: >>> >>> -1 for #1 and #2 >>> +1 for #3 >>> >>> Besides #3 being IMO more general and more consistent, having qualified >>> names (#3) would help/make easier for someone to use cross >>> databases/catalogs queries (joining multiple data sets/streams). For >>> example with some functions to manipulate/clean up/convert the stored data >>> in different catalogs registered in the respective catalogs. >>> >>> Piotrek >>> >>>> On 19 Sep 2019, at 06:35, Jark Wu <[hidden email]> wrote: >>>> >>>> I agree with Xuefu that inconsistent handling with all the other >>> objects is >>>> not a big problem. >>>> >>>> Regarding to option#3, the special "system.system" namespace may confuse >>>> users. >>>> Users need to know the set of built-in function names to know when to >>> use >>>> "system.system" namespace. >>>> What will happen if user registers a non-builtin function name under the >>>> "system.system" namespace? >>>> Besides, I think it doesn't solve the "explode" problem I mentioned at >>> the >>>> beginning of this thread. >>>> >>>> So here is my vote: >>>> >>>> +1 for #1 >>>> 0 for #2 >>>> -1 for #3 >>>> >>>> Best, >>>> Jark >>>> >>>> >>>> On Thu, 19 Sep 2019 at 08:38, Xuefu Z <[hidden email]> wrote: >>>> >>>>> @Dawid, Re: we also don't need additional referencing the >>> specialcatalog >>>>> anywhere. >>>>> >>>>> True. But once we allow such reference, then user can do so in any >>> possible >>>>> place where a function name is expected, for which we have to handle. >>>>> That's a big difference, I think. >>>>> >>>>> Thanks, >>>>> Xuefu >>>>> >>>>> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz < >>>>> [hidden email]> >>>>> wrote: >>>>> >>>>>> @Bowen I am not suggesting introducing additional catalog. I think we >>>>> need >>>>>> to get rid of the current built-in catalog. >>>>>> >>>>>> @Xuefu in option #3 we also don't need additional referencing the >>> special >>>>>> catalog anywhere else besides in the CREATE statement. The resolution >>>>>> behaviour is exactly the same in both options. >>>>>> >>>>>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <[hidden email]> wrote: >>>>>> >>>>>>> Hi Dawid, >>>>>>> >>>>>>> "GLOBAL" is a temporary keyword that was given to the approach. It >>> can >>>>> be >>>>>>> changed to something else for better. >>>>>>> >>>>>>> The difference between this and the #3 approach is that we only need >>>>> the >>>>>>> keyword for this create DDL. For other places (such as function >>>>>>> referencing), no keyword or special namespace is needed. >>>>>>> >>>>>>> Thanks, >>>>>>> Xuefu >>>>>>> >>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < >>>>>>> [hidden email]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> I think it makes sense to start voting at this point. >>>>>>>> >>>>>>>> Option 1: Only 1-part identifiers >>>>>>>> PROS: >>>>>>>> - allows shadowing built-in functions >>>>>>>> CONS: >>>>>>>> - incosistent with all the other objects, both permanent & temporary >>>>>>>> - does not allow shadowing catalog functions >>>>>>>> >>>>>>>> Option 2: Special keyword for built-in function >>>>>>>> I think this is quite similar to the special catalog/db. The thing I >>>>> am >>>>>>>> strongly against in this proposal is the GLOBAL keyword. This >>> keyword >>>>>>> has a >>>>>>>> meaning in rdbms systems and means a function that is present for a >>>>>>>> lifetime of a session in which it was created, but available in all >>>>>> other >>>>>>>> sessions. Therefore I really don't want to use this keyword in a >>>>>>> different >>>>>>>> context. >>>>>>>> >>>>>>>> Option 3: Special catalog/db >>>>>>>> >>>>>>>> PROS: >>>>>>>> - allows shadowing built-in functions >>>>>>>> - allows shadowing catalog functions >>>>>>>> - consistent with other objects >>>>>>>> CONS: >>>>>>>> - we introduce a special namespace for built-in functions >>>>>>>> >>>>>>>> I don't see a problem with introducing the special namespace. In the >>>>>> end >>>>>>> it >>>>>>>> is very similar to the keyword approach. In this case the catalog/db >>>>>>>> combination would be the "keyword" >>>>>>>> >>>>>>>> Therefore my votes: >>>>>>>> Option 1: -0 >>>>>>>> Option 2: -1 (I might change to +0 if we can come up with a better >>>>>>> keyword) >>>>>>>> Option 3: +1 >>>>>>>> >>>>>>>> Best, >>>>>>>> Dawid >>>>>>>> >>>>>>>> >>>>>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <[hidden email]> wrote: >>>>>>>> >>>>>>>>> Hi Aljoscha, >>>>>>>>> >>>>>>>>> Thanks for the summary and these are great questions to be >>>>> answered. >>>>>>> The >>>>>>>>> answer to your first question is clear: there is a general >>>>> agreement >>>>>> to >>>>>>>>> override built-in functions with temp functions. >>>>>>>>> >>>>>>>>> However, your second and third questions are sort of related, as a >>>>>>>> function >>>>>>>>> reference can be either just function name (like "func") or in the >>>>>> form >>>>>>>> or >>>>>>>>> "cat.db.func". When a reference is just function name, it can mean >>>>>>>> either a >>>>>>>>> built-in function or a function defined in the current cat/db. If >>>>> we >>>>>>>>> support overriding a built-in function with a temp function, such >>>>>>>>> overriding can also cover a function in the current cat/db. >>>>>>>>> >>>>>>>>> I think what Timo referred as "overriding a catalog function" >>>>> means a >>>>>>>> temp >>>>>>>>> function defined as "cat.db.func" overrides a catalog function >>>>> "func" >>>>>>> in >>>>>>>>> cat/db even if cat/db is not current. To support this, temp >>>>> function >>>>>>> has >>>>>>>> to >>>>>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd >>>>>>>> questions >>>>>>>>> are related. The problem with such support is the ambiguity when >>>>> user >>>>>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func >>>>>> ...". >>>>>>>>> Here "func" can means a global temp function, or a temp function in >>>>>>>> current >>>>>>>>> cat/db. If we can assume the former, this creates an inconsistency >>>>>>>> because >>>>>>>>> "CREATE FUNCTION func" actually means a function in current cat/db. >>>>>> If >>>>>>> we >>>>>>>>> assume the latter, then there is no way for user to create a global >>>>>>> temp >>>>>>>>> function. >>>>>>>>> >>>>>>>>> Giving a special namespace for built-in functions may solve the >>>>>>> ambiguity >>>>>>>>> problem above, but it also introduces artificial catalog/database >>>>>> that >>>>>>>>> needs special treatment and pollutes the cleanness of the code. I >>>>>>> would >>>>>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE >>>>>>>>> [GLOBAL] TEMPORARY FUNCTION func". >>>>>>>>> >>>>>>>>> Thus, I'd like to summarize a few candidate proposals for voting >>>>>>>> purposes: >>>>>>>>> >>>>>>>>> 1. Support only global, temporary functions without namespace. Such >>>>>>> temp >>>>>>>>> functions overrides built-in functions and catalog functions in >>>>>> current >>>>>>>>> cat/db. The resolution order is: temp functions -> built-in >>>>> functions >>>>>>> -> >>>>>>>>> catalog functions. (Partially or fully qualified functions has no >>>>>>>>> ambiguity!) >>>>>>>>> >>>>>>>>> 2. In addition to #1, support creating and referencing temporary >>>>>>>> functions >>>>>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global >>>>>> temp >>>>>>>>> functions. The resolution order is: global temp functions -> >>>>> built-in >>>>>>>>> functions -> temp functions in current cat/db -> catalog function. >>>>>>>>> (Resolution for partially or fully qualified function reference is: >>>>>>> temp >>>>>>>>> functions -> persistent functions.) >>>>>>>>> >>>>>>>>> 3. In addition to #1, support creating and referencing temporary >>>>>>>> functions >>>>>>>>> associated with a cat/db with a special namespace for built-in >>>>>>> functions >>>>>>>>> and global temp functions. The resolution is the same as #2, except >>>>>>> that >>>>>>>>> the special namespace might be prefixed to a reference to a >>>>> built-in >>>>>>>>> function or global temp function. (In absence of the special >>>>>> namespace, >>>>>>>> the >>>>>>>>> resolution order is the same as in #2.) >>>>>>>>> >>>>>>>>> My personal preference is #1, given the unknown use case and >>>>>> introduced >>>>>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative. >>>>>>> Thus, >>>>>>>>> my votes are: >>>>>>>>> >>>>>>>>> +1 for #1 >>>>>>>>> +0 for #2 >>>>>>>>> -1 for #3 >>>>>>>>> >>>>>>>>> Everyone, please cast your vote (in above format please!), or let >>>>> me >>>>>>> know >>>>>>>>> if you have more questions or other candidates. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Xuefu >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < >>>>>> [hidden email]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I think this discussion and the one for FLIP-64 are very >>>>> connected. >>>>>>> To >>>>>>>>>> resolve the differences, think we have to think about the basic >>>>>>>>> principles >>>>>>>>>> and find consensus there. The basic questions I see are: >>>>>>>>>> >>>>>>>>>> - Do we want to support overriding builtin functions? >>>>>>>>>> - Do we want to support overriding catalog functions? >>>>>>>>>> - And then later: should temporary functions be tied to a >>>>>>>>>> catalog/database? >>>>>>>>>> >>>>>>>>>> I don’t have much to say about these, except that we should >>>>>> somewhat >>>>>>>>> stick >>>>>>>>>> to what the industry does. But I also understand that the >>>>> industry >>>>>> is >>>>>>>>>> already very divided on this. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Aljoscha >>>>>>>>>> >>>>>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <[hidden email]> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> +1 to strive for reaching consensus on the remaining topics. We >>>>>> are >>>>>>>>>> close to the truth. It will waste a lot of time if we resume the >>>>>>> topic >>>>>>>>> some >>>>>>>>>> time later. >>>>>>>>>>> >>>>>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s >>>>>> “cat.db.fun” >>>>>>>> way >>>>>>>>>> to override a catalog function. >>>>>>>>>>> >>>>>>>>>>> I’m not sure about “system.system.fun”, it introduces a >>>>>> nonexistent >>>>>>>> cat >>>>>>>>>> & db? And we still need to do special treatment for the dedicated >>>>>>>>>> system.system cat & db? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jark >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <[hidden email]> 写道: >>>>>>>>>>>> >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> >>>>>>>>>>>> @Xuefu: I would like to avoid adding too many things >>>>>>> incrementally. >>>>>>>>>> Users should be able to override all catalog objects consistently >>>>>>>>> according >>>>>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If >>>>>>>> functions >>>>>>>>>> are treated completely different, we need more code and special >>>>>>> cases. >>>>>>>>> From >>>>>>>>>> an implementation perspective, this topic only affects the lookup >>>>>>> logic >>>>>>>>>> which is rather low implementation effort which is why I would >>>>> like >>>>>>> to >>>>>>>>>> clarify the remaining items. As you said, we have a slight >>>>> consenus >>>>>>> on >>>>>>>>>> overriding built-in functions; we should also strive for reaching >>>>>>>>> consensus >>>>>>>>>> on the remaining topics. >>>>>>>>>>>> >>>>>>>>>>>> @Dawid: I like your idea as it ensures registering catalog >>>>>> objects >>>>>>>>>> consistent and the overriding of built-in functions more >>>>> explicit. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Timo >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 17.09.19 11:59, kai wang wrote: >>>>>>>>>>>>> hi, everyone >>>>>>>>>>>>> I think this flip is very meaningful. it supports functions >>>>>> that >>>>>>>> can >>>>>>>>> be >>>>>>>>>>>>> shared by different catalogs and dbs, reducing the >>>>> duplication >>>>>> of >>>>>>>>>> functions. >>>>>>>>>>>>> >>>>>>>>>>>>> Our group based on flink's sql parser module implements >>>>> create >>>>>>>>> function >>>>>>>>>>>>> feature, stores the parsed function metadata and schema into >>>>>>> mysql, >>>>>>>>> and >>>>>>>>>>>>> also customizes the catalog, customizes sql-client to support >>>>>>>> custom >>>>>>>>>>>>> schemas and functions. Loaded, but the function is currently >>>>>>>> global, >>>>>>>>>> and is >>>>>>>>>>>>> not subdivided according to catalog and db. >>>>>>>>>>>>> >>>>>>>>>>>>> In addition, I very much hope to participate in the >>>>> development >>>>>>> of >>>>>>>>> this >>>>>>>>>>>>> flip, I have been paying attention to the community, but >>>>> found >>>>>> it >>>>>>>> is >>>>>>>>>> more >>>>>>>>>>>>> difficult to join. >>>>>>>>>>>>> thank you. >>>>>>>>>>>>> >>>>>>>>>>>>> Xuefu Z <[hidden email]> 于2019年9月17日周二 上午11:19写道: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It seems to me that there is a general consensus on having >>>>>> temp >>>>>>>>>> functions >>>>>>>>>>>>>> that have no namespaces and overwrite built-in functions. >>>>> (As >>>>>> a >>>>>>>> side >>>>>>>>>> note >>>>>>>>>>>>>> for comparability, the current user defined functions are >>>>> all >>>>>>>>>> temporary and >>>>>>>>>>>>>> having no namespaces.) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced >>>>>> temp >>>>>>>>>> functions >>>>>>>>>>>>>> that can overwrite functions defined in a specific cat/db. >>>>>>>> However, >>>>>>>>>> this >>>>>>>>>>>>>> idea appears orthogonal to the former and can be added >>>>>>>>> incrementally. >>>>>>>>>>>>>> >>>>>>>>>>>>>> How about we first implement non-namespaced temp functions >>>>> now >>>>>>> and >>>>>>>>>> leave >>>>>>>>>>>>>> the door open for namespaced ones for later releases as the >>>>>>>>>> requirement >>>>>>>>>>>>>> might become more crystal? This also helps shorten the >>>>> debate >>>>>>> and >>>>>>>>>> allow us >>>>>>>>>>>>>> to make some progress along this direction. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the >>>>>>>>> temporary >>>>>>>>>> temp >>>>>>>>>>>>>> functions that don't have namespaces, my only concern is the >>>>>>>> special >>>>>>>>>>>>>> treatment for a cat/db, which makes code less clean, as >>>>>> evident >>>>>>> in >>>>>>>>>> treating >>>>>>>>>>>>>> the built-in catalog currently. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Xuefiu >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < >>>>>>>>>>>>>> [hidden email]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How >>>>>> about >>>>>>>> we >>>>>>>>>> have a >>>>>>>>>>>>>>> special namespace (catalog + database) for built-in >>>>> objects? >>>>>>> This >>>>>>>>>> catalog >>>>>>>>>>>>>>> would be invisible for users as Xuefu was suggesting. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Then users could still override built-in functions, if they >>>>>>> fully >>>>>>>>>> qualify >>>>>>>>>>>>>>> object with the built-in namespace, but by default the >>>>> common >>>>>>>> logic >>>>>>>>>> of >>>>>>>>>>>>>>> current dB & cat would be used. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ... >>>>>>>>>>>>>>> registers temporary function in current cat & dB >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ... >>>>>>>>>>>>>>> registers temporary function in cat db >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ... >>>>>>>>>>>>>>> Overrides built-in function with temporary function >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The built-in/system namespace would not be writable for >>>>>>> permanent >>>>>>>>>>>>>> objects. >>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This way I think we can have benefits of both solutions. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Dawid >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, < >>>>> [hidden email] >>>>>>> >>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Bowen, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I understand the potential benefit of overriding certain >>>>>>>> built-in >>>>>>>>>>>>>>>> functions. I'm open to such a feature if many people >>>>> agree. >>>>>>>>>> However, it >>>>>>>>>>>>>>>> would be great to still support overriding catalog >>>>> functions >>>>>>>> with >>>>>>>>>>>>>>>> temporary functions in order to prototype a query even >>>>>> though >>>>>>> a >>>>>>>>>>>>>>>> catalog/database might not be available currently or >>>>> should >>>>>>> not >>>>>>>> be >>>>>>>>>>>>>>>> modified yet. How about we support both cases? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs >>>>>>>>>>>>>>>> -> creates/overrides a built-in function and never >>>>>> consideres >>>>>>>>>> current >>>>>>>>>>>>>>>> catalog and database; inconsistent with other DDL but >>>>>>> acceptable >>>>>>>>> for >>>>>>>>>>>>>>>> functions I guess. >>>>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun >>>>>>>>>>>>>>>> -> creates/overrides a catalog function >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects >>>>>>> (tables, >>>>>>>>>> views) >>>>>>>>>>>>>>>> except functions", this might change in the near future. >>>>>> Take >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an >>>>>>>> example. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Timo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote: >>>>>>>>>>>>>>>>> Hi Fabian, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable >>>>>> thus I >>>>>>>>>> didn't >>>>>>>>>>>>>>>>> include that as a voting option, and the discussion is >>>>>> mainly >>>>>>>>>> between >>>>>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Re > However, it means that temp functions are >>>>> differently >>>>>>>>> treated >>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>> other db objects. >>>>>>>>>>>>>>>>> IMO, the treatment difference results from the fact that >>>>>>>>> functions >>>>>>>>>>>>>> are >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>> bit different from other objects - Flink don't have any >>>>>> other >>>>>>>>>>>>>> built-in >>>>>>>>>>>>>>>>> objects (tables, views) except functions. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Bowen >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Xuefu Zhang >>>>>>>>>>>>>> >>>>>>>>>>>>>> "In Honey We Trust!" >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Xuefu Zhang >>>>>>>>> >>>>>>>>> "In Honey We Trust!" >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Xuefu Zhang >>>>>>> >>>>>>> "In Honey We Trust!" >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Xuefu Zhang >>>>> >>>>> "In Honey We Trust!" >>>>> >>> >>> |
Free forum by Nabble | Edit this page |